| ESP Journal of Engineering & Technology Advancements |
| © 2025 by ESP JETA |
| Volume 5 Issue 3 |
| Year of Publication : 2025 |
| Authors : Suresh Pairu Subramanyam |
:10.56472/25832646/JETA-V5I3P111 |
Suresh Pairu Subramanyam, 2025. "AI-Enhanced Data Pipeline Optimization in Azure: A Scalable Framework for Enterprise Analytics Using Azure Data Factory", ESP Journal of Engineering & Technology Advancements 5(3): 76-83.
Pipelines of traditional clouds, especially those ADF-based, are not dynamically or adequately optimized for throughput and reliability, as well as the cost, depending on varying enterprise analytics workloads. Most often, these limitations manifest themselves as poor performance accompanied by heightened levels of operating expenditure. This paper proposes an innovative intelligent framework integrated into Azure Data Factory, utilizing predictive analytics and reinforcement learning for adaptive resource allocation, which supports intelligent anomaly detection and is accompanied by self-healing capabilities. The experimental results based on massive datasets from the finance and healthcare industries show not only definite throughput improvement but also latency and error rate reduction that can be measured together with total operational expenses reduced compared to standard configurations. This architecture provides a scalable, intelligent solution at the analytics layer of the enterprise, enabling it to process data more efficiently and reliably, and making informed decisions based on data in critical sectors such as finance and healthcare. The progress shown here is a big step forward scientifically because it makes the automated management of data pipelines better.
[1] A. K. Tripathi and R. Singh, “Optimizing Big Data Pipelines Using Machine Learning: A Case Study on Cloud Platforms,” in Proc. 2023 IEEE Int. Conf. on Cloud Computing (CLOUD), pp. 167–174, 2023.
[2] Y. Zhang et al., “Federated Learning-Based Optimization for Distributed Data Pipelines in Healthcare Applications,” in IEEE J. Biomed. Health Inform., vol. 27, no. 2, pp. 469–478, Feb. 2023.
[3] J. Lin and X. Wu, “Reinforcement Learning for Resource Allocation in Data-Intensive Workflows on the Cloud,” in Proc. 2022 IEEE Int. Conf. on Big Data (BigData), pp. 2556–2563, 2022.
[4] S. K. Sharma et al., “Adaptive Scheduling of ETL Pipelines Using AI and Time-Series Forecasting,” in IEEE Trans. Cloud Comput., vol. 11, no. 1, pp. 24–36, Jan.–Mar. 2023.
[5] M. R. Alam and F. Zulkernine, “Self-Healing Data Pipelines for Fault-Tolerant Data Engineering,” in Proc. 2021 IEEE Int. Conf. on Dependable, Autonomic and Secure Computing (DASC), pp. 1–8, 2021.
[6] A. Ghosh et al., “Cost-Aware Auto-Scaling of Data Pipelines in the Cloud Using Deep Reinforcement Learning,” in Proc. 2022 IEEE Int. Conf. on Cloud Engineering (IC2E), pp. 45–54, 2022.
[7] D. J. Cook and L. Holder, “Explainable AI for Anomaly Detection in Streaming Data Pipelines,” in IEEE Intell. Syst., vol. 38, no. 2, pp. 47–56, Mar.–Apr. 2023.
[8] A. M. Nasir et al., “Enhancing Web Application Testing Using Selenium Integrated with CI/CD Pipelines,” in Proc. 2022 IEEE Intl. Conf. on Smart Computing and Electronic Enterprise (ICSCEE), pp. 190–195, 2022.
[9] K. Venkatesh et al., “AI-Based Operational Intelligence for Data Workflows in Azure,” in Proc. 2021 IEEE Int. Conf. on Cloud Computing Technology and Science (CloudCom), pp. 140–147, 2021.
[10] L. Jiao et al., “Toward Intelligent Data Pipeline Management: A Survey,” in IEEE Access, vol. 8, pp. 138792–138812, 2020.
Azure Data Factory (ADF), AI-Driven Optimization, Data Pipelines, Reinforcement Learning, Anomaly Detection, Predictive Analytics, Cloud Computing, Cost Efficiency, Enterprise Analytics, Self-Healing Systems.