ESP Journal of Engineering & Technology Advancements |
© 2021 by ESP JETA |
Volume 1 Issue 2 |
Year of Publication : 2021 |
Authors : Madan Mohan Tito Ayyalasomayajula, Sailaja Ayyalasomayajula |
: 10.56472/25832646/ESP-V1I2P108 |
Madan Mohan Tito Ayyalasomayajula, Sailaja Ayyalasomayajula, 2021. "Proactive Scaling Strategies for Cost-Efficient Hyperparameter Optimization in Cloud-Based Machine Learning Models: A Comprehensive Review" ESP Journal of Engineering & Technology Advancements 1(2): 42-56.
Hyperparameter tuning is critical to machine learning model development, significantly impacting model performance. However, the process can be time-consuming and resource-intensive, especially when dealing with complex deep-learning models and large datasets. This paper explores proactive scaling strategies for efficient and cost-effective hyperparameter configuration in machine learning models using cloud infrastructure. By employing machine learning algorithms to analyze workload patterns and predict future requirements, cloud infrastructure can automatically allocate resources and adjust hyperparameters accordingly. Two approaches – proactively scaling the cluster size beforehand and parallel scaling of models for adaptation to changing cloud conditions – are examined, demonstrating their ability to deliver substantial cost savings and improved performance compared to reactive scaling methods. Additionally, the integration of a cloud service called "Scavenger" is discussed, which identifies the most statistically and parallelly efficient job configuration parameters and returns a cost-optimized cluster configuration, further enhancing the benefits of proactive scaling. Overall, this research underscores the improved performance that can be achieved with proactive scaling strategies, thereby boosting the effectiveness of machine learning modeling in the cloud.
[1] Ilager, S., Muralidhar, R., & Buyya, R. (2020). Artificial intelligence (ai)-centric management of resources in modern distributed computing systems. 2020 IEEE Cloud Summit. https://doi.org/10.1109/ieeecloudsummit48914.2020.00007
[2] Rai, H., Ojha, S. K., & Nazarov, A. (2020). A hybrid approach for process scheduling in cloud environment using particle swarm optimization technique. 2020 International Conference Engineering and Telecommunication (En&T). https://doi.org/10.1109/ent50437.2020.9431318
[3] Rizwan Ali, M., Ahmad, F., Hasanain Chaudary, M., Ashfaq Khan, Z., A. Alqahtani, M., Saad Alqurni, J., Ullah, Z., & Ullah Khan, W. (2021). Petri Net based modeling and analysis for improved resource utilization in cloud computing.
[4] Preuveneers, D., Tsingenopoulos, I., & Joosen, W. (2020). Resource Usage and Performance Trade-offs for Machine Learning Models in Smart Environments. ncbi.nlm.nih.gov
[5] Hsu, C. J., Nair, V., Menzies, T., & Freeh, V. (2018). Micky: A Cheaper Alternative for Selecting Cloud Instances.
[6] Flunkert, V., Rebjock, Q., Castellon, J., Callot, L., & Januschowski, T. (2020). A simple and effective predictive resource scaling heuristic for large-scale cloud applications.
[7] Usman Sana, M. & Li, Z. (2021). Efficiency aware scheduling techniques in cloud computing: a descriptive literature review.
[8] Liu, L., Chen, X., Olayemi Petinrin, O., Zhang, W., Rahaman, S., Tang, Z. R., & Wong, K. C. (2021). Machine Learning Protocols in Early Cancer Detection Based on Liquid Biopsy: A Survey.
[9] Shi, T., Ma, H., & Chen, G. (2019). A seeding-based GA for location-aware workflow deployment in multi-cloud environment. 2019 IEEE Congress on Evolutionary Computation (CEC). https://doi.org/10.1109/cec.2019.8790110
[10] Mohammed Qasim, H., Ata, O., Azam Ansari, M., N. Alomary, M., Alghamdi, S., & Almehmadi, M. (2021). Hybrid Feature Selection Framework for the Parkinson Imbalanced Dataset Prediction Problem.
[11] Collins, G. S., & Moons, K. G. (2019). Reporting of artificial intelligence prediction models. The Lancet, 393(10181), 1577-1579. https://doi.org/10.1016/s0140-6736(19)30037-6
[12] Bhardwaj, A. S., Saraf, R., Nair, G. G., & Vallabhaneni, S. (2019). Real-time monitoring and predictive failure identification for electrical submersible pumps. Day 2 Tue, November 12, 2019. https://doi.org/10.2118/197911-ms
[13] Mukhopadhyay, R., Bandyopadhyay, S., Sutradhar, A., & Chattopadhyay, P. (2019). Performance analysis of deep Q networks and advantage actor critic algorithms in designing reinforcement learning-based self-tuning PID controllers. 2019 IEEE Bombay Section Signature Conference (IBSSC). https://doi.org/10.1109/ibssc47189.2019.8973068
[14] Barbierato, E., Campanile, L., Gribaudo, M., Iacono, M., Mastroianni, M., & Nacchia, S. (2021). Performance evaluation for the design of a hybrid cloud based distance synchronous and asynchronous learning architecture.
[15] Feng, D., Wu, Z., Zuo, D. C., & Zhang, Z. (2019). ERP: An elastic resource provisioning approach for cloud applications. ncbi.nlm.nih.gov
[16] Mendes, P., Casimiro, M., Romano, P., & Garlan, D. (2020). TrimTuner: Efficient Optimization of Machine Learning Jobs in the Cloud via Sub-Sampling.
[17] Hernandez, B., Herrero, P., Miles Rawson, T., S. P. Moore, L., Evans, B., Toumazou, C., H. Holmes, A., & Georgiou, P. (2017). Supervised learning for infection risk inference using pathology data.
[18] Singh, D. & K Reddy, C. (2015). A survey on platforms for big data analytics.
[19] Tomiatti Andreazi, G., Cezar Estrella, J., Mazzini Bruschi, S., Immich, R., Guidoni, D., Alves Pereira Júnior, L., & Ipolito Meneguette, R. (2021). MoHRiPA—An Architecture for Hybrid Resources Management of Private Cloud Environments.
[20] Babaii Rizvandi, N., Taheri, J., Y. Zomaya, A., & Moraveji, R. (2011). A Study on Using Uncertain Time Series Matching Algorithms in MapReduce Applications.
[21] Vooturi, D. T., Varma, G., & Kothapalli, K. (2019). Dynamic block sparse Reparameterization of Convolutional neural networks. 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW). https://doi.org/10.1109/iccvw.2019.00367
[22] Benhamida, F. Z., Kaddouri, O., Ouhrouche, T., Benaichouche, M., Casado-Mansilla, D., & Lopez-de-Ipina, D. (2020). Stock&Buy: A new demand forecasting tool for inventory control. 2020 5th International Conference on Smart and Sustainable Technologies (SpliTech). https://doi.org/10.23919/splitech49282.2020.9243824
[23] Levin, S., Toerper, M., Hamrock, E., Hinson, J. S., Barnes, S., Gardner, H., Dugas, A., Linton, B., Kirsch, T., & Kelen, G. (2018). Machine-learning-Based electronic triage more accurately differentiates patients with respect to clinical outcomes compared with the emergency severity index. Annals of Emergency Medicine, 71(5), 565-574.e2. https://doi.org/10.1016/j.annemergmed.2017.08.005
[24] Koch, P., Golovidov, O., Gardner, S., Wujek, B., Griffin, J., & Xu, Y. (2018). Autotune: A Derivative-free Optimization Framework for Hyperparameter Tuning. [PDF]
[25] Jaskari, J., Myllarinen, J., Leskinen, M., Rad, A. B., Hollmen, J., Andersson, S., & Sarkka, S. (2020). Machine learning methods for neonatal mortality and morbidity classification. IEEE Access, 8, 123347-123358. https://doi.org/10.1109/access.2020.3006710
[26] Barros, B., Lacerda, P., Albuquerque, C., & Conci, A. (2021). Pulmonary COVID-19: Learning Spatiotemporal Features Combining CNN and LSTM Networks for Lung Ultrasound Video Classification.
[27] Horovitz, S., & Arian, Y. (2018). Efficient cloud auto-scaling with SLA objective using Q-learning. 2018 IEEE 6th International Conference on Future Internet of Things and Cloud (FiCloud). https://doi.org/10.1109/ficloud.2018.00020
[28] Biswas, A., Majumdar, S., Nandy, B., & El-Haraki, A. (2017). A hybrid auto-scaling technique for clouds processing applications with service level agreements. Journal of Cloud Computing, 6(1). https://doi.org/10.1186/s13677-017-0100-5
[29] Hameed, N., Shabut, A. M., Ghosh, M. K., & Hossain, M. (2020). Multi-class multi-level classification algorithm for skin lesions classification using machine learning techniques. Expert Systems with Applications, 141, 112961. https://doi.org/10.1016/j.eswa.2019.112961
[30] V. Pavlenko, Y., Evans, A., P. K. Banerjee, D., R. Geballe, T., Munari, U., D. Gehrz, R., E. Woodward, C., & Starrfield, S. (2020). Isotopic ratios in the red giant component of the recurrent nova T Coronae Borealis.
[31] Alkhanak, E. N., Lee, S. P., Rezaei, R., & Parizi, R. M. (2016). Cost optimization approaches for scientific workflow scheduling in cloud and grid computing: A review, classifications, and open issues. Journal of Systems and Software, 113, 1-26. https://doi.org/10.1016/j.jss.2015.11.023
[32] Qu, C., N. Calheiros, R., & Buyya, R. (2016). Auto-scaling Web Applications in Clouds: A Taxonomy and Survey.
[33] Ortiz, J., Lee, B., Balazinska, M., & L. Hellerstein, J. (2016). PerfEnforce: A Dynamic Scaling Engine for Analytics with Performance Guarantees.
[34] Martins Do Rosario, V., A. Silva Camacho, T., O. Napoli, O., & Borin, E. (2020). Fast and Low-cost Search for Efficient Cloud Configurations for HPC Workloads.
[35] Han, R. (2015). Investigations into Elasticity in Cloud Computing.
[36] Liu, R., Krishnan, S., J. Elmore, A., & J. Franklin, M. (2020). Understanding and Optimizing Packed Neural Network Training for Hyper-Parameter Tuning.
[37] S. Netto, M., N. Calheiros, R., R. Rodrigues, E., L. F. Cunha, R., & Buyya, R. (2017). HPC Cloud for Scientific and Business Applications: Taxonomy, Vision, and Research Challenges.
[38] Kaur, K., Sharma, D. S., & Kahlon, D. K. (2017). Interoperability and portability approaches in inter-connected clouds. ACM Computing Surveys, 50(4), 1-40. https://doi.org/10.1145/3092698
[39] Gunasekaran, J. R., Thinakaran, P., Kandemir, M. T., Urgaonkar, B., Kesidis, G., & Das, C. (2019). Spock: Exploiting Serverless functions for SLO and cost-aware resource procurement in the public cloud. 2019 IEEE 12th International Conference on Cloud Computing (CLOUD). https://doi.org/10.1109/cloud.2019.00043
[40] Akhilandeswari, P., George, J.G. (2014). Secure Text Steganography. In: Sathiakumar, S., Awasthi, L., Masillamani, M., Sridhar, S. (eds) Proceedings of International Conference on Internet Computing and Information Communications. Advances in Intelligent Systems and Computing, vol 216. Springer, New Delhi. https://doi.org/10.1007/978-81-322-1299-7_1
[41] George, J.G. ;Marín-Esponda, T.T. & Kumar-Dandpat, P. (2019). Analyzing the impact of excess inventory of California Glam to control the inventories of distributors by integrating product and distributor segmentation concept in the supply chain. Trabajo de obtención de grado, Especialidad en Gestión de la Cadena de Suministro. Tlaquepaque, Jalisco: ITESO.
[42] Ayyalasomayajula, M. M. T., Chintala, S., & Sailaja, A. (2019). A Cost-Effective Analysis of Machine Learning Workloads in Public Clouds: Is AutoML Always Worth Using? International Journal of Computer Science Trends and Technology (IJCST), 7(5), 107–115.
[43] Chintala, S. ., & Ayyalasomayajula, M. M. T.. (2019). Optimizing Predictive Accuracy With Gradient Boosted Trees In Financial Forecasting. Turkish Journal of Computer and Mathematics Education (TURCOMAT), 10(3), 1710–1721. https://doi.org/10.61841/turcomat.v10i3.14707
[44] Ayyalasomayajula, M., & Chintala, S. (2020). Fast Parallelizable Cassava Plant Disease Detection using Ensemble Learning with Fine Tuned AmoebaNet and ResNeXt-101. Turkish Journal of Computer and Mathematics Education (TURCOMAT), 11(3), 3013–3023.
[45] Aparna K Bhat, Rajeshwari Hegde, 2014. “Comprehensive Analysis of Acoustic Echo Cancellation Algorithms on DSP Processor”, International Journal of Advance Computational Engineering and Networking (IJACEN), volume 2, Issue 9, pp.6-11.
[46] Bhat, V. Gojanur, and R. Hegde. 2015. 4G protocol and architecture for BYOD over Cloud Computing. In Communications and Signal Processing (ICCSP), 2015 International Conference on. 0308-0313. Google Scholar.
[47] Aparna Bhat, Rajeshwari Hegde, “Comprehensive Study of Renewable Energy Resources and Present Scenario in India,” 2015 IEEE International Conference on Engineering and Technology (ICETECH), Coimbatore, TN, India, 2015.
[48] Chanthati, Sasibhushan Roa. (2021). A segmented approach to encouragement of entrepreneurship using data science. World Journal of Advanced Engineering Technology and Science. https://doi.org/10.30574/wjaets.2024.12.2.0330.
[49] Preyaa Atri, "Enhancing Data Engineering and AI Development with the 'Consolidate-csv-files-from-gcs' Python Library", International Journal of Science and Research (IJSR), Volume 9 Issue 5, May 2020, pp. 1863-1865, https://www.ijsr.net/getabstract.php?paperid=SR24522151121
[50] Preyaa Atri, "Design and Implementation of High-Throughput Data Streams using Apache Kafka for Real-Time Data Pipelines", International Journal of Science and Research (IJSR), Volume 7 Issue 11, November 2018, pp. 1988-1991, https://www.ijsr.net/getabstract.php?paperid=SR24422184316
Artificial Intelligence, Auto-Scaling Techniques, Big Data, Bayesian Optimization, Cloud Computing, Cloud Infrastructure, Deep Learning, Genetic Algorithms, Grid Search, Hyperparameter Tuning, Hyperparameter Optimization, Pro-Active Scaling Strategies, Random Search, Resource Allocation.