ISSN : 2583-2646

Edge-Cloud Collaboration for Low-Latency Gen AI Applications

ESP Journal of Engineering & Technology Advancements
© 2024 by ESP JETA
Volume 4  Issue 4
Year of Publication : 2024
Authors : Rahul Vadisetty
:10.56472/25832646/JETA-V4I4P124

Citation:

Rahul Vadisetty, 2024. "Edge-Cloud Collaboration for Low-Latency Gen AI Applications", ESP Journal of Engineering & Technology Advancements  4(4): 181-189.

Abstract:

The rapid development of Gen AI and LLM opens a new avenue for application possibilities in both IR and QA. Since large language models are computation-intensive, avoiding latency to guarantee real-time performance can often take time and effort. The work, bearing this in mind, tries to minimize this considerable latency challenge for several applications of emergent-gen AI by studying the exploitation of the concept of edge-cloud collaboration. This paper presents a framework that efficiently deploys LLMs by leveraging strengths from both edge and cloud to balance loads, reduce latency, and provide responsiveness for IR and QA applications. Furthermore, we present an experimental analysis of the proposed system that captures its impact on latency and accuracy, establishing its applicability to real-world use. This work belongs to the growing family of edge-cloud collaboration works and provides valuable insight into how LLMs can be optimized for low-latency AI solutions.

References:

[1] Ale, L., Zhang, N., King, S. A., & Chen, D. (2024). Empowering generative AI through mobile edge computing. Nature Reviews Electrical Engineering, 1-9.https://www.nature.com/articles/s44287-024-00053-6

[2] Bajpai, D. J., & Hanawal, M. K. (2024). Distributed Inference on Mobile Edge and Cloud: An Early Exit based Clustering Approach. arXiv preprint arXiv:2410.05338.https://arxiv.org/abs/2410.05338

[3] Chang, Y., Wang, X., Wang, J., Wu, Y., Yang, L., Zhu, K., ... & Xie, X. (2024). A survey on evaluation of large language models. ACM Transactions on Intelligent Systems and Technology, 15(3), 1-45.https://dl.acm.org/doi/abs/10.1145/3641289

[4] Du, H., Niyato, D., Kang, J., Xiong, Z., Zhang, P., Cui, S., ... & Kim, D. I. (2024). The age of generative AI and AI-generated everything. IEEE Network.https://ieeexplore.ieee.org/abstract/document/10580983/

[5] Duan, S., Wang, D., Ren, J., Lyu, F., Zhang, Y., Wu, H., & Shen, X. (2022). Distributed artificial intelligence empowered by end-edge-cloud computing: A survey. IEEE Communications Surveys & Tutorials, 25(1), 591-624.https://ieeexplore.ieee.org/abstract/document/9933792/

[6] Gu, H., Zhao, L., Han, Z., Zheng, G., & Song, S. (2023). AI-Enhanced Cloud-Edge-Terminal Collaborative Network: Survey, Applica-tions, and Future Directions. IEEE Communications Surveys & Tutorials.https://ieeexplore.ieee.org/abstract/document/10336879/

[7] Hadi, M. U., Qureshi, R., Shah, A., Irfan, M., Zafar, A., Shaikh, M. B., ... & Mirjalili, S. (2023). A survey on large language models: Applications, challenges, limitations, and practical usage. Authorea Pre-prints.https://www.techrxiv.org/doi/full/10.36227/techrxiv.23589741.v1

[8] Hua, H., Li, Y., Wang, T., Dong, N., Li, W., & Cao, J. (2023). Edge computing with artificial intelligence: A machine learning perspective. ACM Computing Surveys, 55(9), 1-35.https://dl.acm.org/doi/abs/10.1145/3555802

[9] Li, P., Zhang, H., Wu, Y., Qian, L., Yu, R., Niyato, D., & Shen, X. (2024). Filling the missing: Exploring generative AI for enhanced federated learning over heterogeneous mobile edge devices. IEEE Transactions on Mobile Compu-ting.https://ieeexplore.ieee.org/abstract/document/10454003/

[10] Manduva, V. C., Pentyala, D. K., & Devarasetty, N. (2021). Scalable AI Solutions for Edge-Cloud Collaborative Computing: Challenges and Innovations. International Journal of Advanced Engineering Technologies and Innovations, 1(4), 80-117.https://ijaeti.com/index.php/Journal/article/view/532

[11] Minaee, Shervin, Tomas Mikolov, Narjes Nikzad, Meysam Chenaghlu, Richard Socher, Xavier Amatriain, and Jianfeng Gao. "Large language models: A survey." arXiv preprint arXiv:2402.06196 (2024).https://arxiv.org/abs/2402.06196

[12] Rao, K., Coviello, G., Benedetti, P., Giuseppe De Vita, C., Mellone, G., & Chakradhar, S. (2024, June). ECO-LLM: LLM-based Edge Cloud Optimization. In Proceedings of the 2024 Workshop on AI For Systems (pp. 7-12).https://dl.acm.org/doi/abs/10.1145/3660605.3660941

[13] Rong, G., Xu, Y., Tong, X., & Fan, H. (2021). An edge-cloud collaborative computing platform for building AIoT applications efficiently. Journal of Cloud Computing, 10(1), 36.https://link.springer.com/article/10.1186/s13677-021-00250-w

[14] Shaheen, M. Y. (2021). Applications of Artificial Intelligence (AI) in healthcare: A review. ScienceOpen Pre-prints.https://www.scienceopen.com/hosted-document?doi=10.14293/S2199-1006.1.SOR-.PPVRY8K.v1

[15] Tian, Y., Zhang, Z., Yang, Y., Chen, Z., Yang, Z., Jin, R., ... & Wong, K. K. (2024). An Edge-Cloud Collaboration Framework for Generative AI Service Provision with Synergetic Big Cloud and Small Edge Models. arXiv preprint arXiv:2401.01666.https://arxiv.org/abs/2401.01666

[16] Wang, Y. C., Xue, J., Wei, C., & Kuo, C. C. J. (2023). An overview of generative AI at scale with Edge-Cloud Computing. IEEE Open Journal of the Communications Society.https://ieeexplore.ieee.org/abstract/document/10268594/

[17] Zhang, Z., Genc, Y., Wang, D., Ahsen, M. E., & Fan, X. (2021). Effect of AI explanations on human perceptions of patient-facing AI-powered healthcare systems. Journal of Medical Systems, 45(6), 64.https://link.springer.com/article/10.1007/s10916-021-01743-6

[18] Ahmadi, S. (2024). Security Implications of Edge Computing in Cloud Networks. Ahmadi, S.(2024) Security Implications of Edge Computing in Cloud Networks. Journal of Computer and Communications, 12, 26-46. https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4722028

[19] Alwarafy, A., Al-Thelaya, K. A., Abdallah, M., Schneider, J., & Hamdi, M. (2020). A survey on security and privacy issues in edge-computing-assisted internet of things. IEEE Internet of Things Journal, 8(6), 4004-4022. https://ieeexplore.ieee.org/abstract/document/9163078/

[20] Ometov, A., Molua, O. L., Komarov, M., & Nurmi, J. (2022). A survey of security in cloud, edge, and fog computing. Sensors, 22(3), 927.https://www.mdpi.com/1424-8220/22/3/927

[21] Zhang, J., Chen, B., Zhao, Y., Cheng, X., & Hu, F. (2018). Data security and privacy-preserving in edge computing paradigm: Survey and open issues. IEEE Access, 6, 18209-18237.https://ieeexplore.ieee.org/abstract/document/8327600/

[22] Mao, B., Liu, J., Wu, Y., & Kato, N. (2023). Security and privacy on 6g network edge: A survey. IEEE communications surveys & tutorials, 25(2), 1095-1127.https://ieeexplore.ieee.org/abstract/document/10044183/

[23] Jena, S. R., Shanmugam, R., Dhanaraj, R. K., & Saini, K. (2019). Recent advances and future research directions in edge cloud frame-work. International Journal of Engineering and Advanced Technology, 9(2), 439-444.https://www.researchgate.net/profile/Mr-Soumya-Je-na/publication/338293717_Recent_Advances_and_Future_Research_Directions_in_Edge_Cloud_Framework/links/5e0deea1299bf10bc38afa1b/Recent-Advances-and-Future-Research-Directions-in-Edge-Cloud-Framework.pdf

[24] Pan, J., & McElhannon, J. (2017). Future edge cloud and edge computing for Internet of Things applications. IEEE Internet of Things Journal, 5(1), 439-449.

[25] Li, J., Gu, C., Xiang, Y., & Li, F. (2022). Edge-cloud computing systems for smart grid: state-of-the-art, architecture, and applications. Journal of Modern Power Systems and Clean Energy, 10(4), 805-817.https://ieeexplore.ieee.org/abstract/document/9744527/

Keywords:

Edge-Cloud Collaboration, Low-Latency, Generative AI, Information Retrieval (IR), Question Answering (QA), Latency Reduction, Response Time, Accuracy Metrics, Resource Utilization, Task Distribution, Network Traffic, Data Transfer Optimization, Scalability, 5G Networks, Real-Time Processing, System Performance, Dynamic Task Offloading, Computational Efficiency, Healthcare Applications, Cloud Computing, Edge Devices, System Reliability.