Leveraging Large Language Models for Intelligent Data Source Selection and Query Generation in Multi-Database Systems

Nikunj Agarwal

Leveraging Large Language Models for Intelligent Data Source Selection and Query Generation in Multi-Database Systems

ESP Journal of Engineering & Technology Advancements

Volume 5 Issue 1

Year of Publication : 2025

Authors : Nikunj Agarwal

:10.56472/25832646/JETA-V5I1P102

Citation:

Nikunj Agarwal, 2025. "Leveraging Large Language Models for Intelligent Data Source Selection and Query Generation in Multi-Database Systems", ESP Journal of Engineering & Technology Advancements 5(1): 6-11.

Abstract:

The exponential growth of data across industries has led to the emergence of complex, multi-database systems, necessitating intelligent and efficient methods for data source selection and query generation. This research explores the transformative potential of Large Language Models (LLMs) in addressing these challenges. By leveraging their advanced natural language understanding and contextual reasoning capabilities, LLMs can dynamically select relevant data sources and generate optimized queries tailored to specific user inquiries and operational contexts. We propose a framework that integrates LLMs to streamline data retrieval and enhance decision-making processes across multiple domains. Our approach demonstrates its applicability in building efficient chatbots and intelligent systems that provide real-time, accurate, and context-aware responses. Additionally, this system ensures adherence to domain-specific rules and regulations while optimizing performance in handling diverse and distributed data environments. The findings highlight the versatility and efficiency of LLM-powered solutions in revolutionizing data-driven workflows across industries.

References:

[1] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. A., Kaiser, Ł., & Polosukhin, I. (2017). Attention is all you need. Advances in Neural Information Processing Systems, 30, 5998-6008.

[2] Brown, T. B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., & others. (2020). Language models are few-shot learners. Proceedings of the 34th International Conference on Neural Information Processing Systems, 1598-1610.

[3] Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2019). BERT: Pre-training of deep bidirectional transformers for language understanding. Proceedings of NAACL-HLT, 4171–4186.

[4] Radford, A., Narasimhan, K., Salimans, T., & Sutskever, I. (2018). Improving language understanding by generative pre-training. OpenAI Blog.

[5] Petroni, F., Ruder, S., Schmitz, J., & Ruder, S. (2019). Language models as knowledge bases? Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing (EMNLP), 4299-4308.

[6] Gupta, S., Yadav, A., & Bansal, M. (2021). Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks. Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics (ACL).

[7] Amazon Web Services (2023). Amazon Bedrock: Building Generative AI Applications with Foundation Models. AWS Whitepaper.

[8] Chen, M., & Zhang, Z. (2020). A Survey on Data Integration and Query Generation Techniques. International Journal of Database Management Systems (IJDMS), 12(6), 75-92.

[9] Zhou, L., & Wang, X. (2021). Multi-Database Query Generation for Data Integration using LLMs. Journal of Information Systems, 25(3), 123-145.

[10] Bender, E. M., Gebru, T., McMillan-Major, A., & Shmitchell, S. (2021). On the dangers of stochastic parrots: Can language models be too big? Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, 610-623.

Keywords:

Contextual Query Generation, Machine Learning for Databases, Multi-Database Query Optimization, Prompt Engineering, Large Language Models, Natural Language Processing.

ISSN : 2583-2646