DEVELOPMENT OF THE RAG METHOD FOR PROVIDING INFORMATION SUPPORT TO USERS
DOI:
https://doi.org/10.32689/maup.it.2025.4.20Keywords:
Retrieval-Augmented Generation (RAG), large language models (LLM), information retrieval, economical, Ragas, embedding modelsAbstract
The article discusses the problem of providing accurate and relevant information support to users when working with large amounts of text data. Traditional search engines and isolated large language models (LLMs) have significant limitations, such as insufficient semantic understanding and hallucination generation. To solve this problem, a hybrid approach is proposed – Retrieval-Augmented Generation (RAG), which combines the accuracy of information retrieval with the generative capabilities of LLM. Purpose of the work. Development and experimental testing of a complete RAG method designed for automated information support. The key goal is to ensure economical and high resource efficiency, making the method practical for small organizations, university departments, or startups with limited computing budgets. Methodology. A modular system architecture is proposed, implemented using open libraries: LangChain for process coordination, ChromaDB as a local vector storage, HuggingFace for access to embedding models, and for high-speed and costeffective LLM queries used Groq API. A multi-stage evaluation was conducted: testing search accuracy (Top-k) for different embedding models on a Ukrainian-language dataset, testing end-to-end generation quality using an LLM evaluator, and optimizing parameters (prompt templates, fragmentation strategies) using the Ragas framework. Scientific novelty. A systematic comparison of the effectiveness of embedding models for semantic search in a Ukrainianlanguage text corpus was conducted. The optimal balance between the cost and quality of generative models (LLM) available through API was experimentally identified. A cost-effective RAG pipeline, optimized using Ragas metrics to achieve high response accuracy at minimal cost, was proposed and validated. Conclusions. The study confirmed the viability of the developed method. The intfloat/multilingual-e5-large-instruct embedding model demonstrated the best search accuracy, reaching 100% in the Top-7. The meta-llama/llama-4-scout-17b- 16e-instruct generative model showed the optimal price-quality ratio (88.3% correct answers). Optimization of prompts and fragmentation strategy (size 1000, overlap 150) allowed us to achieve the highest accuracy rates. Prospects for further work include testing specialized Ukrainian models and implementing the system in real chatbots.
References
A Survey on Hallucination in Large Language Models: Principles, Taxonomy, Challenges, and Open Questions / L. Huang et al. ACM Transactions on Information Systems. 2024. https://doi.org/10.1145/3703155.
Evaluation of a retrieval-augmented generation system using a Japanese Institutional Nuclear Medicine Manual and large language model-automated scoring. Y. Fukui et al. Radiological Physics and Technology. 2025. https://doi.org/10.1007/s12194-025-00941-y.
Evaluating Retrieval-Augmented Generation Models for Financial Report Question and Answering / I. Iaroshev et al. Applied Sciences. 2024. Vol. 14, no. 20. P. 9318. https://doi.org/10.3390/app14209318.
Gao Y. et al. Retrieval-Augmented Generation for Large Language Models: A Survey. arXiv preprint. 2023. arXiv:2312.10997. https://doi.org/10.48550/arXiv.2312.10997.
Hersh W. Search still matters: information retrieval in the era of generative AI. Journal of the American Medical Informatics Association. 2024. https://doi.org/10.1093/jamia/ocae014.
Liu X. A Survey of Hallucination Problems Based on Large Language Models. Applied and Computational Engineering. 2024. Vol. 97, no. 1. P. 24–30. https://doi.org/10.54254/2755-2721/2024.17851.
Patrick L. et al. Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks. arXiv preprint. 2020. arXiv:2005.11401. https://doi.org/10.48550/arXiv.2005.11401.
Pokhrel S., K C B., Shah P. B. A Practical Application of Retrieval-Augmented Generation for Website-Based Chatbots: Combining Web Scraping, Vectorization, and Semantic Search. Journal of Trends in Computer Science and Smart Technology. 2024. Vol. 6, no. 4. P. 424–442. https://doi.org/10.36548/jtcsst.2024.4.007.
Ragas. Ragas. URL: https://docs.ragas.io/en/stable/. (дата звернення: 01.11.2025).
Vladimir K. et al. Dense Passage Retrieval for Open-Domain Question Answering. arXiv preprint. 2020. arXiv:2004.04906. https://doi.org/10.48550/arXiv.2004.04906.






