AUTOMATED VERIFICATION OF STATEMENTS USING THE RAG MECHANISM AND SYMBOL CLASSIFICATION
DOI:
https://doi.org/10.32689/maup.it.2026.1.1Keywords:
fact-checking, artificial intelligence, fake news, RAG, multimodal model, transformers, claim verification, disinformationAbstract
The object of the study is the problem of automatic fact verification in a digital environment saturated with disinformation. The paper analyzes modern approaches to fake news detection, including transformer architectures, neurosemantic and graph models. Additionally, the limitations of existing methods are identified, in particular, the popularity of the use of static features and poor generalization ability in a constant dynamic flow of information. The author proposes his own architecture of a multimodal model that combines style classification, AI text detection and a fact-checking module, supported by the search for relevant evidence through the RAG mechanism. The results of experiments on a test set of 1660 examples showed that the model achieves a high Recall indicator (84.6 %), while maintaining an acceptable balance of accuracy (Accuracy – 78.6 %, Precision – 74.4 %, F1 – 80.8 %). The obtained results indicate sufficient effectiveness of multi-task learning in truth-checking systems. This allows for effective detection of fake news from various sources, albeit with a certain number of false positives, but the balance between high Recall and lower Precision is justified, since the system is focused on reducing the possibility of missing fake news. The proposed model is suitable for use in real-world monitoring of the information space, in particular in the context of countering information threats. The effectiveness of the model is explained by the combination of several independent features (style, origin, factuality) and a flexible signal integration system. In addition, the use of the RAG mechanism provides an additional level of interpretability of the results obtained with reference to external sources. It can be used in online platforms with a large number of unstructured messages. The approach can be expanded with multimedia analysis and adapted for another specific language environment.
References
Al-Alshaqi, M., et al. (2024). Ensemble Techniques for Robust Fake News Detection: Integrating Transformers, Natural Language Processing, and Machine Learning. Sensors. Vol. 24(18). Article № 6062. DOI: https://doi.org/10.3390/s24186062
Almandouh, M., et al. (2024). Ensemble based high performance deep learning models for fake news detection. Scientific Reports. Vol. 14. Article № 3863. DOI: https://doi.org/10.1038/s41598-024-76286-0
Al-Ezzi, A. et al. (2022). Analysis of Deep Ensemble Transformer Model for Fake News Detection. IEEE Access. 2022. Vol. 10. P. 107485–107498. DOI: https://doi.org/10.1109/ACCESS.2022.3200595
Md. Ishraquzzaman et al. (2024). Ensemble Transformer-Based Detection of Fake and AI-Generated News. Advances in Computational Intelligence and Systems. Article ID 3268456. DOI: https://doi.org/10.1155/acis/3268456
Varshini, S. S., et al. (2023). I-S2FND: a novel interpretable self-ensembled semi-supervised model based on transformers for fake news detection. Journal of Intelligent Information Systems. Vol. 62. P. 233–250. DOI: https://doi.org/10.1007/s10844-023-00821-0
Yilun, Niu et al. (2024). VeraCT Scan: Retrieval-Augmented Fake News Detection with Justifiable Reasoning. arXiv preprint. arXiv:2404.01920. Available at: https://arxiv.org/abs/2404.01920
Zhao, Z., Zhou, Y., Cheng, Y. (2023). Fake News Detection Based on Knowledge-Guided Semantic Analysis. Journal of Web Engineering. Vol. 22(8). P. 2201–2222. DOI: https://doi.org/10.13052/jwe1540-9589.22811
Devlin, J., Chang, M.-W., Lee, K., Toutanova, K. (2019). BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. P. 4171–4186.
Lewis, P., Perez, E., Piktus, A. et al. (2020). Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks. Advances in Neural Information Processing Systems. Vol. 33.
Facebook AI Research. Faiss: A library for efficient similarity search and clustering of dense vectors. GitHub repository. 2019–2023. Retrieved from: https://github.com/facebookresearch/faiss
Wikimedia Foundation. Корпус англомовної Вікіпедії: повне текстове дамп-архівування статей. Wikimedia Downloads. 2023. Retrieved from: https://dumps.wikimedia.org/enwiki/latest/
Перехресна ентропія. Вікіпедія : вільна енциклопедія. 2023. Retrieved from: https://uk.wikipedia.org/wiki/Перехресна_ентропія
Косинус подібності. Вікіпедія : вільна енциклопедія.2023. Retrieved from: https://uk.wikipedia.org/wiki/Ко-синус_подібності. Укр
PyTorch Contributors. torch.optim.AdamW. PyTorch Documentation. 2023. Retrieved from: https://docs.pytorch.org/docs/stable/generated/torch.optim.AdamW.html
Classification: Accuracy, Precision, Recall. Google Developers : Machine Learning Crash Course. – 2023. Retrieved from: https://developers.google.com/machine-learning/crash-course/classification/accuracy-precision-recall







