АВТОМАТИЗОВАНА ПЕРЕВІРКА ТВЕРДЖЕНЬ З ВИКОРИСТАННЯМ RAG-МЕХАНІЗМУ ТА КЛАСИФІКАЦІЇ ОЗНАК

Vitalii Dadyverin; Oleg Bisikalo

doi:10.32689/maup.it.2026.1.1

Authors

Vitalii Dadyverin Vinnytsia National Technical University https://orcid.org/0000-0001-5121-2263
Oleg Bisikalo Vinnytsia National Technical University https://orcid.org/0000-0002-7607-1943

DOI:

https://doi.org/10.32689/maup.it.2026.1.1

Keywords:

fact-checking, artificial intelligence, fake news, RAG, multimodal model, transformers, claim verification, disinformation

Abstract

The object of the study is the problem of automatic fact verification in a digital environment saturated with disinformation. The paper analyzes modern approaches to fake news detection, including transformer architectures, neurosemantic and graph models. Additionally, the limitations of existing methods are identified, in particular, the popularity of the use of static features and poor generalization ability in a constant dynamic flow of information. The author proposes his own architecture of a multimodal model that combines style classification, AI text detection and a fact-checking module, supported by the search for relevant evidence through the RAG mechanism. The results of experiments on a test set of 1660 examples showed that the model achieves a high Recall indicator (84.6 %), while maintaining an acceptable balance of accuracy (Accuracy – 78.6 %, Precision – 74.4 %, F1 – 80.8 %). The obtained results indicate sufficient effectiveness of multi-task learning in truth-checking systems. This allows for effective detection of fake news from various sources, albeit with a certain number of false positives, but the balance between high Recall and lower Precision is justified, since the system is focused on reducing the possibility of missing fake news. The proposed model is suitable for use in real-world monitoring of the information space, in particular in the context of countering information threats. The effectiveness of the model is explained by the combination of several independent features (style, origin, factuality) and a flexible signal integration system. In addition, the use of the RAG mechanism provides an additional level of interpretability of the results obtained with reference to external sources. It can be used in online platforms with a large number of unstructured messages. The approach can be expanded with multimedia analysis and adapted for another specific language environment.

References

Al-Alshaqi, M., et al. (2024). Ensemble Techniques for Robust Fake News Detection: Integrating Transformers, Natural Language Processing, and Machine Learning. Sensors. Vol. 24(18). Article № 6062. DOI: https://doi.org/10.3390/s24186062

Almandouh, M., et al. (2024). Ensemble based high performance deep learning models for fake news detection. Scientific Reports. Vol. 14. Article № 3863. DOI: https://doi.org/10.1038/s41598-024-76286-0

Al-Ezzi, A. et al. (2022). Analysis of Deep Ensemble Transformer Model for Fake News Detection. IEEE Access. 2022. Vol. 10. P. 107485–107498. DOI: https://doi.org/10.1109/ACCESS.2022.3200595

Md. Ishraquzzaman et al. (2024). Ensemble Transformer-Based Detection of Fake and AI-Generated News. Advances in Computational Intelligence and Systems. Article ID 3268456. DOI: https://doi.org/10.1155/acis/3268456

Varshini, S. S., et al. (2023). I-S2FND: a novel interpretable self-ensembled semi-supervised model based on transformers for fake news detection. Journal of Intelligent Information Systems. Vol. 62. P. 233–250. DOI: https://doi.org/10.1007/s10844-023-00821-0

Yilun, Niu et al. (2024). VeraCT Scan: Retrieval-Augmented Fake News Detection with Justifiable Reasoning. arXiv preprint. arXiv:2404.01920. Available at: https://arxiv.org/abs/2404.01920

Zhao, Z., Zhou, Y., Cheng, Y. (2023). Fake News Detection Based on Knowledge-Guided Semantic Analysis. Journal of Web Engineering. Vol. 22(8). P. 2201–2222. DOI: https://doi.org/10.13052/jwe1540-9589.22811

Devlin, J., Chang, M.-W., Lee, K., Toutanova, K. (2019). BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. P. 4171–4186.

Lewis, P., Perez, E., Piktus, A. et al. (2020). Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks. Advances in Neural Information Processing Systems. Vol. 33.

Facebook AI Research. Faiss: A library for efficient similarity search and clustering of dense vectors. GitHub repository. 2019–2023. Retrieved from: https://github.com/facebookresearch/faiss

Wikimedia Foundation. Корпус англомовної Вікіпедії: повне текстове дамп-архівування статей. Wikimedia Downloads. 2023. Retrieved from: https://dumps.wikimedia.org/enwiki/latest/

Перехресна ентропія. Вікіпедія : вільна енциклопедія. 2023. Retrieved from: https://uk.wikipedia.org/wiki/Перехресна_ентропія

Косинус подібності. Вікіпедія : вільна енциклопедія.2023. Retrieved from: https://uk.wikipedia.org/wiki/Ко-синус_подібності. Укр

PyTorch Contributors. torch.optim.AdamW. PyTorch Documentation. 2023. Retrieved from: https://docs.pytorch.org/docs/stable/generated/torch.optim.AdamW.html

Classification: Accuracy, Precision, Recall. Google Developers : Machine Learning Crash Course. – 2023. Retrieved from: https://developers.google.com/machine-learning/crash-course/classification/accuracy-precision-recall

AUTOMATED VERIFICATION OF STATEMENTS USING THE RAG MECHANISM AND SYMBOL CLASSIFICATION

Authors

DOI:

Keywords:

Abstract

References

Downloads

Published

How to Cite

Issue

Section

License

Most read articles by the same author(s)

Language