GENRE CLASSIFICATION OF LITERATURE BASED ON METRICS USING LARGE LANGUAGE MODELS

Authors

DOI:

https://doi.org/10.32689/maup.philol.2025.1.11

Keywords:

genre classification, large language models, text parameterization, genre ontologies, natural language processing, automated text analysis

Abstract

The paper explores the use of large language models (LLMs) for literary genre classification by employing metric-based parameterization and genre ontologies.The study examines the theoretical foundations of genre classification, including traditional approaches to defining genres and modern algorithmic methods that leverage large language models. Particular attention is paid to the selection of metrics for text parameterization, including the level of formality, depth of technical analysis, methodological approach, target audience, application domain, type of research data, presence of empirical results, and methods of information visualization. The paper proposes a multi-level classification system that allows for a more precise hierarchical structuring of genre features.The aim of this study is to examine the main metrics for parametrizing literary genre classification and to conduct a practical experiment on the classification of scientific papers in the field of "artificial intelligence." The scientific novelty of the article lies in the development and application of a comprehensive parametrization of literary genres based on clearly defined metrics, enabling the use of large language models for automated genre classification.As part of the study, a practical experiment was conducted on genre classification using 10 academic papers in the field of artificial intelligence. The analysis was performed using GPT-4o and associated machine learning algorithms. The results confirmed the effectiveness of text parameterization based on predefined metrics and their application for automated classification. It was found that large language models exhibit high accuracy in identifying key textual characteristics but struggle with recognizing hybrid genres and providing explainable classification decisions.The primary challenges of automated genre classification include blurred genre boundaries, the influence of training data on classification outcomes, the need to enhance the explainability of classification decisions, and the adaptation of models to the specifics of different genres. The paper suggests directions for further research, such as the integration of genre ontologies, improvements in text parameterization, and the development of algorithms capable of handling multi-level genre structures.Thus, this study confirms the potential of large language models for automated literary genre classification based on text metrics. However, further refinement of classification algorithms and approaches to text parameterization is required to achieve higher accuracy and reliability.

References

Бехта І. А., Марчук О. В. Структурно-типологічна параметризація художнього текстопростору англомовного фентезі. Науковий вісник Міжнародного гуманітарного університету. Сер.: Філологія. 2021. Вип. 47. №. 1. С. 17–21.

Бовсунівська Т. В. Теорія літературних жанрів : Жанрова парадигма сучасного зарубіжного роману : Підручник / Т.В. Бовсунівська. К.: Видавничополіграфічний центр «Київський університет». 2009. 519 с.

Ворочек О. Г., Соловей І. В. Використання мовних моделей штучного інтелекту для генерації публікацій у соціальних мережах. Технічна інженерія. 2024. Вип. 1. №. 93. С. 128–134. DOI: https://doi.org/10.26642/ten-2024-1(93)-128-134.

Драненко Г. Теорія літературних жанрів у світлі сучасних міждисциплінарних учень: емпіричний та онтологічний дискурси. Науковий вісник Східноєвропейського національного університету імені Лесі Українки. 2015. Вип. 8. С. 49–56.

Doulaty M., Saz O., Raymond W. M. Automatic Genre and Show Identification of Broadcast Media. 17th Annual Conference of the International Speech Communication Association, Interspeech. San Francisco, USA. Duration: September 8-12, 2016. P. 2115–2119. https://doi.org/10.48550/arXiv.1606.03333

Garbacz P. An Outline of a Formal Ontology of Genres. Conference: Knowledge Science, Engineering and Management, First International Conference. Guilin, China. Duration: August 5-8, 2006. P. 151–163. DOI: 10.1007/11811220_14.

Lepekhin M., Sharoff S. Estimating Confidence of Predictions of Individual Classifiers and Their Ensembles for the Genre Classification Task. 2022. https://doi.org/10.48550/arXiv.2206.07427.

Martin C., Hood D. The Use of Natural Language Processing in Literature Reviews. 2024. URL: https:// insights.axtria.com/hubfs/thought-leadership-whitepapers/Axtria-Insights-White-Paper-The-Use-of-Natural- Language-Processing-in-Literature-Reviews.pdf

Mu Y., Dong C., Bontcheva K., Song X. Large Language Models Offer an Alternative to the Traditional Approach of Topic Modelling. 2024. https://doi.org/10.48550/arXiv.2403.16248.

Peng X. A Comparative Study of Neural Network for Text Classification. 2020 IEEE Conference on Telecommunications, Optics and Computer Science (TOCS). Shenyang, China. Duration: December 11-13, 2020. P. 2115–2119. DOI: https://doi.org/10.1109/TOCS50858.2020.9339702.

Schreiber H. Genre Ontology Learning: Comparing Curated with Crowd-Sourced Folksonomies. Proceedings of the 17th International Society for Music Information Retrieval Conference. New York City, USA. Duration: August 7-11, 2016. P. 400–406. URL: https://archives.ismir.net/ismir2016/paper/000074.pdf.

Sobchuk O., Sela A. Computational thematics: Comparing algorithms for clustering the genres of literary fiction. 2023. URL: https://doi.org/10.48550/arXiv.2305.11251.

The CWRC Genre Ontology Specification 0.7: вебсайт. URL: https://sparql.cwrc.ca/ontologies/genre. html (дата звернення: 24.02.2025).

Published

2025-03-27

How to Cite

ПАСІЧНИК, В., & ЯРОМИЧ, М. (2025). GENRE CLASSIFICATION OF LITERATURE BASED ON METRICS USING LARGE LANGUAGE MODELS. Scientific Works of Interregional Academy of Personnel Management. Philology, (1 (15), 60-68. https://doi.org/10.32689/maup.philol.2025.1.11