Scientific article
doi 10.28995/2073-0101-2025-4-1187-1201
For citation
Belov, Ilya I. (2025). Application of large language models in document value assessment. 2018-2025, Herald of an Archivist, no. 4, pp. 1187-1201, doi 10.28995/2073-0101-2025-4-1187-1201
Belov, Ilya I., All-Russian Research Institute of Document Science and Archival Affairs, Moscow, Russia
Application of Large Language Models in Document Valuation. 2018-2025.
Abstract
This article examines the application of large language models (LLMs) in document value assessment to improve current practices and uncover the potential benefits of implementing and using this tool in organizations to support decision-making. The research's novelty lies in conducting a practical experiment to adapt and apply a large language model to document value assessment, which allows for both theoretical propositions and empirical identification of the advantages, challenges, and risks of using such technologies. The aim of the study is to analyze the potential of large language models in document value assessment and to present practical experience in retraining and testing a large language model in this area. The study draws on the work of domestic and international scholars and specialists in the fields of documentation, archival science, and information science, emphasizing the interdisciplinary nature of the problem under consideration. The article describes the experiment, which included retraining a pre-trained large language model to perform highly specialized document value assessment tasks and subsequent testing on real data from the organization's documentary collection. The article presents the rationale for selecting a specific pre-trained large-scale language model, as well as the chosen method for improving its parameters, to effectively perform one of the key tasks of document value assessment: selecting documents for permanent storage or destruction. The preparation and labeling of training data, which reflects document value assessment criteria, is analyzed in detail. The author emphasizes not only the obvious benefits of using large-scale language models, such as accelerating the processing of document arrays, which will reduce the time spent on value assessment, but also the potential risks. These include a detailed examination of information security issues associated with the use of such technologies in working with an organization's documentary collection, technical limitations associated with equipment costs and the imperfection of technologies in contextualizing them, as well as the lack of necessary competencies and skills among document management specialists. The results obtained demonstrate the undeniable practical potential of using large-scale language models in document value assessment, but also highlight the need for further research to develop a methodology for the secure integration of such technologies into archival practice.
Keywords
Information technology, document value assessment, large language model, artificial intelligence, document management.
Download the article: belov_doi
References
Albrecht, B. V., Simonova, E. R. (2019). On the Concept of Document Value Assessment in the Information Society, An Invitation to Discussion, VNIIDAD Bulletin, No. 6, pp. 92-96.
Doronina, L. A. (2017). Document Value Assessment in an Organization. IN: Step into the Future, Artificial Intelligence and the Digital Economy, Proceedings of the 1st International Scientific and Practical Conference, Iss. 1, Moscow, GUU ID publ., pp. 282-287.
Kuznetsova, L. O. (2024). Digital Opportunities in Document Assessment, Gasyrlar avazy - Echo of the Centuries, No. 3, pp. 188-192.
Lagutina, K. V., Boychuk, E. I., Lagutina, N. S. (2023). Automatic classification of Russian-language online texts by genre, Artificial Intelligence and Decision Making, No. 4, pp. 103-114.
Lobanova, A. M. (2021). Expertise of document value, modern challenges and prospects for the application of artificial intelligence, Scientific Bulletin of Crimea, No. 5 (34), pp. 1-12.
Groppe, J., Marquet, A., Walz, A., Groppe, S. (2025). Automated Archival Descriptions with Federated Intelligence of LLMs. IN: Database and Expert Systems Applications, 36th International Conference, Bangkok, Thailand, pp. 53-67.
Humphries, M., Leddy, L., Downton, Q., Legace. (2025). M. Unlocking the archives, Using large language models to transcribe handwritten historical documents, Historical Methods, A Journal of Quantitative and Interdisciplinary History. № 58 (3), pp. 1-19.
Hutchinson, T. (2020). Natural language processing and machine learning as practical toolsets for archival processing, Records Management Journal, № 30 (2), pp. 155–174.
Vellino, A., Alberts, I. (2016). Assisting the Appraisal of E-mail Records with Automatic Classification, Records Management Journal, № 26 (3), pp. 293-313.
Yang, J., Zhang, X., Liang, K., Liu, Y. (2023). Exploring the Application of Large Language Models in Detecting and Protecting Personally Identifiable Information in Archival Data, A Comprehensive Study. IN: 2023 IEEE International Conference on Big Data (BigData), Sorrento, Italy, pp. 2116-2123.
Zhang, S., Peng, S., Wang, P., Hou, J. (2024). Archives Meet GPT, A Pilot Study on Enhancing Archival Workflows with Large Language Models, IN: iConference 2024 Proceedings, pp. 1-12.
About authors
Belov Ilya I., All-Russian Research Institute of Documentation and Archival Affairs, Department of Documentation, Research Fellow, Russian State Humanitarian University, Historical and Archival Institute, Department of Automated Document Management Systems, Assistant, Moscow, Russia, +7-913-608-74-28, This e-mail address is being protected from spambots. You need JavaScript enabled to view it
ORCID 0009-0000-7922-2770
The article was received in the editorial office on 10.01.2025, recommended for publication on 20.09.2025.












