The digitalization of language resources for low-resource languages requires a formal organization of audio data collection, description, quality control, and publication. In this context, the study aims to develop a data model for the audio module of the Tundra Nenets online dictionary, i.e. the Nenets-Russian and Russian-Nenets online dictionary, and a decision support framework for selecting lexical units for recording, post-processing audio materials, and integrating them into the dictionary system. The empirical base includes corpus and dictionary resources, educational and thematic materials, previously created audio resources, and the results of fieldwork conducted in Naryan-Mar in December 2025. The methodological framework combines systems analysis, formalization of information flows, multicriteria prioritization of lexical items, and a reproducible workflow for processing audio materials. The study identifies the core entities of the audio module data model, the quality control framework, and the decision support framework for expanding the dictionary’s audio coverage. A list of 542 units was profiled by unit type, part of speech, theme, and microtheme; the paper also characterizes the composition of informants, the structure of audio materials, file naming conventions, and quality control statuses. The proposed solution can be applied to the development of digital dictionaries and speech resources for low-resource languages.
1. Epimakhova A.S., Kokanova E.S. Nenets language in digital environment. Journal of Siberian Federal University. Humanities & Social Sciences. 2025;18(10):1924–1931.
2. Kokanova E.S., Shnyakov P.Ye. Specifics of designing the Nenets-Russian and Russian-Nenets online dictionary. Ethnopsycholinguistics. 2025;(3):61–75. (In Russ.).
3. Malashina A.G. Possibility of Recovering Message Segments Based on Side Information about Original Characters. Doklady Mathematics. 2023;108(S2):S282–S292. https://doi.org/10.1134/S106456242370151X
4. Makarova E.A. Processing of semi-structured text data for use in data analysis models. Information and Mathematical Technologies in Science and Management. 2023;(1):178–189. (In Russ.). https://doi.org/10.25729/ESI.2023.29.1.015
5. Petrov V.A., Filippov A.A. Analysis of natural language text classification methods. Bulletin of Ulyanovsk State Technical University. 2024;(3):40–44. (In Russ.).
6. Onwujekwe G., Weistroffer H.R. Intelligent Decision Support Systems: An Analysis of the Literature and a Framework for Development. Information Systems Frontiers. 2025;27(5):2027–2058. https://doi.org/10.1007/s10796-024-10571-1
7. Van Kampen A.H.C., Mahamune U., Jongejan A., et al. ENCORE: a practical implementation to improve reproducibility and transparency of computational research. Nature Communications. 2024;15(1). https://doi.org/10.1038/s41467-024-52446-8
8. Dirdal H., Johansen S.H., Durrant Ph. Representativeness and metadata presentation in learner/child corpora: Lessons from the GiG and TRAWL corpora. Research Methods in Applied Linguistics. 2024;3(3). https://doi.org/10.1016/j.rmal.2024.100145
9. Ackoff R.L., Magidson J., Addison H.J. Idealized Design: Creating an Organization’s Future. Upper Saddle River: Wharton School Publishing; 2006. 336 p.
10. Grenoble L. New Horizons in Research on the Even Language. North-Eastern Journal of Humanities. 2024;(3):23–31. (In Russ.). https://doi.org/10.25693/SVGV.2024.48.3.002
11. Wieczorkowska A. Methodology for Obtaining High-Quality Speech Corpora. Applied Sciences. 2025;15(4). https://doi.org/10.3390/app15041848
12. Gibbon D., Moore R., Winski R. Handbook of Standards and Resources for Spoken Language Systems. Berlin, New York: Mouton de Gruyter; 1997. 886 p.
13. Saburov A.A., Nikiforov A.S., Minchuk O.V. Preservation of the Nenets Language in the Nenets Autonomous Okrug: Based on Sociological Survey. Arctic and North. 2023;(50):189–210. (In Russ.). https://doi.org/10.37482/issn2221-2698.2023.50.189
Shnyakov Pavel Yevgenyevich
Email: p.shnyakov@narfu.ru
WoS | ORCID | eLibrary |
Northern (Arctic) Federal University
Arkhangelsk, Russian Federation
Kokanova Elena Sergeevna
Candidate of Philological Sciences, Docent
Email: e.s.kokanova@narfu.ru
WoS | Scopus | ORCID | eLibrary |
Northern (Arctic) Federal University
Arkhangelsk, Russian Federation