References

moitvivt

Моделирование, оптимизация и информационные технологии

Modeling, Optimization and Information Technology

2310-6018

Издательство

10.26102/2310-6018/2024.47.4.026

1748

Метод подготовки данных по научным публикациям для интеллектуальной поддержки принятия решений при оценке экспертности рецензентов

Method of preparing data on scientific publications for intelligent decision-making support in evaluating expertise of peer reviewers

0000-0003-3063-105X

Латыпова

Виктория Александровна

Latypova

Viktoriya Aleksandrovna

vikalaty@yandex.ru aff-1

Уфимский университет науки и технологий Ufa University of Science and Technology

01 01 2026

1 1

10.26102/2310-6018/2024.47.4.026

2026

This work is licensed under a Creative Commons Attribution 4.0 International License

Одним из основных факторов при назначении рецензента является его экспертность по теме рукописи (наличие соответствующих публикаций). Поддержка принятия решений, базирующаяся на применении интеллектуального анализа данных наукометрических баз по научным публикациям, ускоряет и делает менее трудоемким процесс оценки экспертности рецензентов. Однако критическим пунктом в данном случае является корректность данных по научным публикациям, подвергающихся интеллектуальному анализу. В настоящий момент исследователи активно занимаются вопросом определения корректности данных наукометрических баз и способам ее обеспечения, осуществляя различные процедуры очистки в рамках подготовки данных. Тем не менее, в существующих работах не учитывается специфика задачи, для решения которой собираются данные по научным публикациям. Для решения данной проблемы в статье предлагается метод подготовки данных по научным публикациям для интеллектуальной поддержки принятия решений при оценке экспертности рецензентов, учитывающий особенности, связанные с необходимостью определения семантической близости текста данных по публикациям. Метод успешно апробирован при подготовке данных по научным публикациям членов редколлегии журнала «Системная инженерия и информационные технологии» с привлечением содержимого их профилей в наукометрических базах «РИНЦ» и «Академия Google».

One of the main factors in assigning a peer reviewer is his expertise on the manuscript topic (the existence of the relevant publicatios). Decision-making support, based on the usage of mining scientometric base data on scientific publications, speeds up the process of evaluating the expertise of peer reviewers and makes it less time-consuming. However, the critical point in this case is the correctness of the data on scientific publications subject to intellectual analysis. At present, researchers actively deal with the question of defining the scientometric base data correctness and means of ensuring it, conducting different procedures of cleaning within data preparation. Yet in the existing works, the specifics of the task, for which data on scientific publications are gathered, is not taken into account. To address the problem, a method of preparing data on scientific publications for intelligent decision-making support in evaluating expertise of peer reviewers, considering features associated with the need to define the semantic similarity of text of data on publications, is suggested in the paper. The method was successfully tested when preparing data on scientific publications of members of the academic journal “Systems Engineering and Information Technologies” editorial board, involving the content of their profiles in scientometric bases “RISC” and “Google Scholar”.

подготовка данных поддержка принятия решений интеллектуальный анализ данных рецензент научная публикация

data preparation decision-making support data mining peer reviewer scientific publication

Исследование выполнено без спонсорской поддержки.

The study was performed without external funding.

References 1

Sharifyanov N., Latypova V. A Method of Filling Missing Values in Data using Data Mining. In: 2023 IX International Conference on Information Technology and Nanotechnology (ITNT), 17–21 April 2023, Samara, Russian Federation. IEEE; 2023. pp. 1–5. https://doi.org/10.1109/ITNT57377.2023.10139280

Okafor N.U., Delaney D.T. Missing Data Imputation on IoT Sensor Networks: Implications for on-Site Sensor Calibration. IEEE Sensors Journal. 2021;21(20):22833–22845. https://doi.org/10.1109/JSEN.2021.3105442

McCombe N., Liu S., Ding X., Prasad G., Bucholc M., Finn D.P. Practical Strategies for Extreme Missing Data Imputation in Dementia Diagnosis. IEEE Journal of Biomedical and Health Informatics. 2022;26(2):818–827. https://doi.org/10.1109/JBHI.2021.3098511

Шарифьянов Н.В., Латыпова В.А. Формирование данных в фиксациях моделей нефтегазовых скважин на основе применения интеллектуального метода заполнения пропущенных значений. Моделирование, оптимизация и информационные технологии. 2023;11(2). https://doi.org/10.26102/2310-6018/2023.41.2.022

Hunko M., Tkachov V., Liashenko O., Rabčan J. Application Architecture For Obtaining Data From Scientometric Databases. In: 2022 IEEE 3rd KhPI Week on Advanced Technology (KhPIWeek), 03–07 October 2022, Kharkiv, Ukraine. IEEE; 2022. pp. 1–4. https://doi.org/10.1109/KhPIWeek57572.2022.9916398

Wan H., Zhang Y., Zhang J., Tang J. AMiner: Search and Mining of Academic Social Networks. Data Intelligence. 2019;1(1):58–76. https://doi.org/10.1162/dint_a_00006

Sauvayre R. Types of Errors Hiding in Google Scholar Data. Journal of Medical Internet Research. 2022;24(5). https://doi.org/10.2196/28354

Van Eck N.J., Waltman L. Accuracy of citation data in Web of Science and Scopus. ArXiv. URL: https://doi.org/10.48550/arXiv.1906.07011 [Accessed 10th August 2024].

Selivanova I.V., Kosyakov D.V., Guskov A.E. The Impact of Errors in the Sсopus Database on the Research Assessment. Scientific and Technical Information Processing. 2019;46(3):204–212. https://doi.org/10.3103/S0147688219030109

Zhang J., Tang J. Name disambiguation in AMiner. Science China Information Sciences. 2020;64(4). https://doi.org/10.1007/s11432-019-9884-y

Zhang Y., Zhang F., Yao P., Tang J. Name Disambiguation in AMiner: Clustering, Maintenance, and Human in the Loop. In: KDD '18: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 19–23 August 2018, London, United Kingdom. New York: Association for Computing Machinery; 2018. pp. 1002–1011. https://doi.org/10.1145/3219819.3219859

Müller M.-C., Reitz F., Roy N. Data sets for author name disambiguation: an empirical analysis and a new resource. Scientometrics. 2017;111(3):1467–1500. https://doi.org/10.1007/s11192-017-2363-5

Maddi A., Baudoin L. The quality of the web of science data: a longitudinal study on the completeness of authors-addresses links. Scientometrics. 2022;127(11):6279–6292. https://doi.org/10.1007/s11192-022-04525-0

Liu W., Hu G., Tang L. Missing author address information in Web of Science – An explorative study. Journal of Informetrics. 2018;12(3):985–997. https://doi.org/10.1016/j.joi.2018.07.008

Аксентьева М.С., Чебуков Д.Е. Влияние ошибок в списках литературы в базе данных Web of Science на цитируемость и импакт-фактор научных журналов. В сборнике: Научное издание международного уровня – 2019: стратегия и тактика управления и развития: Материалы 8-й Международной научно-практической конференции, 23–26 апреля 2019 года, Москва, Россия. Екатеринбург: Изд-во Урал. ун-та; 2019. С. 7–16. https://doi.org/10.24069/konf-23-26-04-2019.01

Cioffi A., Coppini S., Massari A., Moretti A., Peroni S., Santini C., Asadi N.S. Identifying and correcting invalid citations due to DOI errors in Crossref data. Scientometrics. 2022;127(6):3593–3612. https://doi.org/10.1007/s11192-022-04367-w

Rodrigues D., Lopes A.L., Batista F. Web of Science Citation Gaps: An Automatic Approach to Detect Indexed but Missing Citations. In: 12th Symposium on Languages, Applications and Technologies (SLATE 2023), 26–28 June 2023, Vila do Conde, Portugal. Schloss Dagstuhl – Leibniz-Zentrum für Informatik; 2023. pp. 5:1–5:11. https://doi.org/10.4230/OASIcs.SLATE.2023.5

Латыпова В.А. Метод поддержки принятия решений при многокритериальном выборе рецензентов с использованием интегральной оценки и методов обработки естественного языка в научном журнале. Моделирование, оптимизация и информационные технологии. 2023;11(4). https://doi.org/10.26102/2310-6018/2023.43.4.035

Schock C., Dumler J., Doepper F. Data Acquisition and Preparation – Enabling Data Analytics Projects within Production. Procedia CIRP. 2021;104:636–640. https://doi.org/10.1016/j.procir.2021.11.107

Гринёв А.В. Проблемы наукометрии и ее пригодность для управления научной деятельностью в современной России. Управленческие науки. 2024;14(1):117–132. https://doi.org/10.26794/2404-022X-2024-14-1-117-132

López-Cózar E.D., Orduna-Malea E., Martín-Martín A., Ayllón J.M. Google Scholar: The Big Data Bibliographic Tool. In: Research Analytics. Boosting University Productivity and Competitiveness through Scientometrics: Chapter 4. New York: Auerbach Publications; 2017. pp. 59–80. https://doi.org/10.1201/9781315155890-4

The authors declare that there are no conflicts of interest present.