References

moitvivt

Моделирование, оптимизация и информационные технологии

Modeling, Optimization and Information Technology

2310-6018

Издательство

10.26102/2310-6018/2025.50.3.035

1993

Методы определения нетиповых объектов в музыкальном ряде

Methods for detecting atypical objects in a musical sequence

Котельников

Владимир Владимирович

Kotelnikov

Vladimir Vladimirovich

vv.kotelnikov@inbox.ru aff-1

Ахлестин

Андрей Игоревич

Ahlestin

Andrey Igorevich

ahlestin.and@yandex.ru aff-2

Паринова

Евгения Викторовна

Parinova

Evgeniya Victorovna

ysahno86@gmail.com aff-3

Воронежский государственный технический университет Voronezh State Technical University

01 01 2026

1 1

10.26102/2310-6018/2025.50.3.035

2026

This work is licensed under a Creative Commons Attribution 4.0 International License

В статье рассматриваются современные методы автоматического обнаружения нетиповых (аномальных) музыкальных событий в музыкальном ряде, таких как неожиданные смены гармонии, нехарактерные интервалы, ритмические сбои или нарушения музыкального стиля, которые позволяют автоматизировать данный процесс и оптимизировать время работы специалистов. Задача выявления аномалий актуальна в музыкальной аналитике, цифровой реставрации, генеративной музыке и адаптивных рекомендациях. В работе используются как традиционные признаки (Chroma Features, MFCC, Tempogram, RMS-energy, Spectral Contrast), так и современные методы анализа последовательностей (self-similarity matrices, latent space embeddings). В качестве исходных данных применялись разнообразные MIDI-корпусы и аудиозаписи различных жанров, приведенные к единому частотному и временному масштабу. Были опробованы методы обучения с учителем и без него, включая кластеризацию, автоэнкодеры, нейросетевые классификаторы и алгоритмы изоляции аномалий (isolation forests). Полученные результаты демонстрируют, что наибольшую эффективность показывает гибридный подход, сочетающий структурные музыкальные признаки с методами глубокого обучения. Новизна работы заключается в комплексном сравнении традиционных и нейросетевых подходов для разных типов аномалий на едином корпусе данных. Практическая апробация показала перспективность предлагаемого метода для систем автоматического мониторинга музыкального контента и повышения качества музыкальных рекомендаций. В дальнейшем планируется расширение исследования на мультимодальные музыкальные данные и обработку в режиме реального времени.

The article explores modern methods for automatic detection of atypical (anomalous) musical events within a musical sequence, such as unexpected harmonic shifts, uncharacteristic intervals, rhythmic disruptions, or deviations from musical style, aimed at automating this process and optimizing specialists' working time. The task of anomaly detection is highly relevant in music analytics, digital restoration, generative music, and adaptive recommendation systems. The study employs both traditional features (Chroma Features, MFCC, Tempogram, RMS-energy, Spectral Contrast) and advanced sequence analysis techniques (self-similarity matrices, latent space embeddings). The source data consisted of diverse MIDI corpora and audio recordings from various genres, normalized to a unified frequency and temporal scale. Both supervised and unsupervised learning methods were tested, including clustering, autoencoders, neural network classifiers, and anomaly isolation algorithms (isolation forests). The results demonstrate that the most effective approach is a hybrid one that combines structural musical features with deep learning methods. The novelty of this research lies in a comprehensive comparison of traditional and neural network approaches for different types of anomalies on a unified dataset. Practical testing has shown the proposed method's potential for automatic music content monitoring systems and for improving the quality of music recommendations. Future work is planned to expand the research to multimodal musical data and real-time processing.

музыкальный ряд аномалия темпограмма музыкальный стиль MFCC Chroma автоэнкодер обнаружение музыкальных аномалий

musical sequence anomaly tempogram musical style MFCC Chroma autoencoder music anomaly detection

Исследование выполнено без спонсорской поддержки.

The study was performed without external funding.

References 1

Müller M. Fundamentals of Music Processing: Audio, Analysis, Algorithms, Applications. Cham: Springer; 2015. 487 p. https://doi.org/10.1007/978-3-319-21945-5

Tzanetakis G., Cook P. Musical Genre Classification of Audio Signals. IEEE Transactions on Speech and Audio Processing. 2002;10(5):293–302. https://doi.org/10.1109/TSA.2002.800560

Choi K., Fazekas G., Sandler M.B., Cho K. Convolutional Recurrent Neural Networks for Music Classification. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 05–09 March 2017, New Orleans, LA, USA. IEEE; 2017. P. 2392–2396. https://doi.org/10.1109/ICASSP.2017.7952585

Huang Yu-S., Yang Yi-H. Pop Music Transformer: Beat-Based Modeling and Generation of Expressive Pop Piano Compositions. In: MM '20: Proceedings of the 28th ACM International Conference on Multimedia, 12–16 October 2020, Seattle, WA, USA. New York: Association for Computing Machinery; 2020. P. 1180–1188. https://doi.org/10.1145/3394171.3413671

Luo Yi.-J., Su L. Learning Domain-Adaptive Latent Representations of Music Signals Using Variational Autoencoders. In: Proceedings of the 19th International Society for Music Information Retrieval Conference (ISMIR), 23–27 September 2018, Paris, France. 2018. P. 653–660.

Foote J. Visualizing Music and Audio Using Self-Similarity. In: MULTIMEDIA '99: Proceedings of the 7th ACM international Conference on Multimedia (Part 1), 30 October – 5 November 1999, Orlando, FL, USA. New York: Association for Computing Machinery; 1999. P. 77–80. https://doi.org/10.1145/319463.319472

Peeters G., Angulo F. SSM-Net: Feature Learning for Music Structure Analysis Using a Self‑Similarity‑Matrix Based Loss. In: ISMIR 2022: Proceedings of the 23rd International Society for Music Information Retrieval Conference, 04–08 December 2022, Bengaluru, India. 2022. https://arxiv.org/abs/2211.08141

McFee B., Ellis D. Analyzing Song Structure with Spectral Clustering. In: ISMIR 2014: Proceedings of the 15th International Society for Music Information Retrieval Conference, 27–31 October 2014, Taipei, Taiwan. 2014. P. 405–410.

Sigtia S., Benetos E., Dixon S. An End-to-End Neural Network for Polyphonic Piano Music Transcription. IEEE/ACM Transactions on Audio, Speech, and Language Processing. 2016;24(5):927–939. https://doi.org/10.1109/TASLP.2016.2533858

Lattner S., Grachten M., Widmer G. Learning Transposition-Invariant Interval Features from Symbolic Music and Audio. In: Proceedings of the 19th International Society for Music Information Retrieval Conference (ISMIR), 23–27 September 2018, Paris, France. 2018. https://doi.org/10.48550/arXiv.1806.08236

The authors declare that there are no conflicts of interest present.