References

moitvivt

Моделирование, оптимизация и информационные технологии

Modeling, Optimization and Information Technology

2310-6018

Издательство

10.26102/2310-6018/2025.51.4.019

2060

Машинное обучение в защите веб-приложений: современные тренды и перспективы

Machine learning in web application security: current trends and prospects

Ледовская

Екатерина Валерьевна

Ledovskaya

Ekaterina Valerievna

ekvaled@mail.ru aff-1

МИРЭА - Российский технологический университет MIREA - Russian Technological University

01 01 2026

1 1

10.26102/2310-6018/2025.51.4.019

2026

This work is licensed under a Creative Commons Attribution 4.0 International License

Стремительная эволюция киберугроз и их возрастающая сложность обусловливают критическую необходимость интеграции методов машинного обучения в системы защиты веб-приложений. Настоящее исследование представляет комплексный анализ современных подходов к применению алгоритмов машинного обучения в архитектуре межсетевых экранов веб-приложений (WAF) с фокусом на повышение эффективности детектирования атак нулевого дня. Методологическая основа исследования включает сравнительный анализ производительности ансамблевых методов, глубокого обучения и трансформерных архитектур на стандартизированных наборах данных CSIC 2010 и CIC-IDS2017. Эмпирическая база исследования составила 2,847,372 HTTP-запроса, проанализированных с использованием 14 различных алгоритмов машинного обучения в период с июня по декабрь 2024 года. Результаты демонстрируют превосходство гибридных архитектур LSTM-трансформер с достигнутой точностью 98,73 % для детектирования SQL-инъекций и 97,84 % для XSS-атак, что превышает производительность традиционных сигнатурных методов на 23,7 %. Установлено, что применение техник конструирования признаков в сочетании с методами Random Forest и Extreme Gradient Boosting обеспечивает повышение метрики F1-score до 0,989 при сокращении времени обработки запросов в 18 раз относительно алгоритмов на основе правил. Практическая значимость исследования заключается в разработке адаптивной архитектуры WAF, способной к автоматической корректировке параметров детектирования в реальном времени с учетом развивающегося ландшафта угроз. Теоретический вклад работы состоит в формализации принципов интеграции механизмов самовнимания в задачи анализа HTTP-трафика и обосновании оптимальных конфигураций многоголового внимания для различных типов веб-атак.

The rapid evolution of cyber threats and their increasing sophistication necessitate the critical integration of machine learning methods into web application protection systems. This study presents a comprehensive analysis of modern approaches to applying machine learning algorithms within Web Application Firewall (WAF) architectures, with a focus on enhancing zero-day attack detection efficacy. The methodological framework of the research involves a comparative performance analysis of ensemble methods, deep learning, and transformer architectures on standardized datasets CSIC 2010 and CIC-IDS2017. The empirical basis of the study comprised 2,847,372 HTTP requests analyzed using 14 different machine learning algorithms between June and December 2024. The results demonstrate the superiority of hybrid LSTM-Transformer architectures, achieving an accuracy of 98.73% for SQL injection detection and 97.84% for XSS attacks, which exceeds the performance of traditional signature-based methods by 23.7%. It was established that the application of feature engineering techniques combined with Random Forest and Extreme Gradient Boosting methods provides an increase in the F1-score metric to 0.989 while reducing request processing time by a factor of 18 compared to rule-based engines. The practical significance of the research lies in the development of an adaptive WAF architecture capable of automatic real-time adjustment of detection parameters in response to the evolving threat landscape. The theoretical contribution of the work consists of the formalization of principles for integrating self-attention mechanisms into HTTP traffic analysis tasks and the justification of optimal multi-head attention configurations for different types of web attacks.

машинное обучение межсетевой экран веб-приложений глубокое обучение трансформерные архитектуры детектирование аномалий кибербезопасность ансамблевые методы

machine learning web application firewall deep learning transformer architectures anomaly detection cybersecurity ensemble methods

Исследование выполнено без спонсорской поддержки.

The study was performed without external funding.

References 1

Román-Gallego J.-A., Pérez-Delgado M.-L., Viñuela M.L., Vega-Hernández M.-C. Artificial Intelligence Web Application Firewall for Advanced Detection of Web Injection Attacks. Expert Systems. 2023;42(1). https://doi.org/10.1111/exsy.13505

Shaheed A., Kurdy M.H.D.B. Web Application Firewall Using Machine Learning and Features Engineering. Security and Communication Networks. 2022;2022. https://doi.org/10.1155/2022/5280158

Dawadi B.R., Adhikari B., Srivastava D.K. Deep Learning Technique-Enabled Web Application Firewall for the Detection of Web Attacks. Sensors. 2023;23(4). https://doi.org/10.3390/s23042073

Vartouni A.M., Teshnehlab M., Kashi S.S. Leveraging Deep Neural Networks for Anomaly‐Based Web Application Firewall. IET Information Security. 2019;13(4). https://doi.org/10.1049/iet-ifs.2018.5404

Hartono B., Silalahi F.D., Muthohir M. Transformers in Cybersecurity: Advancing Threat Detection and Response Through Machine Learning Architectures. Journal of Technology Informatics and Engineering. 2024;3(3):382–396. https://doi.org/10.51903/jtie.v3i3.211

Avci C., Tekinerdogan B., Catal C. Design Tactics for Tailoring Transformer Architectures to Cybersecurity Challenges. Cluster Computing. 2024;27:9587–9613. https://doi.org/10.1007/s10586-024-04355-0

Junior M.D., Ebecken N.F.F. A New WAF Architecture with Machine Learning for Resource-Efficient Use. Computers & Security. 2021;106. https://doi.org/10.1016/j.cose.2021.102290

Applebaum S., Gaber T., Ahmed A. Signature-Based and Machine-Learning-Based Web Application Firewalls: A Short Survey. Procedia Computer Science. 2021;189:359–367. https://doi.org/10.1016/j.procs.2021.05.105

Belavagi M.C., Muniyal B. Performance Evaluation of Supervised Machine Learning Algorithms for Intrusion Detection. Procedia Computer Science. 2016;89:117–123. https://doi.org/10.1016/j.procs.2016.06.016

Urda D., Martínez B., Basurto N., Kull M., Arroyo Á., Herrero Á. Enhancing Web Traffic Attacks Identification Through Ensemble Methods and Feature Selection. arXiv. URL: https://arxiv.org/abs/2412.16791 [Accessed 15th July 2025].

Franklin J. The Elements of Statistical Learning: Data Mining, Inference, and Prediction. The Mathematical Intelligencer. 2005;27:83–85. https://doi.org/10.1007/BF02985802

Sukumar J.V.A., Pranav I., Neetish M.M., Narayanan J. Network Intrusion Detection Using Improved Genetic k-Means Algorithm. In: 2018 International Conference on Advances in Computing, Communications and Informatics (ICACCI), 19–22 September 2018, Bangalore, India. IEEE; 2018. P. 2441–2446. https://doi.org/10.1109/ICACCI.208.8554710

Vaswani A., Shazeer N., Parmar N., et al. Attention Is All You Need. arXiv. URL: https://arxiv.org/abs/1706.03762 [Accessed 15th July 2025].

Tavallaee M., Bagheri E., Lu W., Ghorbani A.A. A Detailed Analysis of the KDD CUP 99 Data Set. In: 2009 IEEE Symposium on Computational Intelligence for Security and Defense Applications, 08–10 July 2009, Ottawa, ON, Canada. IEEE; 2009. P. 1–6. https://doi.org/10.1109/CISDA.2009.5356528

Shiravi A., Shiravi H., Tavallaee M., Ghorbani A.A. Toward Developing a Systematic Approach to Generate Benchmark Datasets for Intrusion Detection. Computers & Security. 2012;31(3):357–374. https://doi.org/10.1016/j.cose.2011.12.012

The authors declare that there are no conflicts of interest present.