References

moitvivt

Моделирование, оптимизация и информационные технологии

Modeling, Optimization and Information Technology

2310-6018

Издательство

10.26102/2310-6018/2024.46.3.019

1640

Особенности применения методов глубокого обучения для обнаружения небольших объектов на видео в условиях дождя

Features of applying deep learning methods to detect small objects in video in rainy conditions

0000-0003-2866-4864

Штехин

Сергей Евгеньевич

Shtekhin

Sergei Evgenievich

shs77@bk.ru aff-1

Стадник

Алексей Викторович

Stadnik

Aleksei Vicktorovich

i@lxstd.ru aff-2

«Отраслевой центр разработки и внедрения информационных систем» Сириус, филиал № 11 "Industry center for the development and implementation of information systems" Sirius, branch No. 11

01 01 2026

1 1

10.26102/2310-6018/2024.46.3.019

2026

This work is licensed under a Creative Commons Attribution 4.0 International License

В данной работе рассматриваются методы детектирования объектов небольшого размера на видео, при проведении распознавании технологических операций ручного труда, которые проходят вне помещений, на открытом воздухе и подвержены влиянию погодных условий. Рассмотрены подходы для улучшения точности детектирования таких объектов при неблагоприятных погодных условиях, таких как дождь. В данной работе был исследован двухэтапный подход. На первом этапе методами компьютерного зрения, такими методами глубокого обучения, как сверточные нейросети, производится выявление и классификация различных погодных условий на видео. На втором этапе, при обнаружении неблагоприятных погодных условий, проводится исследование различных методов глубокого обучения для фильтрация погодных условий на видео. Основное внимание уделено оценке влияния различных методов фильтрации на точность детектирования объектов небольшого размера. В работе рассмотрен вопрос применимости данного подхода для детектирования небольших инструментов на видеоданных, при распознавании технологических операций ручного труда, выполняемых при ремонте и обслуживании железнодорожного пути. Полученные результаты могут быть полезны при исследовании трудовых процессов, происходящих вне помещений, в алгоритмах распознавания технологических операций ручного труда на видеоданных.

This paper discusses methods for detecting small objects in video when recognizing manual labor operations that take place outdoors, in the open air, and are affected by weather conditions. Approaches to improve the accuracy of detecting such objects in adverse weather conditions, such as rain, are considered. This paper explores a two-stage approach. At the first stage, computer vision methods and deep learning methods such as convolutional neural networks are used to identify and classify various weather conditions in video. At the second stage, when adverse weather conditions are detected, a study is conducted of various deep learning methods for filtering weather conditions in video. The main focus is on assessing the impact of various filtering methods on the accuracy of detecting small objects. The paper considers the applicability of this approach to detecting small tools in video data when recognizing manual labor operations performed during repair and maintenance of a railway track. The obtained results can be useful in the study of labor processes occurring outdoors, in algorithms for recognizing manual labor operations in video data.

глубокое обучение трансформер детектирование объектов распознавание погодных условий на видео фильтрация погодных условий фильтрация шума на изображении нейронные сети технологические операции

deep learning transformer object detection recognition of weather conditions on video filtering of weather conditions filtering of noise in the image neural networks technological operations

Исследование выполнено без спонсорской поддержки.

The study was performed without external funding.

References 1

Штехин С.Е., Карачёв Д.К., Иванова Ю.К. Разработка алгоритма распознавания движений человека методами компьютерного зрения в задаче нормирования рабочего времени. Труды Института системного программирования РАН. 2020;32(1):121–136. https://doi.org/10.15514/ISPRAS-2020-32(1)-7

Zou Zh., Chen K., Shi Zh., Guo Yu., Ye J. Object Detection in 20 Years: A Survey. Proceedings of the IEEE. 2023;111(3):257–276. https://doi.org/10.1109/jproc.2023.3238524

Arkin E., Yadikar N., Xu X. et al. A survey: object detection methods from CNN to transformer. Multimedia Tools and Applications. 2023;82(14):21353–21383. https://doi.org/10.1007/s11042-022-13801-3

Карачев Д.К., Штехин С.Е., Тарасян В.С., Смолин И.Ю., Исаков М.В. Использование переноса стиля как способ улучшения обобщающей способности нейросети в задаче детекции объектов. Труды Института системного программирования РАН. 2023;35(6):247–264. https://doi.org/10.15514/ISPRAS-2023-35(6)-16

Liu Y., Sun P., Wergeles N., Shang Y. A survey and performance evaluation of deep learning methods for small object detection. Expert Systems with Applications. 2021;172. https://doi.org/10.1016/j.eswa.2021.114602

Царук В.Б. Выделение искажений, вносимых атмосферными осадками на видеоизображения. В сборнике: Актуальные проблемы авиации и космонавтики: Сборник материалов XIV Международной научно-практической конференции, посвященной Дню космонавтики: Том 2, 09–13 апреля 2018 года, Красноярск, Россия. 2018. С. 176–178.

Ляхов П.А., Ионисян А.С., Лютова В.В., Оразаев А.Р. Обзор методов улучшения визуального качества изображений и видео в неблагоприятных погодных условиях. Современная наука и инновации. 2022;(4):8–24. https://doi.org/10.37493/2307-910X.2022.4.1

Shtekhin S., Karachev D., Stadnik A. Study of Filtering the Weather Adverse Effects to Object Detection. Physics of Particles and Nuclei. 2024;55:329–333. https://doi.org/10.1134/S1063779624030766

Hnewa M., Radha H. Object Detection Under Rainy Conditions for Autonomous Vehicles: A Review of State-of-the-Art and Emerging Techniques. IEEE Signal Processing Magazine. 2021;38(1):53–67. https://doi.org/10.1109/MSP.2020.2984801

Deng J., Dong W., Socher R., Li L.-J., Li K., Li F.-F. ImageNet: A large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, 20–25 June 2009, Miami, USA. IEEE; 2009. pp. 248–255. https://doi.org/10.1109/CVPR.2009.5206848

Gbeminiyi O., Zenghui W. Multi-Class Weather Classification from Still Image Using Said Ensemble Method. In: 2019 Southern African Universities Power Engineering Conference/Robotics and Mechatronics/Pattern Recognition Association of South Africa (SAUPEC/RobMech/PRASA), 28–30 January 2019, Bloemfontein, South Africa. IEEE; 2019. pp. 135–140. https://doi.org/10.1109/RoboMech.2019.8704783

He K., Zhang X., Ren S., Sun J. Deep Residual Learning for Image Recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 27–30 June 2016, Las Vegas, USA. IEEE; 2016. pp. 770–778. https://doi.org/10.1109/CVPR.2016.90

Chen X., Pan J., Dong J., Tang J. Towards Unified Deep Image Deraining: A Survey and A New Benchmark. URL: https://arxiv.org/pdf/2310.03535 [Accessed 30th July 2024].

Goodfellow I., Pouget-Abadie J., Mirza M., Xu B., Warde-Farley D., Ozair S., Courville A., Bengio Y. Generative adversarial networks. Communications of the ACM. 2020;63(11):139–144. https://doi.org/10.1145/3422622

Vaswani A., Shazeer N., Parmar N., Uszkoreit J., Jones L., Gomez A.N. et al. Attention is All you Need. In: Advances in Neural Information Processing Systems 30 (NIPS 2017): 31st Conference on Neural Information Processing Systems (NIPS 2017), 4–9 December 2017, Long Beach, USA. Montreal: Curran Associates; 2017. pp. 5998–6008.

Yang W., Tan R.T., Feng J., Liu J., Guo Z., Yan S. Deep Joint Rain Detection and Removal from a Single Image. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 21–26 July 2017, Honolulu, USA. IEEE; 2017. pp. 1685–1694. https://doi.org/10.1109/CVPR.2017.183

Fu X., Huang J., Zeng D., Huang Y., Ding X., Paisley J. Removing Rain from Single Images via a Deep Detail Network. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 21–26 July 2017, Honolulu, USA. IEEE; 2017. pp. 1715–1723. https://doi.org/10.1109/CVPR.2017.186

Li X., Wu J., Lin Z., Liu H., Zha H. Recurrent Squeeze-and-Excitation Context Aggregation Net for Single Image Deraining. In: Computer Vision – ECCV 2018: 15th European Conference: Proceedings: Part VII, 8–14 September 2018, Munich, Germany. Cham: Springer; 2018. pp. 262–277. https://doi.org/10.1007/978-3-030-01234-2_16

Fu X., Qi Q., Zha Z.-J., Zhu Y., Ding X. Rain Streak Removal via Dual Graph Convolutional Network. Proceedings of the AAAI Conference on Artificial Intelligence. 2021;35(2):1352–1360. https://doi.org/10.1609/aaai.v35i2.16224

Fu X., Xiao J., Zhu Y., Liu A., Wu F., Zha Z.-J. Continual Image Deraining With Hypergraph Convolutional Networks. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2023;45(8):9534–9551. https://doi.org/10.1109/TPAMI.2023.3241756

Qian R., Tan R.T., Yang W., Su J., Liu J. Attentive Generative Adversarial Network for Raindrop Removal from A Single Image. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 18–23 June 2018, Salt Lake City, USA. IEEE; 2018. pp. 2482–2491. https://doi.org/10.1109/CVPR.2018.00263

Zhang H., Sindagi V., Patel V.M. Image De-Raining Using a Conditional Generative Adversarial Network. IEEE Transactions on Circuits and Systems for Video Technology. 2019;30(11):3943–3956. https://doi.org/10.1109/TCSVT.2019.2920407

Li R., Cheong L.-F., Tan R.T. Heavy Rain Image Restoration: Integrating Physics Model and Conditional Adversarial Learning. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 15–20 June 2019, Long Beach, USA. IEEE; 2019. pp. 1633–1642. https://doi.org/10.1109/CVPR.2019.00173

Pan J., Dong J., Liu Y., Zhang J., Ren J., Tang J. et al. Physics-Based Generative Adversarial Models for Image Restoration and Beyond. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2020;43(7):2449–2462. https://doi.org/10.1109/TPAMI.2020.2969348

Ni S., Cao X., Yue T., Hu X. Controlling the Rain: from Removal to Rendering. In: 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 20–25 June 2021, Nashville, USA. IEEE; 2021. pp. 6324–6333. https://doi.org/10.1109/CVPR46437.2021.00626

Han K., Wang Y., Chen H., Chen X., Guo J., Liu Z. et al. A Survey on Vision Transformer. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2022;45(1):87–110. https://doi.org/10.1109/TPAMI.2022.3152247

Xiao J., Fu X., Liu A., Wu F., Zha Z.-J. Image De-Raining Transformer. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2022;45(11):12978–12995. https://doi.org/10.1109/TPAMI.2022.3183612

Chen H., Wang Y., Guo T., Xu C., Deng Y., Liu Z. et al. Pre-Trained Image Processing Transformer. In: 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 20–25 June 2021, Nashville, USA. IEEE; 2021. pp. 12294–12305. https://doi.org/10.1109/CVPR46437.2021.01212

Zamir S.W., Arora A., Khan S., Hayat M., Khan F.S., Yang M.-H. Restormer: Efficient Transformer for High-Resolution Image Restoration. In: 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 18–24 June 2022, New Orleans, USA. IEEE; 2022. pp. 5718–5729. https://doi.org/10.1109/CVPR52688.2022.00564

Jiang K., Wang Z., Chen C., Wang Z., Cui L., Lin C.-W. Magic ELF: Image Deraining Meets Association Learning and Transformer. In: MM '22: The 30th ACM International Conference on Multimedia, 10–14 October 2022, Lisboa, Portugal. New York: Association for Computing Machinery; 2022. pp. 827–836. https://doi.org/10.1145/3503161.3547760

Chen X., Pan J., Lu J., Fan Z., Li H. Hybrid CNN-Transformer Feature Fusion for Single Image Deraining. Proceedings of the AAAI Conference on Artificial Intelligence. 2023;37(1):378–386. https://doi.org/10.1609/aaai.v37i1.25111

Wang C.-Y., Bochkovskiy A., Liao H.-Y.M. YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. URL: https://arxiv.org/abs/2207.02696v1 [Accessed 30th July 2024].

Howard A., Sandler M., Chen B., Wang W., Chen L.-C., Tan M., Chu G., Vasudevan V. Searching for MobileNetV3. In: 2019 IEEE/CVF International Conference on Computer Vision (ICCV), 27 October 2019 – 02 November 2019, Seoul, Korea (South). IEEE; 2019. pp. 1314–1324. https://doi.org/10.1109/ICCV.2019.00140

Simonyan K., Zisserman A. Very Deep Convolutional Networks for Large-Scale Image Recognition. URL: https://arxiv.org/abs/1409.1556 [Accessed 30th July 2024].

The authors declare that there are no conflicts of interest present.