References

moitvivt

Моделирование, оптимизация и информационные технологии

Modeling, Optimization and Information Technology

2310-6018

Издательство

10.26102/2310-6018/2026.54.3.012

2211

Архитектуры глубокого обучения для сегментации мультифазных КТ-изображений

Deep learning architectures for multiphase CT image segmentation

Самсоненко

Станислав Владимирович

Samsonenko

Stanislav Vladimirovich

olessyarogok81@gmail.com aff-1

0000-0002-8664-9817

Каширина

Ирина Леонидовна

Kashirina

Irina Leonidovna

kash.irina@mail.ru aff-2

МИРЭА - Российский технологический университет MIREA – Russian Technological University

МИРЭА - Российский технологический университет Воронежский государственный университет MIREA – Russian Technological University Voronezh State University

01 01 2026

1 1

10.26102/2310-6018/2026.54.3.012

2026

This work is licensed under a Creative Commons Attribution 4.0 International License

В статье проводится комплексный систематический анализ современных архитектур глубокого обучения для автоматической сегментации мультифазных КТ-изображений. Подробно рассматриваются специфические особенности мультифазных данных, главными из которых являются пространственные несовпадения (смещения) между фазами, вызванные движениями пациента, и различный характер накопления контрастного вещества в патологических тканях на разных фазах. Эти особенности делают прямую адаптацию классических методов сегментации неэффективной и требуют разработки специализированных архитектур. В статье прослеживается эволюция подходов: от базовых сверточных сетей (U-Net, 3D U-Net, nnU-Net) и гибридных моделей (TransUNet, UNETR), комбинирующих свертки и трансформеры, до специализированных решений. Особое внимание уделяется моделям с механизмами перекрестного внимания между фазами, таким как PA-ResSeg, M3Net и MULLET, которые позволяют осуществлять неявное выравнивание признаков и адаптивное слияние информации из разных фаз без явной регистрации (совмещения) изображений. В работе также анализируются сравнительные преимущества различных стратегий слияния данных c разных фаз (раннее, позднее, перекрестное взаимодействие), рассматриваются вопросы вычислительной эффективности и доступности открытых датасетов. Определены ключевые тенденции и перспективные направления развития области, включая применение фундаментальных моделей (MedSAM, VoxTell) и модально-агностичное обучение. Делается вывод о том, что дальнейший прогресс в области мультифазной сегментации КТ-изображений связан с созданием вычислительно эффективных архитектур, способных к интеграции в реальный клинический процесс для поддержки диагностических решений.

The article provides a comprehensive systematic analysis of modern deep learning architectures for automatic segmentation of multiphase CT images. The specific features of multiphase data are considered in detail, the main of which are spatial mismatches (offsets) between phases caused by patient movements and the different nature of the accumulation of contrast agent in pathological tissues at different phases. These features make direct adaptation of classical segmentation methods ineffective and require the development of specialized architectures. The article traces the evolution of approaches: from basic convolutional networks (U-Net, 3D U-Net, nnU-Net) and hybrid models (TransUNet, UNETR) combining convolutions and transformers to specialized solutions. Special attention is paid to models with mechanisms of cross-attention between phases, such as PA-ResSeg, M3Net and MULLET, which allow for implicit alignment of features and adaptive merging of information from different phases without explicit registration (alignment) of images. The paper also analyzes the comparative advantages of various data fusion strategies from different phases (early, late, cross-interaction), discusses issues of computational efficiency and availability of open datasets. Key trends and promising areas of development of the field have been identified, including the use of fundamental models (MedSAM, VoxTell) and modal-agnostic learning. It is concluded that further progress in the field of multiphase segmentation of CT images is associated with the creation of computationally efficient architectures capable of integration into the real clinical process to support diagnostic solutions.

гибридные архитектуры сегментация изображений механизмы внимания мультифазная КТ слияние признаков медицинская визуализация глубокое обучение компьютерное зрение PA-ResSeg M3Net

hybrid architectures image segmentation attention mechanisms multiphase CT feature fusion medical imaging deep learning computer vision PA-ResSeg M3Net

Исследование выполнено без спонсорской поддержки.

The study was performed without external funding.

References 1

Wu L., Wang H., Chen Y., et al. Beyond radiologist-level liver lesion detection on multi-phase contrast-enhanced CT images by deep learning. iScience. 2023;26(11). https://doi.org/10.1016/j.isci.2023.108183

Руденко А.В., Руденко М.А., Каширина И.Л. Применение искусственных нейронных сетей для поиска объектов на медицинских изображениях. Моделирование, оптимизация и информационные технологии. 2024;12(3). https://doi.org/10.26102/2310-6018/2024.46.3.013

Xu Y., Cai M., Lin L., et al. PA-ResSeg: A phase attention residual network for liver tumor segmentation from multiphase CT images. Medical Physics. 2021;48(7):3752–3766. https://doi.org/10.1002/mp.14922

Ronneberger O., Fischer Ph., Brox Th. U-Net: Convolutional networks for biomedical image segmentation. In: Medical Image Computing and Computer-Assisted Intervention – MICCAI 2015: 18th International Conference: Proceedings: Part III, 05–09 October 2015, Munich, Germany. Cham: Springer; 2015. P. 234–241. https://doi.org/10.1007/978-3-319-24574-4_28

Çiçek Ö., Abdulkadir A., Lienkamp S.S., Brox Th., Ronneberger O. 3D U-Net: Learning dense volumetric segmentation from sparse annotation. In: Medical Image Computing and Computer-Assisted Intervention – MICCAI 2016: 19th International Conference: Proceedings: Part II, 17–21 October 2016, Athens, Greece. Cham: Springer; 2016. P. 424–432. https://doi.org/10.1007/978-3-319-46723-8_49

Milletari F., Navab N., Ahmadi S.-A. V-Net: Fully convolutional neural networks for volumetric medical image segmentation. In: 2016 Fourth International Conference on 3D Vision (3DV), 25–28 October 2016, Stanford, CA, USA. IEEE; 2016. P. 565–571. https://doi.org/10.1109/3DV.2016.79

Isensee F., Jaeger P.F., Kohl S.A.A., Petersen J., Maier-Hein K.H. nnU-Net: A self-configuring method for deep learning-based biomedical image segmentation. Nature Methods. 2021;18:203–211. https://doi.org/10.1038/s41592-020-01008-z

Куликов А.А., Каширина И.Л., Савкина Е.Ф. Сегментация объемных образований печени на мультифазных КТ-изображениях с использованием фреймворка nnU-Net. Моделирование, оптимизация и информационные технологии. 2025;13(1). https://doi.org/10.26102/2310-6018/2025.48.1.040

Dosovitskiy A., Beyer L., Kolesnikov A., et al. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. In: 9th International Conference on Learning Representations, ICLR 2021, 03–07 May 2021, Virtual Event, Austria. 2021. https://doi.org/10.48550/arXiv.2010.11929

Zheng S., Lu J., Zhao H., et al. Rethinking semantic segmentation from a sequence-to-sequence perspective with transformers. In: Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 20–25 June 2021, Nashville, TN, USA. IEEE; 2021. P. 6877–6886. https://doi.org/10.1109/CVPR46437.2021.00681

Chen J., Lu Y., Yu Q., et al. TransUNet: Transformers make strong encoders for medical image segmentation. arXiv. URL: https://doi.org/10.48550/arXiv.2102.04306 [Accessed 15th January 2026].

Hatamizadeh A., Tan Y., Nath V., et al. UNETR: Transformers for 3D medical image segmentation. In: Proceedings of the 2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 03–08 January 2022, Waikoloa, HI, USA. IEEE; 2022. P. 1748–1758. https://doi.org/10.1109/WACV51458.2022.00181

Hatamizadeh A., Nath V., Tang Y., et al. Swin UNETR: Swin transformers for semantic segmentation of brain tumors in MRI images. In: Brainlesion: Glioma, Multiple Sclerosis, Stroke and Traumatic Brain Injuries: 7th International Workshop, BrainLes 2021, Held in Conjunction with MICCAI 2021: Revised Selected Papers: Part I, 27 September 2021, Virtual Event. Cham: Springer; 2022. P. 272–284. https://doi.org/10.1007/978-3-031-08999-2_22

Oktay O., Schlemper J., Folgoc L.L., et al. Attention U-Net: Learning where to look for the pancreas. arXiv. URL: https://doi.org/10.48550/arXiv.1804.03999 [Accessed 15th January 2026].

Alirr O.I. Dual attention U-Net for liver tumor segmentation in CT images. International Journal of Computers, Communications & Control. 2024;19(2).

Достовалова А.М., Горшенин А.К., Старичкова Ю.В., Арзамасов К.М. Сравнительный анализ модификаций нейросетевых архитектур U-Net в задаче сегментации медицинских изображений. Digital Diagnostics. 2024;5(4):833–853. https://doi.org/10.17816/DD629866

Ma J., He Y., Li F., et al. Segment anything in medical images. Nature Communications. 2024;15. https://doi.org/10.1038/s41467-024-44824-z

Rokuss M., Langenberg M., Kirchhoff Y., et al. VoxTell: Free-text promptable universal 3D medical image segmentation. arXiv. URL: https://doi.org/10.48550/arXiv.2511.11450 [Accessed 15th January 2026].

Старичкова Ю.В., Питинов А.В., Газанова Н.Ш. Разработка метода регистрации мультифазных КТ-изображений с использованием афинных преобразований. В сборнике: Медэлектроника–2024. Средства медицинской электроники и новые медицинские технологии: сборник научных статей XIV Международной научно-технической конференции, 05–06 декабря 2024 года, Минск, Беларусь. Минск; 2024. С. 237–240.

Liu F., Cai J., Huo Y., et al. JSSR: A joint synthesis, segmentation, and registration system for 3D multi-modal image alignment of large-scale pathological CT scans. In: Computer Vision – ECCV 2020: 16th European Conference: Proceedings: Part XIII, 23–28 August 2020, Glasgow, UK. Cham: Springer; 2020. P. 257–274. https://doi.org/10.1007/978-3-030-58601-0_16

Zhou Y., Li Y., Zhang Zh., et al. Hyper-Pairing Network for Multi-Phase Pancreatic Ductal Adenocarcinoma Segmentation. arXiv. URL: https://doi.org/10.48550/arXiv.1905.00367 [Accessed 20th January 2026].

Qu T., Wang X., Fang Ch., et al. M3Net: A multi-scale multi-view framework for multi-phase pancreas segmentation based on cross-phase non-local attention. Medical Image Analysis. 2022;75. https://doi.org/10.1016/j.media.2021.102232

Lteif D., Appapogu D., Bargal S.A., Plummer B.A., Kolachalama V.B. Anatomy-guided, modality-agnostic segmentation of neuroimaging abnormalities. Human Brain Mapping. 2025;46(14). https://doi.org/10.1002/hbm.70329

Antonelli M., Reinke A., Bakas S., et al. The Medical Segmentation Decathlon. Nature Communications. 2022;13. https://doi.org/10.1038/s41467-022-30695-9

Wu X., Su H., Hua Y., et al. A multi-phase CT dataset for automated differential diagnosis of liver tumors. Scientific Data. 2026;13. https://doi.org/10.1038/s41597-025-06343-4

Bartnik K., Bartczak T., Krzyziński M., et al. WAW-TACE: A hepatocellular carcinoma multiphase CT dataset with segmentations, radiomics features, and clinical data. Radiology: Artificial Intelligence. 2024;6(6).

Elbatel M., Yi Q., Huang X., et al. Triphasic-aided Liver Lesion Segmentation in Non-contrast CT (TriALS) Challenge. In: Medical Image Computing and Computer Assisted Intervention 2025 (MICCAI). https://doi.org/10.5281/zenodo.15087646

The authors declare that there are no conflicts of interest present.