vox2vec: A Framework for Self-supervised Contrastive Learning of Voxel-Level Representations in Medical Images

被引：5

作者：

Goncharov, Mikhail ^{[1
]}

Soboleva, Vera ^{[2
]}

Kurmukov, Anvar ^{[3
]}

Pisov, Maxim ^{[4
]}

Belyaev, Mikhail ^{[1
,3
]}

机构：

[1] Skolkovo Inst Sci & Technol, Moscow, Russia

[2] Artificial Intelligence Res Inst AIRI, Moscow, Russia

[3] Inst Informat Transmiss Problems, Moscow, Russia

[4] IRA Labs, Moscow, Russia

来源：

MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION, MICCAI 2023, PT I | 2023年 / 14220卷

基金：

俄罗斯科学基金会;

关键词：

Contrastive Self-Supervised Representation Learning; Medical Image Segmentation;

D O I：

10.1007/978-3-031-43907-0_58

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This paper introduces vox2vec - a contrastive method for self-supervised learning (SSL) of voxel-level representations. vox2vec representations are modeled by a Feature Pyramid Network (FPN): a voxel representation is a concatenation of the corresponding feature vectors from different pyramid levels. The FPN is pre-trained to produce similar representations for the same voxel in different augmented contexts and distinctive representations for different voxels. This results in unified multi-scale representations that capture both global semantics (e.g., body part) and local semantics (e.g., different small organs or healthy versus tumor tissue). We use vox2vec to pre-train a FPN on more than 6500 publicly available computed tomography images. We evaluate the pre-trained representations by attaching simple heads on top of them and training the resulting models for 22 segmentation tasks. We show that vox2vec outperforms existing medical imaging SSL techniques in three evaluation setups: linear and non-linear probing and end-to-end fine-tuning. Moreover, a non-linear head trained on top of the frozen vox2vec representations achieves competitive performance with the FPN trained from scratch while having 50 times fewer trainable parameters. The code is available at https://github.com/mishgon/vox2vec.

引用

页码：605 / 614

页数：10

共 50 条

[1] wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations
Baevski, Alexei
Zhou, Henry
Mohamed, Abdelrahman
Auli, Michael
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
[2] A Review of Predictive and Contrastive Self-supervised Learning for Medical Images
Wang, Wei-Chien
Ahn, Euijoon
Feng, Dagan
Kim, Jinman
MACHINE INTELLIGENCE RESEARCH, 2023, 20 (04) : 483 - 513
[3] A Review of Predictive and Contrastive Self-supervised Learning for Medical Images
Wei-Chien Wang
Euijoon Ahn
Dagan Feng
Jinman Kim
Machine Intelligence Research, 2023, 20 : 483 - 513
[4] Uncertainty-Guided Voxel-Level Supervised Contrastive Learning for Semi-Supervised Medical Image Segmentation
Hua, Yu
Shu, Xin
Wang, Zizhou
Zhang, Lei
INTERNATIONAL JOURNAL OF NEURAL SYSTEMS, 2022, 32 (04)
[5] Self-supervised contrastive learning on agricultural images
Guldenring, Ronja
Nalpantidis, Lazaros
COMPUTERS AND ELECTRONICS IN AGRICULTURE, 2021, 191
[6] CCC-WAV2VEC 2.0: CLUSTERING AIDED CROSS CONTRASTIVE SELF-SUPERVISED LEARNING OF SPEECH REPRESENTATIONS
Lodagala, Vasista Sai
Ghosh, Sreyan
Umesh, S.
2022 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP, SLT, 2022, : 1 - 8
[7] Non-Contrastive Self-Supervised Learning of Utterance-Level Speech Representations
Cho, Jaejin
Pappagari, Raghavendra
Zelasko, Piotr
Velazquez, Laureano Moro
Villalba, Jesus
Dehak, Najim
INTERSPEECH 2022, 2022, : 4028 - 4032
[8] Self-Supervised Visual Representations Learning by Contrastive Mask Prediction
Zhao, Yucheng
Wang, Guangting
Luo, Chong
Zeng, Wenjun
Zha, Zheng-Jun
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 10140 - 10149
[9] Self-Supervised Voxel-Level Representation Rediscovers Subcellular Structures in Volume Electron Microscopy
Han, Hongqing
Dmitrieva, Mariia
Sauer, Alexander
Tam, Ka Ho
Rittscher, Jens
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2022, 2022, : 1873 - 1882
[10] Contrastive self-supervised learning from 100 million medical images with optional supervision
Ghesu, Florin C.
Georgescu, Bogdan
Mansoor, Awais
Yoo, Youngjin
Neumann, Dominik
Patel, Pragneshkumar
Vishwanath, Reddappagari Suryanarayana
Balter, James M.
Cao, Yue
Grbic, Sasa
Comaniciu, Dorin
JOURNAL OF MEDICAL IMAGING, 2022, 9 (06)

← 1 2 3 4 5 →