Self-supervised Learning of Contextualized Local Visual Embeddings

被引：0

作者：

Silva, Thalles ^{[1
]}

Pedrini, Helio ^{[1
]}

Rivera, Adin Ramirez ^{[2
]}

机构：

[1] Univ Estadual Campinas, Inst Comp, Campinas, SP, Brazil

[2] Univ Oslo, Dept Informat, Oslo, Norway

来源：

2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS, ICCVW | 2023年

关键词：

D O I：

10.1109/ICCVW60793.2023.00025

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We present Contextualized Local Visual Embeddings (CLoVE), a self-supervised convolutional-based method that learns representations suited for dense prediction tasks. CLoVE deviates from current methods and optimizes a single loss function that operates at the level of contextualized local embeddings learned from output feature maps of convolution neural network (CNN) encoders. To learn contextualized embeddings, CLoVE proposes a normalized multhead self-attention layer that combines local features from different parts of an image based on similarity. We extensively benchmark CLoVE's pre-trained representations on multiple datasets. CLoVE reaches state-of-the-art performance for CNN-based architectures in 4 dense prediction downstream tasks, including object detection, instance segmentation, keypoint detection, and dense pose estimation. Code: https://github.com/sthalles/CLoVE.

引用

页码：177 / 186

页数：10

共 50 条

[21] A Survey on Masked Autoencoder for Visual Self-supervised Learning
Zhang, Chaoning
Zhang, Chenshuang
Song, Junha
Yi, John Seon Keun
Kweon, In So
PROCEEDINGS OF THE THIRTY-SECOND INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2023, 2023, : 6805 - 6813
[22] Transitive Invariance for Self-supervised Visual Representation Learning
Wang, Xiaolong
He, Kaiming
Gupta, Abhinav
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 1338 - 1347
[23] Self-supervised Visual Representation Learning for Histopathological Images
Yang, Pengshuai
Hong, Zhiwei
Yin, Xiaoxu
Zhu, Chengzhan
Jiang, Rui
MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2021, PT II, 2021, 12902 : 47 - 57
[24] Self-supervised Visual Attribute Learning for Fashion Compatibility
Kim, Donghyun
Saito, Kuniaki
Mishra, Samarth
Sclaroff, Stan
Saenko, Kate
Plummer, Bryan A.
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW 2021), 2021, : 1057 - 1066
[25] Self-Supervised Visual Representation Learning with Semantic Grouping
Wen, Xin
Zhao, Bingchen
Zheng, Anlin
Zhang, Xiangyu
Qi, Xiaojuan
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
[26] Self-supervised representation learning by predicting visual permutations
Zhao, Qilu
Dong, Junyu
KNOWLEDGE-BASED SYSTEMS, 2020, 210
[27] Self-Supervised Visual Descriptor Learning for Dense Correspondence
Schmidt, Tanner
Newcombe, Richard
Fox, Dieter
IEEE ROBOTICS AND AUTOMATION LETTERS, 2017, 2 (02): : 420 - 427
[28] Boosting Self-Supervised Embeddings for Speech Enhancement
Hung, Kuo-Hsuan
Fu, Szu-Wei
Tseng, Huan-Hsin
Chiang, Hsin-Tien
Tsao, Yu
Lin, Chii-Wann
INTERSPEECH 2022, 2022, : 186 - 190
[29] SELF-SUPERVISED LEARNING FOR AUDIO-VISUAL SPEAKER DIARIZATION
Ding, Yifan
Xu, Yong
Zhang, Shi-Xiong
Cong, Yahuan
Wang, Liqiang
2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 4367 - 4371
[30] Sequential Adversarial Learning for Self-Supervised Deep Visual Odometry
Li, Shunkai
Xue, Fei
Wang, Xin
Yan, Zike
Zha, Hongbin
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 2851 - 2860

← 1 2 3 4 5 →