Improving Masked Autoencoders by Learning Where to Mask

被引：0

作者：

Chen, Haijian ^{[1
]}

Zhang, Wendong ^{[1
]}

Wang, Yunbo ^{[1
]}

Yang, Xiaokang ^{[1
]}

机构：

[1] Shanghai Jiao Tong Univ, Shanghai, Peoples R China

来源：

PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT VIII | 2024年 / 14432卷

关键词：

Self-Supervised Learning; Masked Image Modeling;

D O I：

10.1007/978-981-99-8543-2_31

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Masked image modeling is a promising self-supervised learning method for visual data. It is typically built upon image patches with random masks, which largely ignores the variation of information density between them. The question is: Is there a better masking strategy than random sampling and how can we learn it? We empirically study this problem and initially find that introducing object-centric priors in mask sampling can significantly improve the learned representations. Inspired by this observation, we present AutoMAE, a fully differentiable framework that uses Gumbel-Softmax to interlink an adversarially trained mask generator and a mask-guided image modeling process. In this way, our approach can adaptively find patches with higher information density for different images, and further strike a balance between the information gain obtained from image reconstruction and its practical training difficulty. In our experiments, AutoMAE is shown to provide effective pretraining models on standard self-supervised benchmarks and downstream tasks.

引用

页码：377 / 390

页数：14

共 50 条

[1] How Mask Matters: Towards Theoretical Understandings of Masked Autoencoders
Zhang, Qi
Wang, Yifei
Wang, Yisen
[J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
[2] Improving Visual Representations of Masked Autoencoders With Artifacts Suppression
Miao, Zhengwei
Luo, Hui
Liu, Dongxu
Zhang, Jianlin
[J]. IEEE SIGNAL PROCESSING LETTERS, 2024, 31 : 2615 - 2619
[3] What's Behind the Mask: Understanding Masked Graph Modeling for Graph Autoencoders
Li, Jintang
Wu, Ruofan
Sun, Wangbin
Chen, Liang
Tian, Sheng
Zhu, Liang
Meng, Changhua
Zheng, Zibin
Wang, Weiqiang
[J]. PROCEEDINGS OF THE 29TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, KDD 2023, 2023, : 1268 - 1279
[4] Enhancing Representation Learning of EEG Data with Masked Autoencoders
Zhou, Yifei
Liu, Sitong
[J]. AUGMENTED COGNITION, PT II, AC 2024, 2024, 14695 : 88 - 100
[5] Audiovisual Masked Autoencoders
Georgescu, Mariana-Iuliana
Fonseca, Eduardo
Ionescu, Radu Tudor
Lucic, Mario
Schmid, Cordelia
Arnab, Anurag
[J]. 2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 16098 - 16108
[6] Siamese Masked Autoencoders
Gupta, Agrim
Wu, Jiajun
Deng, Jia
Li Fei-Fei
[J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
[7] GAUSSIAN MASKED AUTOENCODERS
Rajasegaran, Jathushan
Chen, Xinlei
Li, Rulilong
Feichtenhofer, Christoph
Malik, Jitendra
Ginosar, Shiry
[J]. arXiv,
[8] AdaMAE: Adaptive Masking for Efficient Spatiotemporal Learning with Masked Autoencoders
Bandara, Wele Gedara Chaminda
Patel, Naman
Gholami, Ali
Nikkhah, Mehdi
Agrawal, Motilal
Patel, Vishal M.
[J]. 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 14507 - 14517
[9] SemMAE: Semantic-Guided Masking for Learning Masked Autoencoders
Li, Gang
Zheng, Heliang
Liu, Daqing
Wang, Chaoyue
Su, Bing
Zheng, Changwen
[J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
[10] Improving Reinforcement Learning Exploration by Autoencoders
Paczolay, Gabor
Harmati, Istvan
[J]. Periodica Polytechnica Electrical Engineering and Computer Science, 2024, 68 (04): : 335 - 343

← 1 2 3 4 5 →