Improving Masked Autoencoders by Learning Where to Mask

被引:0
|
作者
Chen, Haijian [1 ]
Zhang, Wendong [1 ]
Wang, Yunbo [1 ]
Yang, Xiaokang [1 ]
机构
[1] Shanghai Jiao Tong Univ, Shanghai, Peoples R China
关键词
Self-Supervised Learning; Masked Image Modeling;
D O I
10.1007/978-981-99-8543-2_31
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Masked image modeling is a promising self-supervised learning method for visual data. It is typically built upon image patches with random masks, which largely ignores the variation of information density between them. The question is: Is there a better masking strategy than random sampling and how can we learn it? We empirically study this problem and initially find that introducing object-centric priors in mask sampling can significantly improve the learned representations. Inspired by this observation, we present AutoMAE, a fully differentiable framework that uses Gumbel-Softmax to interlink an adversarially trained mask generator and a mask-guided image modeling process. In this way, our approach can adaptively find patches with higher information density for different images, and further strike a balance between the information gain obtained from image reconstruction and its practical training difficulty. In our experiments, AutoMAE is shown to provide effective pretraining models on standard self-supervised benchmarks and downstream tasks.
引用
收藏
页码:377 / 390
页数:14
相关论文
共 50 条
  • [1] How Mask Matters: Towards Theoretical Understandings of Masked Autoencoders
    Zhang, Qi
    Wang, Yifei
    Wang, Yisen
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [2] Improving Visual Representations of Masked Autoencoders With Artifacts Suppression
    Miao, Zhengwei
    Luo, Hui
    Liu, Dongxu
    Zhang, Jianlin
    [J]. IEEE SIGNAL PROCESSING LETTERS, 2024, 31 : 2615 - 2619
  • [3] What's Behind the Mask: Understanding Masked Graph Modeling for Graph Autoencoders
    Li, Jintang
    Wu, Ruofan
    Sun, Wangbin
    Chen, Liang
    Tian, Sheng
    Zhu, Liang
    Meng, Changhua
    Zheng, Zibin
    Wang, Weiqiang
    [J]. PROCEEDINGS OF THE 29TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, KDD 2023, 2023, : 1268 - 1279
  • [4] Enhancing Representation Learning of EEG Data with Masked Autoencoders
    Zhou, Yifei
    Liu, Sitong
    [J]. AUGMENTED COGNITION, PT II, AC 2024, 2024, 14695 : 88 - 100
  • [5] Audiovisual Masked Autoencoders
    Georgescu, Mariana-Iuliana
    Fonseca, Eduardo
    Ionescu, Radu Tudor
    Lucic, Mario
    Schmid, Cordelia
    Arnab, Anurag
    [J]. 2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 16098 - 16108
  • [6] Siamese Masked Autoencoders
    Gupta, Agrim
    Wu, Jiajun
    Deng, Jia
    Li Fei-Fei
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [7] GAUSSIAN MASKED AUTOENCODERS
    Rajasegaran, Jathushan
    Chen, Xinlei
    Li, Rulilong
    Feichtenhofer, Christoph
    Malik, Jitendra
    Ginosar, Shiry
    [J]. arXiv,
  • [8] AdaMAE: Adaptive Masking for Efficient Spatiotemporal Learning with Masked Autoencoders
    Bandara, Wele Gedara Chaminda
    Patel, Naman
    Gholami, Ali
    Nikkhah, Mehdi
    Agrawal, Motilal
    Patel, Vishal M.
    [J]. 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 14507 - 14517
  • [9] SemMAE: Semantic-Guided Masking for Learning Masked Autoencoders
    Li, Gang
    Zheng, Heliang
    Liu, Daqing
    Wang, Chaoyue
    Su, Bing
    Zheng, Changwen
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [10] Improving Reinforcement Learning Exploration by Autoencoders
    Paczolay, Gabor
    Harmati, Istvan
    [J]. Periodica Polytechnica Electrical Engineering and Computer Science, 2024, 68 (04): : 335 - 343