Complementary Mask Self-Supervised Pre-training Based on Teacher-Student Network

被引:0
|
作者
Ye, Shaoxiong [1 ]
Huang, Jing [1 ]
Zhu, Lifu [1 ]
机构
[1] Wuhan Univ Technol, Sch Comp Sci & Technol, Wuhan, Hubei, Peoples R China
关键词
Pre-training model; Self-supervised; Masked image modeling; Contrastive learning; Encoder;
D O I
10.1109/ACCTCS58815.2023.00082
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we propose a complementary self-supervised mask model based on teacher-student networks. This model contains a student network, a teacher network, and a mask prediction module. The student's network is an encoder structure, and the teacher's network consists of encoders and decoders. The teacher and student network encoders are used for learning image representations and have the same network structure and model parameters. The pre-training has two pre-tasks: First, the mask image block representation predicted by the decoder in the teacher network predicts the actual image pixels through the mask prediction module. Then, we introduce a comparative learning loss function to compare the outputs of the teacher and student modules in representation space. This paper proposes a complementary masking mechanism to reduce the gap between upstream and downstream mismatches in the pre-training model based on mask image modeling (MIM). For example, a complete picture is an input into the teacher and the student network. For the teacher network, the input picture is randomly masked off, for example, 75 %; the student network masks the remaining part of the input picture, 25 %. The student network masks the rest (25%) of the input image. The pre-trained model proposed in this paper has been pre-trained on COCO and other data sets, and downstream tasks are performed on four conventional data sets. By comparing with some of the latest self-supervised pre-trained models, it is proved that the pre-trained model proposed in this paper can learn better representational information.
引用
收藏
页码:199 / 206
页数:8
相关论文
共 50 条
  • [31] AN ADAPTER BASED PRE-TRAINING FOR EFFICIENT AND SCALABLE SELF-SUPERVISED SPEECH REPRESENTATION LEARNING
    Kessler, Samuel
    Thomas, Bethan
    Karout, Salah
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 3179 - 3183
  • [32] Masked self-supervised pre-training model for EEG-based emotion recognition
    Hu, Xinrong
    Chen, Yu
    Yan, Jinlin
    Wu, Yuan
    Ding, Lei
    Xu, Jin
    Cheng, Jun
    COMPUTATIONAL INTELLIGENCE, 2024, 40 (03)
  • [33] Token Boosting for Robust Self-Supervised Visual Transformer Pre-training
    Li, Tianjiao
    Foo, Lin Geng
    Hu, Ping
    Shang, Xindi
    Rahmani, Hossein
    Yuan, Zehuan
    Liu, Jun
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 24027 - 24038
  • [34] Joint Encoder-Decoder Self-Supervised Pre-training for ASR
    Arunkumar, A.
    Umesh, S.
    INTERSPEECH 2022, 2022, : 3418 - 3422
  • [35] Individualized Stress Mobile Sensing Using Self-Supervised Pre-Training
    Islam, Tanvir
    Washington, Peter
    APPLIED SCIENCES-BASEL, 2023, 13 (21):
  • [36] Self-Supervised Pre-training for Protein Embeddings Using Tertiary Structures
    Guo, Yuzhi
    Wu, Jiaxiang
    Ma, Hehuan
    Huang, Junzhou
    THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / THE TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 6801 - 6809
  • [37] Stabilizing Label Assignment for Speech Separation by Self-supervised Pre-training
    Huang, Sung-Feng
    Chuang, Shun-Po
    Liu, Da-Rong
    Chen, Yi-Chen
    Yang, Gene-Ping
    Lee, Hung-yi
    INTERSPEECH 2021, 2021, : 3056 - 3060
  • [38] Censer: Curriculum Semi-supervised Learning for Speech Recognition Based on Self-supervised Pre-training
    Zhang, Bowen
    Cao, Songjun
    Zhang, Xiaoming
    Zhang, Yike
    Ma, Long
    Shinozaki, Takahiro
    INTERSPEECH 2022, 2022, : 2653 - 2657
  • [39] Self-Supervised pre-training model based on Multi-view for MOOC Recommendation
    Tian R.
    Cai J.
    Li C.
    Wang J.
    Expert Systems with Applications, 2024, 252
  • [40] SslTransT: Self-supervised pre-training visual object tracking with Transformers
    Cai, Yannan
    Tan, Ke
    Wei, Zhenzhong
    OPTICS COMMUNICATIONS, 2024, 557