Cross-Modal Knowledge Distillation with Dropout-Based Confidence

被引:0
|
作者
Cho, Won Ik [1 ]
Kim, Jeunghun [2 ,3 ]
Kim, Nam Soo [2 ,3 ]
机构
[1] Samsung Adv Inst Technol, Samsung Elect, Suwon, South Korea
[2] Seoul Natl, Dept Elect & Comp Engn, Seoul, South Korea
[3] Seoul Natl, INMC, Seoul, South Korea
关键词
dropout; uncertainty; cross-modal distillation; spoken language understanding;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In cross-modal distillation, e.g., from text-based inference modules to spoken language understanding module, it is difficult to determine the teacher's influence due to the different nature of both modalities that bring the heterogeneity in the aspect of uncertainty. Though error rate or entropy-based schemes have been suggested to cope with the heuristics of time-based scheduling, the confidence of the teacher inference has not been necessarily taken into deciding the teacher's influence. In this paper, we propose a dropout-based confidence that decides the teacher's confidence and to-student influence of the loss. On the widely used spoken language understanding dataset, Fluent Speech Command, we show that our weight decision scheme enhances performance in combination with the conventional scheduling strategies, displaying a maximum 20% relative error reduction concerning the model with no distillation.
引用
收藏
页码:653 / 657
页数:5
相关论文
共 50 条
  • [1] CROSS-MODAL KNOWLEDGE DISTILLATION FOR ACTION RECOGNITION
    Thoker, Fida Mohammad
    Gall, Juergen
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2019, : 6 - 10
  • [2] Acoustic NLOS Imaging with Cross-Modal Knowledge Distillation
    Shin, Ui-Hyeon
    Jang, Seungwoo
    Kim, Kwangsu
    [J]. PROCEEDINGS OF THE THIRTY-SECOND INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2023, 2023, : 1405 - 1413
  • [3] Unsupervised Deep Cross-Modal Hashing by Knowledge Distillation for Large-scale Cross-modal Retrieval
    Li, Mingyong
    Wang, Hongya
    [J]. PROCEEDINGS OF THE 2021 INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL (ICMR '21), 2021, : 183 - 191
  • [4] Latent Space Semantic Supervision Based on Knowledge Distillation for Cross-Modal Retrieval
    Zhang, Li
    Wu, Xiangqian
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2022, 31 : 7154 - 7164
  • [5] Unsupervised domain adaptation for lip reading based on cross-modal knowledge distillation
    Takashima, Yuki
    Takashima, Ryoichi
    Tsunoda, Ryota
    Aihara, Ryo
    Takiguchi, Tetsuya
    Ariki, Yasuo
    Motoyama, Nobuaki
    [J]. EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2021, 2021 (01)
  • [6] Unsupervised domain adaptation for lip reading based on cross-modal knowledge distillation
    Yuki Takashima
    Ryoichi Takashima
    Ryota Tsunoda
    Ryo Aihara
    Tetsuya Takiguchi
    Yasuo Ariki
    Nobuaki Motoyama
    [J]. EURASIP Journal on Audio, Speech, and Music Processing, 2021
  • [7] CKDH: CLIP-Based Knowledge Distillation Hashing for Cross-Modal Retrieval
    Li, Jiaxing
    Wong, Wai Keung
    Jiang, Lin
    Fang, Xiaozhao
    Xie, Shengli
    Xu, Yong
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (07) : 6530 - 6541
  • [8] Semi-Supervised Knowledge Distillation for Cross-Modal Hashing
    Su, Mingyue
    Gu, Guanghua
    Ren, Xianlong
    Fu, Hao
    Zhao, Yao
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 662 - 675
  • [9] Cross-modal knowledge distillation for continuous sign language recognition
    Gao, Liqing
    Shi, Peng
    Hu, Lianyu
    Feng, Jichao
    Zhu, Lei
    Wan, Liang
    Feng, Wei
    [J]. NEURAL NETWORKS, 2024, 179
  • [10] Progressive Cross-modal Knowledge Distillation for Human Action Recognition
    Ni, Jianyuan
    Ngu, Anne H. H.
    Yan, Yan
    [J]. PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022, : 5903 - 5912