MEID: Mixture-of-Experts with Internal Distillation for Long-Tailed Video Recognition

被引:0
|
作者
Li, Xinjie [1 ]
Xu, Huijuan [1 ]
机构
[1] Penn State Univ, University Pk, PA 16802 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The long-tailed video recognition problem is especially challenging, as videos tend to be long and untrimmed, and each video may contain multiple classes, causing frame-level class imbalance. The previous method tackles the long-tailed video recognition only through frame-level sampling for class rebalance without distinguishing the frame-level feature representation between head and tail classes. To improve the frame-level feature representation of tail classes, we modulate the frame-level features with an auxiliary distillation loss to reduce the distribution distance between head and tail classes. Moreover, we design a mixture-of-experts framework with two different expert designs, i.e., the first expert with an attention-based classification network handling the original long-tailed distribution, and the second expert dealing with the re-balanced distribution from class-balanced sampling. Notably, in the second expert, we specifically focus on the frames unsolved by the first expert by designing a complementary frame selection module, which inherits the attention weights from the first expert and selects frames with low attention weights, and we also enhance the motion feature representation for these selected frames. To highlight the multi-label challenge in long-tailed video recognition, we create two additional benchmarks based on Charades and CharadesEgo videos with the multi-label property, called CharadesLT and CharadesEgoLT. Extensive experiments are conducted on the existing long-tailed video benchmark VideoLT and the two new benchmarks to verify the effectiveness of our proposed method with state-of-the-art performance. The code and proposed benchmarks are released at https://github.com/VisionLanguageLab/MEID.
引用
收藏
页码:1451 / 1459
页数:9
相关论文
共 50 条
  • [31] Self-Supervised Aggregation of Diverse Experts for Test-Agnostic Long-Tailed Recognition
    Zhang, Yifan
    Hooi, Bryan
    Hong, Lanqing
    Feng, Jiashi
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [32] Bilinear-experts network with self-adaptive sampler for long-tailed visual recognition
    Wang, Qin
    Kwong, Sam
    Wang, Xizhao
    NEUROCOMPUTING, 2025, 633
  • [33] Enhancing Mixture-of-Experts by Leveraging Attention for Fine-Grained Recognition
    Zhang, Lianbo
    Huang, Shaoli
    Liu, Wei
    IEEE TRANSACTIONS ON MULTIMEDIA, 2022, 24 : 4409 - 4421
  • [34] PROGRESSIVE MODELING OF STEERED MIXTURE-OF-EXPERTS FOR LIGHT FIELD VIDEO APPROXIMATION
    Verhack, Ruben
    Van Wallendael, Glenn
    Courteaux, Martijn
    Lambert, Peter
    Sikora, Thomas
    2018 PICTURE CODING SYMPOSIUM (PCS 2018), 2018, : 268 - 272
  • [35] Learning Prototype Classifiers for Long-Tailed Recognition
    Sharma, Saurabh
    Xian, Yongqin
    Yu, Ning
    Singh, Ambuj
    PROCEEDINGS OF THE THIRTY-SECOND INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2023, 2023, : 1360 - 1368
  • [36] ResLT: Residual Learning for Long-Tailed Recognition
    Cui, Jiequan
    Liu, Shu
    Tian, Zhuotao
    Zhong, Zhisheng
    Jia, Jiaya
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (03) : 3695 - 3706
  • [37] Long-Tailed Recognition via Weight Balancing
    Alshammari, Shaden
    Wang, Yu-Xiong
    Ramanan, Deva
    Kong, Shu
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 6887 - 6897
  • [38] Equalization Loss for Long-Tailed Object Recognition
    Tan, Jingru
    Wang, Changbao
    Li, Buyu
    Li, Quanquan
    Ouyang, Wanli
    Yin, Changqing
    Yan, Junjie
    2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, : 11659 - 11668
  • [39] Decoupled Optimisation for Long-Tailed Visual Recognition
    Cong, Cong
    Xuan, Shiyu
    Liu, Sidong
    Zhang, Shiliang
    Pagnucco, Maurice
    Song, Yang
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 2, 2024, : 1380 - 1388
  • [40] Decoupled Contrastive Learning for Long-Tailed Recognition
    Xuan, Shiyu
    Zhang, Shiliang
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 6, 2024, : 6396 - 6403