MEID: Mixture-of-Experts with Internal Distillation for Long-Tailed Video Recognition

被引：0

作者：

Li, Xinjie ^{[1
]}

Xu, Huijuan ^{[1
]}

机构：

[1] Penn State Univ, University Pk, PA 16802 USA

来源：

THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 2 | 2023年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The long-tailed video recognition problem is especially challenging, as videos tend to be long and untrimmed, and each video may contain multiple classes, causing frame-level class imbalance. The previous method tackles the long-tailed video recognition only through frame-level sampling for class rebalance without distinguishing the frame-level feature representation between head and tail classes. To improve the frame-level feature representation of tail classes, we modulate the frame-level features with an auxiliary distillation loss to reduce the distribution distance between head and tail classes. Moreover, we design a mixture-of-experts framework with two different expert designs, i.e., the first expert with an attention-based classification network handling the original long-tailed distribution, and the second expert dealing with the re-balanced distribution from class-balanced sampling. Notably, in the second expert, we specifically focus on the frames unsolved by the first expert by designing a complementary frame selection module, which inherits the attention weights from the first expert and selects frames with low attention weights, and we also enhance the motion feature representation for these selected frames. To highlight the multi-label challenge in long-tailed video recognition, we create two additional benchmarks based on Charades and CharadesEgo videos with the multi-label property, called CharadesLT and CharadesEgoLT. Extensive experiments are conducted on the existing long-tailed video benchmark VideoLT and the two new benchmarks to verify the effectiveness of our proposed method with state-of-the-art performance. The code and proposed benchmarks are released at https://github.com/VisionLanguageLab/MEID.

引用

页码：1451 / 1459

页数：9

共 50 条

[31] Self-Supervised Aggregation of Diverse Experts for Test-Agnostic Long-Tailed Recognition
Zhang, Yifan
Hooi, Bryan
Hong, Lanqing
Feng, Jiashi
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
[32] Bilinear-experts network with self-adaptive sampler for long-tailed visual recognition
Wang, Qin
Kwong, Sam
Wang, Xizhao
NEUROCOMPUTING, 2025, 633
[33] Enhancing Mixture-of-Experts by Leveraging Attention for Fine-Grained Recognition
Zhang, Lianbo
Huang, Shaoli
Liu, Wei
IEEE TRANSACTIONS ON MULTIMEDIA, 2022, 24 : 4409 - 4421
[34] PROGRESSIVE MODELING OF STEERED MIXTURE-OF-EXPERTS FOR LIGHT FIELD VIDEO APPROXIMATION
Verhack, Ruben
Van Wallendael, Glenn
Courteaux, Martijn
Lambert, Peter
Sikora, Thomas
2018 PICTURE CODING SYMPOSIUM (PCS 2018), 2018, : 268 - 272
[35] Learning Prototype Classifiers for Long-Tailed Recognition
Sharma, Saurabh
Xian, Yongqin
Yu, Ning
Singh, Ambuj
PROCEEDINGS OF THE THIRTY-SECOND INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2023, 2023, : 1360 - 1368
[36] ResLT: Residual Learning for Long-Tailed Recognition
Cui, Jiequan
Liu, Shu
Tian, Zhuotao
Zhong, Zhisheng
Jia, Jiaya
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (03) : 3695 - 3706
[37] Long-Tailed Recognition via Weight Balancing
Alshammari, Shaden
Wang, Yu-Xiong
Ramanan, Deva
Kong, Shu
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 6887 - 6897
[38] Equalization Loss for Long-Tailed Object Recognition
Tan, Jingru
Wang, Changbao
Li, Buyu
Li, Quanquan
Ouyang, Wanli
Yin, Changqing
Yan, Junjie
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, : 11659 - 11668
[39] Decoupled Optimisation for Long-Tailed Visual Recognition
Cong, Cong
Xuan, Shiyu
Liu, Sidong
Zhang, Shiliang
Pagnucco, Maurice
Song, Yang
THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 2, 2024, : 1380 - 1388
[40] Decoupled Contrastive Learning for Long-Tailed Recognition
Xuan, Shiyu
Zhang, Shiliang
THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 6, 2024, : 6396 - 6403

← 1 2 3 4 5 →