共 50 条
- [21] MULTI-MODAL INFORMATION FUSION FOR NEWS STORY SEGMENTATION IN BROADCAST VIDEO 2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 1417 - 1420
- [22] Video Visual Relation Detection via Multi-modal Feature Fusion PROCEEDINGS OF THE 27TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA (MM'19), 2019, : 2657 - 2661
- [23] Language-guided Multi-Modal Fusion for Video Action Recognition 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW 2021), 2021, : 3151 - 3155
- [24] MMTF: Multi-Modal Temporal Fusion for Commonsense Video Question Answering 2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS, ICCVW, 2023, : 4659 - 4664
- [25] Fusion of Learned Multi-Modal Representations and Dense Trajectories for Emotional Analysis in Videos 2015 13TH INTERNATIONAL WORKSHOP ON CONTENT-BASED MULTIMEDIA INDEXING (CBMI), 2015,
- [26] Video Relation Detection with Trajectory-aware Multi-modal Features MM '20: PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, 2020, : 4590 - 4594
- [29] MRCap: Multi-modal and Multi-level Relationship-based Dense Video Captioning 2023 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, ICME, 2023, : 2615 - 2620