共 50 条
- [42] Cross-Modal Global Interaction and Local Alignment for Audio-Visual Speech Recognition [J]. PROCEEDINGS OF THE THIRTY-SECOND INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2023, 2023, : 5076 - 5084
- [44] Collecting Cross-Modal Presence-Absence Evidence for Weakly-Supervised Audio-Visual Event Perception [J]. 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 18827 - 18836
- [46] Specialty may be better: A decoupling multi-modal fusion network for Audio-visual event localization [J]. 2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN, 2023,
- [47] Complete Cross-triplet Loss in Label Space for Audio-visual Cross-modal Retrieval [J]. 2022 IEEE INTERNATIONAL SYMPOSIUM ON MULTIMEDIA (ISM), 2022, : 1 - 9
- [50] Hierarchical cross-modal contextual attention network for visual grounding [J]. Multimedia Systems, 2023, 29 : 2073 - 2083