共 50 条
- [1] Temporal Cross-Modal Attention for Audio-Visual Event Localization [J]. Seimitsu Kogaku Kaishi/Journal of the Japan Society for Precision Engineering, 2022, 88 (03): : 263 - 268
- [3] Cross-modal Background Suppression for Audio-Visual Event Localization [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 19957 - 19966
- [4] Audio-Visual Event Localization based on Cross-Modal Interacting Guidance [J]. 2021 IEEE FOURTH INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND KNOWLEDGE ENGINEERING (AIKE 2021), 2021, : 104 - 107
- [5] Cross-Modal Relation-Aware Networks for Audio-Visual Event Localization [J]. MM '20: PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, 2020, : 3893 - 3901
- [6] Cross-Modal Label Contrastive Learning for Unsupervised Audio-Visual Event Localization [J]. THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 1, 2023, : 215 - 222
- [7] Temporal and Cross-modal Attention for Audio-Visual Zero-Shot Learning [J]. COMPUTER VISION, ECCV 2022, PT XX, 2022, 13680 : 488 - 505
- [8] Audio-visual Speaker Recognition with a Cross-modal Discriminative Network [J]. INTERSPEECH 2020, 2020, : 2242 - 2246
- [9] Deep Cross-Modal Audio-Visual Generation [J]. PROCEEDINGS OF THE THEMATIC WORKSHOPS OF ACM MULTIMEDIA 2017 (THEMATIC WORKSHOPS'17), 2017, : 349 - 357
- [10] Cross-modal prediction in audio-visual communication [J]. 1996 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, CONFERENCE PROCEEDINGS, VOLS 1-6, 1996, : 2056 - 2059