共 50 条
- [42] Audio-Visual Attention Networks for Emotion Recognition AVSU'18: PROCEEDINGS OF THE 2018 WORKSHOP ON AUDIO-VISUAL SCENE UNDERSTANDING FOR IMMERSIVE MULTIMEDIA, 2018, : 27 - 32
- [43] VIDEO CODING BASED ON AUDIO-VISUAL ATTENTION ICME: 2009 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOLS 1-3, 2009, : 57 - 60
- [44] Localize to Binauralize: Audio Spatialization from Visual Sound Source Localization 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 1910 - 1919
- [46] A Joint Cross-Attention Model for Audio-Visual Fusion in Dimensional Emotion Recognition 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2022, 2022, : 2485 - 2494
- [47] Cross-Modal Attention Network for Temporal Inconsistent Audio-Visual Event Localization THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 279 - 286
- [48] Span-based Audio-Visual Localization PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022, : 1252 - 1260
- [49] Audio-Visual Event Localization in Unconstrained Videos COMPUTER VISION - ECCV 2018, PT II, 2018, 11206 : 252 - 268
- [50] Scene recognition with audio-visual sensor fusion Multisensor, Multisource Information Fusion: Architectures, Algorithms and Applications 2005, 2005, 5813 : 201 - 210