共 50 条
- [2] Attention-Based Audio-Visual Fusion for Video Summarization [J]. NEURAL INFORMATION PROCESSING (ICONIP 2019), PT II, 2019, 11954 : 328 - 340
- [3] Noise-Tolerant Learning for Audio-Visual Action Recognition [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 7761 - 7774
- [5] Attention Fusion for Audio-Visual Person Verification Using Multi-Scale Features [J]. 2020 15TH IEEE INTERNATIONAL CONFERENCE ON AUTOMATIC FACE AND GESTURE RECOGNITION (FG 2020), 2020, : 281 - 285
- [6] Attention-based Audio-Visual Fusion for Robust Automatic Speech Recognition [J]. ICMI'18: PROCEEDINGS OF THE 20TH ACM INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION, 2018, : 111 - 115
- [7] A Deep Neural Network for Audio-Visual Person Recognition [J]. 2015 IEEE 7TH INTERNATIONAL CONFERENCE ON BIOMETRICS THEORY, APPLICATIONS AND SYSTEMS (BTAS 2015), 2015,
- [8] Multi-Attention Audio-Visual Fusion Network for Audio Spatialization [J]. PROCEEDINGS OF THE 2021 INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL (ICMR '21), 2021, : 394 - 401
- [9] Noise-Tolerant Self-Supervised Learning for Audio-Visual Voice Activity Detection [J]. INTERSPEECH 2021, 2021, : 326 - 330
- [10] Fuzzy-Neural-Network Based Audio-Visual Fusion for Speech Recognition [J]. 2019 1ST INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE IN INFORMATION AND COMMUNICATION (ICAIIC 2019), 2019, : 210 - 214