共 50 条
- [23] Transfer Learning from Audio-Visual Grounding to Speech Recognition INTERSPEECH 2019, 2019, : 3242 - 3246
- [24] BAUM-2: a multilingual audio-visual affective face database Multimedia Tools and Applications, 2015, 74 : 7429 - 7459
- [26] Multimodal Learning Using 3D Audio-Visual Data or Audio-Visual Speech Recognition 2017 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP), 2017, : 40 - 43
- [28] Learning Bimodal Structure in Audio-Visual Data IEEE TRANSACTIONS ON NEURAL NETWORKS, 2009, 20 (12): : 1898 - 1910
- [29] ADVERSARIAL INPUT ABLATION FOR AUDIO-VISUAL LEARNING 2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 7742 - 7746
- [30] AUDIO-VISUAL SPEECH INPAINTING WITH DEEP LEARNING 2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 6653 - 6657