共 50 条
- [1] Noise-Tolerant Self-Supervised Learning for Audio-Visual Voice Activity Detection [J]. INTERSPEECH 2021, 2021, : 326 - 330
- [2] AUDIO-VISUAL DEEP LEARNING FOR NOISE ROBUST SPEECH RECOGNITION [J]. 2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 7596 - 7599
- [3] LEARNING CONTEXTUALLY FUSED AUDIO-VISUAL REPRESENTATIONS FOR AUDIO-VISUAL SPEECH RECOGNITION [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2022, : 1346 - 1350
- [4] SELF-SUPERVISED CONTRASTIVE LEARNING FOR AUDIO-VISUAL ACTION RECOGNITION [J]. 2023 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2023, : 1000 - 1004
- [5] NOISE-TOLERANT AUDIO-VISUAL ONLINE PERSON VERIFICATION USING AN ATTENTION-BASED NEURAL NETWORK FUSION [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 3995 - 3999
- [7] DEEP MULTIMODAL LEARNING FOR AUDIO-VISUAL SPEECH RECOGNITION [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 2130 - 2134
- [8] Multimodal Learning Using 3D Audio-Visual Data or Audio-Visual Speech Recognition [J]. 2017 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP), 2017, : 40 - 43
- [9] Audio-visual speech recognition using deep learning [J]. APPLIED INTELLIGENCE, 2015, 42 (04) : 722 - 737