共 50 条
- [1] LEARNING CONTEXTUALLY FUSED AUDIO-VISUAL REPRESENTATIONS FOR AUDIO-VISUAL SPEECH RECOGNITION 2022 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2022, : 1346 - 1350
- [3] Leveraging Unimodal Self-Supervised Learning for Multimodal Audio-Visual Speech Recognition PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 4491 - 4503
- [4] DEEP MULTIMODAL LEARNING FOR AUDIO-VISUAL SPEECH RECOGNITION 2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 2130 - 2134
- [7] MODALITY ATTENTION FOR END-TO-END AUDIO-VISUAL SPEECH RECOGNITION 2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 6565 - 6569
- [9] An audio-visual speech recognition with a new mandarin audio-visual database INT CONF ON CYBERNETICS AND INFORMATION TECHNOLOGIES, SYSTEMS AND APPLICATIONS/INT CONF ON COMPUTING, COMMUNICATIONS AND CONTROL TECHNOLOGIES, VOL 1, 2007, : 19 - +
- [10] Audio-Visual Biometric Recognition Via Joint Sparse Representations 2016 23RD INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2016, : 3031 - 3035