共 50 条
- [41] Leveraging Unimodal Self-Supervised Learning for Multimodal Audio-Visual Speech Recognition [J]. PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 4491 - 4503
- [42] Multimodal English corpus for automatic speech recognition [J]. 2013 SIGNAL PROCESSING: ALGORITHMS, ARCHITECTURES, ARRANGEMENTS, AND APPLICATIONS (SPA), 2013, : 106 - 111
- [43] A corpus of audio-visual Lombard speech with frontal and profile views [J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2018, 143 (06): : EL523 - EL529
- [48] A Phone-Viseme Dynamic Bayesian Network for Audio-Visual Automatic Speech Recognition [J]. 19TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOLS 1-6, 2008, : 2597 - 2600
- [49] Speaker independent audio-visual continuous speech recognition [J]. IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOL I AND II, PROCEEDINGS, 2002, : A25 - A28
- [50] Audio-visual fuzzy fusion for robust speech recognition [J]. 2013 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2013,