共 50 条
- [2] AVLnet: Learning Audio-Visual Language Representations from Instructional Videos INTERSPEECH 2021, 2021, : 1584 - 1588
- [3] Multilingual Audio-Visual Smartphone Dataset and Evaluation IEEE ACCESS, 2021, 9 : 153240 - 153257
- [4] Audio-Visual Event Localization in Unconstrained Videos COMPUTER VISION - ECCV 2018, PT II, 2018, 11206 : 252 - 268
- [5] Self-Supervised Audio-Visual Representation Learning for in-the-wild Videos 2020 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2020, : 5671 - 5672
- [10] LEARNING CONTEXTUALLY FUSED AUDIO-VISUAL REPRESENTATIONS FOR AUDIO-VISUAL SPEECH RECOGNITION 2022 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2022, : 1346 - 1350