共 50 条
- [31] Multi-modal Grouping Network for Weakly-Supervised Audio-Visual Video Parsing ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
- [32] Audio-visual event detection based on mining of semantic audio-visual labels STORAGE AND RETRIEVAL METHODS AND APPLICATIONS FOR MULTIMEDIA 2004, 2004, 5307 : 292 - 299
- [34] Self-Supervised Audio-Visual Representation Learning for in-the-wild Videos 2020 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2020, : 5671 - 5672
- [35] SCLAV: Supervised Cross-modal Contrastive Learning for Audio-Visual Coding PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 261 - 270
- [40] A Robust Audio-visual Speech Recognition Using Audio-visual Voice Activity Detection 11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4, 2010, : 2702 - +