共 50 条
- [1] A PRE-TRAINED AUDIO-VISUAL TRANSFORMER FOR EMOTION RECOGNITION 2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 4698 - 4702
- [2] PRISM: Pre-trained Indeterminate Speaker Representation Model for Speaker Diarization and Speaker Verification INTERSPEECH 2022, 2022, : 1431 - 1435
- [4] SELF-SUPERVISED LEARNING FOR AUDIO-VISUAL SPEAKER DIARIZATION 2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 4367 - 4371
- [5] Speaker Diarization based on Audio-Visual Integration for Smart Posterboard 2014 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA), 2014,
- [7] AVA-AVD: Audio-Visual Speaker Diarization in the Wild PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022, : 3838 - 3847
- [9] Audio-visual speaker diarization using fisher linear semi-discriminant analysis Multimedia Tools and Applications, 2016, 75 : 115 - 130
- [10] DyViSE: Dynamic Vision-Guided Speaker Embedding for Audio-Visual Speaker Diarization 2022 IEEE 24TH INTERNATIONAL WORKSHOP ON MULTIMEDIA SIGNAL PROCESSING (MMSP), 2022,