共 50 条
- [21] Speaker position detection system using audio-visual information FUJITSU SCIENTIFIC & TECHNICAL JOURNAL, 1999, 35 (02): : 212 - 220
- [22] Audio-Visual Speech Synchronization Detection Using a Bimodal Linear Prediction Model 2009 IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPR WORKSHOPS 2009), VOLS 1 AND 2, 2009, : 670 - +
- [23] Dynamic visual features for audio-visual speaker verification COMPUTER SPEECH AND LANGUAGE, 2010, 24 (02): : 136 - 149
- [25] Performance enhancement for audio-visual speaker identification using dynamic facial muscle model Medical and Biological Engineering and Computing, 2006, 44 : 919 - 930
- [28] Multimodal SpeakerBeam: Single channel target speech extraction with audio-visual speaker clues INTERSPEECH 2019, 2019, : 2718 - 2722
- [29] Emotion Recognition with Pre-Trained Transformers Using Multimodal Signals 2022 10TH INTERNATIONAL CONFERENCE ON AFFECTIVE COMPUTING AND INTELLIGENT INTERACTION (ACII), 2022,
- [30] Uncertainty-Guided End-to-End Audio-Visual Speaker Diarization for Far-Field Recordings PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 4031 - 4041