共 50 条
- [1] MULTI-SPEAKER TRACKING BY FUSING AUDIO AND VIDEO INFORMATION [J]. 2021 IEEE STATISTICAL SIGNAL PROCESSING WORKSHOP (SSP), 2021, : 321 - 325
- [2] Audio segmentation and speaker localization in meeting videos [J]. 18TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL 2, PROCEEDINGS, 2006, : 1150 - +
- [3] Training Speaker Embedding Extractors Using Multi-Speaker Audio with Unknown Speaker Boundaries [J]. INTERSPEECH 2022, 2022, : 605 - 609
- [5] Speaker detection using multi-speaker audio files for both enrollment and test [J]. 2003 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL II, PROCEEDINGS: SPEECH II; INDUSTRY TECHNOLOGY TRACKS; DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS; NEURAL NETWORKS FOR SIGNAL PROCESSING, 2003, : 77 - 80
- [6] Multi-speaker DoA Estimation Using Audio and Visual Modality [J]. Neural Processing Letters, 2023, 55 : 8887 - 8901
- [7] Exploiting the Complementarity of Audio and Visual Data in Multi-Speaker Tracking [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW 2017), 2017, : 446 - 454
- [8] Multi-speaker DoA Estimation Using Audio and Visual Modality [J]. NEURAL PROCESSING LETTERS, 2023, 55 (07) : 8887 - 8901
- [9] Audio Visual Multi-Speaker Tracking with Improved GCF and PMBM Filter [J]. INTERSPEECH 2022, 2022, : 3704 - 3708
- [10] Audio-Visual Multi-Speaker Tracking Based On the GLMB Framework [J]. INTERSPEECH 2020, 2020, : 3082 - 3086