Speaker Diarization with Session-Level Speaker Embedding Refinement Using Graph Neural Networks

被引：0

作者：

Wang, Jixuan ^{[1
]}

Xiao, Xiong ^{[2
]}

Wu, Jian ^{[2
]}

Ramamurthy, Ranjani ^{[2
]}

Rudzicz, Frank ^{[1
]}

Brudno, Michael ^{[1
]}

机构：

[1] University of Toronto, Canada

[2] Microsoft, United States

来源：

ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings | 2020年 / 2020-May卷

关键词：

Compendex;

D O I：

9054176

中图分类号：

学科分类号：

摘要：

Graph neural networks - Speech recognition - Deep neural networks - Clustering algorithms - Matrix algebra

引用

页码：7109 / 7113

共 50 条

[31] DyViSE: Dynamic Vision-Guided Speaker Embedding for Audio-Visual Speaker Diarization
Wuerkaixi, Abudukelimu
Yan, Kunda
Zhang, You
Duan, Zhiyao
Zhang, Changshui
[J]. 2022 IEEE 24TH INTERNATIONAL WORKSHOP ON MULTIMEDIA SIGNAL PROCESSING (MMSP), 2022,
[32] Age-Invariant Speaker Embedding for Diarization of Cognitive Assessments
Xu, Sean Shensheng
Mak, Man-Wai
Wong, Ka Ho
Meng, Helen
Kwok, Timothy C. Y.
[J]. 2021 12TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2021,
[33] Graph-Embedding for Speaker Recognition
Karam, Zahi N.
Campbell, William M.
[J]. 11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4, 2010, : 2750 - +
[34] Speaker identification using neural networks
Pawar, RV
Kajave, PP
Mali, SN
[J]. ENFORMATIKA, VOL 7: IEC 2005 PROCEEDINGS, 2005, : 429 - 433
[35] Speaker Identification using Neural Networks
Pawar, R. V.
Kajave, P. P.
Mali, S. N.
[J]. PROCEEDINGS OF WORLD ACADEMY OF SCIENCE, ENGINEERING AND TECHNOLOGY, VOL 7, 2005, 7 : 429 - 433
[36] A Comparison of Neural Network Feature Transforms for Speaker Diarization
Yella, Sree Harsha
Stolcke, Andreas
[J]. 16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 3026 - 3030
[37] Speech refinement using Bi-LSTM and improved spectral clustering in speaker diarization
Aishwarya Gupta
Archana Purwar
[J]. Multimedia Tools and Applications, 2024, 83 : 54433 - 54448
[38] Speech refinement using Bi-LSTM and improved spectral clustering in speaker diarization
Gupta, Aishwarya
Purwar, Archana
[J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 83 (18) : 54433 - 54448
[39] Multimodal Speaker Segmentation and Diarization using Lexical and Acoustic Cues via Sequence to Sequence Neural Networks
Park, Tae Jin
Georgiou, Panayiotis
[J]. 19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 1373 - 1377
[40] Speaker Diarization Using a priori Acoustic Information
Aronowitz, Hagai
[J]. 12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 944 - 947

← 1 2 3 4 5 →