Speaker Diarization with Session-Level Speaker Embedding Refinement Using Graph Neural Networks

被引:0
|
作者
Wang, Jixuan [1 ]
Xiao, Xiong [2 ]
Wu, Jian [2 ]
Ramamurthy, Ranjani [2 ]
Rudzicz, Frank [1 ]
Brudno, Michael [1 ]
机构
[1] University of Toronto, Canada
[2] Microsoft, United States
关键词
Compendex;
D O I
9054176
中图分类号
学科分类号
摘要
Graph neural networks - Speech recognition - Deep neural networks - Clustering algorithms - Matrix algebra
引用
收藏
页码:7109 / 7113
相关论文
共 50 条
  • [31] DyViSE: Dynamic Vision-Guided Speaker Embedding for Audio-Visual Speaker Diarization
    Wuerkaixi, Abudukelimu
    Yan, Kunda
    Zhang, You
    Duan, Zhiyao
    Zhang, Changshui
    [J]. 2022 IEEE 24TH INTERNATIONAL WORKSHOP ON MULTIMEDIA SIGNAL PROCESSING (MMSP), 2022,
  • [32] Age-Invariant Speaker Embedding for Diarization of Cognitive Assessments
    Xu, Sean Shensheng
    Mak, Man-Wai
    Wong, Ka Ho
    Meng, Helen
    Kwok, Timothy C. Y.
    [J]. 2021 12TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2021,
  • [33] Graph-Embedding for Speaker Recognition
    Karam, Zahi N.
    Campbell, William M.
    [J]. 11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4, 2010, : 2750 - +
  • [34] Speaker identification using neural networks
    Pawar, RV
    Kajave, PP
    Mali, SN
    [J]. ENFORMATIKA, VOL 7: IEC 2005 PROCEEDINGS, 2005, : 429 - 433
  • [35] Speaker Identification using Neural Networks
    Pawar, R. V.
    Kajave, P. P.
    Mali, S. N.
    [J]. PROCEEDINGS OF WORLD ACADEMY OF SCIENCE, ENGINEERING AND TECHNOLOGY, VOL 7, 2005, 7 : 429 - 433
  • [36] A Comparison of Neural Network Feature Transforms for Speaker Diarization
    Yella, Sree Harsha
    Stolcke, Andreas
    [J]. 16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 3026 - 3030
  • [37] Speech refinement using Bi-LSTM and improved spectral clustering in speaker diarization
    Aishwarya Gupta
    Archana Purwar
    [J]. Multimedia Tools and Applications, 2024, 83 : 54433 - 54448
  • [38] Speech refinement using Bi-LSTM and improved spectral clustering in speaker diarization
    Gupta, Aishwarya
    Purwar, Archana
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 83 (18) : 54433 - 54448
  • [39] Multimodal Speaker Segmentation and Diarization using Lexical and Acoustic Cues via Sequence to Sequence Neural Networks
    Park, Tae Jin
    Georgiou, Panayiotis
    [J]. 19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 1373 - 1377
  • [40] Speaker Diarization Using a priori Acoustic Information
    Aronowitz, Hagai
    [J]. 12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 944 - 947