Speaker Diarization for Meeting Room Audio

被引:0
|
作者
Sun, Hanwu [1 ]
Nwe, Tin Lay [1 ]
Ma, Bin [1 ]
Li, Haizhou [1 ]
机构
[1] ASTAR, I2R, Singapore 138632, Singapore
关键词
Multiple Distant Microphone; speaker diarization; time difference of arrive; speech activity detection; speaker clustering;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper describes a speaker diarization system in 2007 NIST Rich Transcription (RT07) Meeting Recognition Evaluation for the task of Multiple Distant Microphone (MDM) in meeting room scenarios. The system includes three major modules: data preparation, initial speaker clustering and cluster purification/merging. The data preparation consists of the raw data Wiener filtering and beamforming, Time Difference of Arrival estimate and speech activity detection. Based on the initial processed data, two-stage histogram quantization has been used to perform the initial speaker clustering. A modified purification strategy via high-order GMM clustering method is proposed. BIC criterion is applied for cluster merging. The system achieves a competitive overall DER of 8.31% for RT07 MDM speaker diarization task.
引用
收藏
页码:888 / 891
页数:4
相关论文
共 50 条
  • [1] SPEAKER DIARIZATION IN MEETING AUDIO
    Nwe, Tin Lay
    Sun, Hanwu
    Li, Haizhou
    Rahardja, Susanto
    [J]. 2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 4073 - 4076
  • [2] Speaker Diarization in Meeting Audio for Single Distant Microphone
    Nwe, Tin Lay
    Sun, Hanwu
    Ma, Bin
    Li, Haizhou
    [J]. 11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 1505 - 1508
  • [3] SPEAKER DIARIZATION SYSTEM FOR RT07 AND RT09 MEETING ROOM AUDIO
    Sun, Hanwu
    Ma, Bin
    Khine, Swe Zin Kalayar
    Li, Haizhou
    [J]. 2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 4982 - 4985
  • [4] Development of a Filipino Speaker Diarization in Meeting Room Conversations
    De la Cruz, Angelica H.
    Raga Jr, Rodolfo C.
    [J]. PROCEEDINGS OF THE 2019 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP), 2019, : 462 - 467
  • [5] The Influence of Speech Activity Detection and Overlap on Speaker Diarization for Meeting Room Recordings
    Fredouille, Corinne
    Evans, Nicholas
    [J]. INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 2704 - 2707
  • [6] Speaker Diarization and Linking of Meeting Data
    Ferras, Marc
    Madikeri, Srikanth
    Bourlard, Herve
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2016, 24 (11) : 1935 - 1945
  • [7] ADVANCED APPROACHES TO SPEAKER DIARIZATION OF AUDIO DOCUMENTS
    Markov, Konstantin
    [J]. JCPC: 2009 JOINT CONFERENCE ON PERVASIVE COMPUTING, 2009, : 179 - 184
  • [8] Fusing Audio and Video Information for Online Speaker Diarization
    Schmalenstroeer, Joerg
    Kelling, Martin
    Leutnant, Volker
    Haeb-Umbach, Reinhold
    [J]. INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 1159 - 1162
  • [9] Online Meeting Recognizer with Multichannel Speaker Diarization
    Araki, Shoko
    Hori, Takaaki
    Fujimoto, Masakiyo
    Watanabe, Shinji
    Yoshioka, Takuya
    Nakatani, Tomohiro
    Nakamura, Atsushi
    [J]. 2010 CONFERENCE RECORD OF THE FORTY FOURTH ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS AND COMPUTERS (ASILOMAR), 2010, : 1697 - 1701
  • [10] Improved Location Features for Meeting Speaker Diarization
    Otterson, Scott
    [J]. INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 2928 - 2931