SPEAKER DIARIZATION IN MEETING AUDIO

被引:6
|
作者
Nwe, Tin Lay [1 ]
Sun, Hanwu [1 ]
Li, Haizhou [1 ]
Rahardja, Susanto [1 ]
机构
[1] ASTAR, Inst Infocomm Res I2R, Singapore 138632, Singapore
关键词
Meetings; pattern classification; clustering methods; speech processing; modeling;
D O I
10.1109/ICASSP.2009.4960523
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper describes speaker diarization system on NIST Rich Transcription 2007 (RT-07) Meeting Recognition evaluation data set for the task of Multiple Distant Microphone (MDM). Our implementation includes three components: initial clustering, non-speech removal and cluster purification. Initial clusters are generated using Direction of Arrival (DOA) information and bootstrap clustering. Multiple GMM modeling for speech/non-speech classification is employed for non-speech removal component. In addition, a novel system fusion strategy using information from Receiver Operating Curve (ROC) is proposed for non-speech removal component. Finally, consensus clustering approach together with iterative GMM clustering method is employed for speaker cluster purification. The system achieves the overall DER of 10.81%.
引用
收藏
页码:4073 / 4076
页数:4
相关论文
共 50 条
  • [1] Speaker Diarization for Meeting Room Audio
    Sun, Hanwu
    Nwe, Tin Lay
    Ma, Bin
    Li, Haizhou
    [J]. INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 888 - 891
  • [2] Speaker Diarization in Meeting Audio for Single Distant Microphone
    Nwe, Tin Lay
    Sun, Hanwu
    Ma, Bin
    Li, Haizhou
    [J]. 11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 1505 - 1508
  • [3] Speaker Diarization and Linking of Meeting Data
    Ferras, Marc
    Madikeri, Srikanth
    Bourlard, Herve
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2016, 24 (11) : 1935 - 1945
  • [4] ADVANCED APPROACHES TO SPEAKER DIARIZATION OF AUDIO DOCUMENTS
    Markov, Konstantin
    [J]. JCPC: 2009 JOINT CONFERENCE ON PERVASIVE COMPUTING, 2009, : 179 - 184
  • [5] SPEAKER DIARIZATION SYSTEM FOR RT07 AND RT09 MEETING ROOM AUDIO
    Sun, Hanwu
    Ma, Bin
    Khine, Swe Zin Kalayar
    Li, Haizhou
    [J]. 2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 4982 - 4985
  • [6] Fusing Audio and Video Information for Online Speaker Diarization
    Schmalenstroeer, Joerg
    Kelling, Martin
    Leutnant, Volker
    Haeb-Umbach, Reinhold
    [J]. INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 1159 - 1162
  • [7] Online Meeting Recognizer with Multichannel Speaker Diarization
    Araki, Shoko
    Hori, Takaaki
    Fujimoto, Masakiyo
    Watanabe, Shinji
    Yoshioka, Takuya
    Nakatani, Tomohiro
    Nakamura, Atsushi
    [J]. 2010 CONFERENCE RECORD OF THE FORTY FOURTH ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS AND COMPUTERS (ASILOMAR), 2010, : 1697 - 1701
  • [8] Improved Location Features for Meeting Speaker Diarization
    Otterson, Scott
    [J]. INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 2928 - 2931
  • [9] VARIATIONAL BAYESIAN SPEAKER DIARIZATION OF MEETING RECORDINGS
    Valente, Fabio
    Motlicek, Petr
    Vijayasenan, Deepu
    [J]. 2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 4954 - 4957
  • [10] ADAPTIVE AND ONLINE SPEAKER DIARIZATION FOR MEETING DATA
    Soldi, Giovanni
    Beaugeant, Christophe
    Evans, Nicholas
    [J]. 2015 23RD EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2015, : 2112 - 2116