The AMI speaker diarization system for NIST RT06s meeting data

被引:0
|
作者
van Leeuwen, David A. [1 ]
Huijbregts, Marijn [2 ]
机构
[1] TNO, Human Factors, POB 23, NL-3769 DE Soesterberg, Netherlands
[2] Univ Twente, Depat EEMCS, Human Media Interact, Enschede, Netherlands
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We describe the systems submitted to the NIST RT06s evaluation for the Speech Activity Detection (SAD) and Speaker Diarization (SPKR) tasks. For speech activity detection, a new analysis methodology is presented that generalizes the Detection Erorr Tradeoff analysis commonly used in speaker detection tasks. The speaker diarization systems are based on the TNO and ICSI system submitted for RT05s. For the conference room evaluation Single Distant Microphone condition, the SAD results perform well at 4.23% error rate, and the 'HMM-BIC' SPKR results perform competatively at an error rate of 37.2% including overlapping speech.
引用
收藏
页码:371 / +
页数:3
相关论文
共 39 条
  • [21] Technical improvements of the E-HMM based speaker diarization system for meeting records
    Fredouille, Corinne
    Senay, Gregory
    MACHINE LEARNING FOR MULTIMODAL INTERACTION, 2006, 4299 : 359 - +
  • [22] NIST smart data flow system II - Speaker localization
    Fillinger, Antoine
    Diduch, Lukas
    Hamchi, Imad
    Degre, Stephane
    Stanford, Vincent
    PROCEEDINGS OF THE SIXTH INTERNATIONAL SYMPOSIUM ON INFORMATION PROCESSING IN SENSOR NETWORKS, 2007, : 549 - 550
  • [23] SRI's 2004 NIST speaker recognition evaluation system
    Kajarekar, SS
    Ferrer, L
    Shriberg, E
    Sonmez, K
    Stolcke, A
    Venkatarman, A
    Zheng, J
    2005 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1-5: SPEECH PROCESSING, 2005, : 173 - 176
  • [24] Speaker diarization using direction of arrival estimate and acoustic feature information:: The I2R-NTU submission for the NIST RT 2007 evaluation
    Koh, Eugene Chin Wei
    Sun, Hanwu
    Nwe, Tin Lay
    Nguyen, Trung Hieu
    Ma, Bin
    Chng, Eng-Siong
    Li, Haizhou
    Rahardja, Susanto
    MULTIMODAL TECHNOLOGIES FOR PERCEPTION OF HUMANS, 2008, 4625 : 484 - +
  • [25] Speaker Clustering and Cluster Purification Methods for RT07 and RT09 Evaluation Meeting Data
    Nwe, Tin Lay
    Sun, Hanwu
    Ma, Bin
    Li, Haizhou
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2012, 20 (02): : 461 - 473
  • [26] The ISL RT-06S speech-to-text system
    Fuegen, Christian
    Ikbal, Shajith
    Kraft, Florian
    Kumatani, Kenichi
    Laskowski, Kornel
    McDonough, John W.
    Ostendorf, Mari
    Stueker, Sebastian
    Woelfel, Matthias
    MACHINE LEARNING FOR MULTIMODAL INTERACTION, 2006, 4299 : 407 - +
  • [27] Two-Pass IB based Speaker Diarization System using Meeting-Specific ANN based Features
    Dawalatabad, Nauman
    Madikeri, Srikanth
    Sekhar, C. Chandra
    Murthy, Hema A.
    17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 2199 - 2203
  • [28] Nuance - Politecnico di Torino's 2016 NIST Speaker Recognition Evaluation System
    Colibro, Daniele
    Vair, Claudio
    Dalmasso, Emanuele
    Farrell, Kevin
    Karvitsky, Gennady
    Cumani, Sandro
    Laface, Pietro
    18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 1338 - 1342
  • [29] LOQUENDO - POLITECNICO DI TORINO'S 2010 NIST SPEAKER RECOGNITION EVALUATION SYSTEM
    Castaldo, Fabio
    Colibro, Daniele
    Vair, Claudio
    Cumani, Sandro
    Laface, Pietro
    2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 5464 - 5467
  • [30] Nuance - Politecnico di Torino's 2012 NIST Speaker Recognition Evaluation System
    Colibro, Daniele
    Vair, Claudio
    Farrell, Kevin
    Krause, Nir
    Karvitsky, Gennady
    Cumani, Sandro
    Laface, Pietro
    14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 1995 - 1999