Online Meeting Recognizer with Multichannel Speaker Diarization

被引:0
|
作者
Araki, Shoko [1 ]
Hori, Takaaki [1 ]
Fujimoto, Masakiyo [1 ]
Watanabe, Shinji [1 ]
Yoshioka, Takuya [1 ]
Nakatani, Tomohiro [1 ]
Nakamura, Atsushi [1 ]
机构
[1] NTT Corp, NTT Commun Sci Labs, Seika, Kyoto 6190237, Japan
关键词
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
We present our newly developed real-time conversation analyzer for group meetings. The goal of the system is to estimate automatically "who speaks when and what" in an online manner. In our system, "who speaks when" information is first obtained by estimating the directions of arrival (DOAs) of signals. Then, "who speaks what" is estimated with our automatic speech recognition (ASR) system, after suppressing reverberation, background noise, and interference speakers' voices. In this paper, we focus particularly on the speaker diarization ("who speaks when" estimation) method, and we show that the speaker diarization information helps the ASR to reduce insertion errors.
引用
收藏
页码:1697 / 1701
页数:5
相关论文
共 50 条
  • [1] ADAPTIVE AND ONLINE SPEAKER DIARIZATION FOR MEETING DATA
    Soldi, Giovanni
    Beaugeant, Christophe
    Evans, Nicholas
    [J]. 2015 23RD EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2015, : 2112 - 2116
  • [2] SPEAKER DIARIZATION IN MEETING AUDIO
    Nwe, Tin Lay
    Sun, Hanwu
    Li, Haizhou
    Rahardja, Susanto
    [J]. 2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 4073 - 4076
  • [3] Speaker Diarization and Linking of Meeting Data
    Ferras, Marc
    Madikeri, Srikanth
    Bourlard, Herve
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2016, 24 (11) : 1935 - 1945
  • [4] Speaker Diarization for Meeting Room Audio
    Sun, Hanwu
    Nwe, Tin Lay
    Ma, Bin
    Li, Haizhou
    [J]. INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 888 - 891
  • [5] A Hybrid Approach to Online Speaker Diarization
    Vaquero, Carlos
    Vinyals, Oriol
    Friedland, Gerald
    [J]. 11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4, 2010, : 2646 - +
  • [6] Speaker-Corrupted Embeddings for Online Speaker Diarization
    Ghahabi, Omid
    Fischer, Volker
    [J]. INTERSPEECH 2019, 2019, : 386 - 390
  • [7] Improved Location Features for Meeting Speaker Diarization
    Otterson, Scott
    [J]. INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 2928 - 2931
  • [8] VARIATIONAL BAYESIAN SPEAKER DIARIZATION OF MEETING RECORDINGS
    Valente, Fabio
    Motlicek, Petr
    Vijayasenan, Deepu
    [J]. 2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 4954 - 4957
  • [9] Online Target Speaker Voice Activity Detection for Speaker Diarization
    Wang, Weiqing
    Lin, Qingjian
    Li, Ming
    [J]. INTERSPEECH 2022, 2022, : 1441 - 1445
  • [10] Experiments with Segmentation in an Online Speaker Diarization System
    Kunesova, Marie
    Zajic, Zbynek
    Radova, Vlasta
    [J]. TEXT, SPEECH, AND DIALOGUE, TSD 2017, 2017, 10415 : 429 - 437