Online Meeting Recognizer with Multichannel Speaker Diarization

被引：0

作者：

Araki, Shoko ^{[1
]}

Hori, Takaaki ^{[1
]}

Fujimoto, Masakiyo ^{[1
]}

Watanabe, Shinji ^{[1
]}

Yoshioka, Takuya ^{[1
]}

Nakatani, Tomohiro ^{[1
]}

Nakamura, Atsushi ^{[1
]}

机构：

[1] NTT Corp, NTT Commun Sci Labs, Seika, Kyoto 6190237, Japan

来源：

2010 CONFERENCE RECORD OF THE FORTY FOURTH ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS AND COMPUTERS (ASILOMAR) | 2010年

关键词：

D O I：

暂无

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

We present our newly developed real-time conversation analyzer for group meetings. The goal of the system is to estimate automatically "who speaks when and what" in an online manner. In our system, "who speaks when" information is first obtained by estimating the directions of arrival (DOAs) of signals. Then, "who speaks what" is estimated with our automatic speech recognition (ASR) system, after suppressing reverberation, background noise, and interference speakers' voices. In this paper, we focus particularly on the speaker diarization ("who speaks when" estimation) method, and we show that the speaker diarization information helps the ASR to reduce insertion errors.

引用

页码：1697 / 1701

页数：5

共 50 条

[1] ADAPTIVE AND ONLINE SPEAKER DIARIZATION FOR MEETING DATA
Soldi, Giovanni
Beaugeant, Christophe
Evans, Nicholas
[J]. 2015 23RD EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2015, : 2112 - 2116
[2] SPEAKER DIARIZATION IN MEETING AUDIO
Nwe, Tin Lay
Sun, Hanwu
Li, Haizhou
Rahardja, Susanto
[J]. 2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 4073 - 4076
[3] Speaker Diarization and Linking of Meeting Data
Ferras, Marc
Madikeri, Srikanth
Bourlard, Herve
[J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2016, 24 (11) : 1935 - 1945
[4] Speaker Diarization for Meeting Room Audio
Sun, Hanwu
Nwe, Tin Lay
Ma, Bin
Li, Haizhou
[J]. INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 888 - 891
[5] A Hybrid Approach to Online Speaker Diarization
Vaquero, Carlos
Vinyals, Oriol
Friedland, Gerald
[J]. 11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4, 2010, : 2646 - +
[6] Speaker-Corrupted Embeddings for Online Speaker Diarization
Ghahabi, Omid
Fischer, Volker
[J]. INTERSPEECH 2019, 2019, : 386 - 390
[7] Improved Location Features for Meeting Speaker Diarization
Otterson, Scott
[J]. INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 2928 - 2931
[8] VARIATIONAL BAYESIAN SPEAKER DIARIZATION OF MEETING RECORDINGS
Valente, Fabio
Motlicek, Petr
Vijayasenan, Deepu
[J]. 2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 4954 - 4957
[9] Online Target Speaker Voice Activity Detection for Speaker Diarization
Wang, Weiqing
Lin, Qingjian
Li, Ming
[J]. INTERSPEECH 2022, 2022, : 1441 - 1445
[10] Experiments with Segmentation in an Online Speaker Diarization System
Kunesova, Marie
Zajic, Zbynek
Radova, Vlasta
[J]. TEXT, SPEECH, AND DIALOGUE, TSD 2017, 2017, 10415 : 429 - 437

← 1 2 3 4 5 →