JOINTLY RECOGNIZING MULTI-SPEAKER CONVERSATIONS

被引：1

作者：

Ji, Gang ^{[1
]}

Bilmes, Jeff ^{[1
]}

机构：

[1] Univ Washington, Dept Elect Engn, Seattle, WA 98195 USA

来源：

2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING | 2010年

关键词：

Speech recognition; multi-speaker; graphical models;

D O I：

10.1109/ICASSP.2010.5495041

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

We suggest an approach to speech recognition where multiple sides of a conversation in a dialog or meeting are processed and decoded jointly rather than independently. We moreover introduce a practical implementation of this approach that demonstrates both language model perplexity and speech recognition word error rate improvements in conversational telephone speech. Specifically, we show that such benefits can be had if a n-gram language model, in addition to conditioning on immediately preceding words in an utterance, is also allowed to condition on the estimated dialog-act of the immediately preceding utterance of an alternate speaker.

引用

下载

页码：5110 / 5113

页数：4

共 50 条

[1] MULTI-SPEAKER CONVERSATIONS, CROSS-TALK, AND DIARIZATION FOR SPEAKER RECOGNITION
Sell, Gregory
McCree, Alan
2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 5425 - 5429
[2] SPEAKER RECOGNITION FOR MULTI-SPEAKER CONVERSATIONS USING X-VECTORS
Snyder, David
Garcia-Romero, Daniel
Sell, Gregory
McCree, Alan
Povey, Daniel
Khudanpur, Sanjeev
2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 5796 - 5800
[3] Speech Recognition and Multi-Speaker Diarization of Long Conversations
Mao, Huanru Henry
Li, Shuyang
McAuley, Julian
Cottrell, Garrison W.
INTERSPEECH 2020, 2020, : 691 - 695
[4] Modeling both Context- and Speaker-Sensitive Dependence for Emotion Detection in Multi-speaker Conversations
Zhang, Dong
Wu, Liangqing
Sun, Changlong
Li, Shoushan
Zhu, Qiaoming
Zhou, Guodong
PROCEEDINGS OF THE TWENTY-EIGHTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2019, : 5415 - 5421
[5] Improving Multi-Speaker Tacotron with Speaker Gating Mechanisms
Zhao, Wei
Xu, Li
He, Ting
PROCEEDINGS OF THE 39TH CHINESE CONTROL CONFERENCE, 2020, : 7498 - 7503
[6] Multi-array multi-speaker tracking
Potamitis, I
Tremoulis, G
Fakotakis, N
TEXT, SPEECH AND DIALOGUE, PROCEEDINGS, 2003, 2807 : 206 - 213
[7] A hybrid approach to speaker recognition in multi-speaker environment
Trivedi, J
Maitra, A
Mitra, SK
PATTERN RECOGNITION AND MACHINE INTELLIGENCE, PROCEEDINGS, 2005, 3776 : 272 - 275
[8] Automatic speaker clustering from multi-speaker utterances
McLaughlin, J
Reynolds, D
Singer, E
O'Leary, GC
ICASSP '99: 1999 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS VOLS I-VI, 1999, : 817 - 820
[9] Automatic speaker clustering from multi-speaker utterances
MIT Lincoln Lab, Lexington, United States
ICASSP IEEE Int Conf Acoust Speech Signal Process Proc, (817-820):
[10] Speaker Clustering with Penalty Distance for Speaker Verification with Multi-Speaker Speech
Das, Rohan Kumar
Yang, Jichen
Li, Haizhou
2019 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2019, : 1630 - 1635

← 1 2 3 4 5 →