LOW-LATENCY SPEAKER-INDEPENDENT CONTINUOUS SPEECH SEPARATION

被引：0

作者：

Yoshioka, Takuya ^{[1
]}

Chen, Zhuo ^{[1
]}

Liu, Changliang ^{[1
]}

Xiao, Xiong ^{[1
]}

Erdogan, Hakan ^{[1
]}

Dimitriadis, Dimitrios ^{[1
]}

机构：

[1] Microsoft, One Microsoft Way, Redmond, WA 98052 USA

来源：

2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP) | 2019年

关键词：

Meeting transcription; continuous speech separation; speaker-independent speech separation; microphone arrays;

D O I：

暂无

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Speaker independent continuous speech separation (SI-CSS) is a task of converting a continuous audio stream, which may contain overlapping voices of unknown speakers, into a fixed number of continuous signals each of which contains no overlapping speech segment. A separated, or cleaned, version of each utterance is generated from one of SI-CSS's output channels nondeterministically without being split up and distributed to multiple channels. A typical application scenario is transcribing multi-party conversations, such as meetings, recorded with microphone arrays. The output signals can be simply sent to a speech recognition engine because they do not include speech overlaps. The previous SI-CSS method uses a neural network trained with permutation invariant training and a data-driven beamformer and thus requires much processing latency. This paper proposes a low-latency SI-CSS method whose performance is comparable to that of the previous method in a microphone array-based meeting transcription task. This is achieved (1) by using a new speech separation network architecture combined with a double buffering scheme and (2) by performing enhancement with a set of fixed beamformers followed by a neural post-filter.

引用

下载

页码：6980 / 6984

页数：5

共 50 条

[21] SKIM: SKIPPING MEMORY LSTM FOR LOW-LATENCY REAL-TIME CONTINUOUS SPEECH SEPARATION
Li, Chenda
Yang, Lei
Wang, Weiqin
Qian, Yanmin
2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 681 - 685
[22] LOW-LATENCY SPEECH SEPARATION GUIDED DIARIZATION FOR TELEPHONE CONVERSATIONS
Morrone, Giovanni
Cornell, Samuele
Raj, Desh
Serafini, Luca
Zovato, Enrico
Brutti, Alessio
Squartini, Stefano
2022 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP, SLT, 2022, : 641 - 646
[23] On Speaker-Independent, Speaker-Dependent, and Speaker-Adaptive Speech Recognition
Huang, Xuedong
Lee, Kai-Fu
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1993, 1 (02): : 150 - 157
[24] Speaker adaptation techniques for speech recognition with a speaker-independent phonetic recognizer
Kim, WG
Jang, M
COMPUTATIONAL INTELLIGENCE AND SECURITY, PT 1, PROCEEDINGS, 2005, 3801 : 95 - 100
[25] CBLDNN-BASED SPEAKER-INDEPENDENT SPEECH SEPARATION VIA GENERATIVE ADVERSARIAL TRAINING
Li, Chenxing
Zhu, Lei
Xu, Shuang
Gao, Peng
Xu, Bo
2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 711 - 715
[26] Arabic Speaker-Independent Continuous Automatic Speech Recognition Based on a Phonetically Rich and Balanced Speech Corpus
Abushariah, Mohammad
Ainon, Raja Noor
Zainuddin, Roziati
Elshafei, Moustafa
Khalifa, Othman
INTERNATIONAL ARAB JOURNAL OF INFORMATION TECHNOLOGY, 2012, 9 (01) : 84 - 93
[27] SPEAKER-INDEPENDENT DETECTION OF CHILD-DIRECTED SPEECH
Schuster, Sebastian
Pancoast, Stephanie
Ganjoo, Milind
Frank, Michael C.
Jurafsky, Dan
2014 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY SLT 2014, 2014, : 366 - 371
[28] Biomimetic pattern recognition for speaker-independent speech recognition
Qin, H
Wang, SJ
Sun, H
PROCEEDINGS OF THE 2005 INTERNATIONAL CONFERENCE ON NEURAL NETWORKS AND BRAIN, VOLS 1-3, 2005, : 1290 - 1294
[29] On Speaker-Independent Personality Perception and Prediction from Speech
Polzehl, Tim
Schoenenberg, Katrin
Moeller, Sebastian
Metze, Florian
Mohammadi, Gelareh
Vinciarelli, Alessandro
13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 258 - 261
[30] Speaker-Independent Speech Recognition using Visual Features
Pooventhiran, G.
Sandeep, A.
Manthiravalli, K.
Harish, D.
Renuka, Karthika D.
INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2020, 11 (11) : 616 - 620

← 1 2 3 4 5 →