SPEAKER-AWARE TARGET SPEAKER ENHANCEMENT BY JOINTLY LEARNING WITH SPEAKER EMBEDDING EXTRACTION

被引：0

作者：

Ji, Xuan ^{[1
]}

Yu, Meng ^{[2
]}

Zhang, Chunlei ^{[2
]}

Su, Dan ^{[1
]}

Yu, Tao ^{[3
]}

Liu, Xiaoyu ^{[4
]}

Yu, Dong ^{[2
]}

机构：

[1] Tencent AI Lab, Shenzhen, Peoples R China

[2] Tencent AI Lab, Bellevue, WA USA

[3] Tencent IEG, Bellevue, WA USA

[4] Tencent IEG, Shenzhen, Peoples R China

来源：

2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING | 2020年

关键词：

speaker-aware; target speech enhancement; speaker embedding; joint learning;

D O I：

10.1109/icassp40776.2020.9054311

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Deep learning based speech separation approaches have received great interest, among which the recent speaker-aware speech enhancement methods are promising for solving difficulties such as arbitrary source permutation and unknown number of sources. In this paper, we propose a novel training framework which jointly learns the speaker-conditioned target speaker extraction model and its associated speaker embedding model. The resulting unified model directly learns the appropriate speaker embedding for improved target speech enhancement. We demonstrate, on our large simulated noisy and far-field evaluation sets of overlapped speech signals, that our proposed approach significantly improves the speech enhancement performance compared to the baseline speaker-aware speech enhancement models.

引用

页码：7294 / 7298

页数：5

共 50 条

[1] Speaker-Aware Mixture of Mixtures Training for Weakly Supervised Speaker Extraction
Zhao, Zifeng
Gu, Rongzhi
Yang, Dongchao
Tian, Jinchuan
Zou, Yuexian
INTERSPEECH 2022, 2022, : 5318 - 5322
[2] Speaker-aware Deep Denoising Autoencoder with Embedded Speaker Identity for Speech Enhancement
Chuang, Fu-Kai
Wang, Syu-Siang
Hung, Jeih-weih
Tsao, Yu
Fang, Shih-Hau
INTERSPEECH 2019, 2019, : 3173 - 3177
[3] Speaker-aware neural network based beamformer for speaker extraction in speech mixtures
Zmplikova, Katerina
Delcroix, Marc
Kinoshita, Keisuke
Higuchi, Takuya
Ogawa, Atsunori
Nakatani, Tomohiro
18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 2655 - 2659
[4] Speaker-Aware Linear Discriminant Analysis in Speaker Verification
Zheng, Naijun
Wu, Xixin
Zhong, Jinghua
Liu, Xunying
Meng, Helen
INTERSPEECH 2020, 2020, : 3012 - 3016
[5] Speaker-Aware Speech Enhancement with Self-Attention
Lin, Ju
Van Wijngaarden, Adriaan J.
Smith, Melissa C.
Wang, Kuang-Ching
29TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO 2021), 2021, : 486 - 490
[6] Low-Resource Speech Synthesis with Speaker-Aware Embedding
Yang, Li-Jen
Yeh, I-Ping
Chien, Jen-Tzung
2022 13TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2022, : 235 - 239
[7] SPEAKER-AWARE SPEECH-TRANSFORMER
Fan, Zhiyun
Li, Jie
Zhou, Shiyu
Xu, Bo
2019 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU 2019), 2019, : 222 - 229
[8] Speaker-Aware Monaural Speech Separation
Xu, Jiahao
Hu, Kun
Xu, Chang
Duc Chung Tran
Wang, Zhiyong
INTERSPEECH 2020, 2020, : 1451 - 1455
[9] OPTIMIZATION OF SPEAKER-AWARE MULTICHANNEL SPEECH EXTRACTION WITH ASR CRITERION
Zmolikova, Katerina
Delcroix, Marc
Kinoshita, Keisuke
Higuchi, Takuya
Nakatani, Tomohiro
Cernocky, Jan
2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 6702 - 6706
[10] SpeakerBeam: Speaker Aware Neural Network for Target Speaker Extraction in Speech Mixtures
Zmolikova, Katerina
Delcroix, Marc
Kinoshita, Keisuke
Ochiai, Tsubasa
Nakatani, Tomohiro
Burget, Lukas
Cernocky, Jan
IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, 2019, 13 (04) : 800 - 814

← 1 2 3 4 5 →