SPEAKER-AWARE TARGET SPEAKER ENHANCEMENT BY JOINTLY LEARNING WITH SPEAKER EMBEDDING EXTRACTION

被引:0
|
作者
Ji, Xuan [1 ]
Yu, Meng [2 ]
Zhang, Chunlei [2 ]
Su, Dan [1 ]
Yu, Tao [3 ]
Liu, Xiaoyu [4 ]
Yu, Dong [2 ]
机构
[1] Tencent AI Lab, Shenzhen, Peoples R China
[2] Tencent AI Lab, Bellevue, WA USA
[3] Tencent IEG, Bellevue, WA USA
[4] Tencent IEG, Shenzhen, Peoples R China
关键词
speaker-aware; target speech enhancement; speaker embedding; joint learning;
D O I
10.1109/icassp40776.2020.9054311
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Deep learning based speech separation approaches have received great interest, among which the recent speaker-aware speech enhancement methods are promising for solving difficulties such as arbitrary source permutation and unknown number of sources. In this paper, we propose a novel training framework which jointly learns the speaker-conditioned target speaker extraction model and its associated speaker embedding model. The resulting unified model directly learns the appropriate speaker embedding for improved target speech enhancement. We demonstrate, on our large simulated noisy and far-field evaluation sets of overlapped speech signals, that our proposed approach significantly improves the speech enhancement performance compared to the baseline speaker-aware speech enhancement models.
引用
收藏
页码:7294 / 7298
页数:5
相关论文
共 50 条
  • [1] Speaker-Aware Mixture of Mixtures Training for Weakly Supervised Speaker Extraction
    Zhao, Zifeng
    Gu, Rongzhi
    Yang, Dongchao
    Tian, Jinchuan
    Zou, Yuexian
    INTERSPEECH 2022, 2022, : 5318 - 5322
  • [2] Speaker-aware Deep Denoising Autoencoder with Embedded Speaker Identity for Speech Enhancement
    Chuang, Fu-Kai
    Wang, Syu-Siang
    Hung, Jeih-weih
    Tsao, Yu
    Fang, Shih-Hau
    INTERSPEECH 2019, 2019, : 3173 - 3177
  • [3] Speaker-aware neural network based beamformer for speaker extraction in speech mixtures
    Zmplikova, Katerina
    Delcroix, Marc
    Kinoshita, Keisuke
    Higuchi, Takuya
    Ogawa, Atsunori
    Nakatani, Tomohiro
    18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 2655 - 2659
  • [4] Speaker-Aware Linear Discriminant Analysis in Speaker Verification
    Zheng, Naijun
    Wu, Xixin
    Zhong, Jinghua
    Liu, Xunying
    Meng, Helen
    INTERSPEECH 2020, 2020, : 3012 - 3016
  • [5] Speaker-Aware Speech Enhancement with Self-Attention
    Lin, Ju
    Van Wijngaarden, Adriaan J.
    Smith, Melissa C.
    Wang, Kuang-Ching
    29TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO 2021), 2021, : 486 - 490
  • [6] Low-Resource Speech Synthesis with Speaker-Aware Embedding
    Yang, Li-Jen
    Yeh, I-Ping
    Chien, Jen-Tzung
    2022 13TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2022, : 235 - 239
  • [7] SPEAKER-AWARE SPEECH-TRANSFORMER
    Fan, Zhiyun
    Li, Jie
    Zhou, Shiyu
    Xu, Bo
    2019 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU 2019), 2019, : 222 - 229
  • [8] Speaker-Aware Monaural Speech Separation
    Xu, Jiahao
    Hu, Kun
    Xu, Chang
    Duc Chung Tran
    Wang, Zhiyong
    INTERSPEECH 2020, 2020, : 1451 - 1455
  • [9] OPTIMIZATION OF SPEAKER-AWARE MULTICHANNEL SPEECH EXTRACTION WITH ASR CRITERION
    Zmolikova, Katerina
    Delcroix, Marc
    Kinoshita, Keisuke
    Higuchi, Takuya
    Nakatani, Tomohiro
    Cernocky, Jan
    2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 6702 - 6706
  • [10] SpeakerBeam: Speaker Aware Neural Network for Target Speaker Extraction in Speech Mixtures
    Zmolikova, Katerina
    Delcroix, Marc
    Kinoshita, Keisuke
    Ochiai, Tsubasa
    Nakatani, Tomohiro
    Burget, Lukas
    Cernocky, Jan
    IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, 2019, 13 (04) : 800 - 814