Autoencoder based Domain Adaptation for Speaker Recognition under Insufficient Channel Information

被引:25
|
作者
Shon, Suwon [1 ]
Mun, Seongkyu [2 ]
Kim, Wooil [3 ]
Ko, Hanseok [1 ]
机构
[1] Korea Univ, Sch Elect Engn, Seoul, South Korea
[2] Korea Univ, Dept Visual Informat Proc, Seoul, South Korea
[3] Incheon Natl Univ, Dept Comp Sci & Engn, Incheon, South Korea
基金
新加坡国家研究基金会;
关键词
unsupervised domain adaptation; domain mismatch; speaker recognition; autoencoder; denoising autoencoder;
D O I
10.21437/Interapeech.2017-49
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In real-life conditions, mismatch between development and test domain degrades speaker recognition performance. To solve the issue, many researchers explored domain adaptation approaches using matched in-domain dataset. However, adaptation would be not effective if the dataset is insufficient to estimate channel variability of the domain. In this paper, we explore the problem of performance degradation under such a situation of insufficient channel information. In order to exploit limited in-domain dataset effectively, we propose an unsupervised domain adaptation approach using Autoencoder based Domain Adaptation (AEDA). The proposed approach combines an autoencoder with a denoising autoencoder to adapt resource-rich development dataset to test domain. The proposed technique is evaluated on the Domain Adaptation Challenge 13 experimental protocols that is widely used in speaker recognition for domain mismatched condition. The results show significant improvements over baselines and results from other prior studies.
引用
收藏
页码:1014 / 1018
页数:5
相关论文
共 50 条
  • [1] Autoencoder-based Unsupervised Domain Adaptation for Speech Emotion Recognition
    Deng, Jun
    Zhang, Zixing
    Eyben, Florian
    Schuller, Bjoern
    [J]. IEEE SIGNAL PROCESSING LETTERS, 2014, 21 (09) : 1068 - 1072
  • [2] Universum Autoencoder-Based Domain Adaptation for Speech Emotion Recognition
    Deng, Jun
    Xu, Xinzhou
    Zhang, Zixing
    Fruhholz, Sascha
    Schuller, Bjorn
    [J]. IEEE SIGNAL PROCESSING LETTERS, 2017, 24 (04) : 500 - 504
  • [3] STACKED AUTOENCODER NETWORKS BASED SPEAKER RECOGNITION
    Zeng, Chun-Yan
    Ma, Chao-Feng
    Wang, Zhi-Feng
    Ye, Jia-Xiang
    [J]. PROCEEDINGS OF 2018 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS (ICMLC), VOL 1, 2018, : 294 - 299
  • [4] On robustness of unsupervised domain adaptation for speaker recognition
    Bousquet, Pierre-Michel
    Rouvier, Mickael
    [J]. INTERSPEECH 2019, 2019, : 2958 - 2962
  • [5] DOMAIN AND SPEAKER ADAPTATION FOR CORTANA SPEECH RECOGNITION
    Zhao, Yong
    Li, Jinyu
    Zhang, Shixiong
    Chen, Liping
    Gong, Yifan
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5984 - 5988
  • [6] SUPERVISED DOMAIN ADAPTATION FOR I-VECTOR BASED SPEAKER RECOGNITION
    Garcia-Romero, Daniel
    McCree, Alan
    [J]. 2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [7] Domain Adaptation Network Based on Autoencoder
    WANG Xuesong
    MA Yuting
    CHENG Yuhu
    [J]. Chinese Journal of Electronics, 2018, 27 (06) : 1258 - 1264
  • [8] Domain Adaptation Network Based on Autoencoder
    Wang Xuesong
    Ma Yuting
    Cheng Yuhu
    [J]. CHINESE JOURNAL OF ELECTRONICS, 2018, 27 (06) : 1258 - 1264
  • [9] Variational autoencoder for prosody-based speaker recognition
    Ben Alex, Starlet
    Mary, Leena
    [J]. ETRI JOURNAL, 2023, 45 (04) : 678 - 689
  • [10] DOMAIN ADAPTATION FOR SPEAKER RECOGNITION IN SINGING AND SPOKEN VOICE
    Chowdhury, Anurag
    Cozzo, Austin
    Ross, Arun
    [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 7192 - 7196