Exploring single channel speech separation for short-time text-dependent speaker verification

被引:1
|
作者
Han, Jiangyu [1 ]
Shi, Yan [1 ]
Long, Yanhua [1 ]
Liang, Jiaen [2 ]
机构
[1] Shanghai Normal Univ, Key Innovat Grp Digital Humanities Resource & Res, Shanghai 200234, Peoples R China
[2] Unisound AI Technol Co Ltd, Beijing, Peoples R China
基金
中国国家自然科学基金;
关键词
Speaker verification; Text-dependent; Test speech extraction; Conv-TasNet;
D O I
10.1007/s10772-022-09959-8
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
The automatic speaker verification (ASV) has recently achieved great progress. However, the performance of ASV degrades significantly when the test speech is corrupted by interference speakers, especially when multi-talkers speak at the same time. Although the target speech extraction (TSE) has also attracted increasing attention in recent years, its TSE ability is constrained by the required pre-saved anchor speech examples of the target speaker. It becomes impossible to directly use existing TSE methods to extract the desired test speech in an ASV test trial, because the speaker identity of each test speech is unknown. Therefore, based on the state-of-the-art single channel speech separation technique-Conv-TasNet, this paper aims to design a test speech extraction mechanism for building short-time text-dependent speaker verification systems. Instead of providing a pre-saved anchor speech for each training or test speaker, we extract the desired test speech from a mixture by computing the pairwise dynamic time warping between each output of Conv-TasNet and the enrollment utterance of speaker model in each test trial in the ASV task. The acoustic domain mismatch between ASV and TSE training data, the behaviors of speech separation in different stages of ASV system building, such as, the voiceprint enrollment, test and PLDA backend are all investigated in detail. Experimental results show that the proposed test speech extraction mechanism in ASV brings significant relative improvements (36.3%) in overlapped multi-talker speaker verification, benefits can be found not only in ASV test stage, but also in target speaker modeling.
引用
收藏
页码:261 / 268
页数:8
相关论文
共 50 条
  • [1] Exploring single channel speech separation for short-time text-dependent speaker verification
    Jiangyu Han
    Yan Shi
    Yanhua Long
    Jiaen Liang
    [J]. International Journal of Speech Technology, 2022, 25 : 261 - 268
  • [2] Exploring subsegmental and suprasegmental features for a text-dependent speaker verification in distant speech signals
    Avinash, B.
    Guruprasad, S.
    Yegnanarayana, B.
    [J]. 11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 1073 - +
  • [3] Addressing Text-Dependent Speaker Verification Using Singing Speech
    Shi, Yan
    Zhou, Juanjuan
    Long, Yanhua
    Li, Yijie
    Mao, Hongwei
    [J]. APPLIED SCIENCES-BASEL, 2019, 9 (13):
  • [4] EXPLORING SEQUENTIAL CHARACTERISTICS IN SPEAKER BOTTLENECK FEATURE FOR TEXT-DEPENDENT SPEAKER VERIFICATION
    Chen, Liping
    Zhao, Yong
    Zhang, Shi-Xiong
    Li, Jie
    Ye, Guoli
    Soong, Frank
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5364 - 5368
  • [5] Speaker-dependent Dictionary-based Speech Enhancement for Text-Dependent Speaker Verification
    Thomsen, Nicolai Baek
    Thomsen, Dennis Alexander Lehmann
    Tan, Zheng-Hua
    Lindberg, Borge
    Jensen, Soren Holdt
    [J]. 17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 1839 - 1843
  • [6] Text-Dependent Speaker Verification System: A Review
    Debnath, Saswati
    Soni, B.
    Baruah, U.
    Sah, D. K.
    [J]. PROCEEDINGS OF 2015 IEEE 9TH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS AND CONTROL (ISCO), 2015,
  • [7] Deep feature for text-dependent speaker verification
    Liu, Yuan
    Qian, Yanmin
    Chen, Nanxin
    Fu, Tianfan
    Zhang, Ya
    Yu, Kai
    [J]. SPEECH COMMUNICATION, 2015, 73 : 1 - 13
  • [8] Bidirectional Attention for Text-Dependent Speaker Verification
    Fang, Xin
    Gao, Tian
    Zou, Liang
    Ling, Zhenhua
    [J]. SENSORS, 2020, 20 (23) : 1 - 17
  • [9] Robust Methods for Text-Dependent Speaker Verification
    Bhukya, Ramesh K.
    Prasanna, S. R. Mahadeva
    Sarma, Biswajit Dev
    [J]. CIRCUITS SYSTEMS AND SIGNAL PROCESSING, 2019, 38 (11) : 5253 - 5288
  • [10] Content Normalization for Text-dependent Speaker Verification
    Dey, Subhadeep
    Madikeri, Srikanth
    Motlicek, Petr
    Ferras, Marc
    [J]. 18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 1482 - 1486