Toward Text-independent Cross-lingual Speaker Recognition Using English-Mandarin-Taiwanese Dataset

被引:1
|
作者
Wu, Yi-Chieh [1 ]
Liao, Wen-Hung [1 ]
机构
[1] Natl Chengchi Univ, Dept Comp Sci, Taipei, Taiwan
关键词
Speaker recognition; Acoustic features; Text-independent speaker identification; Cross-lingual dataset; VERIFICATION;
D O I
10.1109/ICPR48806.2021.9412170
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Over 40% of the world's population is bilingual. Existing speaker identification/verification systems, however, assume the same language type for both enrollment and recognition stages. In this work, we investigate the feasibility of employing multilingual speech for biometric applications. We establish a dataset containing audio recorded in English, Mandarin and Taiwanese. Three acoustic features, namely, i-vector, d-vector and x-vector have been evaluated for both speaker verification (SV) and identification (SI) tasks. Preliminary experimental results indicate that x-vector achieves the best overall performance. Additionally, the model trained with hybrid data demonstrates the highest accuracy, at the cost of extra data collection efforts. In SI tasks, we obtained over 91 % cross-lingual accuracy in all models using 3-second audio. In SV tasks, the EER among cross-lingual test is at most 6.52 %, which is observed on the model trained by English corpus. The outcome suggests the feasibility of adopting cross-lingual speech in building text-independent speaker recognition systems.
引用
收藏
页码:8515 / 8522
页数:8
相关论文
共 37 条
  • [1] SpeakerNet for Cross-lingual Text-Independent Speaker Verification
    Habib, Hafsa
    Tauseef, Huma
    Fahiem, Muhammad Abuzar
    Farhan, Saima
    Usman, Ghousia
    [J]. ARCHIVES OF ACOUSTICS, 2020, 45 (04) : 573 - 583
  • [2] CROSS-LINGUAL TEXT-INDEPENDENT SPEAKER VERIFICATION USING UNSUPERVISED ADVERSARIAL DISCRIMINATIVE DOMAIN ADAPTATION
    Xia, Wei
    Huang, Jing
    Hansen, John H. L.
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 5816 - 5820
  • [4] Text-independent speaker recognition using graph matching
    Hautamaki, Ville
    Kinnunen, Tomi
    Franti, Pasi
    [J]. PATTERN RECOGNITION LETTERS, 2008, 29 (09) : 1427 - 1432
  • [5] Text-independent speaker recognition using support vector machine
    Hou, FL
    Wang, BX
    [J]. 2001 INTERNATIONAL CONFERENCES ON INFO-TECH AND INFO-NET PROCEEDINGS, CONFERENCE A-G: INFO-TECH & INFO-NET: A KEY TO BETTER LIFE, 2001, : C402 - C407
  • [6] Text-independent Speaker Recognition Using Radial Basis Function Network
    Yakovenko, Anton
    Malychina, Galina
    [J]. ADVANCES IN NEURAL NETWORKS - ISNN 2016, 2016, 9719 : 74 - 81
  • [7] CHANNEL ADVERSARIAL TRAINING FOR CROSS-CHANNEL TEXT-INDEPENDENT SPEAKER RECOGNITION
    Fang, Xin
    Zou, Liang
    Li, Jin
    Sun, Lei
    Ling, Zhen-Hua
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 6221 - 6225
  • [8] Text-independent speaker recognition using probabilistic SVM with GMM adjustment
    Hou, FL
    Wang, BX
    [J]. 2003 INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING AND KNOWLEDGE ENGINEERING, PROCEEDINGS, 2003, : 305 - 308
  • [9] Codebook design using DCT coder for text-independent speaker recognition
    Lung, SY
    [J]. Proceedings of the Sixth IASTED International Conference on Signal and Image Processing, 2004, : 261 - 263
  • [10] English to Hindi Cross-Lingual Text Summarizer using TextRank Algorithm
    Rawat, Sunita
    Kalambe, Kavita
    Jaywant, Sagarika
    Werulkar, Lakshita
    Barbate, Mukul
    Jaiswal, Tarrun
    [J]. INTERNATIONAL JOURNAL OF NEXT-GENERATION COMPUTING, 2023, 14 (01): : 238 - 245