Robust Speaker Recognition Based on Multi-Stream Features

被引:0
|
作者
Wang, Ning [1 ]
Wang, Lei [1 ]
机构
[1] Beijing Univ Posts & Telecommun, Sch Informat & Commun Engn, Beijing, Peoples R China
关键词
speaker recognition; PNCC; modified SCF; System fusion;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
In this paper, we investigate the effect of the G. 723.1 (6.3kbps) on speaker recognition system. In order to improve the robustness of codec mismatch, we used the Power Normalized Cepstral Coefficients (PNCC) which is a new robustness acoustic feature, to improve the performance of speaker verification system. And a modified SCF speech feature is propose to improve the robustness under codec mismatch. We proposed a new method to improving the performance of I-vector based speaker recognition system by combining PNCC and the modified SCF feature. Three type of fusion method is introduced and compared in this paper. The experiment results of speaker recognition towards G. 723.1 resynthesized coded speech demonstrate the effectiveness of our proposed method. Compared with traditional speaker recognition system, the EER improved 72% by the multi-stream features based speaker recognition system.
引用
收藏
页数:4
相关论文
共 50 条
  • [1] Phase AutoCorrelation (PAC) features in entropy based multi-stream for robust speech recognition
    Ikbal, S
    Misra, H
    Bourlard, H
    Hermansky, H
    [J]. 2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING, 2004, : 205 - 208
  • [2] Multi-Stream Spectro-Temporal Features for Robust Speech Recognition
    Zhao, Sherry Y.
    Morgan, Nelson
    [J]. INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 898 - 901
  • [3] Autoencoder based multi-stream combination for noise robust speech recognition
    Mallidi, Sri Harish
    Ogawa, Tetsuji
    Vesely, Karel
    Nidadavolu, Phani S.
    Hermansky, Hynek
    [J]. 16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 3551 - 3555
  • [4] A new multi-stream approach using acoustic and visual features for robust speech recognition system
    Radha, N.
    Shahina, A.
    Khan, A. Nayeemulla
    Velusami, Jansi Rani Sella
    [J]. MATERIALS TODAY-PROCEEDINGS, 2022, 62 : 4916 - 4924
  • [5] MULTI-STREAM CONVOLUTIONAL NEURAL NETWORK WITH FREQUENCY SELECTION FOR ROBUST SPEAKER VERIFICATION
    Yao, Wei
    Chen, Shen
    Cui, Jiamin
    Lou, Yaolin
    [J]. COMPUTING AND INFORMATICS, 2024, 43 (04) : 819 - 848
  • [6] Noise Adaptive Stream Fusion Based on Feature Component Rejection for Robust Multi-Stream Speech Recognition
    Zhang, Jun
    Feng, Yizhi
    Ning, Gengxin
    Ji, Fei
    [J]. 2015 SEVENTH INTERNATIONAL CONFERENCE ON ADVANCED COMPUTATIONAL INTELLIGENCE (ICACI), 2015, : 279 - 283
  • [7] Robust multi-stream speech recognition based on weighting the output probabilities of feature components
    ZHANG Jun WEI Gang YU Hua NING Genxin (College of Electronic & Information Engineering
    [J]. Chinese Journal of Acoustics, 2009, 28 (03) : 269 - 279
  • [8] A multi-stream bimodal continuous speech recognition system using datasieve based features
    Xie, L
    Ravyse, I
    Jiang, DM
    Zhao, RC
    Sahli, H
    Verhelst, W
    Cornelis, J
    [J]. 2003 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-5, PROCEEDINGS, 2003, : 2287 - 2290
  • [9] Multi-Stream Speaker Diarization Systems for the Meetings Domain
    Gallardo-Antolin, Ascension
    Anguera, Xavier
    Wooters, Chuck
    [J]. INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 2186 - +
  • [10] Stream fusion for multi-stream automatic speech recognition
    Sagha, Hesam
    Li, Feipeng
    Variani, Ehsan
    Millan, Jose del R.
    Chavarriaga, Ricardo
    Schuller, Bjoern
    [J]. INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2016, 19 (04) : 669 - 675