Fast speaker adaptation using non-negative matrix factorization

被引:7
|
作者
Duchateau, Jacques [1 ]
Leroy, Tobias [1 ]
Demuynck, Kris [1 ]
Van hamme, Hugo [1 ]
机构
[1] Katholieke Univ Leuven, ESAT, B-3001 Louvain, Belgium
关键词
speech recognition; adaptive systems; speaker adaptation; matrix decomposition; non-negative matrix factorization;
D O I
10.1109/ICASSP.2008.4518598
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper describes a new method for fast speaker adaptation in large vocabulary recognition system. As in most HMM-based recognizers, the observation densities are modeled as a weighted sum of Gaussian densities. Instead of adapting the means of the Gaussian densities, which is typically done, the weights for the Gaussian densities in the states are adapted. By applying non-negative matrix factorization (NW) in the proposed method, very fast adaptation was achieved. Experiments on the Wall Street Journal benchmark recognition task show relative improvements between 5% and 15%, while the adaptation converges within 0.2 seconds. Analysis of the latent speakers found by NMF learns that these latent speakers reflect the gender of the speaker most prominently, even when vocal tract length normalization is used, and that they reflect the speaker's age more clearly than the speaker's regional influences or dialect.
引用
收藏
页码:4269 / 4272
页数:4
相关论文
共 50 条
  • [1] RAPID SPEAKER ADAPTATION WITH SPEAKER ADAPTIVE TRAINING AND NON-NEGATIVE MATRIX FACTORIZATION
    Zhang, Xueru
    Demuynck, Kris
    Van Hamme, Hugo
    [J]. 2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 4456 - 4459
  • [2] Rapid speaker adaptation in latent speaker space with non-negative matrix factorization
    Zhang, Xueru
    Demuynck, Kris
    Van Hamme, Hugo
    [J]. SPEECH COMMUNICATION, 2013, 55 (09) : 893 - 908
  • [3] Speaker conversion using kernel non-negative matrix factorization
    Xu Qinyu
    Lu Guanming
    Yan Jingjie
    Li Haibo
    Cheng Xiao
    [J]. The Journal of China Universities of Posts and Telecommunications, 2017, (05) : 60 - 67
  • [4] Speaker conversion using kernel non-negative matrix factorization
    Xu Qinyu
    Lu Guanming
    Yan Jingjie
    Li Haibo
    Cheng Xiao
    [J]. TheJournalofChinaUniversitiesofPostsandTelecommunications., 2017, 24 (05) - 67
  • [5] Speaker conversion using kernel non-negative matrix factorization
    [J]. Guanming, Lu (lugm@njupt.edu.cn), 2017, Beijing University of Posts and Telecommunications (24):
  • [6] Speaker Clustering Based on Non-negative Matrix Factorization
    Nishida, Masafumi
    Yamamoto, Seiichi
    [J]. 12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 956 - 959
  • [7] Translation Non-negative Matrix Factorization with Fast Optimization
    Wang, Yuanyuan
    Guan, Naiyang
    Mao, Bin
    Huang, Xuhui
    Luo, Zhigang
    [J]. 2014 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS (SMC), 2014, : 2871 - 2874
  • [8] Speech Enhancement Using Sparse Convolutive Non-negative Matrix Factorization with Basis Adaptation
    Carlin, Michael A.
    Malyska, Nicolas
    Quatieri, Thomas F.
    [J]. 13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 582 - 585
  • [9] Human Detection Using Non-negative Matrix Factorization
    Zeng, Jing-Xiu
    Lin, Chih-Yang
    Lin, Wei-Yang
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS - TAIWAN (ICCE-TW), 2015, : 370 - 371
  • [10] Email surveillance using non-negative matrix factorization
    Berry M.W.
    Browne M.
    [J]. Computational & Mathematical Organization Theory, 2005, 11 (3): : 249 - 264