Using kernel PCA to improve eigenvoice speaker adaptation

被引:0
|
作者
Mak, B [1 ]
Kwok, JT [1 ]
Ho, S [1 ]
机构
[1] Hong Kong Univ Sci & Technol, Dept Comp Sci, Hong Kong, Hong Kong, Peoples R China
关键词
D O I
10.1109/ICMLC.2004.1378558
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Eigenvoice-based methods have been shown to be effective for fast speaker adaptation when only a small amount of adaptation data is available. Conventionally, these methods employ linear principal component analysis (PCA) to find the most important eigenvoices. Recently, in what we called kernel eigenvoice (KEV) speaker adaptation, we suggested the use of kernel PCA to compute the eigenvoices so as to exploit possible nonlinearity in the data. The major challenge is that unlike the standard eigenvoice (EV) method, an adapted speaker model found by KEV adaptation resides in the high-dimensional kernel-induced feature space; it is not clear how to obtain the constituent Gaussians of the adapted model that are needed for the computation of state observation likelihoods during the estimation of eigenvoice weights and subsequent decoding. Our solution is the use of composite kernels in such a way that state observation likelihoods can be computed using only kernel functions. In an evaluation on the TIDIGITS task using less than 10s of adaptation speech, it is found that KEV speaker adaptation using composite Gaussian kernels outperforms a speaker-independent model and adapted models found by EV, MAP, or MLLR adaptation using 2.1s and 4.1s of speech.
引用
收藏
页码:3062 / 3067
页数:6
相关论文
共 50 条
  • [1] Eigenvoice speaker adaptation via composite kernel PCA
    Kwok, JT
    Mak, B
    Ho, S
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 16, 2004, 16 : 1401 - 1408
  • [2] Kernel eigenvoice speaker adaptation
    Mak, B
    Kwok, JT
    Ho, S
    IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2005, 13 (05): : 984 - 992
  • [3] Embedded kernel eigenvoice speaker adaptation and its implication to reference speaker weighting
    Mak, Brian Kan-Wing
    Hsiao, Roger Wend-Huu
    Ho, Simon Ka-Lung
    Kwok, James T.
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2006, 14 (04): : 1267 - 1280
  • [4] Study of various composite kernels for kernel eigenvoice speaker adaptation
    Mak, B
    Kwok, JT
    Ho, S
    2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING, 2004, : 325 - 328
  • [5] Speaker adaptation by hierarchical EigenVoice
    Onishi, Yoshifumi
    Iso, Ken-Ichi
    ICASSP IEEE Int Conf Acoust Speech Signal Process Proc, (576-579):
  • [6] Speaker adaptation by hierarchical eigenvoice
    Onishi, Y
    Iso, K
    2003 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING I, 2003, : 576 - 579
  • [7] A new eigenvoice approach to speaker adaptation
    Huang, CH
    Chien, JT
    Wang, HM
    2004 INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, 2004, : 109 - 112
  • [8] Rapid speaker adaptation in eigenvoice space
    Kuhn, R
    Junqua, JC
    Nguyen, P
    Niedzielski, N
    IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2000, 8 (06): : 695 - 707
  • [9] Various reference speakers determination methods for embedded kernel Eigenvoice speaker adaptation
    Mak, B
    Ho, S
    2005 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1-5: SPEECH PROCESSING, 2005, : 981 - 984
  • [10] Feature space eigenvoice speaker adaptation
    Institute of Information Systems Engineering, Information Engineering University, Zhengzhou
    450000, China
    Zidonghua Xuebao Acta Auto. Sin., 7 (1244-1252):