Using kernel PCA to improve eigenvoice speaker adaptation

被引：0

作者：

Mak, B ^{[1
]}

Kwok, JT ^{[1
]}

Ho, S ^{[1
]}

机构：

[1] Hong Kong Univ Sci & Technol, Dept Comp Sci, Hong Kong, Hong Kong, Peoples R China

来源：

PROCEEDINGS OF THE 2004 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-7 | 2004年

关键词：

D O I：

10.1109/ICMLC.2004.1378558

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Eigenvoice-based methods have been shown to be effective for fast speaker adaptation when only a small amount of adaptation data is available. Conventionally, these methods employ linear principal component analysis (PCA) to find the most important eigenvoices. Recently, in what we called kernel eigenvoice (KEV) speaker adaptation, we suggested the use of kernel PCA to compute the eigenvoices so as to exploit possible nonlinearity in the data. The major challenge is that unlike the standard eigenvoice (EV) method, an adapted speaker model found by KEV adaptation resides in the high-dimensional kernel-induced feature space; it is not clear how to obtain the constituent Gaussians of the adapted model that are needed for the computation of state observation likelihoods during the estimation of eigenvoice weights and subsequent decoding. Our solution is the use of composite kernels in such a way that state observation likelihoods can be computed using only kernel functions. In an evaluation on the TIDIGITS task using less than 10s of adaptation speech, it is found that KEV speaker adaptation using composite Gaussian kernels outperforms a speaker-independent model and adapted models found by EV, MAP, or MLLR adaptation using 2.1s and 4.1s of speech.

引用

页码：3062 / 3067

页数：6

共 50 条

[1] Eigenvoice speaker adaptation via composite kernel PCA
Kwok, JT
Mak, B
Ho, S
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 16, 2004, 16 : 1401 - 1408
[2] Kernel eigenvoice speaker adaptation
Mak, B
Kwok, JT
Ho, S
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2005, 13 (05): : 984 - 992
[3] Embedded kernel eigenvoice speaker adaptation and its implication to reference speaker weighting
Mak, Brian Kan-Wing
Hsiao, Roger Wend-Huu
Ho, Simon Ka-Lung
Kwok, James T.
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2006, 14 (04): : 1267 - 1280
[4] Study of various composite kernels for kernel eigenvoice speaker adaptation
Mak, B
Kwok, JT
Ho, S
2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING, 2004, : 325 - 328
[5] Speaker adaptation by hierarchical EigenVoice
Onishi, Yoshifumi
Iso, Ken-Ichi
ICASSP IEEE Int Conf Acoust Speech Signal Process Proc, (576-579):
[6] Speaker adaptation by hierarchical eigenvoice
Onishi, Y
Iso, K
2003 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING I, 2003, : 576 - 579
[7] A new eigenvoice approach to speaker adaptation
Huang, CH
Chien, JT
Wang, HM
2004 INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, 2004, : 109 - 112
[8] Rapid speaker adaptation in eigenvoice space
Kuhn, R
Junqua, JC
Nguyen, P
Niedzielski, N
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2000, 8 (06): : 695 - 707
[9] Various reference speakers determination methods for embedded kernel Eigenvoice speaker adaptation
Mak, B
Ho, S
2005 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1-5: SPEECH PROCESSING, 2005, : 981 - 984
[10] Feature space eigenvoice speaker adaptation
Institute of Information Systems Engineering, Information Engineering University, Zhengzhou
450000, China
Zidonghua Xuebao Acta Auto. Sin., 7 (1244-1252):

← 1 2 3 4 5 →