Speaker adaptation in the maximum a posteriori framework based on the probabilistic 2-mode analysis of training models

被引:0
|
作者
Jeong, Yongwon [1 ]
机构
[1] Pusan Natl Univ, Sch Elect Engn, Pusan 609735, South Korea
关键词
Speech recognition; Speaker adaptation; Probabilistic tensor analysis; Tucker decomposition; HIDDEN MARKOV-MODELS; LIKELIHOOD;
D O I
10.1186/1687-4722-2013-7
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this article, we describe a speaker adaptation method based on the probabilistic 2-mode analysis of training models. Probabilistic 2-mode analysis is a probabilistic extension of multilinear analysis. We apply probabilistic 2-mode analysis to speaker adaptation by representing each of the hidden Markov model mean vectors of training speakers as a matrix, and derive the speaker adaptation equation in the maximum a posteriori (MAP) framework. The adaptation equation becomes similar to the speaker adaptation equation using the MAP linear regression adaptation. In the experiments, the adapted models based on probabilistic 2-mode analysis showed performance improvement over the adapted models based on Tucker decomposition, which is a representative multilinear decomposition technique, for small amounts of adaptation data while maintaining good performance for large amounts of adaptation data.
引用
收藏
页数:11
相关论文
共 21 条