Comments on vocal fact length normalization equals linear transformation in cepstral space

被引:4
|
作者
Afify, Mohamed [1 ]
Siohan, Olivier
机构
[1] TJ Watson Res Ctr, IBM, Yorktown Hts, NY 10598 USA
[2] Google, New York, NY 10011 USA
关键词
maximum-likelihood linear regression; speaker adaptation; speech recognition; vocal tract length normalization;
D O I
10.1109/TASL.2007.896653
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
The bilinear transformation (BT) is used for vocal tract length normalization (VTLN) in speech recogniton systems. We prove two properties of the bilinear mapping that motivated the band-diagonal transform proposed in M. Afify and O. Siohan, ("Constrained maximum likelihood linear regression for speaker adaptation," in Proc. ICSLP, Beijing, China, Oct. 2000.) This is in contrast to what is stated in M. Pitz and H. Ney, ("Vocal tract length normalization equals linear transformation in cepstral space," IEEE Transactions on Speech and Audio Processing, vol. 13, no. 5, pp 930-944, September 2005) that the transform of Afify and Siohan was motivated by empirical observations.
引用
收藏
页码:1731 / 1732
页数:2
相关论文
共 6 条
  • [1] Vocal tract normalization equals linear transformation in cepstral space
    Pitz, M
    Ney, H
    [J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2005, 13 (05): : 930 - 944
  • [2] Combining Vocal Tract Length Normalization With Hierarchical Linear Transformations
    Saheer, Lakshmi
    Yamagishi, Junichi
    Garner, Philip N.
    Dines, John
    [J]. IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, 2014, 8 (02) : 262 - 272
  • [3] COMBINING VOCAL TRACT LENGTH NORMALIZATION WITH HIERARCHIAL LINEAR TRANSFORMATIONS
    Saheer, Lakshmi
    Yamagishi, Junichi
    Garner, Philip N.
    Dines, John
    [J]. 2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 4493 - 4496
  • [4] A novel feature transformation for vocal tract length normalization in automatic speech recognition
    Claes, T
    Dologlou, I
    ten Bosch, L
    Van Compernolle, D
    [J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1998, 6 (06): : 549 - 557
  • [5] Iterative MMSE Estimation of Vocal Tract Length Normalization Factors for Voice Transformation
    Erro, Daniel
    Navas, Eva
    Hernaez, Inma
    [J]. 13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 86 - 89
  • [6] VOWEL NORMALIZATION BY LINEAR TRANSFORMATION OF EACH SPEAKERS ACOUSTIC SPACE
    HARSHMAN, RA
    PAPCUN, G
    [J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1976, 59 : S71 - S71