Eliminating inter-speaker variability prior to discriminant transforms

被引:0
|
作者
Saon, G [1 ]
Padmanabhan, M [1 ]
Gopinath, R [1 ]
机构
[1] IBM Corp, Thomas J Watson Res Ctr, Yorktown Hts, NY 10598 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper shows the impact of speaker normalization techniques such as vocal tract length normalization (VTLN) and speaker-adaptive training (SAT) prior to discriminant feature space transforms, such as LDA. We demonstrate that removing the inter-speaker variability by using speaker compensation methods results in improved discrimination as measured by the LDA eigenvalues and also in improved classification accuracy (as measured by the word error rate). Experimental results on the SPINE (speech in noisy environments) database indicate an improvement of up to 5% relative over the standard case where speaker adaptation (during testing and training) is applied after the LDA transform which is trained in a speaker independent manner. We conjecture that performing linear discriminant analysis in a canonical feature space (or speaker normalized space) is more effective than LDA in a speaker independent space because the eigenvectors will carve a subspace of maximum intra-speaker phonetic separability whereas in the latter case this subspace is also defined by the interspeaker variability. Indeed, we will show that the more normalization is performed (first VTLN, then SAT) the higher the LDA eigenvalues become.
引用
收藏
页码:73 / 76
页数:4
相关论文
共 50 条
  • [31] Separating Speaker and Environmental Variability Using Factored Transforms
    Seltzer, Michael L.
    Acero, Alex
    12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 1104 - 1107
  • [32] Inter-speaker acoustic differences of sustained vowels at varied dysarthria severities for amyotrophic lateral sclerosis
    Bhattacharjee, Tanuka
    Vengalil, Seena
    Belur, Yamini
    Atchayaram, Nalini
    Ghosh, Prasanta Kumar
    JASA EXPRESS LETTERS, 2024, 4 (12):
  • [33] A Study on the Mixed Model Approach and Symbol Probability Weighting Function for Maximization of Inter-Speaker Variation
    Chin, Se-Noon
    Kang, Chul-Ho
    JOURNAL OF THE ACOUSTICAL SOCIETY OF KOREA, 2005, 24 (07): : 410 - 415
  • [34] Overrated gaps: Inter-speaker gaps provide limited information about the timing of turns in conversation
    Corps, Ruth E.
    Knudsen, Birgit
    Meyer, Antje S.
    COGNITION, 2022, 223
  • [35] MATERNAL CONTROL OF CO-VOCALIZATION AND INTER-SPEAKER SILENCES IN MOTHER INFANT VOCAL ENGAGEMENTS
    ELIAS, G
    HAYES, A
    BROERSE, J
    JOURNAL OF CHILD PSYCHOLOGY AND PSYCHIATRY AND ALLIED DISCIPLINES, 1986, 27 (03): : 409 - 415
  • [36] Speaker matters: Natural inter-speaker variation affects 4-month-olds' perception of audio-visual speech
    Pejovic, Jovana
    Yee, Eiling
    Molnar, Monika
    FIRST LANGUAGE, 2020, 40 (02) : 113 - 127
  • [37] INTER DATASET VARIABILITY MODELING FOR SPEAKER RECOGNITION
    Aronowitz, Hagai
    2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 5400 - 5404
  • [38] INTER DATASET VARIABILITY COMPENSATION FOR SPEAKER RECOGNITION
    Aronowitz, Hagai
    2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [39] Studying the link between inter-speaker coordination and speech imitation through human-machine interactions
    Lancia, Leonardo
    Chaminade, Thierry
    Nguyen, Noel
    Prevot, Laurent
    18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 859 - 863
  • [40] Notes on So-called Inter-speaker Difference in Spontaneous Speech: The Case of Japanese Voiced Obstruent
    Maekawa, Kikuo
    14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 3036 - 3040