Unsupervised Acoustic-to-Articulatory Inversion with Variable Vocal Tract Anatomy

被引:0
|
作者
Sun, Yifan [1 ]
Huang, Qinlong
Wu, Xihong
机构
[1] Peking Univ, Dept Machine Intelligence, Speech & Hearing Res Ctr, Beijing, Peoples R China
来源
基金
中国国家自然科学基金;
关键词
acoustic-to-articulatory inversion; vocal tract anatomy; ADAPTATION;
D O I
10.21437/Interspeech.2022-477
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Acoustic and articulatory variability across speakers has always limited the generalization performance of acoustic-to-articulatory inversion (AAI) methods. Speaker-independent AAI (SI-AAI) methods generally focus on the transformation of acoustic features, but rarely consider the direct matching in the articulatory space. Unsupervised AAI methods have the potential of better generalization ability but typically use a fixed morphological setting of a physical articulatory synthesizer even for different speakers, which may cause nonnegligible articulatory compensation. In this paper, we propose to jointly estimate articulatory movements and vocal tract anatomy during the inversion of speech. An unsupervised AAI framework is employed, where estimated vocal tract anatomy is used to set the configuration of a physical articulatory synthesizer, which in turn is driven by estimated articulation movements to imitate a given speech. Experiments show that the estimation of vocal tract anatomy can bring both acoustic and articulatory benefits. Acoustically, the reconstruction quality is higher; articulatorily, the estimated articulatory movement trajectories better match the measured ones. Moreover, the estimated anatomy parameters show clear clusterings by speakers, indicating successful decoupling of speaker characteristics and linguistic content.
引用
收藏
页码:4656 / 4660
页数:5
相关论文
共 50 条
  • [1] Unsupervised Vocal-tract Length Estimation Through Model-based Acoustic-to-Articulatory Inversion
    Cai, Shanqing
    Bunnell, H. Timothy
    Patel, Rupal
    [J]. 14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 1711 - 1715
  • [2] PERFORMANCES OF UNSUPERVISED HMM IN ACOUSTIC-TO-ARTICULATORY INVERSION
    Lachambre, Helene
    Koenig, Lionel
    Andre-Obrecht, Regine
    [J]. 2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 7140 - 7144
  • [3] Vocal tract length normalization for speaker independent acoustic-to-articulatory speech inversion
    Sivaraman, Ganesh
    Mitra, Vikramjit
    Nam, Hosung
    Tiede, Mark
    Espy-Wilson, Carol
    [J]. 17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 455 - 459
  • [4] Speaker dependent acoustic-to-articulatory inversion using real-time MRI of the vocal tract
    Csapo, Tamas Gabor
    [J]. INTERSPEECH 2020, 2020, : 3720 - 3724
  • [5] Generalized Variable Parameter HMMs Based Acoustic-to-articulatory Inversion
    Xie, Xurong
    Liu, Xunying
    Wang, Lan
    Su, Rongfeng
    [J]. 16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 279 - 283
  • [6] THE GEOMETRIC VOCAL-TRACT VARIABLES CONTROLLED FOR VOWEL PRODUCTION - PROPOSALS FOR CONSTRAINING ACOUSTIC-TO-ARTICULATORY INVERSION
    BOE, LJ
    PERRIER, P
    BAILLY, G
    [J]. JOURNAL OF PHONETICS, 1992, 20 (01) : 27 - 38
  • [7] Jerk Minimization for Acoustic-To-Articulatory Inversion
    Rajpal, Avni
    Patil, Hemant A.
    [J]. 9th ISCA Speech Synthesis Workshop, SSW 2016, 2016, : 82 - 87
  • [8] Formant Trajectories for Acoustic-to-Articulatory Inversion
    Ozbek, I. Yuecel
    Hasegawa-Johnson, Mark
    Demirekler, Muebeccel
    [J]. INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 2783 - +
  • [9] Incorporation of phonetic constraints in acoustic-to-articulatory inversion
    Potard, Blaise
    Laprie, Yves
    Ouni, Slim
    [J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2008, 123 (04): : 2310 - 2323
  • [10] A generalized smoothness criterion for acoustic-to-articulatory inversion
    Ghosh, Prasanta Kumar
    Narayanan, Shrikanth
    [J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2010, 128 (04): : 2162 - 2172