Unsupervised Acoustic-to-Articulatory Inversion with Variable Vocal Tract Anatomy

被引：0

作者：

Sun, Yifan ^{[1
]}

Huang, Qinlong

Wu, Xihong

机构：

[1] Peking Univ, Dept Machine Intelligence, Speech & Hearing Res Ctr, Beijing, Peoples R China

来源：

INTERSPEECH 2022 | 2022年

基金：

中国国家自然科学基金;

关键词：

acoustic-to-articulatory inversion; vocal tract anatomy; ADAPTATION;

D O I：

10.21437/Interspeech.2022-477

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Acoustic and articulatory variability across speakers has always limited the generalization performance of acoustic-to-articulatory inversion (AAI) methods. Speaker-independent AAI (SI-AAI) methods generally focus on the transformation of acoustic features, but rarely consider the direct matching in the articulatory space. Unsupervised AAI methods have the potential of better generalization ability but typically use a fixed morphological setting of a physical articulatory synthesizer even for different speakers, which may cause nonnegligible articulatory compensation. In this paper, we propose to jointly estimate articulatory movements and vocal tract anatomy during the inversion of speech. An unsupervised AAI framework is employed, where estimated vocal tract anatomy is used to set the configuration of a physical articulatory synthesizer, which in turn is driven by estimated articulation movements to imitate a given speech. Experiments show that the estimation of vocal tract anatomy can bring both acoustic and articulatory benefits. Acoustically, the reconstruction quality is higher; articulatorily, the estimated articulatory movement trajectories better match the measured ones. Moreover, the estimated anatomy parameters show clear clusterings by speakers, indicating successful decoupling of speaker characteristics and linguistic content.

引用

页码：4656 / 4660

页数：5

共 50 条

[1] Unsupervised Vocal-tract Length Estimation Through Model-based Acoustic-to-Articulatory Inversion
Cai, Shanqing
Bunnell, H. Timothy
Patel, Rupal
[J]. 14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 1711 - 1715
[2] PERFORMANCES OF UNSUPERVISED HMM IN ACOUSTIC-TO-ARTICULATORY INVERSION
Lachambre, Helene
Koenig, Lionel
Andre-Obrecht, Regine
[J]. 2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 7140 - 7144
[3] Vocal tract length normalization for speaker independent acoustic-to-articulatory speech inversion
Sivaraman, Ganesh
Mitra, Vikramjit
Nam, Hosung
Tiede, Mark
Espy-Wilson, Carol
[J]. 17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 455 - 459
[4] Speaker dependent acoustic-to-articulatory inversion using real-time MRI of the vocal tract
Csapo, Tamas Gabor
[J]. INTERSPEECH 2020, 2020, : 3720 - 3724
[5] Generalized Variable Parameter HMMs Based Acoustic-to-articulatory Inversion
Xie, Xurong
Liu, Xunying
Wang, Lan
Su, Rongfeng
[J]. 16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 279 - 283
[6] THE GEOMETRIC VOCAL-TRACT VARIABLES CONTROLLED FOR VOWEL PRODUCTION - PROPOSALS FOR CONSTRAINING ACOUSTIC-TO-ARTICULATORY INVERSION
BOE, LJ
PERRIER, P
BAILLY, G
[J]. JOURNAL OF PHONETICS, 1992, 20 (01) : 27 - 38
[7] Jerk Minimization for Acoustic-To-Articulatory Inversion
Rajpal, Avni
Patil, Hemant A.
[J]. 9th ISCA Speech Synthesis Workshop, SSW 2016, 2016, : 82 - 87
[8] Formant Trajectories for Acoustic-to-Articulatory Inversion
Ozbek, I. Yuecel
Hasegawa-Johnson, Mark
Demirekler, Muebeccel
[J]. INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 2783 - +
[9] Incorporation of phonetic constraints in acoustic-to-articulatory inversion
Potard, Blaise
Laprie, Yves
Ouni, Slim
[J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2008, 123 (04): : 2310 - 2323
[10] A generalized smoothness criterion for acoustic-to-articulatory inversion
Ghosh, Prasanta Kumar
Narayanan, Shrikanth
[J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2010, 128 (04): : 2162 - 2172

← 1 2 3 4 5 →