A neural network model of the articulatory-acoustic forward mapping trained on recordings of articulatory parameters

被引：44

作者：

Kello, CT ^{[1
]}

Plaut, DC

机构：

[1] George Mason Univ, Dept Psychol, Fairfax, VA 22030 USA

[2] Carnegie Mellon Univ, Dept Psychol, Ctr Neural Basis Cognit, Pittsburgh, PA 15213 USA

来源：

JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA | 2004年 / 116卷 / 04期

关键词：

D O I：

10.1121/1.1715112

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Three neural network models were trained on the forward mapping from articulatory positions to acoustic outputs for a single speaker of the Edinburgh multi-channel articulatory speech database. The model parameters (i.e., connection weights) were learned via the backpropagation of error signals generated by the difference between acoustic outputs of the models, and their acoustic targets. Efficacy of the trained models was assessed by subjecting the models' acoustic outputs to speech intelligibility tests. The results of these tests showed that enough phonetic information was captured by the models to support rates of word identification as high as 84%, approaching an identification rate of 92% for the actual target stimuli. These forward models could serve as one component of a data-driven articulatory synthesizer. The models also provide the first step toward building a model of spoken word acquisition and phonological development trained on real speech. (C) 2004 Acoustical Society of America.

引用

页码：2354 / 2364

页数：11

共 50 条

[21] Mapping between acoustic and articulatory gestures
Ananthakrishnan, G.
Engwall, Olov
SPEECH COMMUNICATION, 2011, 53 (04) : 567 - 589
[22] A non-linear filtering approach to stochastic training of the articulatory-acoustic mapping using the EM algorithm
Ramsay, G
ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 514 - 517
[23] Articulatory-acoustic vowel space: Associations between acoustic and perceptual measures of clear speech
Whitfield, Jason A.
Goberman, Alexander M.
INTERNATIONAL JOURNAL OF SPEECH-LANGUAGE PATHOLOGY, 2017, 19 (02) : 184 - 194
[24] Learning how to speak: Imitation-based refinement of syllable production in an articulatory-acoustic model
Philippsen, Anja Kristina
Reinhart, Rene Felix
Wrede, Britta
FOUTH JOINT IEEE INTERNATIONAL CONFERENCES ON DEVELOPMENT AND LEARNING AND EPIGENETIC ROBOTICS (IEEE ICDL-EPIROB 2014), 2014, : 195 - 200
[25] Articulatory-acoustic relations in the production of alveolar and palatal lateral sounds in Brazilian Portuguese
Charles, Sherman
Lulich, Steven M.
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2019, 145 (06): : 3269 - 3288
[26] Relying on critical articulators to estimate vocal tract spectra in an articulatory-acoustic database
Felps, Daniel
Geng, Christian
Berger, Michael
Richmond, Korin
Gutierrez-Osuna, Ricardo
11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4, 2010, : 1990 - +
[27] Articulatory-acoustic characteristics of the Baraba-Tatar phoneme a/(sic)/in a comparative aspect
Ryzhikova, T. R.
SIBIRSKII FILOLOGICHESKII ZHURNAL, 2019, (02): : 163 - 178
[28] Articulatory-Acoustic Analyses of Mandarin Words in Emotional Context Speech for Smart Campus
Ren, Guofeng
Zhang, Xueying
Duan, Shufei
IEEE ACCESS, 2018, 6 : 48418 - 48427
[29] Synthesizing 3D Acoustic-Articulatory Mapping Trajectories: Predicting Articulatory Movements by Long-Term Recurrent Convolutional Neural Network
Yu, Lingyun
Yu, Jun
Ling, Qiang
2018 IEEE INTERNATIONAL CONFERENCE ON VISUAL COMMUNICATIONS AND IMAGE PROCESSING (IEEE VCIP), 2018,
[30] A Trajectory Mixture Density Network for the Acoustic-Articulatory Inversion Mapping
Richmond, Korin
INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 577 - 580

← 1 2 3 4 5 →