Statistical Mapping between Articulatory and Acoustic Data for an Ultrasound-based Silent Speech Interface

被引：0

作者：

Hueber, Thomas ^{[1
]}

Benaroya, Elie-Laurent ^{[2
]}

Denby, Bruce ^{[2
,3
]}

Chollet, Gerard ^{[4
]}

机构：

[1] UMR 5216 CNRS INP UJF U Stendhal, GIPSA Lab, Grenoble, France

[2] ESPCI Paristech, Sigma Lab, Paris, France

[3] Univ Paris 06, Paris, France

[4] Telecom ParisTech, LTCY CNRS, Paris, France

来源：

12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5 | 2011年

关键词：

silent speech interface; GMM; HMM; ultrasound; video; multimodal; statistical mapping;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This paper presents recent developments on our "silent speech interface" that converts tongue and lip motions, captured by ultrasound and video imaging, into audible speech. In our previous studies, the mapping between the observed articulatory movements and the resulting speech sound was achieved using a unit selection approach. We investigate here the use of statistical mapping techniques, based on the joint modeling of visual and spectral features, using respectively Gaussian Mixture Models (GMM) and Hidden Markov Models (HMM). The prediction of the voiced/unvoiced parameter from visual articulatory data is also investigated using an artificial neural network (ANN). A continuous speech database consisting of one-hour of high-speed ultrasound and video sequences was specifically recorded to evaluate the proposed mapping techniques.

引用

页码：600 / +

页数：2

共 50 条

[41] Hybrid convolutional neural networks for articulatory and acoustic information based speech recognition
Mitra, Vikramjit
Sivaraman, Ganesh
Nam, Hosung
Espy-Wilson, Carol
Saltzman, Elliot
Tiede, Mark
[J]. SPEECH COMMUNICATION, 2017, 89 : 103 - 112
[42] Ultrasound-based sensors for motion correction of PET data
Cheng, Cheng-Chieh
Belsley, Gabriela
Moore, Stephen
Preiswerk, Frank
Wu, Pei-Hsin
Kijewski, Marie
Campbell, Laurel
DiCarli, Marcelo
Madore, Bruno
[J]. JOURNAL OF NUCLEAR MEDICINE, 2018, 59
[43] Acoustic-to-articulatory mapping based on mixture of probabilistic canonical correlation analysis
Uchida, Hidetsugu
Saito, Daisuke
Minematsu, Nobuaki
[J]. 18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 989 - 993
[44] A novel neural-based model for acoustic-articulatory inversion mapping
Hossein Behbood
Seyyed Ali Seyyedsalehi
Hamid Reza Tohidypour
Mojtaba Najafi
Shahriar Gharibzadeh
[J]. Neural Computing and Applications, 2012, 21 : 935 - 943
[45] A novel neural-based model for acoustic-articulatory inversion mapping
Behbood, Hossein
Seyyedsalehi, Seyyed Ali
Tohidypour, Hamid Reza
Najafi, Mojtaba
Gharibzadeh, Shahriar
[J]. NEURAL COMPUTING & APPLICATIONS, 2012, 21 (05): : 935 - 943
[46] Seeing Speech: Ultrasound-based Multimedia Resources for Pronunciation Learning in Indigenous Languages
Bliss, Heather
Bird, Sonya
Cooper, Pepakiye Ashley
Burton, Strang
Gick, Bryan
[J]. LANGUAGE DOCUMENTATION & CONSERVATION, 2018, 12 : 315 - 338
[47] A DNN Based Speech Enhancement Approach to Noise Robust Acoustic-to-Articulatory Inversion
Shahrebabaki, Abdolreza Sabzi
Siniscalchi, Sabato Marco
Salvi, Giampiero
Svendsen, Torbjorn
[J]. 2021 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), 2021,
[48] Speech rate task-specific representation learning from acoustic-articulatory data
Mannem, Renuka
Jyothi, Hima R.
Illa, Aravind
Ghosh, Prasanta Kumar
[J]. INTERSPEECH 2020, 2020, : 2892 - 2896
[49] Statistical multi-stream modeling of real-time MRI articulatory speech data
Bresch, Erik
Katsamanis, Athanasios
Goldstein, Louis
Narayanan, Shrikanth
[J]. 11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4, 2010, : 1584 - +
[50] Speech emotion recognition based on bi-directional acoustic-articulatory conversion
Li, Haifeng
Zhang, Xueying
Duan, Shufei
Liang, Huizhi
[J]. KNOWLEDGE-BASED SYSTEMS, 2024, 299

← 1 2 3 4 5 →