Eigentongue feature extraction for an ultrasound-based silent speech interface

被引：0

作者：

Hueber, T. ^{[1
,3
]}

Aversano, G. ^{[3
]}

Chollet, G. ^{[3
]}

Denby, B. ^{[1
,2
]}

Dreyfus, G. ^{[1
]}

Oussar, Y. ^{[1
]}

Roussel, P. ^{[1
]}

Stone, M. ^{[4
]}

机构：

[1] Ecole Super Phys & Chim Ind Ville Paris, ESPCI Paristech, Elect Lab, 10 Rue Vauquelin, F-75231 Paris 05, France

[2] Univ Paris 06, F-75252 Paris, France

[3] Ecole Natl Super Telecommun Bretagne, Lab Traitement Commun Informat, F-75634 Paris, France

[4] Univ Maryland, Dental Sch, Vocal Tract Visualizat Lab, Baltimore, MD 21201 USA

来源：

2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PTS 1-3, PROCEEDINGS | 2007年

关键词：

image processing; speech synthesis; neural network applications; communication systems; silent speech interface;

D O I：

暂无

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

The article compares two approaches to the description of ultrasound vocal tract images for application in a "silent speech interface," one based on tongue contour modeling, and a second, global coding approach in which images are projected onto a feature space of Eigentongues. A curvature-based lip profile feature extraction method is also presented. Extracted visual features are input to a neural network which learns the relation between the vocal tract configuration and line spectrum frequencies (LSF) contained in a one-hour speech corpus. An examination of the quality of LSF's derived from the two approaches demonstrates that the eigentongues approach has a more efficient implementation and provides superior results based on a normalized mean squared error criterion.

引用

页码：1245 / +

页数：2

共 50 条

[1] Ultrasound-based Silent Speech Interface Built on a Continuous Vocoder
Csapo, Tamas Gabor
Al-Radhi, Mohammed Salah
Nemeth, Geza
Gosztolya, Gabor
Grosz, Tamas
Toth, Laszlo
Marko, Alexandra
[J]. INTERSPEECH 2019, 2019, : 894 - 898
[2] Silent vs Vocalized Articulation for a Portable Ultrasound-Based Silent Speech Interface
Florescu, Victoria-M
Crevier-Buchman, Lise
Denby, Bruce
Hueber, Thomas
Colazo-Simon, Antonia
Pillot-Loiseau, Claire
Roussel, Pierre
Gendrot, Cedric
Quattrocchi, Sophie
[J]. 11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 450 - +
[3] Ultrasound-Based Silent Speech Interface Using Convolutional and Recurrent Neural Networks
Moliner Juanpere, Eloi
Csapo, Tamas Gabor
[J]. ACTA ACUSTICA UNITED WITH ACUSTICA, 2019, 105 (04) : 587 - 590
[4] Ultrasound-Based Silent Speech Interface using Sequential Convolutional Auto-encoder
Xu, Kele
Wu, Yuxiang
Gao, Zhifeng
[J]. PROCEEDINGS OF THE 27TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA (MM'19), 2019, : 2194 - 2195
[5] Statistical Mapping between Articulatory and Acoustic Data for an Ultrasound-based Silent Speech Interface
Hueber, Thomas
Benaroya, Elie-Laurent
Denby, Bruce
Chollet, Gerard
[J]. 12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 600 - +
[6] Neural Speaker Embeddings for Ultrasound-based Silent Speech Interfaces
Shandiz, Amin Honarmandi
Toth, Laszlo
Gosztolya, Gabor
Marko, Alexandra
Csapo, Tamas Gabor
[J]. INTERSPEECH 2021, 2021, : 1932 - 1936
[7] Multi-Task Learning of Speech Recognition and Speech Synthesis Parameters for Ultrasound-based Silent Speech Interfaces
Toth, Laszlo
Gosztolya, Gabor
Grosz, Tamas
Marko, Alexandra
Csapo, Tamas Gabor
[J]. 19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 3172 - 3176
[8] DNN-based Ultrasound-to-Speech Conversion for a Silent Speech Interface
Csapo, Temas Gabor
Grosz, Tamas
Gosztolya, Gabor
Toth, Laszlo
Marko, Alexandra
[J]. 18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 3672 - 3676
[9] The Actualities and Prospects of Ultrasound-based Pattern Recognition in Crop Feature Extraction
Yu, Shanshan
Wu, Chongyou
Wang, Suzhen
Hu, Minjuan
[J]. MECHANICAL, INDUSTRIAL, AND MANUFACTURING ENGINEERING, 2011, : 94 - 98
[10] Visuo-Phonetic Decoding using Multi-Stream and Context-Dependent Models for an Ultrasound-based Silent Speech Interface
Hueber, Thomas
Benaroya, Elie-Laurent
Chollet, Gerard
Denby, Bruce
Dreyfus, Gerard
Stone, Maureen
[J]. INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 628 - +

← 1 2 3 4 5 →