Eigentongue feature extraction for an ultrasound-based silent speech interface

被引:0
|
作者
Hueber, T. [1 ,3 ]
Aversano, G. [3 ]
Chollet, G. [3 ]
Denby, B. [1 ,2 ]
Dreyfus, G. [1 ]
Oussar, Y. [1 ]
Roussel, P. [1 ]
Stone, M. [4 ]
机构
[1] Ecole Super Phys & Chim Ind Ville Paris, ESPCI Paristech, Elect Lab, 10 Rue Vauquelin, F-75231 Paris 05, France
[2] Univ Paris 06, F-75252 Paris, France
[3] Ecole Natl Super Telecommun Bretagne, Lab Traitement Commun Informat, F-75634 Paris, France
[4] Univ Maryland, Dental Sch, Vocal Tract Visualizat Lab, Baltimore, MD 21201 USA
关键词
image processing; speech synthesis; neural network applications; communication systems; silent speech interface;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
The article compares two approaches to the description of ultrasound vocal tract images for application in a "silent speech interface," one based on tongue contour modeling, and a second, global coding approach in which images are projected onto a feature space of Eigentongues. A curvature-based lip profile feature extraction method is also presented. Extracted visual features are input to a neural network which learns the relation between the vocal tract configuration and line spectrum frequencies (LSF) contained in a one-hour speech corpus. An examination of the quality of LSF's derived from the two approaches demonstrates that the eigentongues approach has a more efficient implementation and provides superior results based on a normalized mean squared error criterion.
引用
收藏
页码:1245 / +
页数:2
相关论文
共 50 条
  • [1] Ultrasound-based Silent Speech Interface Built on a Continuous Vocoder
    Csapo, Tamas Gabor
    Al-Radhi, Mohammed Salah
    Nemeth, Geza
    Gosztolya, Gabor
    Grosz, Tamas
    Toth, Laszlo
    Marko, Alexandra
    [J]. INTERSPEECH 2019, 2019, : 894 - 898
  • [2] Silent vs Vocalized Articulation for a Portable Ultrasound-Based Silent Speech Interface
    Florescu, Victoria-M
    Crevier-Buchman, Lise
    Denby, Bruce
    Hueber, Thomas
    Colazo-Simon, Antonia
    Pillot-Loiseau, Claire
    Roussel, Pierre
    Gendrot, Cedric
    Quattrocchi, Sophie
    [J]. 11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 450 - +
  • [3] Ultrasound-Based Silent Speech Interface Using Convolutional and Recurrent Neural Networks
    Moliner Juanpere, Eloi
    Csapo, Tamas Gabor
    [J]. ACTA ACUSTICA UNITED WITH ACUSTICA, 2019, 105 (04) : 587 - 590
  • [4] Ultrasound-Based Silent Speech Interface using Sequential Convolutional Auto-encoder
    Xu, Kele
    Wu, Yuxiang
    Gao, Zhifeng
    [J]. PROCEEDINGS OF THE 27TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA (MM'19), 2019, : 2194 - 2195
  • [5] Statistical Mapping between Articulatory and Acoustic Data for an Ultrasound-based Silent Speech Interface
    Hueber, Thomas
    Benaroya, Elie-Laurent
    Denby, Bruce
    Chollet, Gerard
    [J]. 12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 600 - +
  • [6] Neural Speaker Embeddings for Ultrasound-based Silent Speech Interfaces
    Shandiz, Amin Honarmandi
    Toth, Laszlo
    Gosztolya, Gabor
    Marko, Alexandra
    Csapo, Tamas Gabor
    [J]. INTERSPEECH 2021, 2021, : 1932 - 1936
  • [7] Multi-Task Learning of Speech Recognition and Speech Synthesis Parameters for Ultrasound-based Silent Speech Interfaces
    Toth, Laszlo
    Gosztolya, Gabor
    Grosz, Tamas
    Marko, Alexandra
    Csapo, Tamas Gabor
    [J]. 19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 3172 - 3176
  • [8] DNN-based Ultrasound-to-Speech Conversion for a Silent Speech Interface
    Csapo, Temas Gabor
    Grosz, Tamas
    Gosztolya, Gabor
    Toth, Laszlo
    Marko, Alexandra
    [J]. 18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 3672 - 3676
  • [9] The Actualities and Prospects of Ultrasound-based Pattern Recognition in Crop Feature Extraction
    Yu, Shanshan
    Wu, Chongyou
    Wang, Suzhen
    Hu, Minjuan
    [J]. MECHANICAL, INDUSTRIAL, AND MANUFACTURING ENGINEERING, 2011, : 94 - 98
  • [10] Visuo-Phonetic Decoding using Multi-Stream and Context-Dependent Models for an Ultrasound-based Silent Speech Interface
    Hueber, Thomas
    Benaroya, Elie-Laurent
    Chollet, Gerard
    Denby, Bruce
    Dreyfus, Gerard
    Stone, Maureen
    [J]. INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 628 - +