Statistical Mapping between Articulatory and Acoustic Data for an Ultrasound-based Silent Speech Interface

被引:0
|
作者
Hueber, Thomas [1 ]
Benaroya, Elie-Laurent [2 ]
Denby, Bruce [2 ,3 ]
Chollet, Gerard [4 ]
机构
[1] UMR 5216 CNRS INP UJF U Stendhal, GIPSA Lab, Grenoble, France
[2] ESPCI Paristech, Sigma Lab, Paris, France
[3] Univ Paris 06, Paris, France
[4] Telecom ParisTech, LTCY CNRS, Paris, France
关键词
silent speech interface; GMM; HMM; ultrasound; video; multimodal; statistical mapping;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents recent developments on our "silent speech interface" that converts tongue and lip motions, captured by ultrasound and video imaging, into audible speech. In our previous studies, the mapping between the observed articulatory movements and the resulting speech sound was achieved using a unit selection approach. We investigate here the use of statistical mapping techniques, based on the joint modeling of visual and spectral features, using respectively Gaussian Mixture Models (GMM) and Hidden Markov Models (HMM). The prediction of the voiced/unvoiced parameter from visual articulatory data is also investigated using an artificial neural network (ANN). A continuous speech database consisting of one-hour of high-speed ultrasound and video sequences was specifically recorded to evaluate the proposed mapping techniques.
引用
收藏
页码:600 / +
页数:2
相关论文
共 50 条
  • [41] Hybrid convolutional neural networks for articulatory and acoustic information based speech recognition
    Mitra, Vikramjit
    Sivaraman, Ganesh
    Nam, Hosung
    Espy-Wilson, Carol
    Saltzman, Elliot
    Tiede, Mark
    [J]. SPEECH COMMUNICATION, 2017, 89 : 103 - 112
  • [42] Ultrasound-based sensors for motion correction of PET data
    Cheng, Cheng-Chieh
    Belsley, Gabriela
    Moore, Stephen
    Preiswerk, Frank
    Wu, Pei-Hsin
    Kijewski, Marie
    Campbell, Laurel
    DiCarli, Marcelo
    Madore, Bruno
    [J]. JOURNAL OF NUCLEAR MEDICINE, 2018, 59
  • [43] Acoustic-to-articulatory mapping based on mixture of probabilistic canonical correlation analysis
    Uchida, Hidetsugu
    Saito, Daisuke
    Minematsu, Nobuaki
    [J]. 18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 989 - 993
  • [44] A novel neural-based model for acoustic-articulatory inversion mapping
    Hossein Behbood
    Seyyed Ali Seyyedsalehi
    Hamid Reza Tohidypour
    Mojtaba Najafi
    Shahriar Gharibzadeh
    [J]. Neural Computing and Applications, 2012, 21 : 935 - 943
  • [45] A novel neural-based model for acoustic-articulatory inversion mapping
    Behbood, Hossein
    Seyyedsalehi, Seyyed Ali
    Tohidypour, Hamid Reza
    Najafi, Mojtaba
    Gharibzadeh, Shahriar
    [J]. NEURAL COMPUTING & APPLICATIONS, 2012, 21 (05): : 935 - 943
  • [46] Seeing Speech: Ultrasound-based Multimedia Resources for Pronunciation Learning in Indigenous Languages
    Bliss, Heather
    Bird, Sonya
    Cooper, Pepakiye Ashley
    Burton, Strang
    Gick, Bryan
    [J]. LANGUAGE DOCUMENTATION & CONSERVATION, 2018, 12 : 315 - 338
  • [47] A DNN Based Speech Enhancement Approach to Noise Robust Acoustic-to-Articulatory Inversion
    Shahrebabaki, Abdolreza Sabzi
    Siniscalchi, Sabato Marco
    Salvi, Giampiero
    Svendsen, Torbjorn
    [J]. 2021 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), 2021,
  • [48] Speech rate task-specific representation learning from acoustic-articulatory data
    Mannem, Renuka
    Jyothi, Hima R.
    Illa, Aravind
    Ghosh, Prasanta Kumar
    [J]. INTERSPEECH 2020, 2020, : 2892 - 2896
  • [49] Statistical multi-stream modeling of real-time MRI articulatory speech data
    Bresch, Erik
    Katsamanis, Athanasios
    Goldstein, Louis
    Narayanan, Shrikanth
    [J]. 11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4, 2010, : 1584 - +
  • [50] Speech emotion recognition based on bi-directional acoustic-articulatory conversion
    Li, Haifeng
    Zhang, Xueying
    Duan, Shufei
    Liang, Huizhi
    [J]. KNOWLEDGE-BASED SYSTEMS, 2024, 299