Statistical Mapping between Articulatory and Acoustic Data for an Ultrasound-based Silent Speech Interface

被引:0
|
作者
Hueber, Thomas [1 ]
Benaroya, Elie-Laurent [2 ]
Denby, Bruce [2 ,3 ]
Chollet, Gerard [4 ]
机构
[1] UMR 5216 CNRS INP UJF U Stendhal, GIPSA Lab, Grenoble, France
[2] ESPCI Paristech, Sigma Lab, Paris, France
[3] Univ Paris 06, Paris, France
[4] Telecom ParisTech, LTCY CNRS, Paris, France
关键词
silent speech interface; GMM; HMM; ultrasound; video; multimodal; statistical mapping;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents recent developments on our "silent speech interface" that converts tongue and lip motions, captured by ultrasound and video imaging, into audible speech. In our previous studies, the mapping between the observed articulatory movements and the resulting speech sound was achieved using a unit selection approach. We investigate here the use of statistical mapping techniques, based on the joint modeling of visual and spectral features, using respectively Gaussian Mixture Models (GMM) and Hidden Markov Models (HMM). The prediction of the voiced/unvoiced parameter from visual articulatory data is also investigated using an artificial neural network (ANN). A continuous speech database consisting of one-hour of high-speed ultrasound and video sequences was specifically recorded to evaluate the proposed mapping techniques.
引用
收藏
页码:600 / +
页数:2
相关论文
共 50 条
  • [21] EchoWhisper: Exploring an Acoustic-based Silent Speech Interface for Smartphone Users
    Gao, Yang
    Jin, Yincheng
    Li, Jiyang
    Choi, Seokmin
    Jin, Zhanpeng
    [J]. PROCEEDINGS OF THE ACM ON INTERACTIVE MOBILE WEARABLE AND UBIQUITOUS TECHNOLOGIES-IMWUT, 2020, 4 (03):
  • [22] Prospects for a silent speech interface using ultrasound imaging
    Denby, Bruce
    Oussar, Yacine
    Dreyfus, Gerard
    Stone, Maureen
    [J]. 2006 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-13, 2006, : 365 - 368
  • [23] An Analysis of Articulatory-Acoustic Data based on Articulatory Strokes
    Kato, Tsuneo
    Lee, Sungbok
    Narayanan, Shrikanth
    [J]. 2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 4493 - 4496
  • [24] ACOUSTIC-TO-ARTICULATORY INVERSION FOR DYSARTHRIC SPEECH BY USING CROSS-CORPUS ACOUSTIC-ARTICULATORY DATA
    Maharana, Sarthak Kumar
    Illa, Aravind
    Mannem, Renuka
    Belur, Yamini
    Shetty, Preetie
    Kumar, Veeramani Preethish
    Vengalil, Seena
    Polavarapu, Kiran
    Atchayaram, Nalini
    Ghosh, Prasanta Kumar
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 6458 - 6462
  • [25] IS RELATIONSHIP BETWEEN ACOUSTIC AND ARTICULATORY VARIABLES IN SPEECH INTRINSICALLY LINEAR
    ATAL, BS
    [J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1974, 55 (02): : 384 - 385
  • [26] Articulatory Speech Re-synthesis: Profiting from Natural Acoustic Speech Data
    Bauer, Dominik
    Kannampuzha, Jim
    Kroeger, Bernd J.
    [J]. CROSS-MODAL ANALYSIS OF SPEECH, GESTURES, GAZE AND FACIAL EXPRESSIONS, 2009, 5641 : 344 - +
  • [27] Speech recognition based on a combination of acoustic features with articulatory information
    LU Xugang DANG Jianwu (Japan Advanced Institute of Science and Technology
    [J]. Chinese Journal of Acoustics, 2005, (03) : 271 - 279
  • [28] Optimizing the Ultrasound Tongue Image Representation for Residual Network-Based Articulatory-to-Acoustic Mapping
    Csapo, Tamas Gabor
    Gosztolya, Gabor
    Toth, Laszlo
    Shandiz, Amin Honarmandi
    Marko, Alexandra
    [J]. SENSORS, 2022, 22 (22)
  • [29] An acoustic Doppler-based silent speech interface technology using generative adversarial networks
    Lee, Ki-Seung
    [J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF KOREA, 2021, 40 (02): : 161 - 168
  • [30] ROBUSTNESS IMPROVEMENT OF ULTRASOUND-BASED SENSOR SYSTEMS FOR SPEECH COMMUNICATION
    Cvijanovic, Nemanja
    Kechichian, Patrick
    Janse, Kees
    Kohlrausch, Armin
    [J]. 2015 23RD EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2015, : 884 - 888