Continuous Articulatory-to-Acoustic Mapping using Phone-based Trajectory HMM for a Silent Speech Interface

被引:0
|
作者
Hueber, Thomas [1 ]
Bailly, Gerard [1 ]
Denby, Bruce
机构
[1] UJF, U Stendhal, INP, GIPSA Lab,CNRS,UMR 5216, Grenoble, France
关键词
silent speech interface; handicap; HMM-based speech synthesis; audiovisual speech processing;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The article presents an HMM-based mapping approach for converting ultrasound and video images of the vocal tract into an audible speech signal, for a silent speech interface application. The proposed technique is based on the joint modeling of articulatory and spectral features, for each phonetic class, using Hidden Markov Models (HMM) and multivariate Gaussian distributions with full covariance matrices. The articulatory-to-acoustic mapping is achieved in 2 steps: 1) finding the most likely HMM state sequence from the articulatory observations; 2) inferring the spectral trajectories from both the decoded state sequence and the articulatory observations. The proposed technique is compared to our previous approach, in which only the decoded state sequence was used for the inference of the spectral trajectories, independently from the articulatory observations. Both objective and perceptual evaluations show that this new approach leads to a better estimation of the spectral trajectories.
引用
收藏
页码:722 / 725
页数:4
相关论文
共 50 条
  • [31] EchoWhisper: Exploring an Acoustic-based Silent Speech Interface for Smartphone Users
    Gao, Yang
    Jin, Yincheng
    Li, Jiyang
    Choi, Seokmin
    Jin, Zhanpeng
    PROCEEDINGS OF THE ACM ON INTERACTIVE MOBILE WEARABLE AND UBIQUITOUS TECHNOLOGIES-IMWUT, 2020, 4 (03):
  • [32] An acoustic model adaptation using hmm-based speech synthesis
    Tanaka, K
    Kuroiwa, S
    Tsuge, S
    Ren, F
    2003 INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING AND KNOWLEDGE ENGINEERING, PROCEEDINGS, 2003, : 368 - 373
  • [33] Phone-Based Filter Parameter Optimization for Robust Speech Recognition Using Likelihood Maximization
    Kouhi-Jelehkaran, Bahram
    Bakhshi, Hamidreza
    Razzazi, Farbod
    Amini, Sahar
    ICSP: 2008 9TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, VOLS 1-5, PROCEEDINGS, 2008, : 557 - +
  • [34] Articulatory feature based continuous speech recognition using probabilistic lexical modeling
    Rasipuram, Ramya
    Magimai-Doss, Mathew
    COMPUTER SPEECH AND LANGUAGE, 2016, 36 : 233 - 259
  • [35] Deep Neural Network Based Acoustic-to-articulatory Inversion Using Phone Sequence Information
    Xie, Xurong
    Liu, Xunying
    Wang, Lan
    17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 1497 - 1501
  • [36] Ultrasonic Doppler Based Silent Speech Interface Using Perceptual Distance
    Lee, Ki-Seung
    APPLIED SCIENCES-BASEL, 2022, 12 (02):
  • [37] A study on rescoring using HMM-based detectors for continuous speech recognition
    Fu, Qiang
    Juang, Biing-Hwang
    2007 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING, VOLS 1 AND 2, 2007, : 570 - 575
  • [38] Phone-based filter parameter optimization of filter and sum robust speech recognition using likelihood maximization
    Kouhi-Jelehkaran, Bahram
    Bakhshi, Hamidreza
    Razzazi, Farbod
    AEU-INTERNATIONAL JOURNAL OF ELECTRONICS AND COMMUNICATIONS, 2010, 64 (12) : 1167 - 1172
  • [39] Small-vocabulary speech recognition using a silent speech interface based on magnetic sensing
    Hofe, Robin
    Ell, Stephen R.
    Fagan, Michael J.
    Gilbert, James M.
    Green, Phil D.
    Moore, Roger K.
    Rybchenko, Sergey I.
    SPEECH COMMUNICATION, 2013, 55 (01) : 22 - 32
  • [40] Acoustic and articulatory feature based speech rate estimation using a convolutional dense neural network
    Mannem, Renuka
    Mallela, Jhansi
    Illa, Aravind
    Ghosh, Prasanta Kumar
    INTERSPEECH 2019, 2019, : 929 - 933