Continuous Articulatory-to-Acoustic Mapping using Phone-based Trajectory HMM for a Silent Speech Interface

被引：0

作者：

Hueber, Thomas ^{[1
]}

Bailly, Gerard ^{[1
]}

Denby, Bruce

机构：

[1] UJF, U Stendhal, INP, GIPSA Lab,CNRS,UMR 5216, Grenoble, France

来源：

13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3 | 2012年

关键词：

silent speech interface; handicap; HMM-based speech synthesis; audiovisual speech processing;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The article presents an HMM-based mapping approach for converting ultrasound and video images of the vocal tract into an audible speech signal, for a silent speech interface application. The proposed technique is based on the joint modeling of articulatory and spectral features, for each phonetic class, using Hidden Markov Models (HMM) and multivariate Gaussian distributions with full covariance matrices. The articulatory-to-acoustic mapping is achieved in 2 steps: 1) finding the most likely HMM state sequence from the articulatory observations; 2) inferring the spectral trajectories from both the decoded state sequence and the articulatory observations. The proposed technique is compared to our previous approach, in which only the decoded state sequence was used for the inference of the spectral trajectories, independently from the articulatory observations. Both objective and perceptual evaluations show that this new approach leads to a better estimation of the spectral trajectories.

引用

页码：722 / 725

页数：4

共 50 条

[31] EchoWhisper: Exploring an Acoustic-based Silent Speech Interface for Smartphone Users
Gao, Yang
Jin, Yincheng
Li, Jiyang
Choi, Seokmin
Jin, Zhanpeng
PROCEEDINGS OF THE ACM ON INTERACTIVE MOBILE WEARABLE AND UBIQUITOUS TECHNOLOGIES-IMWUT, 2020, 4 (03):
[32] An acoustic model adaptation using hmm-based speech synthesis
Tanaka, K
Kuroiwa, S
Tsuge, S
Ren, F
2003 INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING AND KNOWLEDGE ENGINEERING, PROCEEDINGS, 2003, : 368 - 373
[33] Phone-Based Filter Parameter Optimization for Robust Speech Recognition Using Likelihood Maximization
Kouhi-Jelehkaran, Bahram
Bakhshi, Hamidreza
Razzazi, Farbod
Amini, Sahar
ICSP: 2008 9TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, VOLS 1-5, PROCEEDINGS, 2008, : 557 - +
[34] Articulatory feature based continuous speech recognition using probabilistic lexical modeling
Rasipuram, Ramya
Magimai-Doss, Mathew
COMPUTER SPEECH AND LANGUAGE, 2016, 36 : 233 - 259
[35] Deep Neural Network Based Acoustic-to-articulatory Inversion Using Phone Sequence Information
Xie, Xurong
Liu, Xunying
Wang, Lan
17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 1497 - 1501
[36] Ultrasonic Doppler Based Silent Speech Interface Using Perceptual Distance
Lee, Ki-Seung
APPLIED SCIENCES-BASEL, 2022, 12 (02):
[37] A study on rescoring using HMM-based detectors for continuous speech recognition
Fu, Qiang
Juang, Biing-Hwang
2007 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING, VOLS 1 AND 2, 2007, : 570 - 575
[38] Phone-based filter parameter optimization of filter and sum robust speech recognition using likelihood maximization
Kouhi-Jelehkaran, Bahram
Bakhshi, Hamidreza
Razzazi, Farbod
AEU-INTERNATIONAL JOURNAL OF ELECTRONICS AND COMMUNICATIONS, 2010, 64 (12) : 1167 - 1172
[39] Small-vocabulary speech recognition using a silent speech interface based on magnetic sensing
Hofe, Robin
Ell, Stephen R.
Fagan, Michael J.
Gilbert, James M.
Green, Phil D.
Moore, Roger K.
Rybchenko, Sergey I.
SPEECH COMMUNICATION, 2013, 55 (01) : 22 - 32
[40] Acoustic and articulatory feature based speech rate estimation using a convolutional dense neural network
Mannem, Renuka
Mallela, Jhansi
Illa, Aravind
Ghosh, Prasanta Kumar
INTERSPEECH 2019, 2019, : 929 - 933

← 1 2 3 4 5 →