Ultrasound-Based Silent Speech Interface Using Convolutional and Recurrent Neural Networks

被引：8

作者：

Moliner Juanpere, Eloi ^{[1
]}

Csapo, Tamas Gabor ^{[2
,3
]}

机构：

[1] UPC Barcelona Sch Telecommun Engn ETSETB, Barcelona, Spain

[2] Budapest Univ Technol & Econ, Dept Telecommun & Media Informat, Budapest, Hungary

[3] MTA ELTE Lendulet Lingual Articulat Res Grp, Budapest, Hungary

来源：

ACTA ACUSTICA UNITED WITH ACUSTICA | 2019年 / 105卷 / 04期

关键词：

TONGUE;

D O I：

10.3813/AAA.919339

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Silent Speech Interface (SSI) is a technology with the goal of synthesizing speech from articulatory motion. A Deep Neural Network based SSI using ultrasound images of the tongue as input signals and spectral coefficients of a vocoder as target parameters are proposed. Several deep learning models, such as a baseline Feed-forward, and a combination of Convolutional and Recurrent Neural Networks are presented and discussed. A pre-processing step using a Deep Convolutional AutoEncoder was also studied. According to the experimental results, an architecture based on a CNN and bidirectional LSTM layers has shown the best objective and subjective results. (C) 2019 The Author(s). Published by S. Hirzel Verlag . EAA.

引用

页码：587 / 590

页数：4

共 50 条

[1] Ultrasound-Based Silent Speech Interface using Sequential Convolutional Auto-encoder
Xu, Kele
Wu, Yuxiang
Gao, Zhifeng
[J]. PROCEEDINGS OF THE 27TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA (MM'19), 2019, : 2194 - 2195
[2] Ultrasound-based Silent Speech Interface Built on a Continuous Vocoder
Csapo, Tamas Gabor
Al-Radhi, Mohammed Salah
Nemeth, Geza
Gosztolya, Gabor
Grosz, Tamas
Toth, Laszlo
Marko, Alexandra
[J]. INTERSPEECH 2019, 2019, : 894 - 898
[3] Eigentongue feature extraction for an ultrasound-based silent speech interface
Hueber, T.
Aversano, G.
Chollet, G.
Denby, B.
Dreyfus, G.
Oussar, Y.
Roussel, P.
Stone, M.
[J]. 2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PTS 1-3, PROCEEDINGS, 2007, : 1245 - +
[4] Silent vs Vocalized Articulation for a Portable Ultrasound-Based Silent Speech Interface
Florescu, Victoria-M
Crevier-Buchman, Lise
Denby, Bruce
Hueber, Thomas
Colazo-Simon, Antonia
Pillot-Loiseau, Claire
Roussel, Pierre
Gendrot, Cedric
Quattrocchi, Sophie
[J]. 11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 450 - +
[5] Neural Speaker Embeddings for Ultrasound-based Silent Speech Interfaces
Shandiz, Amin Honarmandi
Toth, Laszlo
Gosztolya, Gabor
Marko, Alexandra
Csapo, Tamas Gabor
[J]. INTERSPEECH 2021, 2021, : 1932 - 1936
[6] Statistical Mapping between Articulatory and Acoustic Data for an Ultrasound-based Silent Speech Interface
Hueber, Thomas
Benaroya, Elie-Laurent
Denby, Bruce
Chollet, Gerard
[J]. 12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 600 - +
[7] Ultrasound-Based Detection of Lung Abnormalities Using Single Shot Detection Convolutional Neural Networks
Kulhare, Sourabh
Zheng, Xinliang
Mehanian, Courosh
Gregory, Cynthia
Zhu, Meihua
Gregory, Kenton
Xie, Hua
Jones, James McAndrew
Wilson, Benjamin
[J]. SIMULATION, IMAGE PROCESSING, AND ULTRASOUND SYSTEMS FOR ASSISTED DIAGNOSIS AND NAVIGATION, 2018, 11042 : 65 - 73
[8] Speech Emotion Recognition using Convolutional and Recurrent Neural Networks
Lim, Wootaek
Jang, Daeyoung
Lee, Taejin
[J]. 2016 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA), 2016,
[9] Speech Emotion Recognition using Convolutional Recurrent Neural Networks and Spectrograms
Qamhan, Mustafa A.
Meftah, Ali H.
Selouani, Sid-Ahmed
Alotaibi, Yousef A.
Zakariah, Mohammed
Seddiq, Yasser Mohammad
[J]. 2020 IEEE CANADIAN CONFERENCE ON ELECTRICAL AND COMPUTER ENGINEERING (CCECE), 2020,
[10] SottoVoce: An Ultrasound Imaging-Based Silent Speech Interaction Using Deep Neural Networks
Kimura, Naoki
Kono, Michinari
Rekimoto, Jun
[J]. CHI 2019: PROCEEDINGS OF THE 2019 CHI CONFERENCE ON HUMAN FACTORS IN COMPUTING SYSTEMS, 2019,

← 1 2 3 4 5 →