Deep Neural Network Frontend for Continuous EMG-based Speech Recognition

被引：19

作者：

Wand, Michael ^{[1
]}

Schmidhuber, Jurgen

机构：

[1] USI, Ist Dalle Molle Studi Intelligenza Artificiale, Swiss AI Lab IDSIA, Manno Lugano, Switzerland

来源：

17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES | 2016年

关键词：

Silent Speech interface; Deep Neural Networks; Electromyography; EMG-based Speech Recognition;

D O I：

10.21437/Interspeech.2016-340

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

We report on a Deep Neural Network frontend for a continuous speech recognizer based on Surface Electromyography (EMG). Speech data is obtained by facial electrodes capturing the electric activity generated by the articulatory muscles, thus allowing speech processing without making use of the acoustic signal. The electromyographic signal is preprocessed and fed into the neural network, which is trained on framewise targets; the output layer activations are further processed by a Hidden Markov sequence classifier. We show that such a neural network frontend can be trained on EMG data and yields substantial improvements over previous systems, despite the fact that the available amount of data is very small, just amounting to a few tens of sentences: on the EMG-UKA corpus, we obtain average evaluation set Word Error Rate improvements of more than 32% relative on context-independent phone models and 13% relative on versatile Bundled Phonetic feature (BDPF) models, compared to a conventional system using Gaussian Mixture Models. In particular, on simple context-independent phone models, the new system yields results which are almost as good as with BDPF models, which were specifically designed to cope with small amounts of training data.

引用

页码：3032 / 3036

页数：5

共 50 条

[1] Pattern Learning with Deep Neural Networks in EMG-based Speech Recognition
Wand, Michael
Schultz, Tanja
[J]. 2014 36TH ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY (EMBC), 2014, : 4200 - 4203
[2] Modeling coarticulation in EMG-based continuous speech recognition
Schultz, Tanja
Wand, Michael
[J]. SPEECH COMMUNICATION, 2010, 52 (04) : 341 - 353
[3] SESSION-INDEPENDENT EMG-BASED SPEECH RECOGNITION
Wand, Michael
Schultz, Tanja
[J]. BIOSIGNALS 2011, 2011, : 295 - 300
[4] ANALYSIS OF PHONE CONFUSION IN EMG-BASED SPEECH RECOGNITION
Wand, Michael
Schultz, Tanja
[J]. 2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 757 - 760
[5] EMG-Based Continuous Motion Decoding of Upper Limb with Spiking Neural Network
Du, Yuwei
Jin, Jing
Wang, Qiang
Fan, Jianyin
[J]. 2022 IEEE INTERNATIONAL INSTRUMENTATION AND MEASUREMENT TECHNOLOGY CONFERENCE (I2MTC 2022), 2022,
[6] A Spectral Mapping Method for EMG-based Recognition of Silent Speech
Janke, Matthias
Wand, Michael
Schultz, Tanja
[J]. BIO-INSPIRED HUMAN- MACHINE INTERFACES AND HEALTHCARE APPLICATIONS, 2010, : 22 - 31
[7] Impact of Different Feedback Mechanisms in EMG-based Speech Recognition
Herff, Christian
Janke, Matthias
Wand, Michael
Schultz, Tanja
[J]. 12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 2224 - 2227
[8] Three Steps of Neuron Network Classification for EMG-based Thai Tones Speech Recognition
Srisuwan, Niyawadee
Phukpattaranont, Pornchai
Limsakul, Chusak
[J]. 2013 10TH INTERNATIONAL CONFERENCE ON ELECTRICAL ENGINEERING/ELECTRONICS, COMPUTER, TELECOMMUNICATIONS AND INFORMATION TECHNOLOGY (ECTI-CON), 2013,
[9] Tackling Speaking Mode Varieties in EMG-Based Speech Recognition
Wand, Michael
Janke, Matthias
Schultz, Tanja
[J]. IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING, 2014, 61 (10) : 2515 - 2526
[10] Multi-stream HMM for EMG-based speech recognition
Manabe, H
Zhang, Z
[J]. PROCEEDINGS OF THE 26TH ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY, VOLS 1-7, 2004, 26 : 4389 - 4392

← 1 2 3 4 5 →