Relevance-Weighted-Reconstruction of Articulatory Features in Deep-Neural-Network-Based Acoustic-to-Articulatory Mapping

被引：0

作者：

Canevari, Claudia ^{[1
]}

Badino, Leonardo ^{[1
]}

Fadiga, Luciano ^{[1
]}

Metta, Giorgio ^{[1
]}

机构：

[1] Ist Italiano Tecnol, RBCS, Genoa, Italy

来源：

14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5 | 2013年

关键词：

Acoustic-to-Articulatory Mapping; critical articulators; Deep Neural Networks; phone recognition;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We present a strategy for learning Deep-Neural-Network (DNN)-based Acoustic-to-Articulatory Mapping (AAM) functions where the contribution of an articulatory feature (AF) to the global reconstruction error is weighted by its relevance. We first empirically show that when an articulator is more crucial for the production of a given phone it is less variable, confirming previous findings. We then compute the relevance of an articulatory feature as a function of its frame-wise variance dependent on the acoustic evidence which is estimated through a Mixture Density Network (MDN). Finally we combine acoustic and recovered articulatory features in a hybrid DNN-HMM phone recognizer. Tested on the MOCHA-TIMIT corpus, articulatory features reconstructed by a standardly trained DNN lead to a 8.4% relative phone error reduction (w.r.t. a recognizer that only uses MFCCs), whereas when the articulatory features are reconstructed taking into account their relevance the relative phone error reduction increased to 10.9%.

引用

页码：1296 / 1300

页数：5

共 44 条

[31] Deep-Neural-Network-Based Sinogram Synthesis for Sparse-View CT Image Reconstruction
Lee, Hoyeon
Lee, Jongha
Kim, Hyeongseok
Cho, Byungchul
Cho, Seungryong
IEEE TRANSACTIONS ON RADIATION AND PLASMA MEDICAL SCIENCES, 2019, 3 (02) : 109 - 119
[32] Rank-weighted reconstruction feature for a robust deep neural network-based acoustic model
Chung, Hoon
Park, Jeon Gue
Jung, Ho-Young
ETRI JOURNAL, 2019, 41 (02) : 235 - 241
[33] Deep Neural Network Based 3D Articulatory Movement Prediction Using Both Text and Audio Inputs
Yu, Lingyun
Yu, Jun
Ling, Qiang
MULTIMEDIA MODELING (MMM 2019), PT I, 2019, 11295 : 68 - 79
[34] Deep-neural-network-based framework for the accelerating uncertainty quantification of a structural-acoustic fully coupled system in a shallow sea
Chen, Leilei
Pei, Qingxiang
Fei, Ziheng
Zhou, Zhongbin
Hu, Zhongming
ENGINEERING ANALYSIS WITH BOUNDARY ELEMENTS, 2025, 171
[35] Deep-Neural-Network-Based Receiver Design for Downlink Non-Orthogonal Multiple-Access Underwater Acoustic Communication
Zuberi, Habib Hussain
Liu, Songzuo
Bilal, Muhammad
Alharbi, Ayman
Jaffar, Amar
Mohsan, Syed Agha Hussnain
Miyajan, Abdulaziz
Khan, Mohsin Abrar
JOURNAL OF MARINE SCIENCE AND ENGINEERING, 2023, 11 (11)
[36] Research on Chinese Speech Emotion Recognition Based on Deep Neural Network and Acoustic Features
Lee, Ming-Che
Yeh, Sheng-Cheng
Chang, Jia-Wei
Chen, Zhen-Yi
SENSORS, 2022, 22 (13)
[37] Speech Emotion Recognition Based on Multiple Acoustic Features and Deep Convolutional Neural Network
Bhangale, Kishor
Kothandaraman, Mohanaprasad
ELECTRONICS, 2023, 12 (04)
[38] Deep neural network based seafloor sediment mapping using bathymetric features of MBES multifrequency
Khomsin
Mukhtasor
Suntoyo
Pratomo, Danar Guruh
OCEAN SYSTEMS ENGINEERING-AN INTERNATIONAL JOURNAL, 2024, 14 (02): : 1 - 14
[39] DEEP NEURAL NETWORK BASED LEARNING AND TRANSFERRING MID-LEVEL AUDIO FEATURES FOR ACOUSTIC SCENE CLASSIFICATION
Mun, Seongkyu
Shon, Suwon
Kim, Wooil
Han, David K.
Ko, Hanseok
2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 796 - 800
[40] Deep neural network-based power spectrum reconstruction to improve quality of vocoded speech with limited acoustic parameters
Okamoto, Takuma
Tachibana, Kentaro
Toda, Tomoki
Shiga, Yoshinori
Kawai, Hisashi
ACOUSTICAL SCIENCE AND TECHNOLOGY, 2018, 39 (02) : 163 - 166

← 1 2 3 4 5 →