Relevance-Weighted-Reconstruction of Articulatory Features in Deep-Neural-Network-Based Acoustic-to-Articulatory Mapping

被引:0
|
作者
Canevari, Claudia [1 ]
Badino, Leonardo [1 ]
Fadiga, Luciano [1 ]
Metta, Giorgio [1 ]
机构
[1] Ist Italiano Tecnol, RBCS, Genoa, Italy
关键词
Acoustic-to-Articulatory Mapping; critical articulators; Deep Neural Networks; phone recognition;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present a strategy for learning Deep-Neural-Network (DNN)-based Acoustic-to-Articulatory Mapping (AAM) functions where the contribution of an articulatory feature (AF) to the global reconstruction error is weighted by its relevance. We first empirically show that when an articulator is more crucial for the production of a given phone it is less variable, confirming previous findings. We then compute the relevance of an articulatory feature as a function of its frame-wise variance dependent on the acoustic evidence which is estimated through a Mixture Density Network (MDN). Finally we combine acoustic and recovered articulatory features in a hybrid DNN-HMM phone recognizer. Tested on the MOCHA-TIMIT corpus, articulatory features reconstructed by a standardly trained DNN lead to a 8.4% relative phone error reduction (w.r.t. a recognizer that only uses MFCCs), whereas when the articulatory features are reconstructed taking into account their relevance the relative phone error reduction increased to 10.9%.
引用
收藏
页码:1296 / 1300
页数:5
相关论文
共 44 条
  • [1] Acoustic to articulatory mapping with deep neural network
    Zhiyong Wu
    Kai Zhao
    Xixin Wu
    Xinyu Lan
    Helen Meng
    Multimedia Tools and Applications, 2015, 74 : 9889 - 9907
  • [2] Acoustic to articulatory mapping with deep neural network
    Wu, Zhiyong
    Zhao, Kai
    Wu, Xixin
    Lan, Xinyu
    Meng, Helen
    MULTIMEDIA TOOLS AND APPLICATIONS, 2015, 74 (22) : 9889 - 9907
  • [3] Speech modelling based on acoustic-to-articulatory mapping
    Schoentgen, J
    NONLINEAR SPEECH MODELING AND APPLICATIONS, 2005, 3445 : 114 - 135
  • [4] Acoustic-to-Articulatory Mapping With Joint Optimization of Deep Speech Enhancement and Articulatory Inversion Models
    Shahrebabaki, Abdolreza Sabzi
    Salvi, Giampiero
    Svendsen, Torbjorn
    Siniscalchi, Sabato Marco
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2022, 30 : 135 - 147
  • [5] Deep Neural Network Based Acoustic-to-articulatory Inversion Using Phone Sequence Information
    Xie, Xurong
    Liu, Xunying
    Wang, Lan
    17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 1497 - 1501
  • [6] Deep Acoustic-to-Articulatory Inversion Mapping with Latent Trajectory Modeling
    Tobing, Patrick Lumban
    Kameoka, Hirokazu
    Toda, Tomoki
    2017 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC 2017), 2017, : 1233 - 1236
  • [7] REPRESENTATION LEARNING USING CONVOLUTION NEURAL NETWORK FOR ACOUSTIC-TO-ARTICULATORY INVERSION
    Illa, Aravind
    Ghosh, Prasanta Kumar
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 5931 - 5935
  • [8] UNSUPERVISED ACOUSTIC-TO-ARTICULATORY INVERSION NEURAL NETWORK LEARNING BASED ON DETERMINISTIC POLICY GRADIENT
    Shibata, Hayato
    Zhang, Mingxin
    Shinozaki, Takahiro
    2021 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP (SLT), 2021, : 530 - 537
  • [9] Acoustic-to-articulatory mapping based on mixture of probabilistic canonical correlation analysis
    Uchida, Hidetsugu
    Saito, Daisuke
    Minematsu, Nobuaki
    18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 989 - 993
  • [10] DEEP-LEVEL ACOUSTIC-TO-ARTICULATORY MAPPING FOR DBN-HMM BASED PHONE RECOGNITION
    Badino, Leonardo
    Canevari, Claudia
    Fadiga, Luciano
    Metta, Giorgio
    2012 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2012), 2012, : 370 - 375