Acoustic-to-articulatory Speech Inversion with Multi-task Learning

被引:0
|
作者
Siriwardena, Yashish M. [1 ]
Sivaraman, Ganesh [2 ]
Espy-Wilson, Carol [1 ]
机构
[1] Univ Maryland, College Pk, MD 20742 USA
[2] Pindrop, Atlanta, GA USA
来源
基金
美国国家科学基金会;
关键词
acoustic-to-articulatory speech inversion; multi-task learning; acoustic-to-phoneme mapping; biGRNNs; NEURAL-NETWORK; MOVEMENTS;
D O I
10.21437/Interspeech.2022-11164
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Multi-task learning (MTL) frameworks have proven to be effective in diverse speech related tasks like automatic speech recognition (ASR) and speech emotion recognition. This paper proposes a MTL framework to perform acoustic-to-articulatory speech inversion by simultaneously learning an acoustic to phoneme mapping as a shared task. We use the Haskins Production Rate Comparison (HPRC) database which has both the electromagnetic articulography (EMA) data and the corresponding phonetic transcriptions. Performance of the system was measured by computing the correlation between estimated and actual tract variables (TVs) from the acoustic to articulatory speech inversion task. The proposed MTL based Bidirectional Gated Recurrent Neural Network (RNN) model learns to map the input acoustic features to nine TVs while outperforming the baseline model trained to perform only acoustic to articulatory inversion.
引用
收藏
页码:5020 / 5024
页数:5
相关论文
共 50 条
  • [1] Multi-corpus Acoustic-to-articulatory Speech Inversion
    Seneviratne, Nadee
    Sivaraman, Ganesh
    Espy-Wilson, Carol
    [J]. INTERSPEECH 2019, 2019, : 859 - 863
  • [2] A COMPARATIVE STUDY OF ACOUSTIC-TO-ARTICULATORY INVERSION FOR NEUTRAL AND WHISPERED SPEECH
    Illa, Aravind
    Meenakshi, Nisha G.
    Ghosh, Prasanta Kumar
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 5075 - 5079
  • [3] ACOUSTIC-TO-ARTICULATORY INVERSION BASED ON SPEECH DECOMPOSITION AND AUXILIARY FEATURE
    Wang, Jianrong
    Liu, Jinyu
    Zhao, Longxuan
    Wang, Shanyu
    Yu, Ruiguo
    Liu, Li
    [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 4808 - 4812
  • [4] The impact of cross language on acoustic-to-articulatory inversion and its influence on articulatory speech synthesis
    Illa, Aravind
    Nair, Aanish
    Ghosh, Prasanta Kumar
    [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 8267 - 8271
  • [5] Acoustic-to-Articulatory Mapping With Joint Optimization of Deep Speech Enhancement and Articulatory Inversion Models
    Shahrebabaki, Abdolreza Sabzi
    Salvi, Giampiero
    Svendsen, Torbjorn
    Siniscalchi, Sabato Marco
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2022, 30 : 135 - 147
  • [6] Jerk Minimization for Acoustic-To-Articulatory Inversion
    Rajpal, Avni
    Patil, Hemant A.
    [J]. 9th ISCA Speech Synthesis Workshop, SSW 2016, 2016, : 82 - 87
  • [7] Multi-task Learning for Acoustic Modeling Using Articulatory Attributes
    Lee, Yueh-Ting
    Chen, Xuan-Bo
    Lee, Hung-Shin
    Jang, Jyh-Shing Roger
    Wang, Hsin-Min
    [J]. 2019 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2019, : 855 - 861
  • [8] Formant Trajectories for Acoustic-to-Articulatory Inversion
    Ozbek, I. Yuecel
    Hasegawa-Johnson, Mark
    Demirekler, Muebeccel
    [J]. INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 2783 - +
  • [9] ACOUSTIC-TO-ARTICULATORY INVERSION FOR DYSARTHRIC SPEECH BY USING CROSS-CORPUS ACOUSTIC-ARTICULATORY DATA
    Maharana, Sarthak Kumar
    Illa, Aravind
    Mannem, Renuka
    Belur, Yamini
    Shetty, Preetie
    Kumar, Veeramani Preethish
    Vengalil, Seena
    Polavarapu, Kiran
    Atchayaram, Nalini
    Ghosh, Prasanta Kumar
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 6458 - 6462
  • [10] Analysis of acoustic-to-articulatory speech inversion across different accents and languages
    Sivaraman, Ganesh
    Espy-Wilson, Carol
    Wieling, Martijn
    [J]. 18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 974 - 978