Acoustic-to-articulatory Speech Inversion with Multi-task Learning

被引：0

作者：

Siriwardena, Yashish M. ^{[1
]}

Sivaraman, Ganesh ^{[2
]}

Espy-Wilson, Carol ^{[1
]}

机构：

[1] Univ Maryland, College Pk, MD 20742 USA

[2] Pindrop, Atlanta, GA USA

来源：

INTERSPEECH 2022 | 2022年

基金：

美国国家科学基金会;

关键词：

acoustic-to-articulatory speech inversion; multi-task learning; acoustic-to-phoneme mapping; biGRNNs; NEURAL-NETWORK; MOVEMENTS;

D O I：

10.21437/Interspeech.2022-11164

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Multi-task learning (MTL) frameworks have proven to be effective in diverse speech related tasks like automatic speech recognition (ASR) and speech emotion recognition. This paper proposes a MTL framework to perform acoustic-to-articulatory speech inversion by simultaneously learning an acoustic to phoneme mapping as a shared task. We use the Haskins Production Rate Comparison (HPRC) database which has both the electromagnetic articulography (EMA) data and the corresponding phonetic transcriptions. Performance of the system was measured by computing the correlation between estimated and actual tract variables (TVs) from the acoustic to articulatory speech inversion task. The proposed MTL based Bidirectional Gated Recurrent Neural Network (RNN) model learns to map the input acoustic features to nine TVs while outperforming the baseline model trained to perform only acoustic to articulatory inversion.

引用

页码：5020 / 5024

页数：5

共 50 条

[1] Multi-corpus Acoustic-to-articulatory Speech Inversion
Seneviratne, Nadee
Sivaraman, Ganesh
Espy-Wilson, Carol
[J]. INTERSPEECH 2019, 2019, : 859 - 863
[2] A COMPARATIVE STUDY OF ACOUSTIC-TO-ARTICULATORY INVERSION FOR NEUTRAL AND WHISPERED SPEECH
Illa, Aravind
Meenakshi, Nisha G.
Ghosh, Prasanta Kumar
[J]. 2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 5075 - 5079
[3] ACOUSTIC-TO-ARTICULATORY INVERSION BASED ON SPEECH DECOMPOSITION AND AUXILIARY FEATURE
Wang, Jianrong
Liu, Jinyu
Zhao, Longxuan
Wang, Shanyu
Yu, Ruiguo
Liu, Li
[J]. 2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 4808 - 4812
[4] The impact of cross language on acoustic-to-articulatory inversion and its influence on articulatory speech synthesis
Illa, Aravind
Nair, Aanish
Ghosh, Prasanta Kumar
[J]. 2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 8267 - 8271
[5] Acoustic-to-Articulatory Mapping With Joint Optimization of Deep Speech Enhancement and Articulatory Inversion Models
Shahrebabaki, Abdolreza Sabzi
Salvi, Giampiero
Svendsen, Torbjorn
Siniscalchi, Sabato Marco
[J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2022, 30 : 135 - 147
[6] Jerk Minimization for Acoustic-To-Articulatory Inversion
Rajpal, Avni
Patil, Hemant A.
[J]. 9th ISCA Speech Synthesis Workshop, SSW 2016, 2016, : 82 - 87
[7] Multi-task Learning for Acoustic Modeling Using Articulatory Attributes
Lee, Yueh-Ting
Chen, Xuan-Bo
Lee, Hung-Shin
Jang, Jyh-Shing Roger
Wang, Hsin-Min
[J]. 2019 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2019, : 855 - 861
[8] Formant Trajectories for Acoustic-to-Articulatory Inversion
Ozbek, I. Yuecel
Hasegawa-Johnson, Mark
Demirekler, Muebeccel
[J]. INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 2783 - +
[9] ACOUSTIC-TO-ARTICULATORY INVERSION FOR DYSARTHRIC SPEECH BY USING CROSS-CORPUS ACOUSTIC-ARTICULATORY DATA
Maharana, Sarthak Kumar
Illa, Aravind
Mannem, Renuka
Belur, Yamini
Shetty, Preetie
Kumar, Veeramani Preethish
Vengalil, Seena
Polavarapu, Kiran
Atchayaram, Nalini
Ghosh, Prasanta Kumar
[J]. 2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 6458 - 6462
[10] Analysis of acoustic-to-articulatory speech inversion across different accents and languages
Sivaraman, Ganesh
Espy-Wilson, Carol
Wieling, Martijn
[J]. 18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 974 - 978

← 1 2 3 4 5 →