Contribution of modulation spectral features for cross-lingual speech emotion recognition under noisy reverberant conditions

被引：0

作者：

Guo, Taiyang ^{[1
]}

Li, Sixia ^{[1
]}

Kidani, Shunsuke ^{[1
]}

Okada, Shogo ^{[1
]}

Unoki, Masashi ^{[1
]}

机构：

[1] Japan Adv Inst Sci & Technol, 1-1 Asahidai, Nomi, Ishikawa 9231292, Japan

来源：

2023 ASIA PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE, APSIPA ASC | 2023年

基金：

日本学术振兴会;

关键词：

D O I：

10.1109/APSIPAASC58517.2023.10317449

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Handling multiple languages under noisy reverberant conditions has become increasingly important for speech emotion recognition (SER). Previous studies found that modulation spectral features (MSFs) are robust to noisy reverberant conditions for SER. However, they mainly focused on specific languages; the universality of MSFs among languages is still unclear. To address this issue, we compared MSFs, hand-crafted features, Wav2Vec2.0-based features, MSFs+hand-crafted features for SER on four languages under 12 noisy reverberant conditions. Intra-lingual results showed that MSFs+hand-crafted features performed best on most conditions of all languages. Inter-lingual results showed that MSFs performed best on most conditions of test languages except training on a tonal language and testing on others. The results demonstrate that MSFs are robust to multilingual SER under noisy reverberant conditions and suggest that MSFs are potentially language-independent features for nontonal languages.

引用

页码：2221 / 2227

页数：7

共 50 条

[41] Cost-efficient cross-lingual adaptation of a speech recognition system
Callejas, Zoraida
Nouza, Jan
Cerva, Petr
López-Cózar, Ramón
Advances in Intelligent and Soft Computing, 2009, 57 : 331 - 338
[42] CROSS-LINGUAL PHONEME MAPPING FOR LANGUAGE ROBUST CONTEXTUAL SPEECH RECOGNITION
Patel, Ami
Li, David
Cho, Eunjoon
Aleksic, Petar
2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5924 - 5928
[43] Cross-lingual Self-Supervised Speech Representations for Improved Dysarthric Speech Recognition
Hernandez, Abner
Perez-Toro, Paula Andrea
Noeth, Elmar
Orozco-Arroyave, Juan Rafael
Maier, Andreas
Yang, Seung Hee
INTERSPEECH 2022, 2022, : 51 - 55
[44] Modulation frequency features for phoneme recognition in noisy speech
Ganapathy, Sriram
Thomas, Samuel
Hermansky, Hynek
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2009, 125 (01): : EL8 - EL12
[45] CROSS-LINGUAL CONTEXT SHARING AND PARAMETER-TYING FOR MULTI-LINGUAL SPEECH RECOGNITION
Mohan, Aanchan
Rose, Richard
2013 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING (ASRU), 2013, : 126 - 131
[46] Speech Recognition by Denoising and Dereverberation Based on Spectral Subtraction in a Real Noisy Reverberant Environment
Odani, Kyohei
Wang, Longbiao
Kai, Atsuhiko
13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 1250 - 1253
[47] A many-to-one phone mapping approach for cross-lingual speech recognition
Do, Van Hai
Chen, Nancy F.
Lim, Boon Pang
Hasegawa-Johnson, Mark
2016 IEEE RIVF INTERNATIONAL CONFERENCE ON COMPUTING & COMMUNICATION TECHNOLOGIES, RESEARCH, INNOVATION, AND VISION FOR THE FUTURE (RIVF), 2016, : 120 - 124
[48] TOWARDS TRANSFERABLE SPEECH EMOTION REPRESENTATION: ON LOSS FUNCTIONS FOR CROSS-LINGUAL LATENT REPRESENTATIONS
Das, Sneha
Lonfeldt, Nicole Nadine
Pagsberg, Anne Katrine
Clemmensen, Line H.
2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 6452 - 6456
[49] Semantic speech recognition in the Basque context Part I: cross-lingual approaches
Barroso, Nora
Lopez de Ipina, Karmele
Barroso, Odei
Ezeiza, Aitzol
Hernandez, Carmen
Grana, Manuel
INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2012, 15 (01) : 33 - 40
[50] CROSS-LINGUAL SPEECH RECOGNITION BETWEEN LANGUAGES FROM THE SAME LANGUAGE FAMILY
Zgank, Andrej
PROCEEDINGS OF THE ROMANIAN ACADEMY SERIES A-MATHEMATICS PHYSICS TECHNICAL SCIENCES INFORMATION SCIENCE, 2019, 20 (02): : 184 - 191

← 1 2 3 4 5 →