Testing acoustic voice quality classification across languages and speech styles

被引：1

作者：

Braun, Bettina ^{[1
]}

Dehe, Nicole ^{[1
]}

Einfeldt, Marieke ^{[1
]}

Wochner, Daniela ^{[1
]}

Zahner-Ritter, Katharina ^{[2
]}

机构：

[1] Univ Konstanz, Dept Linguist, Constance, Germany

[2] Univ Trier, Dept 2, Phonet, Trier, Germany

来源：

INTERSPEECH 2021 | 2021年

关键词：

voice quality; phonation type; acoustic measures; random forest; cross-linguistic generalization; infant-directed speech; German; Chinese; Icelandic; INFANT-DIRECTED SPEECH; PERCEPTION; EMOTION; BREATHY; FEMALE;

D O I：

10.21437/Interspeech.2021-315

中图分类号：

R36 [病理学]; R76 [耳鼻咽喉科学];

学科分类号：

100104 ; 100213 ;

摘要：

Many studies relate acoustic voice quality measures to perceptual classification. We extend this line of research by training a classifier on a balanced set of perceptually annotated voice quality categories with high inter-rater agreement, and test it on speech samples from a different language and on a different speech style. Annotations were done on continuous speech from different laboratory settings. In Experiment 1, we trained a random forest with Standard Chinese and German recordings labelled as modal, breathy, or glottalized. The model had an accuracy of 78.7% on unseen data from the same sample (most important variables were harmonics-to-noise ratio, cepstral-peak prominence, and H1-A2). This model was then used to classify data from a different language (Icelandic, Experiment 2) and to classify a different speech style (German infant-directed speech (IDS), Experiment 3). Cross-linguistic generalizability was high for Icelandic (78.6% accuracy), but lower for German IDS (71.7% accuracy). Accuracy of recordings of adult-directed speech from the same speakers as in Experiment 3 (77%, Experiment 4) suggests that it is the special speech style of IDS, rather than the recording setting that led to lower performance. Results are discussed in terms of efficiency of coding and generalizability across languages and speech styles.

引用

页码：3920 / 3924

页数：5

共 50 条

[1] Bilingual acoustic voice variation is similarly structured across languages
Johnson, Khia A.
Babel, Molly
Fuhrman, Robert A.
INTERSPEECH 2020, 2020, : 2387 - 2391
[2] Acoustic signal typing for evaluation of voice quality in tracheoesophageal speech
van As-Brooks, Corina J.
Koopmans-van Beinum, Florien J.
Pols, Louis C. W.
Hilgers, Frans J. M.
JOURNAL OF VOICE, 2006, 20 (03) : 355 - 368
[3] Analysis of acoustic-to-articulatory speech inversion across different accents and languages
Sivaraman, Ganesh
Espy-Wilson, Carol
Wieling, Martijn
18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 974 - 978
[4] VOICE QUALITY AND SPEAKING STYLES
Madureira, Sandra
de Souza Fontes, Mario Augusto
Fonseca, Beatriz Coelho
DIALECTOLOGIA, 2016, : 171 - 190
[5] The Role of Voice Quality in Mandarin Sarcastic Speech: An Acoustic and Electroglottographic Study
Li, Shanpeng
Gu, Wentao
Liu, Lei
Tang, Ping
JOURNAL OF SPEECH LANGUAGE AND HEARING RESEARCH, 2020, 63 (08): : 2578 - 2588
[6] Verb Classification Across Languages
Majewska, Olga
Korhonen, Anna
ANNUAL REVIEW OF LINGUISTICS, 2023, 9 : 313 - 333
[7] Modal and Nonmodal Voice Quality Classification Using Acoustic and Electroglottographic Features
Borsky, Michal
Mehta, Daryush D.
Van Stan, Jarrad H.
Gudnason, Jon
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2017, 25 (12) : 2281 - 2291
[8] Analysis of acoustic and voice quality features for the classification of infant and mother vocalizations
Li, Jialu
Hasegawa-Johnson, Mark
McElwain, Nancy L.
SPEECH COMMUNICATION, 2021, 133 : 41 - 61
[9] Acoustic analysis of disordered speech and voice
1600, (19):
[10] Acoustic voice variation in spontaneous speech
Lee, Yoonjeong
Kreiman, Jody
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2022, 151 (05): : 3462 - 3472

← 1 2 3 4 5 →