Comparing accuracy in voice-based assessments of biological speaker traits across speech types

被引：0

作者：

Sorokowski, Piotr ^{[1
]}

Groyecka-Bernard, Agata ^{[1
]}

Frackowiak, Tomasz ^{[1
]}

Kobylarek, Aleksander ^{[2
]}

Kupczyk, Piotr ^{[1
]}

Sorokowska, Agnieszka ^{[1
]}

Misiak, Michal ^{[1
,3
]}

Oleszkiewicz, Anna ^{[1
,4
]}

Bugaj, Katarzyna ^{[1
]}

Wlodarczyk, Malgorzata ^{[1
]}

Pisanski, Katarzyna ^{[1
,5
,6
]}

机构：

[1] Univ Wroclaw, Inst Psychol, Wroclaw, Poland

[2] Univ Wroclaw, Inst Pedag, Wroclaw, Poland

[3] Univ Wroclaw, Being Human Lab, Wroclaw, Poland

[4] Tech Univ Dresden, Interdisciplinary Ctr Smell & Taste, Dept Otorhinolaryngol, Dresden, Germany

[5] Univ Lyon 2, Ctr Natl Rech Sci, CNRS, Lab Dynam Langage, Lyon, France

[6] Univ St Etienne, ENES Bioacoust Res Lab, CRNL, CNRS,Inserm, St Etienne, France

来源：

SCIENTIFIC REPORTS | 2023年 / 13卷 / 01期

关键词：

MENS VOICES; PITCH; ATTRACTIVENESS; PARAMETERS; PERCEPTION; WOMEN; SIZE; OZ;

D O I：

10.1038/s41598-023-49596-y

中图分类号：

O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];

学科分类号：

07 ; 0710 ; 09 ;

摘要：

Nonverbal acoustic parameters of the human voice provide cues to a vocaliser's sex, age, and body size that are relevant in human social and sexual communication, and also increasingly so for computer-based voice recognition and synthesis technologies. While studies have shown some capacity in human listeners to gauge these biological traits from unseen speakers, it remains unknown whether speech complexity improves accuracy. Here, in over 200 vocalisers and 1500 listeners of both sexes, we test whether voice-based assessments of sex, age, height and weight vary from isolated vowels and words, to sequences of vowels and words, to full sentences or paragraphs. We show that while listeners judge sex and especially age more accurately as speech complexity increases, accuracy remains high across speech types, even for a single vowel sound. In contrast, the actual heights and weights of vocalisers explain comparatively less variance in listener's assessments of body size, which do not vary systematically by speech type. Our results thus show that while more complex speech can improve listeners' biological assessments, the gain is ecologically small, as listeners already show an impressive capacity to gauge speaker traits from extremely short bouts of standardised speech, likely owing to within-speaker stability in underlying nonverbal vocal parameters such as voice pitch. We discuss the methodological, technological, and social implications of these results.

引用

页数：9

共 50 条

[31] Average-voice-based speech synthesis using HSMM-based speaker adaptation and adaptive training
Yamagishi, Junichi
Kobayashi, Takao
[J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2007, E90D (02) : 533 - 543
[32] Tone correctness improvement in speaker-independent average-voice-based Thai speech synthesis
Chomphan, Suphattharachal
Kobayashi, Takao
[J]. SPEECH COMMUNICATION, 2009, 51 (04) : 330 - 343
[33] Speaker Dependent Approach for Enhancing a Glossectomy Patient's Speech via GMM-based Voice Conversion
Tanaka, Kei
Hara, Sunao
Abe, Masanobu
Sato, Masaaki
Minagi, Shogo
[J]. 18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 3384 - 3388
[34] Speaker and style adaptation using average voice model for style control in HMM-based speech synthesis
Tachibana, Makoto
Izawa, Shinsuke
Nose, Takashi
Kobayashi, Takao
[J]. 2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 4633 - 4636
[35] Construction of a Voice-based Asynchronous Communication System Utilizing Speech Recognition and Its Potential for EFL Learners' Speaking Ability: A Pilot Study
Ono, Yuichi
Ishii, Takumi
Ohnishi, Akio
[J]. 15TH IEEE INTERNATIONAL CONFERENCE ON ADVANCED LEARNING TECHNOLOGIES (ICALT 2015), 2015, : 317 - 319
[36] Comparing the adaptive landscape across trait types: larger QTL effect size in traits under biotic selection
Louthan, Allison M.
Kay, Kathleen M.
[J]. BMC EVOLUTIONARY BIOLOGY, 2011, 11
[37] Comparing the adaptive landscape across trait types: larger QTL effect size in traits under biotic selection
Allison M Louthan
Kathleen M Kay
[J]. BMC Evolutionary Biology, 11
[38] Speaker-dependent speech recognition based on phone-like units models - Application to voice dialing
Fontaine, V
Bourlard, H
[J]. 1997 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I - V: VOL I: PLENARY, EXPERT SUMMARIES, SPECIAL, AUDIO, UNDERWATER ACOUSTICS, VLSI; VOL II: SPEECH PROCESSING; VOL III: SPEECH PROCESSING, DIGITAL SIGNAL PROCESSING; VOL IV: MULTIDIMENSIONAL SIGNAL PROCESSING, NEURAL NETWORKS - VOL V: STATISTICAL SIGNAL AND ARRAY PROCESSING, APPLICATIONS, 1997, : 1527 - 1530
[39] Authorial voice in source-based and opinion-based argumentative writing: Patterns of voice across task types and proficiency levels
Yoon, Hyung-Jo
Tabari, Mahmoud Abdi
[J]. JOURNAL OF ENGLISH FOR ACADEMIC PURPOSES, 2023, 62
[40] "Just One Short Voice Message"-Comparing the Effects of Text- vs. Voice-Based Answering to Text Messages via Smartphone on Young Drivers' Driving Performances
Kurtz, Max
Oehl, Michael
Sutter, Christine
[J]. SAFETY, 2021, 7 (03)

← 1 2 3 4 5 →