Speaker discrimination in humans and machines: Effects of speaking style variability

被引:1
|
作者
Afshan, Amber [1 ]
Kreiman, Jody [2 ,3 ]
Alwan, Abeer [1 ]
机构
[1] Univ Calif Los Angeles, Dept Elect & Comp Engn, Los Angeles, CA 90024 USA
[2] Univ Calif Los Angeles, Dept Head & Neck Surg, Los Angeles, CA 90024 USA
[3] Univ Calif Los Angeles, Dept Linguist, Los Angeles, CA 90024 USA
来源
关键词
speaker perception; speaking style; automatic speaker verification; human assisted speaker discrimination;
D O I
10.21437/Interspeech.2020-3004
中图分类号
R36 [病理学]; R76 [耳鼻咽喉科学];
学科分类号
100104 ; 100213 ;
摘要
Does speaking style variation affect humans' ability to distinguish individuals from their voices? How do humans compare with automatic systems designed to discriminate between voices? In this paper, we attempt to answer these questions by comparing human and machine speaker discrimination performance for read speech versus casual conversations. Thirty listeners were asked to perform a same versus different speaker task. Their performance was compared to a state-of-the-art x-vector/PLDA-based automatic speaker verification system. Results showed that both humans and machines performed better with style-matched stimuli, and human performance was better when listeners were native speakers of American English. Native listeners performed better than machines in the style-matched conditions (EERs of 6.96% versus 14.35% for read speech, and 15.12% versus 19.87%, for conversations), but for style-mismatched conditions, there was no significant difference between native listeners and machines. In all conditions, fusing human responses with machine results showed improvements compared to each alone, suggesting that humans and machines have different approaches to speaker discrimination tasks. Differences in the approaches were further confirmed by examining results for individual speakers which showed that the perception of distinct and confused speakers differed between human listeners and machines.
引用
收藏
页码:3136 / 3140
页数:5
相关论文
共 50 条
  • [21] On the speaker discriminatory power asymmetry regarding acoustic-phonetic parameters and the impact of speaking style
    Cavalcanti, Julio Cesar
    Eriksson, Anders
    Barbosa, Plinio A.
    [J]. FRONTIERS IN PSYCHOLOGY, 2023, 14
  • [22] Effect of Language, Speaking Style and Speaker on Long-term F0 Estimation
    Arantes, Pablo
    Eriksson, Anders
    Gutzeit, Suska
    [J]. 18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 3897 - 3901
  • [23] GRIMM,JACOB AS A SPEAKER IN THE PAULSKIRCHE - CHANGES IN THE STYLE OF SPEAKING FAVORED BY MEMBERS OF GERMAN LEGISLATIVE BODIES
    ERBEN, J
    [J]. ZEITSCHRIFT FUR DEUTSCHE PHILOLOGIE, 1986, 105 (01): : 100 - 113
  • [24] ACUTE DRUG EFFECTS ON SPEAKING IN ISOLATED HUMANS
    HIGGINS, ST
    STITZER, ML
    OLEARY, DK
    [J]. PHARMACOLOGY BIOCHEMISTRY AND BEHAVIOR, 1985, 22 (06) : 1082 - 1082
  • [25] Speech kinematic variability in adults who stutter is influenced by treatment and speaking style
    Loucks, Torrey M.
    Pelczarski, Kristin M.
    Lomheim, Holly
    Aalto, Daniel
    [J]. JOURNAL OF COMMUNICATION DISORDERS, 2022, 96
  • [26] A general auditory bias for handling speaker variability in speech? Evidence in humans and songbirds
    Kriengwatana, Buddhamas
    Escudero, Paola
    Kerkhoven, Anne H.
    ten Cate, Carel
    [J]. FRONTIERS IN PSYCHOLOGY, 2015, 6
  • [27] Effects of speaking style on speech intelligibility for Mandarin-speaking cochlear implant users
    Li, Yongxin
    Zhang, Guoping
    Kang, Hou-yong
    Liu, Sha
    Han, Deming
    Fu, Qian-Jie
    [J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2011, 129 (06): : EL242 - EL247
  • [28] Humans as Feature Extractors: Combining Prosody and Personality Perception for Improved Speaking Style Recognition
    Mohammadi, Gelareh
    Vinciarelli, Alessandro
    [J]. 2011 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2011, : 363 - 366
  • [29] Speaker discrimination performance for easy versus hard voices in style-matched and -mismatched speech
    Afshan, Amber
    Kreiman, Jody
    Alwan, Abeer
    [J]. Journal of the Acoustical Society of America, 2022, 151 (02): : 1393 - 1403
  • [30] Speaker discrimination performance for "easy" versus "hard" voices in style-matched and -mismatched speech
    Afshan, Amber
    Kreiman, Jody
    Alwan, Abeer
    [J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2022, 151 (02): : 1393 - 1403