The effect of speech pathology on automatic speaker verification: a large-scale study

被引:1
|
作者
Tayebi Arasteh, Soroosh [1 ,2 ,3 ]
Weise, Tobias [1 ,2 ]
Schuster, Maria [4 ]
Noeth, Elmar [1 ]
Maier, Andreas [1 ]
Yang, Seung Hee [2 ]
机构
[1] Friedrich Alexander Univ Erlangen Nurnberg, Pattern Recognit Lab, D-91058 Erlangen, Germany
[2] Friedrich Alexander Univ Erlangen Nurnberg, Speech & Language Proc Lab, D-91054 Erlangen, Germany
[3] Univ Hosp RWTH Aachen, Dept Diagnost & Intervent Radiol, D-52074 Aachen, Germany
[4] Ludwig Maximilians Univ Munchen, Dept Otorhinolaryngol Head & Neck Surg, D-80333 Munich, Germany
关键词
RECOGNITION; VOICE; ANONYMIZATION; ASR;
D O I
10.1038/s41598-023-47711-7
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Navigating the challenges of data-driven speech processing, one of the primary hurdles is accessing reliable pathological speech data. While public datasets appear to offer solutions, they come with inherent risks of potential unintended exposure of patient health information via re-identification attacks. Using a comprehensive real-world pathological speech corpus, with over n=3800 test subjects spanning various age groups and speech disorders, we employed a deep-learning-driven automatic speaker verification (ASV) approach. This resulted in a notable mean equal error rate (EER) of 0.89 +/- 0.06%, outstripping traditional benchmarks. Our comprehensive assessments demonstrate that pathological speech overall faces heightened privacy breach risks compared to healthy speech. Specifically, adults with dysphonia are at heightened re-identification risks, whereas conditions like dysarthria yield results comparable to those of healthy speakers. Crucially, speech intelligibility does not influence the ASV system's performance metrics. In pediatric cases, particularly those with cleft lip and palate, the recording environment plays a decisive role in re-identification. Merging data across pathological types led to a marked EER decrease, suggesting the potential benefits of pathological diversity in ASV, accompanied by a logarithmic boost in ASV effectiveness. In essence, this research sheds light on the dynamics between pathological speech and speaker verification, emphasizing its crucial role in safeguarding patient confidentiality in our increasingly digitized healthcare era.
引用
收藏
页数:14
相关论文
共 50 条
  • [1] The effect of speech pathology on automatic speaker verification: a large-scale study
    Soroosh Tayebi Arasteh
    Tobias Weise
    Maria Schuster
    Elmar Noeth
    Andreas Maier
    Seung Hee Yang
    [J]. Scientific Reports, 13
  • [2] LARGE-SCALE SELF-SUPERVISED SPEECH REPRESENTATION LEARNING FOR AUTOMATIC SPEAKER VERIFICATION
    Chen, Zhengyang
    Chen, Sanyuan
    Wu, Yu
    Qian, Yao
    Wang, Chengyi
    Liu, Shujie
    Qian, Yanmin
    Zeng, Michael
    [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 6147 - 6151
  • [3] Voxceleb: Large-scale speaker verification in the wild
    Nagrani, Arsha
    Chung, Joon Son
    Xie, Weidi
    Zisserman, Andrew
    [J]. COMPUTER SPEECH AND LANGUAGE, 2020, 60
  • [4] SCHEME FOR SPEECH PROCESSING IN AUTOMATIC SPEAKER VERIFICATION
    DAS, SK
    MOHN, WS
    [J]. IEEE TRANSACTIONS ON AUDIO AND ELECTROACOUSTICS, 1971, AU19 (01): : 32 - &
  • [5] ASSESSMENT OF AUTOMATIC SPEAKER VERIFICATION ON LOSSY TRANSCODED SPEECH
    Polacky, Jozef
    Jarina, Roman
    Chmulik, Michal
    [J]. 2016 4TH INTERNATIONAL WORKSHOP ON BIOMETRICS AND FORENSICS (IWBF), 2016,
  • [6] LARGE-SCALE SPEAKER IDENTIFICATION
    Schmidt, Ludwig
    Sharifi, Matthew
    Moreno, Ignacio Lopez
    [J]. 2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [7] Study on Speaker Verification on Emotional Speech
    Wu, Wei
    Zheng, Thomas Fang
    Xu, Ming-Xing
    Bao, Huan-Jun
    [J]. INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 2102 - 2105
  • [8] AN INVESTIGATION OF MONOTONIC TRANSDUCERS FOR LARGE-SCALE AUTOMATIC SPEECH RECOGNITION
    Moritz, Niko
    Seide, Frank
    Le, Duc
    Mahadeokar, Jay
    Fuegen, Christian
    [J]. 2022 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP, SLT, 2022, : 324 - 330
  • [9] Automatic Speech Recognition of Vietnamese for a New Large-Scale Corpus
    Tran, Linh Thi Thuc
    Kim, Han-Gyu
    La, Hoang Minh
    Pham, Su Van
    [J]. ELECTRONICS, 2024, 13 (05)
  • [10] Speech Recognition with Large-Scale Speaker-Class-Based Acoustic Modeling
    Konno, Kazuki
    Kato, Masaharu
    Kosaka, Tetsuo
    [J]. 2013 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA), 2013,