The effect of speech pathology on automatic speaker verification: a large-scale study

被引：0

作者：

Soroosh Tayebi Arasteh

Tobias Weise

Maria Schuster

Elmar Noeth

Andreas Maier

Seung Hee Yang

机构：

[1] Friedrich-Alexander-Universität Erlangen-Nürnberg,Pattern Recognition Lab

[2] Friedrich-Alexander-Universität Erlangen-Nürnberg,Speech & Language Processing Lab

[3] University Hospital RWTH Aachen,Department of Diagnostic and Interventional Radiology

[4] Ludwig-Maximilians-Universität München,Department of Otorhinolaryngology, Head and Neck Surgery

来源：

Scientific Reports | / 13卷

关键词：

D O I：

暂无

中图分类号：

学科分类号：

摘要：

Navigating the challenges of data-driven speech processing, one of the primary hurdles is accessing reliable pathological speech data. While public datasets appear to offer solutions, they come with inherent risks of potential unintended exposure of patient health information via re-identification attacks. Using a comprehensive real-world pathological speech corpus, with over n=\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$=$$\end{document}3800 test subjects spanning various age groups and speech disorders, we employed a deep-learning-driven automatic speaker verification (ASV) approach. This resulted in a notable mean equal error rate (EER) of 0.89±0.06%\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$0.89 \pm 0.06 \%$$\end{document}, outstripping traditional benchmarks. Our comprehensive assessments demonstrate that pathological speech overall faces heightened privacy breach risks compared to healthy speech. Specifically, adults with dysphonia are at heightened re-identification risks, whereas conditions like dysarthria yield results comparable to those of healthy speakers. Crucially, speech intelligibility does not influence the ASV system’s performance metrics. In pediatric cases, particularly those with cleft lip and palate, the recording environment plays a decisive role in re-identification. Merging data across pathological types led to a marked EER decrease, suggesting the potential benefits of pathological diversity in ASV, accompanied by a logarithmic boost in ASV effectiveness. In essence, this research sheds light on the dynamics between pathological speech and speaker verification, emphasizing its crucial role in safeguarding patient confidentiality in our increasingly digitized healthcare era.

引用

共 50 条

[1] The effect of speech pathology on automatic speaker verification: a large-scale study
Tayebi Arasteh, Soroosh
Weise, Tobias
Schuster, Maria
Noeth, Elmar
Maier, Andreas
Yang, Seung Hee
[J]. SCIENTIFIC REPORTS, 2023, 13 (01)
[2] LARGE-SCALE SELF-SUPERVISED SPEECH REPRESENTATION LEARNING FOR AUTOMATIC SPEAKER VERIFICATION
Chen, Zhengyang
Chen, Sanyuan
Wu, Yu
Qian, Yao
Wang, Chengyi
Liu, Shujie
Qian, Yanmin
Zeng, Michael
[J]. 2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 6147 - 6151
[3] Voxceleb: Large-scale speaker verification in the wild
Nagrani, Arsha
Chung, Joon Son
Xie, Weidi
Zisserman, Andrew
[J]. COMPUTER SPEECH AND LANGUAGE, 2020, 60
[4] SCHEME FOR SPEECH PROCESSING IN AUTOMATIC SPEAKER VERIFICATION
DAS, SK
MOHN, WS
[J]. IEEE TRANSACTIONS ON AUDIO AND ELECTROACOUSTICS, 1971, AU19 (01): : 32 - &
[5] ASSESSMENT OF AUTOMATIC SPEAKER VERIFICATION ON LOSSY TRANSCODED SPEECH
Polacky, Jozef
Jarina, Roman
Chmulik, Michal
[J]. 2016 4TH INTERNATIONAL WORKSHOP ON BIOMETRICS AND FORENSICS (IWBF), 2016,
[6] LARGE-SCALE SPEAKER IDENTIFICATION
Schmidt, Ludwig
Sharifi, Matthew
Moreno, Ignacio Lopez
[J]. 2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
[7] Study on Speaker Verification on Emotional Speech
Wu, Wei
Zheng, Thomas Fang
Xu, Ming-Xing
Bao, Huan-Jun
[J]. INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 2102 - 2105
[8] AN INVESTIGATION OF MONOTONIC TRANSDUCERS FOR LARGE-SCALE AUTOMATIC SPEECH RECOGNITION
Moritz, Niko
Seide, Frank
Le, Duc
Mahadeokar, Jay
Fuegen, Christian
[J]. 2022 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP, SLT, 2022, : 324 - 330
[9] Automatic Speech Recognition of Vietnamese for a New Large-Scale Corpus
Tran, Linh Thi Thuc
Kim, Han-Gyu
La, Hoang Minh
Pham, Su Van
[J]. ELECTRONICS, 2024, 13 (05)
[10] Speech Recognition with Large-Scale Speaker-Class-Based Acoustic Modeling
Konno, Kazuki
Kato, Masaharu
Kosaka, Tetsuo
[J]. 2013 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA), 2013,

← 1 2 3 4 5 →