Automated Detection of Voice Disorder in the Saarbrucken Voice Database: Effects of Pathology Subset and Audio Materials

被引:13
|
作者
Huckvale, Mark [1 ]
Buciuleac, Catinca [1 ]
机构
[1] UCL, Speech Hearing & Phonet Sci, London, England
来源
关键词
voice disorders; machine learning; health applications;
D O I
10.21437/Interspeech.2021-1507
中图分类号
R36 [病理学]; R76 [耳鼻咽喉科学];
学科分类号
100104 ; 100213 ;
摘要
The Saarbrucken Voice Database contains speech and simultaneous electroglottography recordings of 1002 speakers exhibiting a wide range of voice disorders, together with recordings of 851 controls. Previous studies have used this database to build systems for automated detection of voice disorders and for differential diagnosis. These studies have varied considerably in the subset of pathologies tested, the audio materials analyzed, the cross-validation method used and the performance metric reported. This variation has made it hard to determine the most promising approaches to the problem of detecting voice disorders. In this study we reimplement three recently published systems that have been trained to detect pathology using the SVD and compare their performance on the same pathologies with the same audio materials using a common cross-validation protocol and performance metric. We show that under this approach, there is much less difference in performance across systems than in their original publication. We also show that voice disorder detection on the basis of a short phrase gives similar performance to that based on a sequence of vowels of different pitch. Our evaluation protocol may be useful for future studies on voice disorder detection with the SVD.
引用
收藏
页码:1399 / 1403
页数:5
相关论文
共 50 条
  • [31] PHASE-BASED INFORMATION FOR VOICE PATHOLOGY DETECTION
    Drugman, Thomas
    Dubuisson, Thomas
    Dutoit, Thierry
    [J]. 2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 4612 - 4615
  • [32] Spectrum analysis of vocalization application for voice pathology detection
    Fetisova, O. G.
    Lamtyugin, D. V.
    Makukha, V. K.
    Voronin, E. M.
    [J]. EUROCON 2007: THE INTERNATIONAL CONFERENCE ON COMPUTER AS A TOOL, VOLS 1-6, 2007, : 1405 - 1408
  • [33] An Investigation of Multidimensional Voice Program Parameters in Three Different Databases for Voice Pathology Detection and Classification
    Al-nasheri, Ahmed
    Muhammad, Ghulam
    Alsulaiman, Mansour
    Ali, Zulfiqar
    Mesallam, Tamer A.
    Farahat, Mohamed
    Malki, Khalid H.
    Bencherif, Mohamed A.
    [J]. JOURNAL OF VOICE, 2017, 31 (01) : 113.e9 - 113.e18
  • [34] Glottal Source biometrical signature for voice pathology detection
    Gomez-Vilda, Pedro
    Fernandez-Baillo, Roberto
    Rodellar-Biarge, Victoria
    Nieto Lluis, Victor
    Alvarez-Marquina, Agustin
    Miguel Mazaira-Fernandez, Luis
    Martinez-Olalla, Rafael
    Ignacio Godino-Llorente, Juan
    [J]. SPEECH COMMUNICATION, 2009, 51 (09) : 759 - 781
  • [35] Biosignal data preprocessing: a voice pathology detection application
    Daza-Santacoloma, Genaro
    Suarez-Cifuentes, Julio Fernando
    Castellanos-Dominguez, German
    [J]. INGENIERIA E INVESTIGACION, 2009, 29 (03): : 92 - 96
  • [36] Complex Networks: Application to Pathology Detection in Voice Signals
    Sebastian Hurtado-Jaramillo, Juan
    Guarin, Diego L.
    Orozco, Alvaro
    [J]. 2012 ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY (EMBC), 2012, : 4229 - 4232
  • [37] Analysis of Amplitude and Frequency Perturbation in the Voice for Fake Audio Detection
    Li, Kai
    Wang, Yao
    Minh Le Nguyen
    Akagi, Masato
    Unoki, Masashi
    [J]. PROCEEDINGS OF 2022 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2022, : 929 - 936
  • [38] VOICE ACTIVITY DETECTION USING AUDIO-VISUAL INFORMATION
    Petsatodis, Theodoros
    Pnevmatikakis, Aristodemos
    Boukis, Christos
    [J]. 2009 16TH INTERNATIONAL CONFERENCE ON DIGITAL SIGNAL PROCESSING, VOLS 1 AND 2, 2009, : 216 - +
  • [39] FEMH Voice Data Challenge: Voice disorder Detection and Classification using Acoustic Descriptors
    Bhat, Chitralekha
    Kopparapu, Sunil Kumar
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2018, : 5233 - 5237
  • [40] Evaluation of Feature Learning Methods for Voice Disorder Detection
    Guan, Hongzhao
    Lerch, Alexander
    [J]. INTERNATIONAL JOURNAL OF SEMANTIC COMPUTING, 2019, 13 (04) : 453 - 470