Automatic intelligibility assessment of pathologic speech in head and neck cancer based on auditory-inspired spectro-temporal modulations

被引:0
|
作者
Zhou, Xinhui [1 ]
Garcia-Romero, Daniel [1 ]
Mesgarani, Nima [1 ]
Stone, Maureen
Espy-Wilson, Carol [1 ]
Shamma, Shihab [1 ]
机构
[1] Univ Maryland, Dept Elect & Comp Engn, College Pk, MD 20742 USA
关键词
Oral; head and neck cancer; speech pathology; speech intelligibility; spectro-temporal modulation; support vector machine (SVM); Gaussian mixture model (GMM);
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Oral, head and neck cancer represents 3% of all cancers in the United States and is the 6th most common cancer worldwide. Depending on the tumor size, location and staging, patients are treated by radical surgery, radiology, chemotherapy or a combination of those treatments. As a result, their anatomical structures for speech are impaired and this leads to some negative impact on their speech intelligibility. As a part of the INTERSPEECH 2012 speaker trait Pathology sub-challenge, this study explored the use of auditory-inspired spectro-temporal modulation features for automatic speech intelligibility assessment of those pathologic speech. The averaged spectro-temporal modulations of speech considered as either intelligible or non-intelligible in the challenge database were analyzed and it was found that the non-intelligible speech tends to have its modulation amplitude peaks shift towards a smaller rate and scale. Based on SVM and GMM, variants of spectro-temporal modulation features were tested on the speaker trait challenge problem and the resulting performances on both the development and the test datasets are comparable to the baseline performance.
引用
收藏
页码:542 / 545
页数:4
相关论文
共 21 条
  • [1] Hooking up spectro-temporal filters with auditory-inspired representations for robust automatic speech recognition
    Meyer, Bernd T.
    Spille, Constantin
    Kollmeier, Birger
    Morgan, Nelson
    [J]. 13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 1258 - 1261
  • [2] Spectro-Temporal Representation of Speech for Intelligibility Assessment of Dysarthria
    Chandrashekar, H. M.
    Karjigi, Veena
    Sreedevi, N.
    [J]. IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, 2020, 14 (02) : 390 - 399
  • [3] Methods for capturing spectro-temporal modulations in automatic speech recognition
    Kleinschmidt, M
    [J]. ACTA ACUSTICA UNITED WITH ACUSTICA, 2002, 88 (03) : 416 - 422
  • [4] Speech discrimination based on multiscale spectro-temporal modulations
    Mesgarani, N
    Shamma, S
    Slaney, M
    [J]. 2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING, 2004, : 601 - 604
  • [5] A spectro-temporal modulation index (STMI) for assessment of speech intelligibility
    Elhilali, M
    Chi, T
    Shamma, SA
    [J]. SPEECH COMMUNICATION, 2003, 41 (2-3) : 331 - 348
  • [6] Improvement and Assessment of Spectro-Temporal Modulation Analysis for Speech Intelligibility Estimation
    Edraki, Amin
    Chan, Wai-Yip
    Jensen, Jesper
    Fogerty, Daniel
    [J]. INTERSPEECH 2019, 2019, : 1378 - 1382
  • [7] The role of auditory spectro-temporal modulation filtering and the decision metric for speech intelligibility prediction
    Chabot-Leclerc, Alexandre
    Jorgensen, Soren
    Dau, Torsten
    [J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2014, 135 (06): : 3502 - 3512
  • [8] Discrimination of speech from nonspeech based on multiscale spectro-temporal modulations
    Mesgarani, N
    Slaney, M
    Shamma, SA
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2006, 14 (03): : 920 - 930
  • [9] Automatic Assessment of Speech Intelligibility using Consonant Similarity for Head and Neck Cancer
    Quintas, Sebastiao
    Mauclair, Julie
    Woisard, Virginie
    Pinquier, Julien
    [J]. INTERSPEECH 2022, 2022, : 3608 - 3612
  • [10] Intelligibility assessment of cleft lip and palate speech using Gaussian posteriograms based on joint spectro-temporal features
    Kalita, Sishir
    Prasanna, S. R. Mahadeva
    Dandapat, S.
    [J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2018, 144 (04): : 2413 - 2423