Automatic prediction of intelligible speaking rate for individuals with ALS from speech acoustic and articulatory samples

被引:65
|
作者
Wang, Jun [1 ,2 ]
Kothalkar, Prasanna V. [1 ]
Kim, Myungjong [1 ]
Bandini, Andrea [3 ]
Cao, Beiming [1 ]
Yunusova, Yana [3 ]
Campbell, Thomas F. [2 ]
Heitzman, Daragh [4 ]
Green, Jordan R. [5 ]
机构
[1] Dept Bioengn Speech Disorders & Technol Lab, BSB 13-302,800 W Campbell Rd, Richardson, TX 75080 USA
[2] Univ Texas Dallas, Callier Ctr Commun Disorders, Richardson, TX 75083 USA
[3] Univ Toronto, Dept Speech Language Pathol, Toronto, ON, Canada
[4] MDA ALS Ctr, Houston, TX USA
[5] MGH Inst Hlth Profess, Dept Commun Sci & Disorders, Boston, MA USA
基金
美国国家卫生研究院;
关键词
amyotrophic lateral sclerosis; dysarthria; speech kinematics; intelligible speaking rate; machine learning; support vector machine; AMYOTROPHIC-LATERAL-SCLEROSIS; PARKINSONS-DISEASE; TUTORIAL; BULBAR; TONGUE;
D O I
10.1080/17549507.2018.1508499
中图分类号
R36 [病理学]; R76 [耳鼻咽喉科学];
学科分类号
100104 ; 100213 ;
摘要
Purpose: This research aimed to automatically predict intelligible speaking rate for individuals with Amyotrophic Lateral Sclerosis (ALS) based on speech acoustic and articulatory samples. Method: Twelve participants with ALS and two normal subjects produced a total of 1831 phrases. NDI Wave system was used to collect tongue and lip movement and acoustic data synchronously. A machine learning algorithm (i.e. support vector machine) was used to predict intelligible speaking rate (speech intelligibility x speaking rate) from acoustic and articulatory features of the recorded samples. Result: Acoustic, lip movement, and tongue movement information separately, yielded a R-2 of 0.652, 0.660, and 0.678 and a Root Mean Squared Error (RMSE) of 41.096, 41.166, and 39.855 words per minute (WPM) between the predicted and actual values, respectively. Combining acoustic, lip and tongue information we obtained the highest R-2 (0.712) and the lowest RMSE (37.562 WPM). Conclusion: The results revealed that our proposed analyses predicted the intelligible speaking rate of the participant with reasonably high accuracy by extracting the acoustic and/or articulatory features from one short speech sample. With further development, the analyses may be well-suited for clinical applications that require automatic speech severity prediction.
引用
收藏
页码:669 / 679
页数:11
相关论文
共 50 条
  • [1] Towards Automatic Detection of Amyotrophic Lateral Sclerosis from Speech Acoustic and Articulatory Samples
    Wang, Jun
    Kothalkar, Prasanna V.
    Cao, Beiming
    Heitzman, Daragh
    17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 1195 - 1199
  • [2] The impact of speaking rate on acoustic-to-articulatory inversion
    Illa, Aravind
    Ghosh, Prasanta Kumar
    COMPUTER SPEECH AND LANGUAGE, 2020, 59 (75-90): : 75 - 90
  • [3] Speaking rate effects on articulatory pattern consistency in talkers with mild ALS
    Mefferd, Antje S.
    Pattee, Gary L.
    Green, Jordan R.
    CLINICAL LINGUISTICS & PHONETICS, 2014, 28 (11) : 799 - 811
  • [4] Automatic speech recognition using articulatory features from subject-independent acoustic-to-articulatory inversion
    Ghosh, Prasanta Kumar
    Narayanan, Shrikanth
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2011, 130 (04): : EL251 - EL257
  • [5] Articulatory-to-Acoustic Relations in Response to Speaking Rate and Loudness Manipulations
    Mefferd, Antje S.
    Green, Jordan R.
    JOURNAL OF SPEECH LANGUAGE AND HEARING RESEARCH, 2010, 53 (05): : 1206 - 1219
  • [6] Features extraction for the automatic detection of ALS disease from acoustic speech signals
    Vashkevich, Maxim
    Azarov, Elias
    Petrovsky, Alexander
    Rushkevich, Yuliya
    2018 SIGNAL PROCESSING: ALGORITHMS, ARCHITECTURES, ARRANGEMENTS, AND APPLICATIONS (SPA), 2018, : 321 - 326
  • [7] Hidden Markov models merging acoustic and articulatory information to automatic speech recognition
    Jacob, B
    Senac, C
    ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 2313 - 2315
  • [8] Speech rate task-specific representation learning from acoustic-articulatory data
    Mannem, Renuka
    Jyothi, Hima R.
    Illa, Aravind
    Ghosh, Prasanta Kumar
    INTERSPEECH 2020, 2020, : 2892 - 2896
  • [9] Validation of Articulatory Rate and Imprecision Judgments in Speech of Individuals With Amyotrophic Lateral Sclerosis
    Waito, Ashley A.
    Wehbe, Farah
    Marzouqah, Reeman
    Barnett, Carolina
    Shellikeri, Sanjana
    Cui, Cindy
    Abrahao, Agessandro
    Zinman, Lorne
    Green, Jordan R.
    Yunusova, Yana
    AMERICAN JOURNAL OF SPEECH-LANGUAGE PATHOLOGY, 2021, 30 (01) : 137 - 149
  • [10] The TORGO database of acoustic and articulatory speech from speakers with dysarthria
    Frank Rudzicz
    Aravind Kumar Namasivayam
    Talya Wolff
    Language Resources and Evaluation, 2012, 46 : 523 - 541