Alzheimer Disease Recognition Using Speech-Based Embeddings From Pre-Trained Models

被引:16
|
作者
Gauder, Lara [1 ,2 ]
Pepino, Leonardo [1 ,2 ]
Ferrer, Luciana [1 ]
Riera, Pablo [1 ]
机构
[1] CONICET UBA, Inst Invest Ciencias Computac ICC, Buenos Aires, DF, Argentina
[2] Univ Buenos Aires UBA, Fac Ciencias Exactas & Nat, Dept Computac, Buenos Aires, DF, Argentina
来源
关键词
computational paralinguistics; ADreSSo challenge; Alzheimer's Disease recognition;
D O I
10.21437/Interspeech.2021-753
中图分类号
R36 [病理学]; R76 [耳鼻咽喉科学];
学科分类号
100104 ; 100213 ;
摘要
This paper describes our submission to the ADreSSo Challenge, which focuses on the problem of automatic recognition of Alzheimer's Disease (AD) from speech. The audio samples contain speech from the subjects describing a picture with the guidance of an experimenter. Our approach to the problem is based on the use of embeddings extracted from different pretrained models - trill, allosaurus, and wav2vec 2.0 - which were trained to solve different speech tasks. These features are modeled with a neural network that takes short segments of speech as input, generating an AD score per segment. The final score for an audio file is given by the average over all segments in the file. We include ablation results to show the performance of different feature types individually and in combination, a study of the effect of the segment size, and an analysis of statistical significance. Our results on the test data for the challenge reach an accuracy of 78.9%, outperforming both the acoustic and linguistic baselines provided by the organizers.
引用
收藏
页码:3795 / 3799
页数:5
相关论文
共 50 条
  • [1] On the Sentence Embeddings from Pre-trained Language Models
    Li, Bohan
    Zhou, Hao
    He, Junxian
    Wang, Mingxuan
    Yang, Yiming
    Li, Lei
    [J]. PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 9119 - 9130
  • [2] Comparing Pre-trained and Feature-Based Models for Prediction of Alzheimer's Disease Based on Speech
    Balagopalan, Aparna
    Eyre, Benjamin
    Robin, Jessica
    Rudzicz, Frank
    Novikova, Jekaterina
    [J]. FRONTIERS IN AGING NEUROSCIENCE, 2021, 13
  • [3] Interpretabilty of Speech Emotion Recognition modelled using Self-Supervised Speech and Text Pre-Trained Embeddings
    Girish, K. V. Vijay
    Konjeti, Srikanth
    Vepa, Jithendra
    [J]. INTERSPEECH 2022, 2022, : 4496 - 4500
  • [4] Distilling Relation Embeddings from Pre-trained Language Models
    Ushio, Asahi
    Camacho-Collados, Jose
    Schockaert, Steven
    [J]. 2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 9044 - 9062
  • [5] Detecting Alzheimer's Disease Based on Acoustic Features Extracted from Pre-trained Models
    Mei, Kangdi
    Guo, Zhiqiang
    Liu, Zhaoci
    Liu, Lijuan
    Li, Xin
    Ling, Zhenhua
    [J]. ARTIFICIAL INTELLIGENCE, CICAI 2022, PT III, 2022, 13606 : 272 - 283
  • [6] Direct enhancement of pre-trained speech embeddings for speech processing in noisy conditions
    Ali, Mohamed Nabih
    Brutti, Alessio
    Falavigna, Daniele
    [J]. COMPUTER SPEECH AND LANGUAGE, 2023, 81
  • [7] IMPROVING CTC-BASED SPEECH RECOGNITION VIA KNOWLEDGE TRANSFERRING FROM PRE-TRAINED LANGUAGE MODELS
    Deng, Keqi
    Cao, Songjun
    Zhang, Yike
    Ma, Long
    Cheng, Gaofeng
    Xu, Ji
    Zhang, Pengyuan
    [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 8517 - 8521
  • [8] Pre-trained Text Embeddings for Enhanced Text-to-Speech Synthesis
    Hayashi, Tomoki
    Watanabe, Shinji
    Toda, Tomoki
    Takeda, Kazuya
    Toshniwal, Shubham
    Livescu, Karen
    [J]. INTERSPEECH 2019, 2019, : 4430 - 4434
  • [9] Detecting Dementia from Transcribed Speech in Slovak using Pre-trained BERT Models
    Stas, Jan
    Hladek, Daniel
    Kopnicky, Ales
    [J]. 2024 34TH INTERNATIONAL CONFERENCE RADIOELEKTRONIKA, RADIOELEKTRONIKA 2024, 2024,
  • [10] Non-Autoregressive ASR Modeling Using Pre-Trained Language Models for Chinese Speech Recognition
    Yu, Fu-Hao
    Chen, Kuan-Yu
    Lu, Ke-Han
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2022, 30 : 1474 - 1482