THE IN-THE-WILD SPEECH MEDICAL CORPUS

被引:4
|
作者
Correia, Joana [1 ,2 ]
Teixeira, Francisco [2 ]
Botelho, Catarina [2 ]
Trancoso, Isabel [2 ]
Raj, Bhiksha [1 ]
机构
[1] Carnegie Mellon Univ, Pittsburgh, PA 15213 USA
[2] Univ Lisbon, INESC ID, Lisbon, Portugal
关键词
Speech affecting diseases; pathological speech; in-the-wild; i-vectors; x-vectors; PARKINSONS-DISEASE;
D O I
10.1109/ICASSP39728.2021.9414230
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Automatic detection of speech affecting (SA) diseases has received significant attention, particularly in clinical scenarios. However, the same task in in-the-wild conditions is often neglected, in part, due to the lack of appropriate datasets. In this work, we present the in-the-Wild Speech Medical (WSM) Corpus, a collection of in-the-wild videos, featuring subjects potentially affected by a SA disease - specifically, depression or Parkinson's disease. The WSM Corpus contains a total 928 videos, and over 131 hours of speech. Each video is accompanied by a crowdsourced annotation for perceived age/gender, and self-reported health status of the speaker. The WSM Corpus is balanced over all the labels. In this work we present a detailed description of the collection, and annotation processes of the WSM corpus. Furthermore, we present present several baseline systems for the detection of SA diseases using speech alone, thus motivating the use of this type of in-the-wild data in paralinguistic audiovisual tasks.
引用
收藏
页码:6973 / 6977
页数:5
相关论文
共 50 条
  • [1] Tracking Authentic and In-the-wild Emotions Using Speech
    Pandit, Vedhas
    Cummins, Nicholas
    Schmitt, Maximilian
    Hantke, Simone
    Graf, Franz
    Paletta, Lucas
    Schuller, Bjoern
    2018 FIRST ASIAN CONFERENCE ON AFFECTIVE COMPUTING AND INTELLIGENT INTERACTION (ACII ASIA), 2018,
  • [2] Towards Accurate Lip-to-Speech Synthesis in-the-Wild
    Hegde, Sindhu
    Mukhopadhyay, Rudrabha
    Jawahar, C. V.
    Namboodiri, Vinay
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 5523 - 5531
  • [3] IN-THE-WILD END-TO-END DETECTION OF SPEECH AFFECTING DISEASES
    Correia, Joana
    Trancoso, Isabel
    Raj, Bhiksha
    2019 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU 2019), 2019, : 734 - 741
  • [4] Enough with 'In-The-Wild'
    Ssozi-Mugarura, Fiona
    Reitmaier, Thomas
    Venter, Anja
    Blake, Edwin
    PROCEEDINGS OF THE FIRST AFRICAN CONFERENCE FOR HUMAN COMPUTER INTERACTION (AFRICHI'16), 2016, : 182 - 186
  • [5] Zero-Shot Keyword Spotting for Visual Speech Recognition In-the-wild
    Stafylakis, Themos
    Tzimiropoulos, Georgios
    COMPUTER VISION - ECCV 2018, PT IV, 2018, 11208 : 536 - 552
  • [6] Synthesising 3D Facial Motion from "In-the-Wild" Speech
    Tzirakis, Panagiotis
    Papaioannou, Athanasios
    Lattas, Alexandros
    Tarasiou, Michail
    Schuller, Bjoern
    Zafeiriou, Stefanos
    2020 15TH IEEE INTERNATIONAL CONFERENCE ON AUTOMATIC FACE AND GESTURE RECOGNITION (FG 2020), 2020, : 265 - 272
  • [7] Behavior Prediction In-The-Wild
    Georgakis, Christos
    Panagakis, Yannis
    Pantic, Maja
    2017 SEVENTH INTERNATIONAL CONFERENCE ON AFFECTIVE COMPUTING AND INTELLIGENT INTERACTION WORKSHOPS AND DEMOS (ACIIW), 2017, : 18 - 25
  • [8] Laughter Synthesis using Pseudo Phonetic Tokens with a Large-scale In-the-wild Laughter Corpus
    Xin, Detai
    Takamichi, Shinnosuke
    Morimatsu, Ai
    Saruwatari, Hiroshi
    INTERSPEECH 2023, 2023, : 17 - 21
  • [9] More than Words: In-the-Wild Visually-Driven Prosody for Text-to-Speech
    Hassid, Michael
    Ramanovich, Michelle Tadmor
    Shillingford, Brendan
    Wang, Miaosen
    Jia, Ye
    Remez, Tal
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 10577 - 10587
  • [10] Quality Assessment of In-the-Wild Videos
    Li, Dingquan
    Jiang, Tingting
    Jiang, Ming
    PROCEEDINGS OF THE 27TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA (MM'19), 2019, : 2351 - 2359