Multi-Scale Spatial and Temporal Speech Associations to Swallowing for Dysphagia Screening

被引:5
|
作者
He, Fei [1 ]
Hu, Xiaoyi [2 ,3 ]
Zhu, Ce [1 ]
Li, Ying [2 ,3 ]
Liu, Yipeng [1 ]
机构
[1] Univ Elect Sci & Technol China UESTC, Sch Informat & Commun Engn, Chengdu 611731, Peoples R China
[2] Sichuan Univ, Ctr Gerontol & Geriatr, Natl Clin Res Ctr Geriatr, Chengdu 610041, Peoples R China
[3] Sichuan Univ, West China Hosp, Chengdu 610041, Peoples R China
基金
中国国家自然科学基金;
关键词
Feature extraction; Speech processing; Vibrations; Spectrogram; Trajectory; Pipelines; Hospitals; Dysphagia; multi-scale speech analysis; quantitative feature selection; spatial spectrogram contours; throat signal; AUTOMATIC DETECTION; VOICE; SCHIZOPHRENIA; DYSARTHRIA;
D O I
10.1109/TASLP.2022.3203235
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Dysphagia is a common symptom of many neurological diseases. It often occurs in older adults and increases the risk of aspiration pneumonia. Existing diagnosis systems of dysphagia are invasive or require patients to swallow liquids, which are costly and harmful to the patients. In this work, we propose an early screening system of dysphagia based on two kinds of throat signals, i.e., vowels and sentences. Based on the vowels, two new speech feature sets are developed: PET (pitch/energy trajectory) and FS-Conts (full spectrogram contours). The PET focuses on the prominent resonance energy of speech to track the pitch and energy fluctuations. It can reflect the stability of vocal cords in the speech generation process. The FS-Conts feature set is proposed to emphasize the spatial details of formants based on three-dimensional contours. Concerning the sentences, three categories of speech features are proposed, called LSSDL (log symmetric spectral difference level), C-coes (crucial energy coefficients), and LDF (local dynamic features). The three features explore the speech representations of dysphagia from global variations to local associations. The LSSDL highlights the global spectral differences in the interested frequency region. The C-coes and LDF locate local speech differences in specific frequency regions and time duration. In addition, a new feature selection algorithm is developed to search for distinguishing features. In the experiments, the SVM classifier is adopted and the dysphagia detection accuracy reaches 95.07%. The results of comparative experiments indicate that our system performs better than the existing methods.
引用
收藏
页码:2888 / 2899
页数:12
相关论文
共 50 条
  • [41] MFSTGN: a multi-scale spatial-temporal fusion graph network for traffic prediction
    Tian, Ran
    Wang, Chu
    Hu, Jia
    Ma, Zhongyu
    APPLIED INTELLIGENCE, 2023, 53 (19) : 22582 - 22601
  • [42] Multi-scale Spatial-Temporal Feature Aggregating for Video Salient Object Segmentation
    Mu, Changhong
    Yuan, Zebin
    Ouyang, Xiuqin
    Wang, Bo
    2019 IEEE 4TH INTERNATIONAL CONFERENCE ON SIGNAL AND IMAGE PROCESSING (ICSIP 2019), 2019, : 224 - 229
  • [43] Dynamic Networks with Multi-scale Temporal Structure
    Xinyu Kang
    Apratim Ganguly
    Eric D. Kolaczyk
    Sankhya A, 2022, 84 : 218 - 260
  • [44] Temporal multi-scale models for flow and acceleration
    Yacoob, Y
    Davis, LS
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 1999, 32 (02) : 147 - 163
  • [45] Dynamic Networks with Multi-scale Temporal Structure
    Kang, Xinyu
    Ganguly, Apratim
    Kolaczyk, Eric D.
    SANKHYA-SERIES A-MATHEMATICAL STATISTICS AND PROBABILITY, 2022, 84 (01): : 218 - 260
  • [46] Temporal multi-scale models for flow and acceleration
    Yacoob, Y
    Davis, LS
    1997 IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, PROCEEDINGS, 1997, : 921 - 927
  • [47] Temporal Multi-Scale Models for Flow and Acceleration
    Yaser Yacoob
    Larry S. Davis
    International Journal of Computer Vision, 1999, 32 : 147 - 163
  • [48] Multi-scale Activity Estimation with Spatial Abstractions
    Hawasly, Majd
    Pokorny, Florian T.
    Ramamoorthy, Subramanian
    GEOMETRIC SCIENCE OF INFORMATION, GSI 2017, 2017, 10589 : 273 - 281
  • [49] CHASTE: incorporating a novel multi-scale spatial and temporal algorithm into a large-scale open source library
    Bernabeu, Miguel O.
    Bordas, Rafel
    Pathmanathan, Pras
    Pitt-Francis, Joe
    Cooper, Jonathan
    Garny, Alan
    Gavaghan, David J.
    Rodriguez, Blanca
    Southern, James A.
    Whiteley, Jonathan P.
    PHILOSOPHICAL TRANSACTIONS OF THE ROYAL SOCIETY A-MATHEMATICAL PHYSICAL AND ENGINEERING SCIENCES, 2009, 367 (1895): : 1907 - 1930
  • [50] A Lightweight Multi-Scale Model for Speech Emotion Recognition
    Li, Haoming
    Zhao, Daqi
    Wang, Jingwen
    Wang, Deqiang
    IEEE ACCESS, 2024, 12 : 130228 - 130240