End-to-End Deep Learning Framework for Speech Paralinguistics Detection Based on Perception Aware Spectrum

被引:13
|
作者
Cai, Danwei [1 ,2 ]
Ni, Zhidong [1 ,2 ]
Liu, Wenbo [1 ]
Cai, Weicheng [1 ]
Li, Gang [3 ]
Li, Ming [1 ,2 ]
机构
[1] Sun Yat Sen Univ, Sch Elect & Informat Technol, Guangzhou, Guangdong, Peoples R China
[2] SYSU CMU Shunde Int Joint Res Inst, Shunde, Guangdong, Peoples R China
[3] Jiangsu Jinling Sci & Technol Grp Ltd, Nanjing, Jiangsu, Peoples R China
基金
中国国家自然科学基金;
关键词
computational paralinguistics; speech under cold; deep learning; perception aware spectrum; NEURAL-NETWORKS;
D O I
10.21437/Interspeech.2017-1445
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we propose an end-to-end deep learning framework to detect speech paralinguistics using perception aware spectrum as input. Existing studies show that speech under cold has distinct variations of energy distribution on low frequency components compared with the speech under 'healthy' condition. This motivates us to use perception aware spectrum as the input to an end-to-end learning framework with small scale dataset. In this work, we try both Constant Q Transform (CQT) spectrum and Gammatone spectrum in different end-to end deep learning networks. where both spectrums are able to closely mimic the human speech perception and transform it into 2D images. Experimental results show the effectiveness of the proposed perception aware spectrum with end-to-end deep learning approach on Interspecch 2017 Computational Paralinguistics Cold sub-Challenge. The final fusion result of our proposed method is 8% better than that of the provided baseline in terms of UAR.
引用
收藏
页码:3452 / 3456
页数:5
相关论文
共 50 条
  • [1] Spectrum Monitoring Based on End-to-End Learning by Deep Learning
    Mahdiyeh Rahmani
    Reza Ghazizadeh
    [J]. International Journal of Wireless Information Networks, 2022, 29 : 180 - 192
  • [2] Spectrum Monitoring Based on End-to-End Learning by Deep Learning
    Rahmani, Mahdiyeh
    Ghazizadeh, Reza
    [J]. INTERNATIONAL JOURNAL OF WIRELESS INFORMATION NETWORKS, 2022, 29 (02) : 180 - 192
  • [3] FluentNet: End-to-End Detection of Stuttered Speech Disfluencies With Deep Learning
    Kourkounakis, Tedd
    Hajavi, Amirhossein
    Etemad, Ali
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2021, 29 : 2986 - 2999
  • [4] An End-to-End Deep Learning Framework with Speech Emotion Recognition of Atypical Individuals
    Tang, Dengke
    Zeng, Junlin
    Li, Ming
    [J]. 19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 162 - 166
  • [5] A framework for end-to-end deep learning-based anomaly detection in transportation networks
    Davis, Neema
    Raina, Gaurav
    Jagannathan, Krishna
    [J]. TRANSPORTATION RESEARCH INTERDISCIPLINARY PERSPECTIVES, 2020, 5
  • [6] An End-to-End Deep Learning Framework for Fault Detection in Marine Machinery
    Rigas, Spyros
    Tzouveli, Paraskevi
    Kollias, Stefanos
    [J]. SENSORS, 2024, 24 (16)
  • [7] MINTZAI: End-to-end Deep Learning for Speech Translation
    Etchegoyhen, Thierry
    Arzelus, Haritz
    Gete, Harritxu
    Alvarez, Aitor
    Hernaez, Inma
    Navas, Eva
    Gonzalez-Docasal, Ander
    Osacar, Jaime
    Benites, Edson
    Ellakuria, Igor
    Calonge, Eusebi
    Martin, Maite
    [J]. PROCESAMIENTO DEL LENGUAJE NATURAL, 2020, (65): : 97 - 100
  • [8] Deep Learning-Based End-to-End Carrier Signal Detection in Broadband Power Spectrum
    Huang, Hao
    Wang, Peng
    Wang, Jiao
    Li, Jianqing
    [J]. ELECTRONICS, 2022, 11 (12)
  • [9] A novel system with an end-to-end framework for mouse scratching detection based on deep learning techniques
    Peng, J. P.
    Hsu, B.
    Lin, Y.
    Tseng, V. S.
    Lee, C.
    [J]. JOURNAL OF INVESTIGATIVE DERMATOLOGY, 2023, 143 (05) : S36 - S36
  • [10] A Novel End-to-End Deep Learning Framework for Chip Packaging Defect Detection
    Zhou, Siyi
    Yao, Shunhua
    Shen, Tao
    Wang, Qingwang
    [J]. SENSORS, 2024, 24 (17)