Stress detection using non-semantic speech representation

被引:6
|
作者
Kejriwal, Jay [1 ,2 ]
Benus, Stefan [1 ,3 ]
Trnka, Marian [1 ]
机构
[1] Slovak Acad Sci, Inst Informat, Bratislava, Slovakia
[2] Slovak Tech Univ, Fac Informat & Informat Technol, Bratislava, Slovakia
[3] Constantine Philosopher Univ, Nitra, Slovakia
来源
2022 32ND INTERNATIONAL CONFERENCE RADIOELEKTRONIKA (RADIOELEKTRONIKA) | 2022年
关键词
stress detection; speech; classification; x-vectors; TRILL vector; MFCC feature; PLP feature; LLD feature;
D O I
10.1109/RADIOELEKTRONIKA54537.2022.9764916
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
In today's world, stress has become a prominent cause for many ailments. Automatic detection of stress from speech using state-of-the-art machine learning algorithms can facilitate early detection and prevention of stress. Artificial intelligence agents involved in affective computing and human-machine spoken interaction (HMI) might benefit from the capacity to identify human stress automatically. Despite the fact that several different methods have been established for stress detection, it is still unclear which auditory features should be considered for training a deep neural network (DNN) model. In this study, we propose to investigate the performance of traditional and modern auditory features for stress classification using the StressDat database. The StressDat database is a collection of acted speech recordings in Slovak realizing sentences within stress-prone situations in three different levels of stress. The performance of traditional auditory features such as Mel-Frequency Cepstral Coefficients (MFCC) and Perceptual Linear Prediction (PLP) are compared with modern auditory non-semantic speech representation such as x-vectors and TRIpLet Loss network (TRILL) vectors. As a benchmark, Low-level descriptors (LLD) auditory features are extracted using the OpenSMILE toolkit. We evaluated performance of four different automatic classification algorithms: support vector machine (SVM), multilayer perceptron (MLP), convolutional neural network (CNN), and long shortterm memory (LSTM). The results reveal that TRILL vectors trained on CNN provide the highest accuracy (81.86%).
引用
收藏
页码:133 / 137
页数:5
相关论文
共 50 条
  • [21] Non-semantic aspects of language in semantic dementia: As normal as they're said to be?
    Benedet, M
    Patterson, K
    Gomez-Pastor, I
    de la Rocha, MLG
    NEUROCASE, 2006, 12 (01) : 15 - 26
  • [22] SPNet: Semantic preserving network with semantic constraint and non-semantic calibration for color constancy
    Zhang, Wen
    Li, Zhijiang
    Zhang, Li
    Tan, Zhenshan
    NEUROCOMPUTING, 2024, 596
  • [23] Asking Friendly Strangers: Non-Semantic Attribute Transfer
    Murrugarra-Llerena, Nils
    Kovashka, Adriana
    THIRTY-SECOND AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTIETH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / EIGHTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2018, : 7268 - 7275
  • [24] Coexistence of the social semantic effect and non-semantic effect in the default mode network
    Zhang, Guangyao
    Hung, Jinyi
    Lin, Nan
    BRAIN STRUCTURE & FUNCTION, 2023, 228 (01): : 321 - 339
  • [25] EFFECTS OF SEMANTIC AND NON-SEMANTIC ORIENTING TASKS ON FREE-RECALL OF WORDS
    EPSTEIN, ML
    PHILLIPS, WD
    JOURNAL OF GENERAL PSYCHOLOGY, 1977, 96 (02): : 281 - 290
  • [26] DIFFERENCES IN THE LATE COMPONENTS OF THE ERP DUE TO AGE AND TO SEMANTIC AND NON-SEMANTIC TASKS
    HARBIN, TJ
    MARSH, GR
    PSYCHOPHYSIOLOGY, 1984, 21 (05) : 579 - 580
  • [27] The equivalence of acceptable noise level (ANL) with English, Mandarin, and non-semantic speech: A study across the US and Taiwan
    Ho, Hsu-Chueh
    Wu, Yu-Hsiang
    Hsiao, Shih-Hsuan
    Stangl, Elizabeth
    Lentz, Emily J.
    Bentler, Ruth A.
    INTERNATIONAL JOURNAL OF AUDIOLOGY, 2013, 52 (02) : 83 - 91
  • [28] Coexistence of the social semantic effect and non-semantic effect in the default mode network
    Guangyao Zhang
    Jinyi Hung
    Nan Lin
    Brain Structure and Function, 2023, 228 : 321 - 339
  • [29] Acceptable noise level (ANL) with Danish and non-semantic speech materials in adult hearing-aid users
    Olsen, Steen Ostergaard
    Lantz, Johannes
    Nielsen, Lars Holme
    Brannstrom, K. Jonas
    INTERNATIONAL JOURNAL OF AUDIOLOGY, 2012, 51 (09) : 678 - 688
  • [30] Convergent language network involvement in non-semantic PPA variants
    Leyton, C.
    Landin-Romero, R.
    Kumfor, F.
    Burrell, J.
    Hodges, J.
    Piguet, O.
    JOURNAL OF NEUROCHEMISTRY, 2016, 138 : 352 - 352