TEO-based speaker stress assessment using hybrid classification and tracking schemes

被引:4
|
作者
Hansen, John H.L. [1 ]
Ruzanski, Evan [1 ]
Bořil, Hynek [1 ]
Meyerhoff, James [1 ]
机构
[1] Center for Robust Speech Systems (CRSS), University of Texas at Dallas, 800 West Campbell Rd, EC33, Richardson, TX 75080-3021, United States
关键词
Stress assessment from speech; FLETC Corpus; TEO operator;
D O I
10.1007/s10772-012-9165-1
中图分类号
学科分类号
摘要
Speaker variability is known to have an adverse impact on speech systems that process linguistic content, such as speech and language recognition. However, speech production changes in individuals due to stress and emotions have similarly detrimental effect also on the task of speaker recognition as they introduce mismatch with the speaker models typically trained on modal speech. The focus of this study is on the analysis of stress-induced variations in speech and design of an automatic stress level assessment scheme that could be used in directing stress-dependent acoustic models or normalization strategies. Current stress detection methods typically employ a binary decision based on whether the speaker is or not under stress. In reality, the amount of stress in individuals varies and can change gradually. Using speech and biometric data collected in a real-world, variable-stress level law enforcement training scenario, this study considers two methods for stress level assessment. The first approach uses a nearest neighbor clustering scheme at the vowel token and sentence levels to classify speech data into three levels of stress. The second approach employs Euclidean distance metrics within the multi-dimensional feature space to provide real-time stress level tracking capability. Evaluations on audio data confirmed by biometric readings show both methods to be effective in assessment of stress level within a speaker (average accuracy of 55.6 % in a 3-way classification task). In addition, an impact of high-level stress on in-set speaker recognition is evaluated and shown to reduce the accuracy from 91.7 % (low/mid stress) to 21.4 % (high level stress).
引用
收藏
页码:295 / 311
页数:16
相关论文
共 50 条
  • [1] Novel TEO-based Gammatone Features for Environmental Sound Classification
    Agrawal, Dharmesh M.
    Sailor, Hardik B.
    Soni, Meet H.
    Patil, Hemant A.
    2017 25TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2017, : 1809 - 1813
  • [2] Methods for stress classification: Nonlinear TEO and linear speech based features
    Zhou, GJ
    Hansen, JHL
    Kaiser, JF
    ICASSP '99: 1999 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS VOLS I-VI, 1999, : 2087 - 2090
  • [3] A Hybrid Approach for Speaker Tracking Based on TDOA and Data-Driven Models
    Laufer-Goldshtein, Bracha
    Talmon, Ronen
    Gannot, Sharon
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2018, 26 (04) : 725 - 735
  • [4] Stress Level Based Emotion Classification Using Hybrid Deep Learning Algorithm
    Pichandi, Sivasankaran
    Balasubramanian, Gomathy
    Chakrapani, Venkatesh
    KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS, 2023, 17 (11): : 3099 - 3120
  • [5] Classification of Functional States and Assessment of Psychoemotional Stress and Fatigue Levels Based on Hybrid Fuzzy Models
    Titov V.S.
    Mishustin V.N.
    Novikov A.V.
    Korovin E.N.
    Biomedical Engineering, 2013, 47 (04) : 183 - 185
  • [6] Hybrid Fingerprinting-EKF Based Tracking Schemes for Indoor Passive Localization
    Bybordi, Salar
    Reggiani, Luca
    INTERNATIONAL JOURNAL OF DISTRIBUTED SENSOR NETWORKS, 2014,
  • [7] Improving speaker verification in noisy environments using adaptive filtering and hybrid classification technique
    Ilyas M.Z.
    Samad S.A.
    Hussain A.
    Ishak K.A.
    Information Technology Journal, 2010, 9 (01) : 107 - 115
  • [8] Frequency line tracking using HMM-based schemes
    Paris, S
    Jauffret, C
    IEEE TRANSACTIONS ON AEROSPACE AND ELECTRONIC SYSTEMS, 2003, 39 (02) : 439 - 449
  • [9] Eye Tracking-Based Stress Classification of Athletes in Virtual Reality
    Stoeve, Maike
    Wirth, Markus
    Farlock, Rosanna
    Antunovic, Andre
    Mueller, Victoria
    Eskofier, Bjoern M.
    PROCEEDINGS OF THE ACM ON COMPUTER GRAPHICS AND INTERACTIVE TECHNIQUES, 2022, 5 (02)
  • [10] Compressed feature based TV program classification and retrieval using speaker identification
    Wu, Fei
    Zhuang, Yueting
    Zheng, Ke
    Liu, Junwei
    Pan, Yunhe
    Moshi Shibie yu Rengong Zhineng/Pattern Recognition and Artificial Intelligence, 2002, 15 (01):