TEO-based speaker stress assessment using hybrid classification and tracking schemes

被引：4

作者：

Hansen, John H.L. ^{[1
]}

Ruzanski, Evan ^{[1
]}

Bořil, Hynek ^{[1
]}

Meyerhoff, James ^{[1
]}

机构：

[1] Center for Robust Speech Systems (CRSS), University of Texas at Dallas, 800 West Campbell Rd, EC33, Richardson, TX 75080-3021, United States

来源：

International Journal of Speech Technology | 2012年 / 15卷 / 03期

关键词：

Stress assessment from speech; FLETC Corpus; TEO operator;

D O I：

10.1007/s10772-012-9165-1

中图分类号：

学科分类号：

摘要：

Speaker variability is known to have an adverse impact on speech systems that process linguistic content, such as speech and language recognition. However, speech production changes in individuals due to stress and emotions have similarly detrimental effect also on the task of speaker recognition as they introduce mismatch with the speaker models typically trained on modal speech. The focus of this study is on the analysis of stress-induced variations in speech and design of an automatic stress level assessment scheme that could be used in directing stress-dependent acoustic models or normalization strategies. Current stress detection methods typically employ a binary decision based on whether the speaker is or not under stress. In reality, the amount of stress in individuals varies and can change gradually. Using speech and biometric data collected in a real-world, variable-stress level law enforcement training scenario, this study considers two methods for stress level assessment. The first approach uses a nearest neighbor clustering scheme at the vowel token and sentence levels to classify speech data into three levels of stress. The second approach employs Euclidean distance metrics within the multi-dimensional feature space to provide real-time stress level tracking capability. Evaluations on audio data confirmed by biometric readings show both methods to be effective in assessment of stress level within a speaker (average accuracy of 55.6 % in a 3-way classification task). In addition, an impact of high-level stress on in-set speaker recognition is evaluated and shown to reduce the accuracy from 91.7 % (low/mid stress) to 21.4 % (high level stress).

引用

页码：295 / 311

页数：16

共 50 条

[1] Novel TEO-based Gammatone Features for Environmental Sound Classification
Agrawal, Dharmesh M.
Sailor, Hardik B.
Soni, Meet H.
Patil, Hemant A.
2017 25TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2017, : 1809 - 1813
[2] Methods for stress classification: Nonlinear TEO and linear speech based features
Zhou, GJ
Hansen, JHL
Kaiser, JF
ICASSP '99: 1999 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS VOLS I-VI, 1999, : 2087 - 2090
[3] A Hybrid Approach for Speaker Tracking Based on TDOA and Data-Driven Models
Laufer-Goldshtein, Bracha
Talmon, Ronen
Gannot, Sharon
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2018, 26 (04) : 725 - 735
[4] Stress Level Based Emotion Classification Using Hybrid Deep Learning Algorithm
Pichandi, Sivasankaran
Balasubramanian, Gomathy
Chakrapani, Venkatesh
KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS, 2023, 17 (11): : 3099 - 3120
[5] Classification of Functional States and Assessment of Psychoemotional Stress and Fatigue Levels Based on Hybrid Fuzzy Models
Titov V.S.
Mishustin V.N.
Novikov A.V.
Korovin E.N.
Biomedical Engineering, 2013, 47 (04) : 183 - 185
[6] Hybrid Fingerprinting-EKF Based Tracking Schemes for Indoor Passive Localization
Bybordi, Salar
Reggiani, Luca
INTERNATIONAL JOURNAL OF DISTRIBUTED SENSOR NETWORKS, 2014,
[7] Improving speaker verification in noisy environments using adaptive filtering and hybrid classification technique
Ilyas M.Z.
Samad S.A.
Hussain A.
Ishak K.A.
Information Technology Journal, 2010, 9 (01) : 107 - 115
[8] Frequency line tracking using HMM-based schemes
Paris, S
Jauffret, C
IEEE TRANSACTIONS ON AEROSPACE AND ELECTRONIC SYSTEMS, 2003, 39 (02) : 439 - 449
[9] Eye Tracking-Based Stress Classification of Athletes in Virtual Reality
Stoeve, Maike
Wirth, Markus
Farlock, Rosanna
Antunovic, Andre
Mueller, Victoria
Eskofier, Bjoern M.
PROCEEDINGS OF THE ACM ON COMPUTER GRAPHICS AND INTERACTIVE TECHNIQUES, 2022, 5 (02)
[10] Compressed feature based TV program classification and retrieval using speaker identification
Wu, Fei
Zhuang, Yueting
Zheng, Ke
Liu, Junwei
Pan, Yunhe
Moshi Shibie yu Rengong Zhineng/Pattern Recognition and Artificial Intelligence, 2002, 15 (01):

← 1 2 3 4 5 →