Deep Neural Networks with Batch Speaker Normalization for Intoxicated Speech Detection

被引：0

作者：

Wang, Weiqing ^{[1
]}

Wu, Haiwei ^{[1
,2
]}

Li, Ming ^{[1
]}

机构：

[1] Duke Kunshan Univ, Data Sci Res Ctr, Kunshan, Peoples R China

[2] Sun Yat Sen Univ, Sch Elect & Informat Technol, Guangzhou, Peoples R China

来源：

2019 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC) | 2019年

关键词：

intoxicated speech detection; Convolutional Neural Network; computational paralinguistics; ALCOHOL-INTOXICATION;

D O I：

暂无

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

Alcohol intoxication can affect people both physically and psychologically, and one's speech will also become different. However, detecting the intoxicated state from the speech is a challenging task. In this paper, we first implement the baseline model with ComParE feature and then explore the influence of the speaker information on the intoxication detection task. Besides, we apply a ResNet18 based model to this task. The model contains three parts: a representation learning sub-network with Deep Residual Neural Network(ResNet) of 18-layer, a global average pooling(GAP) layer and a classifier of 2 fully connected layers. Since we cannot perform speaker z-normalization on the variant-length feature input, we employ the batch z-normalization to train the proposed model. It also achieves similar improvement like applying the speaker normalization to the baseline method. Experimental results show that speaker normalization on baseline model and batch z-normalization on ResNet18 based model provides 4.9% and 3.8% improvement respectively. The results show that speaker normalization can improve the performance of both the baseline model and the proposed model.

引用

下载

页码：1323 / 1327

页数：5

共 50 条

[21] Deep Neural Networks for joint Voice Activity Detection and Speaker Localization
Vecchiotti, Paolo
Principi, Emanuele
Squartini, Stefano
Piazza, Francesco
2018 26TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2018, : 1567 - 1571
[22] NORMALIZATION EFFECTS ON DEEP NEURAL NETWORKS
Yu, Jiahui
Spiliopoulos, Konstantinos
FOUNDATIONS OF DATA SCIENCE, 2023, 5 (03): : 389 - 465
[23] Batch Normalization Orthogonalizes Representations in Deep Random Networks
Daneshmand, Hadi
Joudaki, Amir
Bach, Francis
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
[24] Membrane Potential Batch Normalization for Spiking Neural Networks
Guo, Yufei
Zhang, Yuhan
Chen, Yuanpei
Peng, Weihang
Liu, Xiaode
Zhang, Liwen
Huang, Xuhui
Ma, Zhe
2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 19363 - 19373
[25] Interpolating Convolutional Neural Networks Using Batch Normalization
Data, Gratianus Wesley Putra
Ngu, Kirjon
Murray, David William
Prisacariu, Victor Adrian
COMPUTER VISION - ECCV 2018, PT XIII, 2018, 11217 : 591 - 606
[26] Temporal Effective Batch Normalization in Spiking Neural Networks
Duan, Chaoteng
Ding, Jianhao
Chen, Shiyan
Yu, Zhaofei
Huang, Tiejun
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
[27] ITERATIVE DEEP NEURAL NETWORKS FOR SPEAKER-INDEPENDENT BINAURAL BLIND SPEECH SEPARATION
Liu, Qingju
Xu, Yong
Jackson, Philip J. B.
Wang, Wenwu
Coleman, Philip
2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 541 - 545
[28] Intoxicated Speech Detection by Fusion of Speaker Normalized Hierarchical Features and GMM Supervectors
Bone, Daniel
Black, Matthew P.
Li, Ming
Metallinou, Angeliki
Lee, Sungbok
Narayanan, Shrikanth S.
12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 3224 - 3227
[29] Regularizing deep neural networks for medical image analysis with augmented batch normalization[Formula presented]
Zhu, Shengqian
Yu, Chengrong
Hu, Junjie
Applied Soft Computing, 2024, 154
[30] Speech Activity Detection on YouTube Using Deep Neural Networks
Ryant, Neville
Liberman, Mark
Yuan, Jiahong
14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 728 - 731

← 1 2 3 4 5 →