AUC OPTIMIZATION FOR DEEP LEARNING BASED VOICE ACTIVITY DETECTION

被引:0
|
作者
Fan, Zi-Chen [1 ]
Bai, Zhongxin
Zhang, Xiao-Lei
Rahardja, Susanto
Chen, Jingdong
机构
[1] Northwestern Polytech Univ, Ctr Intelligent Acoust & Immers Commun, Xian, Shaanxi, Peoples R China
基金
中国国家自然科学基金;
关键词
AUC; deep neural networks; voice activity detection;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Voice activity detection (VAD) based on deep neural networks (DNN) has demonstrated good performance in adverse acoustic environments. Current DNN based VAD optimizes a surrogate function, e.g. minimum cross-entropy or minimum squared error, at a given decision threshold. However, VAD usually works on-the-fly with a dynamic decision threshold; and ROC curve is a global evaluation metric of VAD that reflects the performance of VAD at all possible decision thresholds. In this paper, we propose to optimize the area under ROC curve (AUC) by DNN, which can maximize the performance of VAD in terms of the ROC curve. Experimental results show that optimizing AUC by DNN results in higher performance than the common method of optimizing the minimum squared error by DNN.
引用
收藏
页码:6760 / 6764
页数:5
相关论文
共 50 条
  • [1] AUC optimization for deep learning-based voice activity detection
    Xiao-Lei Zhang
    Menglong Xu
    EURASIP Journal on Audio, Speech, and Music Processing, 2022
  • [2] AUC optimization for deep learning-based voice activity detection
    Zhang, Xiao-Lei
    Xu, Menglong
    EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2022, 2022 (01)
  • [3] Research on Voice Activity Detection Methods Based on Deep Learning
    Bai, Ke
    Yan, Huaicheng
    Li, Hao
    Tang, Nanxi
    Sun, Jiazheng
    Li, Zhichen
    2024 14TH ASIAN CONTROL CONFERENCE, ASCC 2024, 2024, : 1323 - 1328
  • [4] Deep Learning Approaches for Voice Activity Detection
    Wang, Mantao
    Huang, Qiang
    Zhang, Jie
    Li, Zhiyong
    Pu, Haibo
    Lei, Jinglan
    Wang, Lanjing
    CYBER SECURITY INTELLIGENCE AND ANALYTICS, 2020, 928 : 816 - 826
  • [5] Deep Belief Networks Based Voice Activity Detection
    Zhang, Xiao-Lei
    Wu, Ji
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2013, 21 (04): : 697 - 710
  • [6] Voice Activity Detection Based on an Unsupervised Learning Framework
    Ying, Dongwen
    Yan, Yonghong
    Dang, Jianwu
    Soong, Frank K.
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2011, 19 (08): : 2624 - 2632
  • [7] DENOISING DEEP NEURAL NETWORKS BASED VOICE ACTIVITY DETECTION
    Zhang, Xiao-Lei
    Wu, Ji
    2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 853 - 857
  • [8] Voice activity detection based on deep neural networks and Viterbi
    Bai, Liang
    Zhang, Zhen
    Hu, Jun
    2017 2ND INTERNATIONAL SEMINAR ON ADVANCES IN MATERIALS SCIENCE AND ENGINEERING, 2017, 231
  • [9] A voice activity detection algorithm using deep learning in time–frequency domain
    Samira Mavaddati
    Neural Computing and Applications, 2025, 37 (4) : 2581 - 2595
  • [10] Forestry pest detection optimization based on deep learning
    Zhao, Yan
    Liu, Ying-An
    Ye, Qiao-Lin
    Zhou, Xiao-Liang
    CHINESE JOURNAL OF LIQUID CRYSTALS AND DISPLAYS, 2022, 37 (09) : 1216 - 1227