AUC OPTIMIZATION FOR DEEP LEARNING BASED VOICE ACTIVITY DETECTION

被引:0
|
作者
Fan, Zi-Chen [1 ]
Bai, Zhongxin
Zhang, Xiao-Lei
Rahardja, Susanto
Chen, Jingdong
机构
[1] Northwestern Polytech Univ, Ctr Intelligent Acoust & Immers Commun, Xian, Shaanxi, Peoples R China
基金
中国国家自然科学基金;
关键词
AUC; deep neural networks; voice activity detection;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Voice activity detection (VAD) based on deep neural networks (DNN) has demonstrated good performance in adverse acoustic environments. Current DNN based VAD optimizes a surrogate function, e.g. minimum cross-entropy or minimum squared error, at a given decision threshold. However, VAD usually works on-the-fly with a dynamic decision threshold; and ROC curve is a global evaluation metric of VAD that reflects the performance of VAD at all possible decision thresholds. In this paper, we propose to optimize the area under ROC curve (AUC) by DNN, which can maximize the performance of VAD in terms of the ROC curve. Experimental results show that optimizing AUC by DNN results in higher performance than the common method of optimizing the minimum squared error by DNN.
引用
收藏
页码:6760 / 6764
页数:5
相关论文
共 50 条
  • [41] Overview of Voice Conversion Methods Based on Deep Learning
    Walczyna, Tomasz
    Piotrowski, Zbigniew
    APPLIED SCIENCES-BASEL, 2023, 13 (05):
  • [42] Monophonic Singing Voice Separation Based on Deep Learning
    Wang, Yutian
    Zhang, Zhao
    Wang, Zheng
    Cai, JuanJuan
    Wang, Hui
    2019 2ND IEEE CONFERENCE ON MULTIMEDIA INFORMATION PROCESSING AND RETRIEVAL (MIPR 2019), 2019, : 491 - 495
  • [43] Voice Recognition Based on Adaptive MFCC and Deep Learning
    Bae, Hyan-Soo
    Lee, Ho-Jin
    Lee, Suk-Gyu
    PROCEEDINGS OF THE 2016 IEEE 11TH CONFERENCE ON INDUSTRIAL ELECTRONICS AND APPLICATIONS (ICIEA), 2016, : 1542 - 1546
  • [44] Deep learning approaches for video-based anomalous activity detection
    Karishma Pawar
    Vahida Attar
    World Wide Web, 2019, 22 : 571 - 601
  • [45] Deep learning approaches for video-based anomalous activity detection
    Pawar, Karishma
    Attar, Vahida
    WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS, 2019, 22 (02): : 571 - 601
  • [46] Voice Activity Detection Method Based on MFPH
    Wu, Xin-Zhong
    Xia, Ling-Xiang
    Zhang, Xu
    Zhou, Cheng
    Beijing Youdian Daxue Xuebao/Journal of Beijing University of Posts and Telecommunications, 2019, 42 (02): : 83 - 89
  • [47] Spectrum Energy Based Voice Activity Detection
    Pang, Jing
    2017 IEEE 7TH ANNUAL COMPUTING AND COMMUNICATION WORKSHOP AND CONFERENCE IEEE CCWC-2017, 2017,
  • [48] Voice activity detection based on facial movement
    Joosten, Bart
    Postma, Eric
    Krahmer, Emiel
    JOURNAL ON MULTIMODAL USER INTERFACES, 2015, 9 (03) : 183 - 193
  • [49] A Voice Activity Detection System Based on FPGA
    Jung, Junhee
    Jin, Seunghun
    Kim, Dongkyun
    Kim, Hyung Soon
    Choi, Jong Suk
    Jeon, Jae Wook
    INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION AND SYSTEMS (ICCAS 2010), 2010, : 2304 - 2308
  • [50] Voice activity detection based on minimum statistics
    Kara, F
    Islam, T
    Palaz, H
    PROCEEDINGS OF THE IEEE 12TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE, 2004, : 556 - 559