Noise robust voice activity detection using joint phase and magnitude based feature enhancement

被引:0
|
作者
Khomdet Phapatanaburi
Longbiao Wang
Zeyan Oo
Weifeng Li
Seiichi Nakagawa
Masahiro Iwahashi
机构
[1] Nagaoka University of Technology,Tianjin Key Laboratory of Cognitive Computing and Application
[2] School of Computer Science and Technology,Graduate School at Shenzhen
[3] Tianjin University,undefined
[4] Tsinghua University,undefined
[5] Toyohashi University of Technology,undefined
关键词
Deep neural network (DNN); Phase information; Noise-robust VAD; Feature enhancement;
D O I
暂无
中图分类号
学科分类号
摘要
Recently, deep neural network (DNN)-based feature enhancement has been proposed for many speech applications. DNN-enhanced features have achieved higher performance than raw features. However, phase information is discarded during most conventional DNN training. In this paper, we propose a DNN-based joint phase- and magnitude -based feature (JPMF) enhancement (JPMF with DNN) and a noise-aware training (NAT)-DNN-based JPMF enhancement (JPMF with NAT-DNN) for noise-robust voice activity detection (VAD). Moreover, to improve the performance of the proposed feature enhancement, a combination of the scores of the proposed phase- and magnitude-based features is also applied. Specifically, mel-frequency cepstral coefficients (MFCCs) and the mel-frequency delta phase (MFDP) are used as magnitude and phase features. The experimental results show that the proposed feature enhancement significantly outperforms the conventional magnitude-based feature enhancement. The proposed JPMF with NAT-DNN method achieves the best relative equal error rate (EER), compared with individual magnitude- and phase-based DNN speech enhancement. Moreover, the combined score of the enhanced MFCC and MFDP using JPMF with NAT-DNN further improves the VAD performance.
引用
收藏
页码:845 / 859
页数:14
相关论文
共 50 条
  • [1] Noise robust voice activity detection using joint phase and magnitude based feature enhancement
    Phapatanaburi, Khomdet
    Wang, Longbiao
    Oo, Zeyan
    Li, Weifeng
    Nakagawa, Seiichi
    Iwahashi, Masahiro
    JOURNAL OF AMBIENT INTELLIGENCE AND HUMANIZED COMPUTING, 2017, 8 (06) : 845 - 859
  • [2] Robust Voice Activity Detection Using Feature Combination
    Haghani, Sahar Khaksar
    Ahadi, Seyed Mohammad
    2013 21ST IRANIAN CONFERENCE ON ELECTRICAL ENGINEERING (ICEE), 2013,
  • [3] Robust voice activity detection based on noise eigenspace
    Ying, Dongwen
    Shi, Yu
    Lu, Xugang
    Dang, Jianwu
    Soong, Frank
    ACOUSTICAL SCIENCE AND TECHNOLOGY, 2007, 28 (06) : 413 - 423
  • [4] On Noise Robust Voice Activity Detection
    Dekens, Tomas
    Verhelst, Werner
    12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 2660 - 2663
  • [5] A robust voice activity detection based on noise eigenspace projection
    Ying, Dongwen
    Shi, Yu
    Soong, Frank
    Dang, Jianwu
    Lu, Xugang
    CHINESE SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, 2006, 4274 : 76 - +
  • [6] Noise robust model-based Voice Activity Detection
    de la Torre, Angel
    Ramirez, Javier
    Benitez, Carmen
    Segura, Jose C.
    Garcia, Luz
    Rubio, Antonio J.
    INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 1954 - 1957
  • [7] Robust Voice Activity Detection Feature Design Based on Spectral Kurtosis
    Zhang Shuyin
    Guo Ying
    Zhang Qun
    PROCEEDINGS OF THE FIRST INTERNATIONAL WORKSHOP ON EDUCATION TECHNOLOGY AND COMPUTER SCIENCE, VOL III, 2009, : 269 - 272
  • [8] Robust Voice Activity Detection Based on Complementary BLSTM Enhancement Stage
    Shahryary, Iman
    Seyedin, Sanaz
    Ahadi, Seyed Mohammad
    2020 28TH IRANIAN CONFERENCE ON ELECTRICAL ENGINEERING (ICEE), 2020, : 1608 - 1612
  • [9] Noise Robust Voice Activity Detection Based on Switching Kalman Filter
    Fujimoto, Masakiyo
    Ishizuka, Kentaro
    INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 965 - 968
  • [10] Noise robust voice activity detection based on switching Kalman filter
    Fujimoto, Masakiyo
    Ishizuka, Kentaro
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2008, E91D (03): : 467 - 477