A Low Complexity Long Short-Term Memory Based Voice Activity Detection

被引:0
|
作者
Yang, Ruiting [1 ]
Liu, Jie [1 ]
Deng, Xiang [1 ]
Zheng, Zhuochao [1 ]
机构
[1] Harman Int, Shenzhen, Peoples R China
关键词
Voice activity detection; long short-term memory; Gammatone cepstral coefficients; spectral features;
D O I
暂无
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Voice Activity Detection (VAD) plays an important role in audio processing, but it is also a common challenge when a voice signal is corrupted with strong and transient noise. In this paper, an accurate and causal VAD module using a long short-term memory (LSTM) deep neural network is proposed. A set of features including Gammatone cepstral coefficients (GTCC) and selected spectral features are used. The low complex structure allows it can be easily implemented in speech processing algorithms and applications. With carefully pre-processing and labeling the collected training data in the classes of speech or non-speech and training on the LSTM net, experiments show the proposed VAD is able to distinguish speech from different types of noisy background effectively. Its robustness against changes including varying frame length, moving speech sources and speaking in different languages, are further investigated.
引用
收藏
页数:6
相关论文
共 50 条
  • [1] Robust Visual Voice Activity Detection Using Long Short-Term Memory Recurrent Neural Network
    Aung, Zaw Htet
    Ritthipravat, Panrasee
    [J]. IMAGE AND VIDEO TECHNOLOGY, PSIVT 2015, 2016, 9431 : 380 - 391
  • [2] Audiovisual Speech Activity Detection with Advanced Long Short-Term Memory
    Tao, Fei
    Busso, Carlos
    [J]. 19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 1244 - 1248
  • [3] Time Series-based Spoof Speech Detection Using Long Short-term Memory and Bidirectional Long Short-term Memory
    Mirza, Arsalan R.
    Al-Talabani, Abdulbasit K.
    [J]. ARO-THE SCIENTIFIC JOURNAL OF KOYA UNIVERSITY, 2024, 12 (02): : 119 - 129
  • [4] Long Short-Term Memory (LSTM)-Based Dog Activity Detection Using Accelerometer and Gyroscope
    Hussain, Ali
    Begum, Khadija
    Armand, Tagne Poupi Theodore
    Mozumder, Md Ariful Islam
    Ali, Sikandar
    Kim, Hee Cheol
    Joo, Moon-Il
    [J]. APPLIED SCIENCES-BASEL, 2022, 12 (19):
  • [5] An intrusion detection approach based on incremental long short-term memory
    Zhou, Hanxun
    Kang, Longyu
    Pan, Hong
    Wei, Guo
    Feng, Yong
    [J]. INTERNATIONAL JOURNAL OF INFORMATION SECURITY, 2023, 22 (02) : 433 - 446
  • [6] An intrusion detection approach based on incremental long short-term memory
    Hanxun Zhou
    Longyu Kang
    Hong Pan
    Guo Wei
    Yong Feng
    [J]. International Journal of Information Security, 2023, 22 : 433 - 446
  • [7] Long Short-Term Memory based Operation Log Anomaly Detection
    Vinayakumar, R.
    Soman, K. P.
    Poornachandran, Prabaharan
    [J]. 2017 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, COMMUNICATIONS AND INFORMATICS (ICACCI), 2017, : 236 - 242
  • [8] Lane Position Detection Based on Long Short-Term Memory (LSTM)
    Yang, Wei
    Zhang, Xiang
    Lei, Qian
    Shen, Dengye
    Xiao, Ping
    Huang, Yu
    [J]. SENSORS, 2020, 20 (11)
  • [9] A short-term prediction model of global ionospheric VTEC based on the combination of long short-term memory and convolutional long short-term memory
    Peng Chen
    Rong Wang
    Yibin Yao
    Hao Chen
    Zhihao Wang
    Zhiyuan An
    [J]. Journal of Geodesy, 2023, 97
  • [10] A short-term prediction model of global ionospheric VTEC based on the combination of long short-term memory and convolutional long short-term memory
    Chen, Peng
    Wang, Rong
    Yao, Yibin
    Chen, Hao
    Wang, Zhihao
    An, Zhiyuan
    [J]. JOURNAL OF GEODESY, 2023, 97 (05)