Voice Activity Detection for Children's Read Speech Recognition in Noisy Conditions

被引:0
|
作者
Pasad, Ankita [1 ]
Sabu, Kamini [1 ]
Rao, Preeti [1 ]
机构
[1] Indian Inst Technol, Dept Elect Engn, Bombay, Maharashtra, India
关键词
TERM SIGNAL VARIABILITY; ROBUST; FEATURES;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Recordings of read-aloud stories by children in a school setting can be used to provide an assessment of reading skills via automatic speech recognition (ASR). ASR, however, is known to be highly susceptible to background noise. The unusual variety of foreground (breath release, mic pops, etc.) and background (children playing, distinct background talker, wind, etc.) non-speech sounds makes this application particularly challenging. Motivated by the observation on real-world data that close to 50% of the recorded audio comprises purely non-speech activity, we investigate robust approaches to voice activity detection to eliminate non-speech segments to the extent possible prior to ASR. We have exploited energy-based and harmonicity-based features coupled with suitable temporal smoothing constraints in a two-pass noise preprocessing system. A discussion of the voice activity detection performance of the system is presented with reference to the characteristics of the noise types.
引用
收藏
页数:6
相关论文
共 50 条
  • [1] Comparison of Acoustic and Visual Voice Activity Detection for Noisy Speech Recognition
    Bratoszewski, Piotr
    Szwoch, Grzegorz
    Czyzewski, Andrzej
    2016 SIGNAL PROCESSING: ALGORITHMS, ARCHITECTURES, ARRANGEMENTS, AND APPLICATIONS (SPA), 2016, : 287 - 291
  • [2] Optimizing Voice Activity Detection for Noisy Conditions
    Lin, Ruixi
    Costello, Charles
    Jankowski, Charles
    Mruthyunjaya, Vishwas
    INTERSPEECH 2019, 2019, : 2030 - 2034
  • [3] Robust Voice Activity Detection Algorithm for Noisy Speech
    Verteletskaya, Ekaterina
    Simak, Boris
    RTT 2009: 11TH INTERNATIONAL CONFERENCE RTT 2009 RESEARCH IN TELECOMMUNICATION TECHNOLOGY, CONFERENCE PROCEEDINGS, 2009, : 98 - 101
  • [4] Evaluating the Impact of Voice Activity Detection on Speech Emotion Recognition for Autistic Children
    Milling, Manuel
    Baird, Alice
    Bartl-Pokorny, Katrin D.
    Liu, Shuo
    Alcorn, Alyssa M.
    Shen, Jie
    Tavassoli, Teresa
    Ainger, Eloise
    Pellicano, Elizabeth
    Pantic, Maja
    Cummins, Nicholas
    Schuller, Bjoern W.
    FRONTIERS IN COMPUTER SCIENCE, 2022, 4
  • [5] Bispectrum estimators for voice activity detection and speech recognition
    Górriz, JM
    Puntonet, CG
    Ramírez, J
    Segura, JC
    NONLINEAR ANALYSES AND ALGORITHMS FOR SPEECH PROCESSING, 2005, 3817 : 174 - 185
  • [6] Performance analysis of voice activity detection algorithm for robust speech recognition system under different noisy environment
    Babu, C. Ganesh
    Vanathi, P. T.
    Ramachandran, R.
    Rajaa, M. Senthil
    Vengatesh, R.
    JOURNAL OF SCIENTIFIC & INDUSTRIAL RESEARCH, 2010, 69 (07): : 515 - 522
  • [7] Harmonic-Based Robust Voice Activity Detection for Enhanced Low SNR Noisy Speech Recognition System
    Shih, Po-Yi
    Lin, Po-Chuan
    Wang, Jhing-Fa
    IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES, 2016, E99A (11) : 1928 - 1936
  • [8] Speech Recognition Performance under Noisy Conditions of Children with Hearing Loss
    Yang, Hui-Mei
    Hsieh, Yi-Jung
    Wu, Jiunn-Liang
    CLINICAL AND EXPERIMENTAL OTORHINOLARYNGOLOGY, 2012, 5 : S73 - S75
  • [9] Accuracy on Children's Speech Recognition under Noisy Circumstances
    Tian, Yu
    Tang, Jiayue
    Jiang, Xiaonan
    Tsutsui, Hiroshi
    Miyanaga, Yoshikazu
    2018 18TH INTERNATIONAL SYMPOSIUM ON COMMUNICATIONS AND INFORMATION TECHNOLOGIES (ISCIT), 2018, : 101 - 104
  • [10] SPEECH RECOGNITION IN UNSEEN AND NOISY CHANNEL CONDITIONS
    Mitra, Vikramjit
    Franco, Horacio
    Bartels, Chris
    van Hout, Julien
    Graciarena, Martin
    Vergyri, Dimitra
    2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 5215 - 5219