Unsupervised Voice Activity Detection by Modeling Source and System Information using Zero Frequency Filtering

被引:0
|
作者
Sarkar, Eklavya [1 ,2 ]
Prasad, RaviShankar [1 ]
Doss, Mathew Magimai [1 ]
机构
[1] Idiap Res Inst, Martigny, Switzerland
[2] Ecole Polytech Fed Lausanne, Lausanne, Switzerland
来源
INTERSPEECH 2022 | 2022年
基金
瑞士国家科学基金会;
关键词
Voice activity detection; zero-frequency filtering; speech analysis; signal processing; NOISE;
D O I
10.21437/Interspeech.2022-10535
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Voice activity detection (VAD) is an important pre-processing step for speech technology applications. The task consists of deriving segment boundaries of audio signals which contain voicing information. In recent years, it has been shown that voice source and vocal tract system information can be extracted using zero-frequency filtering (ZFF) without making any explicit model assumptions about the speech signal. This paper investigates the potential of zero-frequency filtering for jointly modeling voice source and vocal tract system information, and proposes two approaches for VAD. The first approach demarcates voiced regions using a composite signal composed of different zero-frequency filtered signals. The second approach feeds the composite signal as input to the rVAD algorithm. These approaches are compared with other supervised and unsupervised VAD methods in the literature, and are evaluated on the Aurora2 database, across a range of SNRs (20 to -5 dB). Our studies show that the proposed ZFF-based methods perform comparable to state-of-art VAD methods and are more invariant to added degradation and different channel characteristics.
引用
收藏
页码:4626 / 4630
页数:5
相关论文
共 50 条
  • [31] Detection of three-phase fault during power swing using zero frequency filtering
    Prabhu, M. S.
    Nayak, Paresh Kumar
    Pradhan, Gayadhar
    INTERNATIONAL TRANSACTIONS ON ELECTRICAL ENERGY SYSTEMS, 2019, 29 (01):
  • [32] UMP Test based Voice Activity Detection using High Frequency Resolution
    Li, Rong
    Li, Yu
    2014 FOURTH INTERNATIONAL CONFERENCE ON INSTRUMENTATION AND MEASUREMENT, COMPUTER, COMMUNICATION AND CONTROL (IMCCC), 2014, : 357 - 360
  • [33] Innovative Method for Unsupervised Voice Activity Detection and Classification of Audio Segments
    Ali, Zulfiqar
    Talha, Muhammad
    IEEE ACCESS, 2018, 6 : 15494 - 15504
  • [34] Voice Activity Detection Using Generalized Exponential Kernels for Time and Frequency Domains
    Pires Soares, Aminadabe dos Santos
    Parreira, Wemerson Delcio
    Souza, Everton Granemann
    do Nascimento, Chiara das Dores
    Melo de Almeida, Sergio Jose
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS, 2019, 66 (06) : 2116 - 2123
  • [35] VOICE ACTIVITY DETECTION USING HARMONIC FREQUENCY COMPONENTS IN LIKELIHOOD RATIO TEST
    Lee Ngee Tan
    Borgstrom, Bengt J.
    Alwan, Abeer
    2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 4466 - 4469
  • [36] Low Frequency Ultrasonic Voice Activity Detection using Convolutional Neural Networks
    McLoughlin, Ian
    Song, Yan
    16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 2400 - 2404
  • [37] A voice activity detection algorithm using deep learning in time–frequency domain
    Samira Mavaddati
    Neural Computing and Applications, 2025, 37 (4) : 2581 - 2595
  • [38] TEMPORAL MODELING USING DILATED CONVOLUTION AND GATING FOR VOICE-ACTIVITY-DETECTION
    Chang, Shuo-Yiin
    Li, Bo
    Simko, Gabor
    Sainath, Tara N.
    Tripathi, Anshuman
    van den Oord, Aaron
    Vinyals, Oriol
    2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5549 - 5553
  • [39] VOICE ACTIVITY DETECTION USING NEUROGRAMS
    Jassim, Wissam A.
    Harte, Naomi
    2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5524 - 5528
  • [40] An unsupervised acoustic fall detection system using source separation for sound interference suppression
    Khan, Muhammad Salman
    Yu, Miao
    Feng, Pengming
    Wang, Liang
    Chambers, Jonathon
    SIGNAL PROCESSING, 2015, 110 : 199 - 210