Unsupervised Voice Activity Detection by Modeling Source and System Information using Zero Frequency Filtering

被引：0

作者：

Sarkar, Eklavya ^{[1
,2
]}

Prasad, RaviShankar ^{[1
]}

Doss, Mathew Magimai ^{[1
]}

机构：

[1] Idiap Res Inst, Martigny, Switzerland

[2] Ecole Polytech Fed Lausanne, Lausanne, Switzerland

来源：

INTERSPEECH 2022 | 2022年

基金：

瑞士国家科学基金会;

关键词：

Voice activity detection; zero-frequency filtering; speech analysis; signal processing; NOISE;

D O I：

10.21437/Interspeech.2022-10535

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Voice activity detection (VAD) is an important pre-processing step for speech technology applications. The task consists of deriving segment boundaries of audio signals which contain voicing information. In recent years, it has been shown that voice source and vocal tract system information can be extracted using zero-frequency filtering (ZFF) without making any explicit model assumptions about the speech signal. This paper investigates the potential of zero-frequency filtering for jointly modeling voice source and vocal tract system information, and proposes two approaches for VAD. The first approach demarcates voiced regions using a composite signal composed of different zero-frequency filtered signals. The second approach feeds the composite signal as input to the rVAD algorithm. These approaches are compared with other supervised and unsupervised VAD methods in the literature, and are evaluated on the Aurora2 database, across a range of SNRs (20 to -5 dB). Our studies show that the proposed ZFF-based methods perform comparable to state-of-art VAD methods and are more invariant to added degradation and different channel characteristics.

引用

页码：4626 / 4630

页数：5

共 50 条

[31] Detection of three-phase fault during power swing using zero frequency filtering
Prabhu, M. S.
Nayak, Paresh Kumar
Pradhan, Gayadhar
INTERNATIONAL TRANSACTIONS ON ELECTRICAL ENERGY SYSTEMS, 2019, 29 (01):
[32] UMP Test based Voice Activity Detection using High Frequency Resolution
Li, Rong
Li, Yu
2014 FOURTH INTERNATIONAL CONFERENCE ON INSTRUMENTATION AND MEASUREMENT, COMPUTER, COMMUNICATION AND CONTROL (IMCCC), 2014, : 357 - 360
[33] Innovative Method for Unsupervised Voice Activity Detection and Classification of Audio Segments
Ali, Zulfiqar
Talha, Muhammad
IEEE ACCESS, 2018, 6 : 15494 - 15504
[34] Voice Activity Detection Using Generalized Exponential Kernels for Time and Frequency Domains
Pires Soares, Aminadabe dos Santos
Parreira, Wemerson Delcio
Souza, Everton Granemann
do Nascimento, Chiara das Dores
Melo de Almeida, Sergio Jose
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS, 2019, 66 (06) : 2116 - 2123
[35] VOICE ACTIVITY DETECTION USING HARMONIC FREQUENCY COMPONENTS IN LIKELIHOOD RATIO TEST
Lee Ngee Tan
Borgstrom, Bengt J.
Alwan, Abeer
2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 4466 - 4469
[36] Low Frequency Ultrasonic Voice Activity Detection using Convolutional Neural Networks
McLoughlin, Ian
Song, Yan
16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 2400 - 2404
[37] A voice activity detection algorithm using deep learning in time–frequency domain
Samira Mavaddati
Neural Computing and Applications, 2025, 37 (4) : 2581 - 2595
[38] TEMPORAL MODELING USING DILATED CONVOLUTION AND GATING FOR VOICE-ACTIVITY-DETECTION
Chang, Shuo-Yiin
Li, Bo
Simko, Gabor
Sainath, Tara N.
Tripathi, Anshuman
van den Oord, Aaron
Vinyals, Oriol
2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5549 - 5553
[39] VOICE ACTIVITY DETECTION USING NEUROGRAMS
Jassim, Wissam A.
Harte, Naomi
2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5524 - 5528
[40] An unsupervised acoustic fall detection system using source separation for sound interference suppression
Khan, Muhammad Salman
Yu, Miao
Feng, Pengming
Wang, Liang
Chambers, Jonathon
SIGNAL PROCESSING, 2015, 110 : 199 - 210

← 1 2 3 4 5 →