A voice activity detection algorithm in spectro-temporal domain using sparse representation

被引：0

作者：

Mohadese Eshaghi

Farbod Razzazi

Alireza Behrad

机构：

[1] Islamic Azad University,Department of Electrical and Computer Engineering

[2] Shahed University,Electrical and Electronic Engineering Department

来源：

International Journal of Machine Learning and Cybernetics | 2019年 / 10卷

关键词：

Speech processing; Voice activity detector; VAD; Spectro-temporal domain representation; Sparse representation;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

This paper describes a new algorithm for voice activity detection (VAD), based on sparse representation of spectro-temporal domain. Our audio classification algorithm is based on multi-scale spectro-temporal modulation features which are extracted using auditory cortex model. The key concept in sparse representation is that any speech fragment can be represented as a linear combination of a small number of exemplar speech tokens. In this algorithm, the approach transforms the speech into spectro-temporal domain resulting in its decomposition into auditory-based features with multiple scales of temporal and spectral resolutions; in the next stage, each frame is divided into several sub-cubes in the new domain; then the algorithm detects the speech in the signal by using the sparse representation of sub-cubes of the frames in this domain. Simulation results are given to illustrate the effectiveness of our new VAD algorithms. The results reveal that the achieved performance is 90.11 and 91.75% under − 5 db SNR in white and car noise respectively, outperforming most of the state of the art VAD algorithms.

引用

页码：1791 / 1803

页数：12

共 50 条

[31] Spectro-Temporal Modeling for Human Activity Recognition Using a Radar Sensor Network
Luo, Fei
Bodanese, Eliane
Khan, Salabat
Wu, Kaishun
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61
[32] DeepComboSAD: Spectro-Temporal Correlation Based Speech Activity Detection for Naturalistic Audio Streams
Joglekar, Aditya
Hansen, John H. L.
IEEE SIGNAL PROCESSING LETTERS, 2023, 30 : 1472 - 1476
[33] Spectro-temporal Modulation Based Singing Detection Combined with Pitch based Grouping for Singing Voice Separation
Lin, Tse-En
Hsu, Chung-Chien
Chen, Yi-Cheng
Chen, Jian-Hueng
Chi, Tai-Shih
14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 2919 - 2922
[34] Spectro-Temporal Gabor Filterbank Features for Acoustic Event Detection
Schroeder, Jens
Goetze, Stefan
Anemueller, Joern
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2015, 23 (12) : 2198 - 2208
[35] Hemodialysis vascular access stenosis detection using auditory spectro-temporal features of phonoangiography
Po-Hsun Sung
Chung-Dann Kan
Wei-Ling Chen
Ling-Sheng Jang
Jhing-Fa Wang
Medical & Biological Engineering & Computing, 2015, 53 : 393 - 403
[36] Hemodialysis vascular access stenosis detection using auditory spectro-temporal features of phonoangiography
Sung, Po-Hsun
Kan, Chung-Dann
Chen, Wei-Ling
Jang, Ling-Sheng
Wang, Jhing-Fa
MEDICAL & BIOLOGICAL ENGINEERING & COMPUTING, 2015, 53 (05) : 393 - 403
[37] Rheumatic Heart Disease Detection Using Deep Learning from Spectro-Temporal Representation of Un-segmented Heart Sounds
Asmare, Melkamu Hunegnaw
Woldehanna, Frehiwot
Janssens, Luc
Vanrumste, Bart
42ND ANNUAL INTERNATIONAL CONFERENCES OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY: ENABLING INNOVATIVE TECHNOLOGIES FOR GLOBAL HEALTHCARE EMBC'20, 2020, : 168 - 171
[38] A New Method for Voice Activity Detection Based on Sparse Representation
Ahmadi, Parvin
Joneidi, Mohsen
2014 7TH INTERNATIONAL CONGRESS ON IMAGE AND SIGNAL PROCESSING (CISP 2014), 2014, : 878 - 882
[39] Robust Speaker Recognition Using Spectro-Temporal Autoregressive Models
Mallidi, Sri Harish
Ganapathy, Sriram
Hermansky, Hynek
14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 3656 - 3660
[40] Robust Audio Identification Using Spectro-Temporal Subband Centroids
Seo, Jin Soo
Lee, Seungjae
JOURNAL OF THE ACOUSTICAL SOCIETY OF KOREA, 2008, 27 (05): : 239 - 243

← 1 2 3 4 5 →