A voice activity detection algorithm in spectro-temporal domain using sparse representation

被引:0
|
作者
Mohadese Eshaghi
Farbod Razzazi
Alireza Behrad
机构
[1] Islamic Azad University,Department of Electrical and Computer Engineering
[2] Shahed University,Electrical and Electronic Engineering Department
关键词
Speech processing; Voice activity detector; VAD; Spectro-temporal domain representation; Sparse representation;
D O I
暂无
中图分类号
学科分类号
摘要
This paper describes a new algorithm for voice activity detection (VAD), based on sparse representation of spectro-temporal domain. Our audio classification algorithm is based on multi-scale spectro-temporal modulation features which are extracted using auditory cortex model. The key concept in sparse representation is that any speech fragment can be represented as a linear combination of a small number of exemplar speech tokens. In this algorithm, the approach transforms the speech into spectro-temporal domain resulting in its decomposition into auditory-based features with multiple scales of temporal and spectral resolutions; in the next stage, each frame is divided into several sub-cubes in the new domain; then the algorithm detects the speech in the signal by using the sparse representation of sub-cubes of the frames in this domain. Simulation results are given to illustrate the effectiveness of our new VAD algorithms. The results reveal that the achieved performance is 90.11 and 91.75% under − 5 db SNR in white and car noise respectively, outperforming most of the state of the art VAD algorithms.
引用
收藏
页码:1791 / 1803
页数:12
相关论文
共 50 条
  • [31] Spectro-Temporal Modeling for Human Activity Recognition Using a Radar Sensor Network
    Luo, Fei
    Bodanese, Eliane
    Khan, Salabat
    Wu, Kaishun
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61
  • [32] DeepComboSAD: Spectro-Temporal Correlation Based Speech Activity Detection for Naturalistic Audio Streams
    Joglekar, Aditya
    Hansen, John H. L.
    IEEE SIGNAL PROCESSING LETTERS, 2023, 30 : 1472 - 1476
  • [33] Spectro-temporal Modulation Based Singing Detection Combined with Pitch based Grouping for Singing Voice Separation
    Lin, Tse-En
    Hsu, Chung-Chien
    Chen, Yi-Cheng
    Chen, Jian-Hueng
    Chi, Tai-Shih
    14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 2919 - 2922
  • [34] Spectro-Temporal Gabor Filterbank Features for Acoustic Event Detection
    Schroeder, Jens
    Goetze, Stefan
    Anemueller, Joern
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2015, 23 (12) : 2198 - 2208
  • [35] Hemodialysis vascular access stenosis detection using auditory spectro-temporal features of phonoangiography
    Po-Hsun Sung
    Chung-Dann Kan
    Wei-Ling Chen
    Ling-Sheng Jang
    Jhing-Fa Wang
    Medical & Biological Engineering & Computing, 2015, 53 : 393 - 403
  • [36] Hemodialysis vascular access stenosis detection using auditory spectro-temporal features of phonoangiography
    Sung, Po-Hsun
    Kan, Chung-Dann
    Chen, Wei-Ling
    Jang, Ling-Sheng
    Wang, Jhing-Fa
    MEDICAL & BIOLOGICAL ENGINEERING & COMPUTING, 2015, 53 (05) : 393 - 403
  • [37] Rheumatic Heart Disease Detection Using Deep Learning from Spectro-Temporal Representation of Un-segmented Heart Sounds
    Asmare, Melkamu Hunegnaw
    Woldehanna, Frehiwot
    Janssens, Luc
    Vanrumste, Bart
    42ND ANNUAL INTERNATIONAL CONFERENCES OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY: ENABLING INNOVATIVE TECHNOLOGIES FOR GLOBAL HEALTHCARE EMBC'20, 2020, : 168 - 171
  • [38] A New Method for Voice Activity Detection Based on Sparse Representation
    Ahmadi, Parvin
    Joneidi, Mohsen
    2014 7TH INTERNATIONAL CONGRESS ON IMAGE AND SIGNAL PROCESSING (CISP 2014), 2014, : 878 - 882
  • [39] Robust Speaker Recognition Using Spectro-Temporal Autoregressive Models
    Mallidi, Sri Harish
    Ganapathy, Sriram
    Hermansky, Hynek
    14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 3656 - 3660
  • [40] Robust Audio Identification Using Spectro-Temporal Subband Centroids
    Seo, Jin Soo
    Lee, Seungjae
    JOURNAL OF THE ACOUSTICAL SOCIETY OF KOREA, 2008, 27 (05): : 239 - 243