A Voice Activity Detection Algorithm Using Sparse Non-negative Matrix Factorization-based Model Learning in Spectro-Temporal Domain

被引:0
|
作者
Mavaddati, S. [1 ]
机构
[1] Univ Mazandaran, Fac Engn & Technol, Babolsar, Iran
来源
INTERNATIONAL JOURNAL OF ENGINEERING | 2023年 / 36卷 / 08期
关键词
Voice Activity Detector; Spectro-temporal Domain; Spectro-temporal Sparse Structured Principal Component; Analysis; Sparse Non-negative Matrix Factorization; RECOGNITION; NOISE;
D O I
10.5829/ije.2023.36.08b.08
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
Voice activity detectors are presented to extract silence/speech segments of the speech signal to eliminate different background noise signals. A novel voice activity detector is proposed in this paper using spectro-temporal features extracted from the auditory model of the speech signal. After extracting the scale, rate, and frequency features from this feature space, a sparse structured principal component analysis algorithm is used to consider the basic components of these features and reduce the dimension of learning data. Then these feature vectors are employed to learn the models by the sparse non-negative matrix factorization algorithm. The model learning procedure is performed to represent each feature vector with a proper sparse rate based on the selected atoms. Voice activity detection of the input frames is performed by computing the energy of the sparse representation for each input frame over the composite model. If the calculated energy exceeds a specified threshold, it indicates that the input frame has a structure similar to the atoms of the learned models and concludes that the observed frame has voice content. The results of the proposed detector were compared with other baseline methods and classifiers in this processing field. These results in the presence of stationary, non-stationary and periodic noises were investigated and they are shown that the proposed method based on model learning with spectro-temporal features can correctly detect the silence/speech activities.doi: 10.5829/ije.2023.36.08b.08
引用
收藏
页码:1478 / 1488
页数:11
相关论文
共 50 条
  • [31] Non-negative wavelet matrix factorization-based bearing fault intelligent classification method
    Dong, Zhilin
    Zhao, Dezun
    Cui, Lingli
    MEASUREMENT SCIENCE AND TECHNOLOGY, 2023, 34 (11)
  • [32] An improved non-negative matrix factorization algorithm based on genetic algorithm
    Zhou, Sheng
    Yu, Zhi
    Wang, Can
    PROCEEDINGS OF THE 2014 INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND ELECTRONIC TECHNOLOGY, 2015, 6 : 395 - 398
  • [33] Mining node attributes for link prediction with a non-negative matrix factorization-based approach
    Zhao, Zhili
    Hu, Ahui
    Zhang, Nana
    Xie, Jiquan
    Du, Zihao
    Wan, Li
    Yan, Ruiyi
    KNOWLEDGE-BASED SYSTEMS, 2024, 299
  • [34] EXPLICIT BEAT STRUCTURE MODELING FOR NON-NEGATIVE MATRIX FACTORIZATION-BASED MULTIPITCH ANALYSIS
    Ochiai, Kazuki
    Kameoka, Hirokazu
    Sagayama, Shigeki
    2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 133 - 136
  • [35] Improvement of non-negative matrix factorization-based reverberation suppression for bistatic active sonar
    Lee, Seokjin
    Lee, Yongon
    JOURNAL OF THE ACOUSTICAL SOCIETY OF KOREA, 2022, 41 (04): : 468 - 479
  • [36] Group Sparse Non-negative Matrix Factorization for Multi-Manifold Learning
    Liu, Xiangyang
    Lu, Hongtao
    Gu, Hua
    PROCEEDINGS OF THE BRITISH MACHINE VISION CONFERENCE 2011, 2011,
  • [37] Learning quantifiable associations via principal sparse non-negative matrix factorization
    Hu, Chenyong
    Zhang, Benyu
    Wang, Yongji
    Yan, Shuicheng
    Chen, Zheng
    Wang, Qing
    Yang, Qiang
    INTELLIGENT DATA ANALYSIS, 2005, 9 (06) : 603 - 620
  • [38] Learning sparse representations by non-negative matrix factorization and sequential cone programming
    Heiler, Matthias
    Schnoerr, Christoph
    JOURNAL OF MACHINE LEARNING RESEARCH, 2006, 7 : 1385 - 1407
  • [39] ACTIVITY-MAPPING NON-NEGATIVE MATRIX FACTORIZATION FOR EXEMPLAR-BASED VOICE CONVERSION
    Aihara, Ryo
    Takiguchi, Tetsuya
    Ariki, Yasuo
    2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 4899 - 4903
  • [40] Voice Activity Detection Via Noise Reducing Using Non-Negative Sparse Coding
    Teng, Peng
    Jia, Yunde
    IEEE SIGNAL PROCESSING LETTERS, 2013, 20 (05) : 475 - 478