ROBUST FEATURE CLUSTERING FOR UNSUPERVISED SPEECH ACTIVITY DETECTION

被引:0
|
作者
Dubey, Harishchandra [1 ]
Sangwan, Abhijeet [1 ]
Hansen, John H. L. [1 ]
机构
[1] Univ Texas Dallas, Robust Speech Technol Lab, Ctr Robust Speech Syst, Richardson, TX 75080 USA
关键词
Clustering; Hartigan dip test; NIST OpenSAD; NIST OpenSAT; speech activity detection; zero-resource speech processing; unsupervised learning; SYSTEM;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In certain applications such as zero-resource speech processing or very-low resource speech-language systems, it might not be feasible to collect speech activity detection (SAD) annotations. However, the state-of-the-art supervised SAD techniques based on neural networks or other machine learning methods require annotated training data matched to the target domain. This paper establish a clustering approach for fully unsupervised SAD useful for cases where SAD annotations are not available. The proposed approach leverages Hartigan dip test in a recursive strategy for segmenting the feature space into prominent modes. Statistical dip is invariant to distortions that lends robustness to the proposed method. We evaluate the method on NIST OpenSAD 2015 and NIST OpenSAT 2017 public safety communications data. The results showed the superiority of proposed approach over the two-component GMM baseline.
引用
收藏
页码:2726 / 2730
页数:5
相关论文
共 50 条
  • [31] Active Learning with Clustering and Unsupervised Feature Learning
    Berardo, Saul
    Favero, Eloi
    Neto, Nelson
    [J]. ADVANCES IN ARTIFICIAL INTELLIGENCE (AI 2015), 2015, 9091 : 281 - 290
  • [32] Subspace clustering guided unsupervised feature selection
    Zhu, Pengfei
    Zhu, Wencheng
    Hu, Qinghua
    Zhang, Changqing
    Zuo, Wangmeng
    [J]. PATTERN RECOGNITION, 2017, 66 : 364 - 374
  • [33] Iterative Autoencoding and Clustering for Unsupervised Feature Representation
    Du, Songlin
    Ikenaga, Takeshi
    [J]. 2019 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), 2019,
  • [34] Robust endpoint detection for speech recognition based on discriminative feature extraction
    Yamamoto, Koichi
    Jabloun, Firas
    Reinhard, Klaus
    Kawamura, Akinori
    [J]. 2006 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-13, 2006, : 805 - 808
  • [35] Speech emotion recognition with unsupervised feature learning
    Huang, Zheng-wei
    Xue, Wen-tao
    Mao, Qi-rong
    [J]. FRONTIERS OF INFORMATION TECHNOLOGY & ELECTRONIC ENGINEERING, 2015, 16 (05) : 358 - 366
  • [36] Speech emotion recognition with unsupervised feature learning
    Zheng-wei HUANG
    Wen-tao XUE
    Qi-rong MAO
    [J]. Frontiers of Information Technology & Electronic Engineering, 2015, 16 (05) : 358 - 366
  • [37] Speech emotion recognition with unsupervised feature learning
    Zheng-wei Huang
    Wen-tao Xue
    Qi-rong Mao
    [J]. Frontiers of Information Technology & Electronic Engineering, 2015, 16 : 358 - 366
  • [38] USE OF PITCH CONTINUITY FOR ROBUST SPEECH ACTIVITY DETECTION
    Shao, Yiwen
    Lin, Qiguang
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5534 - 5538
  • [39] Robust Voice Activity Detection Algorithm for Noisy Speech
    Verteletskaya, Ekaterina
    Simak, Boris
    [J]. RTT 2009: 11TH INTERNATIONAL CONFERENCE RTT 2009 RESEARCH IN TELECOMMUNICATION TECHNOLOGY, CONFERENCE PROCEEDINGS, 2009, : 98 - 101
  • [40] Enhanced SVM training for robust speech activity detection
    Temko, Andrey
    Macho, Dusan
    Nadeu, Climent
    [J]. 2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3, 2007, : 1025 - +