Using Spectral Fluctuation of Speech in multi-feature HMM-based voice activity detection

被引:0
|
作者
Espi, Miquel [1 ]
Miyabe, Shigeki [1 ]
Nishimoto, Takuya [1 ]
Ono, Nobutaka [1 ]
Sagayama, Shigeki [1 ]
机构
[1] Univ Tokyo, Grad Sch Informat Sci & Technol, Tokyo 1138654, Japan
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Observation of speech spectrum leads to the fact that speech has a specific spectral fluctuation pattern both along time and frequency. In this paper, we integrate the usage of this nature in a multi-feature approach for voice activity detection. The effect of separating such specific spectral fluctuation using multi-stage HPSS (Harmonic-Percussive Sound Separation) has been analyzed over conventional features in voice activity detection, reducing frame-wise detection error by up to 78%, depending on the SNR conditions and noise type. The multi-feature approach has been tested using Hidden Markov Models to model the features stream as a sequence, which has out-performed standard and similar VAD proposals in utterance-based tests intended for automatic speech recognition.
引用
收藏
页码:2624 / 2627
页数:4
相关论文
共 50 条
  • [1] Integration of Spectral Feature Extraction and Modeling for HMM-Based Speech Synthesis
    Nakamura, Kazuhiro
    Hashimoto, Kei
    Nankaku, Yoshihiko
    Tokuda, Keiichi
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2014, E97D (06): : 1438 - 1448
  • [2] IMPROVING VOICE QUALITY OF HMM-BASED SPEECH SYNTHESIS USING VOICE CONVERSION METHOD
    Jiao, Yishan
    Xie, Xiang
    Na, Xingyu
    Tu, Ming
    [J]. 2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [3] Using HMM-based Speech Synthesis to Reconstruct the Voice of Individuals with Degenerative Speech Disorders
    Veaux, Christophe
    Yamagishi, Junichi
    King, Simon
    [J]. 13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 966 - 969
  • [4] An improved method for voice pathology detection by means of a HMM-based feature space transformation
    Arias-Londono, Julian D.
    Godino-Llorente, Juan I.
    Saenz-Lechon, Nicolas
    Osma-Ruiz, Victor
    Castellanos-Dominguez, German
    [J]. PATTERN RECOGNITION, 2010, 43 (09) : 3100 - 3112
  • [5] FACTOR ANALYZED VOICE MODELS FOR HMM-BASED SPEECH SYNTHESIS
    Kazumi, Kyosuke
    Nankaku, Yoshihiko
    Tokuda, Keiichi
    [J]. 2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 4234 - 4237
  • [6] Voice characteristics conversion for HMM-based speech synthesis system
    Masuko, T
    Tokuda, K
    Kobayashi, T
    Imai, S
    [J]. 1997 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I - V: VOL I: PLENARY, EXPERT SUMMARIES, SPECIAL, AUDIO, UNDERWATER ACOUSTICS, VLSI; VOL II: SPEECH PROCESSING; VOL III: SPEECH PROCESSING, DIGITAL SIGNAL PROCESSING; VOL IV: MULTIDIMENSIONAL SIGNAL PROCESSING, NEURAL NETWORKS - VOL V: STATISTICAL SIGNAL AND ARRAY PROCESSING, APPLICATIONS, 1997, : 1611 - 1614
  • [7] Usage of the HMM-Based Speech Synthesis for intelligent Arabic voice
    Fares, Tamer S.
    Khalil, Awad H.
    Hegazy, Abd El-Fatah A.
    [J]. INTELLIGENT SYSTEMS AND AUTOMATION, 2008, 1019 : 93 - +
  • [8] Feature pruning in likelihood evaluation of HMM-based speech recognition
    Li, X
    Bilmes, J
    [J]. ASRU'03: 2003 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING ASRU '03, 2003, : 303 - 308
  • [9] A Replay Voice Detection Algorithm Based on Multi-feature Fusion
    Lin, Lang
    Wang, Rangding
    Yan, Diqun
    Li, Can
    [J]. CLOUD COMPUTING AND SECURITY, PT VI, 2018, 11068 : 289 - 299
  • [10] Discriminative feature weighting for HMM-based continuous speech recognizers
    de la Torre, A
    Peinado, AM
    Rubio, AJ
    Segura, JC
    Benítez, C
    [J]. SPEECH COMMUNICATION, 2002, 38 (3-4) : 267 - 286