Speech/non-speech segmentation based on phoneme recognition features

被引:8
|
作者
Zibert, Janez [1 ]
Pavesic, Nikola [1 ]
Mihelic, France [1 ]
机构
[1] Univ Ljubljana, Fac Elect Engn, Ljubljana 1000, Slovenia
关键词
D O I
10.1155/ASP/2006/90495
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
This work assesses different approaches for speech and non-speech segmentation of audio data and proposes a new, high-level representation of audio signals based on phoneme recognition features suitable for speech/non-speech discrimination tasks. Unlike previous model-based approaches, where speech and non-speech classes were usually modeled by several models, we develop a representation where just one model per class is used in the segmentation process. For this purpose, four measures based on consonant-vowel pairs obtained from different phoneme speech recognizers are introduced and applied in two different segmentation-classification frameworks. The segmentation systems were evaluated on different broadcast news databases. The evaluation results indicate that the proposed phoneme recognition features are better than the standard mel-frequency cepstral coefficients and posterior probability-based features ( entropy and dynamism). The proposed features proved to be more robust and less sensitive to different training and unforeseen conditions. Additional experiments with fusion models based on cepstral and the proposed phoneme recognition features produced the highest scores overall, which indicates that the most suitable method for speech/non-speech segmentation is a combination of low-level acoustic features and high-level recognition features.
引用
收藏
页数:13
相关论文
共 50 条
  • [1] Speech/Non-Speech Segmentation Based on Phoneme Recognition Features
    Janez Žibert
    Nikola Pavešić
    France Mihelič
    [J]. EURASIP Journal on Advances in Signal Processing, 2006
  • [2] Speech/Non-Speech Segments Detection Based On Chaotic and Prosodic Features
    Shafiee, Soheil
    Almasganj, Farshad
    Jafari, Ayyoob
    [J]. INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 111 - 114
  • [3] Speech Based Features Applied to the Detection of Non-speech Audio Events
    Vozarikova, Eva
    Cizmar, Anton
    [J]. 12TH INTERNATIONAL CONFERENCE ON RESEARCH IN TELECOMMUNICATION TECHNOLOGIES (RTT 2010), 2010, : 125 - 128
  • [4] Robust speech detection based on phoneme recognition features
    Mihelic, France
    Zibert, Janez
    [J]. TEXT, SPEECH AND DIALOGUE, PROCEEDINGS, 2006, 4188 : 455 - 462
  • [5] Pattern recognition of non-speech audio
    Aucouturier, Jean-Julien
    Daudet, Laurent
    [J]. PATTERN RECOGNITION LETTERS, 2010, 31 (12) : 1487 - 1488
  • [6] Language Model Based Non-speech Recognition Method
    Zhang, Qinglin
    Chen, Jianfeng
    Bai, Jisheng
    [J]. CONFERENCE PROCEEDINGS OF 2019 IEEE INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, COMMUNICATIONS AND COMPUTING (IEEE ICSPCC 2019), 2019,
  • [7] Speaker Non-speech Event Recognition with Standard Speech Datasets
    Rajnoha, J.
    [J]. ACTA POLYTECHNICA, 2007, 47 (4-5) : 107 - 111
  • [8] Unsupervised speech/non-speech detection for automatic speech recognition in meeting rooms
    Maganti, Hari Krishna
    Motlicek, Petr
    Gatica-Perez, Daniel
    [J]. 2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3, 2007, : 1037 - +
  • [9] Call Analysis with Classification Using Speech and Non-Speech Features
    Ju, Yun-Cheng
    Wang, Ye-Yi
    Acero, Alex
    [J]. INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 1902 - 1905
  • [10] Phoneme segmentation of speech
    Ziolko, Bartosz
    Manandhar, Suresh
    Wilson, Richard C.
    [J]. 18TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL 4, PROCEEDINGS, 2006, : 282 - +