Speech and Phoneme Segmentation Under Noisy Environment Through Spectrogram Image Analysis

被引:0
|
作者
Costa, D. C. [1 ]
Lopes, G. A. M. [1 ]
Mello, C. A. B. [1 ]
Viana, H. O. [1 ]
机构
[1] Univ Fed Pernambuco, Ctr Informat, Recife, PE, Brazil
关键词
speech segmentation; phoneme segmentation; spectrogram; image analysis; FEATURES; MODELS;
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
This paper presents a new algorithm for speech segmentation based on image analysis of the spectrogram of the signal. The algorithm works in two loops: the first segments the sound in search for the speech signal. The segmented speech returns to the algorithm for phoneme segmentation. For evaluation, the algorithm was applied to TIMIT speech signals with correct speech segmentation of every tested signal, including signals under real-world noise.
引用
收藏
页码:1017 / 1022
页数:6
相关论文
共 50 条
  • [1] Phoneme recognition using speech image (spectrogram)
    Ahmadi, M
    Bailey, NJ
    Hoyle, BS
    [J]. ICSP '96 - 1996 3RD INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, PROCEEDINGS, VOLS I AND II, 1996, : 675 - 677
  • [2] Speech segmentation in noisy street environment
    Baszun, Jaroslaw
    [J]. SIGMAP 2007: PROCEEDINGS OF THE SECOND INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND MULTIMEDIA APPLICATIONS, 2007, : 432 - 437
  • [3] Speech Endpoint Detection in Noisy Environment Using Spectrogram Boundary Factor
    Wu, Di
    Tao, Zhi
    Wu, Yuanbo
    Shen, Cheng
    Xiao, Zhongzhe
    Zhang, Xiaojun
    Wu, Di
    Zhao, Heming
    [J]. 2016 9TH INTERNATIONAL CONGRESS ON IMAGE AND SIGNAL PROCESSING, BIOMEDICAL ENGINEERING AND INFORMATICS (CISP-BMEI 2016), 2016, : 964 - 968
  • [4] Speech recognition through phoneme segmentation and neural classification
    Maeran, O
    Piuri, V
    Gajani, GS
    [J]. IMTC/97 - IEEE INSTRUMENTATION & MEASUREMENT TECHNOLOGY CONFERENCE: SENSING, PROCESSING, NETWORKING, PROCEEDINGS VOLS 1 AND 2, 1997, : 1215 - 1220
  • [5] ANALYSIS OF SPEECH PRODUCTION IN A NOISY ENVIRONMENT
    Benzitouni, El Mouatassim
    Falek, Leila
    Teffahi, Hocine
    Djeradi, Amar
    [J]. 2013 8TH INTERNATIONAL WORKSHOP ON SYSTEMS, SIGNAL PROCESSING AND THEIR APPLICATIONS (WOSSPA), 2013, : 357 - 362
  • [6] PERCEPTUAL RESTORATION OF INTERMITTENT SPEECH UNDER NOISY ENVIRONMENT
    Mizumachi, Mitsunori
    Motomura, Satoshi
    Takakura, Tomohito
    Horiuchi, Toshiharu
    [J]. PROCEEDINGS OF THE 22ND INTERNATIONAL CONGRESS ON SOUND AND VIBRATION: MAJOR CHALLENGES IN ACOUSTICS, NOISE AND VIBRATION RESEARCH, 2015, 2015,
  • [7] Towards a New Image-Based Spectrogram Segmentation Speech Coder Optimised for Intelligibility
    Jellyman, K. A.
    Evans, N. W. D.
    Liu, W. M.
    Mason, J. S. D.
    [J]. ADVANCES IN MULTIMEDIA MODELING, PROCEEDINGS, 2009, 5371 : 63 - 73
  • [8] LANGUAGE-RESOURCE INDEPENDENT SPEECH SEGMENTATION USING CUES FROM A SPECTROGRAM IMAGE
    Leow, Su Jun
    Chng, Eng Siong
    Lee, Chin-Hui
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 5813 - 5817
  • [9] Speech Re-synthesis from Spectrogram Image through Sinusoidal Modelling
    Garg, Mayank
    Singhal, Rahul
    [J]. 2014 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, COMMUNICATIONS AND INFORMATICS (ICACCI), 2014, : 2757 - 2761
  • [10] BLIND SPEECH SEGMENTATION USING SPECTROGRAM IMAGE-BASED FEATURES AND MEL CEPSTRAL COEFFICIENTS
    Stan, Adriana
    Valentini-Botinhao, Cassia
    Orza, Bogdan
    Giurgiu, Mircea
    [J]. 2016 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2016), 2016, : 597 - 602