A Fast Method for High-Resolution Voiced/Unvoiced Detection and Glottal Closure/Opening Instant Estimation of Speech

被引:30
|
作者
Koutrouvelis, Andreas I. [1 ]
Kafentzis, George P. [2 ]
Gaubitch, Nikolay D. [3 ]
Heusdens, Richard [4 ]
机构
[1] Delft Univ Technol, Microelect Dept, NL-2628 CD Delft, Netherlands
[2] Univ Crete, Dept Comp Sci, Iraklion 73000, Greece
[3] Delft Univ Technol, Dept Comp Sci, NL-2628 CD Delft, Netherlands
[4] Delft Univ Technol, Fac Elect Engn Math & Comp Sci, NL-2628 CD Delft, Netherlands
关键词
Glottal closure instants (GCIs); glottal opening instants (GOIs); pitch estimation; speech analysis; voiced/unvoiced detection (VUD); LINEAR PREDICTION; EPOCH EXTRACTION; WAVE-FORM; CLOSURE INSTANTS; CLASSIFICATION; EXCITATION; RECOGNITION; ALGORITHM; QUALITY;
D O I
10.1109/TASLP.2015.2506263
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
We propose a fast speech analysis method which simultaneously performs high-resolution voiced/unvoiced detection (VUD) and accurate estimation of glottal closure and glottal opening instants (GCIs and GOIs, respectively). The proposed algorithm exploits the structure of the glottal flow derivative in order to estimate GCIs and GOIs only in voiced speech using simple time-domain criteria. We compare our method with well-known GCI/GOI methods, namely, the dynamic programming projected phase-slope algorithm (DYPSA), the yet another GCI/GOI algorithm (YAGA) and the speech event detection using the residual excitation and a mean-based signal (SEDREAMS). Furthermore, we examine the performance of the aforementioned methods when combined with state-of-the-art VUD algorithms, namely, the robust algorithm for pitch tracking (RAPT) and the summation of residual harmonics (SRH). Experiments conducted on the APLAWD and SAM databases show that the proposed algorithm outperforms the state-of-the-art combinations of VUD and GCI/GOI algorithms with respect to almost all evaluation criteria for clean speech. Experiments on speech contaminated with several noise types (white Gaussian, babble, and car-interior) are also presented and discussed. The proposed algorithm outperforms the state-of-the-art combinations in most evaluation criteria for signal-to-noise ratio greater than 10 dB.
引用
收藏
页码:316 / 328
页数:13
相关论文
共 50 条
  • [21] HIGH-RESOLUTION DETECTION AND ESTIMATION
    TUFTS, DW
    KIRSTEINS, I
    KUMARESAN, R
    PROCEEDINGS OF THE SOCIETY OF PHOTO-OPTICAL INSTRUMENTATION ENGINEERS, 1983, 431 : 20 - 31
  • [22] Sequence-to-Sequence CNN-BiLSTM Based Glottal Closure Instant Detection from Raw Speech
    Matousek, Jindrich
    Tihelka, Daniel
    ARTIFICIAL NEURAL NETWORKS IN PATTERN RECOGNITION, ANNPR 2022, 2023, 13739 : 107 - 120
  • [23] A note on "A fast high-resolution method for bearing estimation in shallow ocean"
    Xu, Yougen
    Liu, Zhiwen
    MULTIDIMENSIONAL SYSTEMS AND SIGNAL PROCESSING, 2011, 22 (04) : 349 - 359
  • [24] A note on “A fast high-resolution method for bearing estimation in shallow ocean”
    Yougen Xu
    Zhiwen Liu
    Multidimensional Systems and Signal Processing, 2011, 22 : 349 - 359
  • [25] A fast method for colon polyp detection in high-resolution CT data
    Kiraly, AP
    Laks, S
    Macari, M
    Geiger, B
    Bogoni, L
    Novak, CL
    CARS 2004: COMPUTER ASSISTED RADIOLOGY AND SURGERY, PROCEEDINGS, 2004, 1268 : 983 - 988
  • [26] High-Resolution Dynamic Speech Imaging with Deformation Estimation
    Fu, Maojing
    Barlaz, Marissa S.
    Shosted, Ryan K.
    Liang, Zhi-Pei
    Sutton, Bradley P.
    2015 37TH ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY (EMBC), 2015, : 1568 - 1571
  • [27] Detection of Glottal Closure Instants from Speech Signals: A Convolutional Neural Network Based Method
    Yang, Shuai
    Wu, Zhiyong
    Shen, Binbin
    Meng, Helen
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 317 - 321
  • [28] Fast Object Detection in High-Resolution Videos
    Tran, Ryan
    Kanaujia, Atul
    Parameswaran, Vasu
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS, ICCVW, 2023, : 1461 - 1470
  • [29] NEW HIGH-RESOLUTION PSEUDOSPECTRUM ESTIMATION METHOD
    IBRAHIM, MK
    IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1987, 35 (07): : 1071 - 1072
  • [30] Fast Satellite Streak Detection for High-resolution Image
    Shin, Seunghyeok
    Kim, Whoi-Yul
    2018 INTERNATIONAL WORKSHOP ON ADVANCED IMAGE TECHNOLOGY (IWAIT), 2018,