Significance of Joint Features Derived from the Modified Group Delay Function in Speech Processing

被引:0
|
作者
Rajesh M. Hegde
Hema A. Murthy
V. R. R. Gadde
机构
[1] University of California San Diego,Department of Electrical and Computer Engineering
[2] Indian Institute of Technology Madras,Department of Computer Science and Engineering
[3] SRI International,STAR Lab
关键词
Acoustics; Speech Recognition; Group Delay; Conventional Group; Resonant Structure;
D O I
暂无
中图分类号
学科分类号
摘要
This paper investigates the significance of combining cepstral features derived from the modified group delay function and from the short-time spectral magnitude like the MFCC. The conventional group delay function fails to capture the resonant structure and the dynamic range of the speech spectrum primarily due to pitch periodicity effects. The group delay function is modified to suppress these spikes and to restore the dynamic range of the speech spectrum. Cepstral features are derived from the modified group delay function, which are called the modified group delay feature (MODGDF). The complementarity and robustness of the MODGDF when compared to the MFCC are also analyzed using spectral reconstruction techniques. Combination of several spectral magnitude-based features and the MODGDF using feature fusion and likelihood combination is described. These features are then used for three speech processing tasks, namely, syllable, speaker, and language recognition. Results indicate that combining MODGDF with MFCC at the feature level gives significant improvements for speech recognition tasks in noise. Combining the MODGDF and the spectral magnitude-based features gives a significant increase in recognition performance of 11% at best, while combining any two features derived from the spectral magnitude does not give any significant improvement.
引用
收藏
相关论文
共 50 条
  • [31] Detection of Hypernasality from Speech Signal Using Group Delay and Wavelet Transform
    Mirzaei, Atefeh
    Vali, Mansour
    2016 6TH INTERNATIONAL CONFERENCE ON COMPUTER AND KNOWLEDGE ENGINEERING (ICCKE), 2016, : 189 - 193
  • [32] Extracting Formants from Short Segments of Speech using Group Delay Functions
    Anand, Joseph M.
    Guruprasad, S.
    Yegnanarayana, B.
    INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 1009 - 1012
  • [33] An analysis of the high resolution property of group delay function with applications to audio signal processing
    Sebastian, Jilt
    Kumar, Manoj P. A. B.
    Murthy, Hema A.
    SPEECH COMMUNICATION, 2016, 81 : 42 - 53
  • [34] Two-pitch tracking in co-channel speech using modified group delay functions
    Rajan, Rajeev
    Murthy, Hema A.
    SPEECH COMMUNICATION, 2017, 89 : 37 - 46
  • [35] Computational modeling of auditory brainstem responses derived from modified speech
    Cheng, Tzu-Han Zoe
    Calamia, Paul
    INTERSPEECH 2023, 2023, : 4214 - 4218
  • [36] SHORT STATURE, JOINT LAXITY, AND SPEECH DELAY - A SYNDROME DISTINCT FROM LARSEN SYNDROME
    ANDERSON, CE
    BOCIAN, ME
    WALKER, AP
    LACHMAN, R
    RIMOIN, DL
    CLINICAL RESEARCH, 1981, 29 (01): : A129 - A129
  • [37] Melody extraction from music using modified group delay functions
    Rajan R.
    Misra M.
    Murthy H.A.
    International Journal of Speech Technology, 2017, 20 (1) : 185 - 204
  • [38] THE SIGNIFICANCE OF GLEY FEATURES IN SOILS DERIVED FROM GREY PARENT MATERIALS
    MOFFAT, AJ
    JARVIS, MG
    JOURNAL OF SOIL SCIENCE, 1988, 39 (02): : 177 - 189
  • [39] SIGNIFICANCE OF GROUP DELAY FUNCTIONS IN SIGNAL RECONSTRUCTION FROM SPECTRAL MAGNITUDE OR PHASE
    YEGNANARAYANA, B
    SAIKIA, DK
    KRISHNAN, TR
    IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1984, 32 (03): : 610 - 623
  • [40] Speech Emotion Recognition Using Derived Features from Speech Segment and Kernel Principal Component Analysis
    Charoendee, Matee
    Suchato, Atiwong
    Punyabukkana, Proadpran
    PROCEEDINGS OF 2017 14TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER SCIENCE AND SOFTWARE ENGINEERING (JCSSE), 2017,