Significance of Joint Features Derived from the Modified Group Delay Function in Speech Processing

被引:0
|
作者
Rajesh M. Hegde
Hema A. Murthy
V. R. R. Gadde
机构
[1] University of California San Diego,Department of Electrical and Computer Engineering
[2] Indian Institute of Technology Madras,Department of Computer Science and Engineering
[3] SRI International,STAR Lab
关键词
Acoustics; Speech Recognition; Group Delay; Conventional Group; Resonant Structure;
D O I
暂无
中图分类号
学科分类号
摘要
This paper investigates the significance of combining cepstral features derived from the modified group delay function and from the short-time spectral magnitude like the MFCC. The conventional group delay function fails to capture the resonant structure and the dynamic range of the speech spectrum primarily due to pitch periodicity effects. The group delay function is modified to suppress these spikes and to restore the dynamic range of the speech spectrum. Cepstral features are derived from the modified group delay function, which are called the modified group delay feature (MODGDF). The complementarity and robustness of the MODGDF when compared to the MFCC are also analyzed using spectral reconstruction techniques. Combination of several spectral magnitude-based features and the MODGDF using feature fusion and likelihood combination is described. These features are then used for three speech processing tasks, namely, syllable, speaker, and language recognition. Results indicate that combining MODGDF with MFCC at the feature level gives significant improvements for speech recognition tasks in noise. Combining the MODGDF and the spectral magnitude-based features gives a significant increase in recognition performance of 11% at best, while combining any two features derived from the spectral magnitude does not give any significant improvement.
引用
收藏
相关论文
共 50 条
  • [41] A method for FIR filter design from joint amplitude and group delay characteristics
    Fotinopoulos, I
    Constantinides, A
    Stathaki, T
    CONFERENCE RECORD OF THE THIRTY-FIFTH ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS AND COMPUTERS, VOLS 1 AND 2, 2001, : 621 - 625
  • [42] Whispered Speech Detection Using Fusion of Group-Delay-Based Subband Modulation Spectrum and Correntropy Features
    Wang, Jinfang
    Shang, Yongqiang
    Jiang, Shuangshuang
    Gowda, Dhananjaya
    Lv, Ke
    IEEE SIGNAL PROCESSING LETTERS, 2016, 23 (08) : 1042 - 1046
  • [43] Significance of Exploring Pitch only Features for the Recognition of Spontaneous Emotions from Speech Signals
    Pooja, A.
    Pravena, D.
    Govind, D.
    2017 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, COMMUNICATIONS AND INFORMATICS (ICACCI), 2017, : 1438 - 1442
  • [44] Novel speech processing mechanism derived from auditory neocortical circuit analysis
    Aleksandrovsky, B
    Whitson, J
    Andes, G
    Lynch, G
    Granger, R
    ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 558 - 561
  • [45] Significance of the MUSIC-Group Delay Method in an ICA-Beamforming Framework for Speech Separation in Multi Source Environments
    Kumar, Lalan
    Singhal, Kushagra
    Sinha, Rohit
    Hegde, Rajesh M.
    2013 NATIONAL CONFERENCE ON COMMUNICATIONS (NCC), 2013,
  • [46] FORMANT EXTRACTION FROM PHASE USING WEIGHTED GROUP DELAY FUNCTION
    MURTHY, HA
    MURTHY, KVM
    YEGNANARAYANA, B
    ELECTRONICS LETTERS, 1989, 25 (23) : 1609 - 1611
  • [47] A new method for joint estimation of delay and Doppler from ambiguity function: combination of stochastic process and spatial processing for noise and clutter suppression
    Shojaedini, Seyed Vahab
    INTERNATIONAL JOURNAL OF ELECTRONICS, 2014, 101 (04) : 569 - 583
  • [48] Matrix function derived from unitary matrix representation of the group
    Wang, Xinjie
    Huazhong Ligong Daxue Xuebao/Journal Huazhong (Central China) University of Science and Technology, 26 (SUPPL. 2): : 44 - 47
  • [49] A matrix function derived from principal elements represented by the group
    Wang, X.
    Huazhong Ligong Daxue Xuebao/Journal Huazhong (Central China) University of Science and Technology, 2001, 29 (11): : 113 - 116
  • [50] Spectro-temporal analysis of speech signals using zero-time windowing and group delay function
    Bayya, Yegnanarayana
    Gowda, Dhananjaya N.
    SPEECH COMMUNICATION, 2013, 55 (06) : 782 - 795