SPECTRAL SMOOTHING BY VARIATIONAL MODE DECOMPOSITION AND ITS EFFECT ON NOISE AND PITCH ROBUSTNESS OF ASR SYSTEM

被引:0
|
作者
Yadav, Ishwar Chandra [1 ]
Shahnawazuddin, S. [1 ]
Govind, D. [2 ]
Pradhan, Gayadhar [1 ]
机构
[1] NIT Patna, Dept Elect & Commun Engn, Patna, Bihar, India
[2] Amrita Univ, Ctr Computat Engn & Networking, Coimbatore, Tamil Nadu, India
关键词
Speech recognition; ambient noise; pitch mismatch; spectral smoothing; VMD; SPEECH; CHILDRENS;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
A novel front-end speech parameterization technique that is robust towards ambient noise and pitch variations is proposed in this paper. In the proposed technique, the short-time magnitude spectrum obtained by discrete Fourier transform is first decomposed in several components using variational mode decomposition (VMD). For sufficiently smoothing the spectrum, the higher-order components are discarded. The smoothed spectrum is then obtained by reconstructing the spectrum using the first-two modes only. The Mel-frequency cepstral coefficients computed using the VMD-based smoothed spectra are observed to be affected less by ambient noise and pitch variations. To validate the same, an automatic speech recognition system is developed on clean speech from adult speakers and evaluated under noisy test conditions. Furthermore, experimental evaluations are also performed on another test set which consists of speech data from children to simulate large pitch differences. The experimental evaluations as well as signal domain analyses presented in this paper support these claims.
引用
收藏
页码:5629 / 5633
页数:5
相关论文
共 50 条
  • [1] Addressing noise and pitch sensitivity of speech recognition system through variational mode decomposition based spectral smoothing
    Yadav, Ishwar Chandra
    Shahnawazuddin, S.
    Pradhan, Gayadhar
    DIGITAL SIGNAL PROCESSING, 2019, 86 : 55 - 64
  • [2] ENHANCING NOISE AND PITCH ROBUSTNESS OF CHILDREN'S ASR
    Shahnawazuddin, S.
    Deepak, K. T.
    Pradhan, Gayadhar
    Sinha, Rohit
    2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 5225 - 5229
  • [3] Enhancing Pitch Robustness of Speech Recognition System through Spectral Smoothing
    Sai, B. Tarun
    Yadav, Ishwar Chandra
    Shahnawazuddin, S.
    Pradhan, Gayadhar
    2018 INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND COMMUNICATIONS (SPCOM 2018), 2018, : 242 - 246
  • [4] Static and dynamic spectral features: Their noise robustness and optimal weights for ASR
    Yang, Chen
    Soong, Frank K.
    Lee, Tan
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2007, 15 (03): : 1087 - 1097
  • [5] Static and dynamic spectral features: Their noise robustness and optimal weights for ASR
    Chen, Y
    Soong, FK
    Lee, T
    2005 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1-5: SPEECH PROCESSING, 2005, : 241 - 244
  • [6] Variational Mode Decomposition for Raman Spectral Denoising
    Bian, Xihui
    Shi, Zitong
    Shao, Yingjie
    Chu, Yuanyuan
    Tan, Xiaoyao
    MOLECULES, 2023, 28 (17):
  • [7] The Application of Variational Mode Decomposition to Spectral Background Removal
    Lu, Xin
    Zeng, Xiaolong
    Li, Fusheng
    2022 IEEE 17TH INTERNATIONAL CONFERENCE ON CONTROL & AUTOMATION, ICCA, 2022, : 504 - 509
  • [8] NOISE-ASSISTED MULTIVARIATE VARIATIONAL MODE DECOMPOSITION
    Zisou, Charilaos A.
    Apostolidis, Georgios K.
    Hadjileontiadis, Leontios J.
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 5090 - 5094
  • [9] Smoothing the difference-based estimates of variance using variational mode decomposition
    Palanisamy, T.
    COMMUNICATIONS IN STATISTICS-SIMULATION AND COMPUTATION, 2017, 46 (06) : 4991 - 5001
  • [10] Seismic random noise attenuation based on variational mode decomposition
    Fang J.
    Wen Z.
    Gu H.
    Liu J.
    Zhang H.
    Shiyou Diqiu Wuli Kantan/Oil Geophysical Prospecting, 2019, 54 (04): : 757 - 767