Speech-Signal-Based Frequency Warping

被引:13
|
作者
Paliwal, Kuldip [1 ]
Shannon, Benjamin [1 ]
Lyons, James [1 ]
Wojcicki, Kamil [1 ]
机构
[1] Griffith Univ, Signal Proc Lab, Nathan, Qld 4111, Australia
关键词
Bark scale; mel scale; robust automatic speech recognition (ASR); speech-signal-based frequency cepstral coefficient (SFCC); speech-signal-based frequency warping;
D O I
10.1109/LSP.2009.2014096
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
The speech signal is used for transmission of linguistic information. High energy portions of the speech spectrum have higher signal-to-noise ratios than the low energy portions. As a result, these regions are more robust to noise. Since the speech signal is known to be very robust to noise, it is expected that the high energy regions of the speech spectrum carry the majority of the linguistic information. This letter tries to derive a frequency warping function directly from the speech signal by sampling the frequency axis nonuniformly with the high energy regions sampled more densely than the low energy regions. To achieve this, an ensemble average short-time power spectrum is computed from a large speech corpus. The speech-signal-based frequency warping is obtained by considering equal area portions of the log spectrum. The proposed frequency warping is shown to be similar to the frequency scales obtained through psycho-acoustic experiments, namely the mel and bark scales. The warping is then used in filterbank design for automatic speech recognition experiments. The results of these experiments show that cepstral features based on the proposed warping achieve performance under clean conditions comparable to that of mel-frequency cepstral coefficients, while outperforming them under noisy conditions.
引用
收藏
页码:319 / 322
页数:4
相关论文
共 50 条
  • [1] Frequency-warping in speech
    Umesh, S
    Cohen, L
    Marinovic, N
    Nelson, D
    ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 414 - 417
  • [2] Improved Speech-Signal Based Frequency Warping Scale for Cepstral Feature in Robust Speaker Verification System
    Sarangi, Susanta Kumar
    Saha, Goutam
    JOURNAL OF SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY, 2020, 92 (07): : 679 - 692
  • [3] Improved Speech-Signal Based Frequency Warping Scale for Cepstral Feature in Robust Speaker Verification System
    Susanta Kumar Sarangi
    Goutam Saha
    Journal of Signal Processing Systems, 2020, 92 : 679 - 692
  • [4] Data Augmentation Based on Frequency Warping for Recognition of Cleft Palate Speech
    Fujiwara, Kento
    Takashima, Ryoichi
    Sugiyama, Chihiro
    Tanaka, Nobukazu
    Nohara, Kanji
    Nozaki, Kazunori
    Takiguchi, Tetsuya
    2021 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2021, : 471 - 476
  • [5] Speaker Identification for Whispered Speech based on Frequency Warping and Score Competition
    Fan, Xing
    Hansen, John H. L.
    INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 1313 - 1316
  • [6] On combining frequency warping and spectral shaping in HMM based speech recognition
    Potamianos, A
    Rose, RC
    1997 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I - V: VOL I: PLENARY, EXPERT SUMMARIES, SPECIAL, AUDIO, UNDERWATER ACOUSTICS, VLSI; VOL II: SPEECH PROCESSING; VOL III: SPEECH PROCESSING, DIGITAL SIGNAL PROCESSING; VOL IV: MULTIDIMENSIONAL SIGNAL PROCESSING, NEURAL NETWORKS - VOL V: STATISTICAL SIGNAL AND ARRAY PROCESSING, APPLICATIONS, 1997, : 1275 - 1278
  • [7] Frequency Warping for Speaker Adaptation in HMM-based Speech Synthesis
    Gao, Weixun
    Cao, Qiying
    JOURNAL OF INFORMATION SCIENCE AND ENGINEERING, 2014, 30 (04) : 1149 - 1166
  • [8] A watermarking method for speech signals based on the time-warping signal processing concept
    Ioana, Cornel
    Jarrot, Arnaud
    Quinquis, Andre
    Krishnan, Sridhar
    2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL II, PTS 1-3, 2007, : 201 - +
  • [9] TRANSLATION OF DIVERS SPEECH USING DIGITAL FREQUENCY WARPING
    ZUE, V
    OPPENHEI.A
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1971, 50 (01): : 131 - &
  • [10] A novel frequency warping scale for speech emotion recognition
    Singh, Premjeet
    Saha, Goutam
    INTERSPEECH 2023, 2023, : 3647 - 3651