Gammatone-Filterbank Based Pitch-Normalized Cepstral Coefficients for Zero-Resource Children's ASR

被引：0

作者：

Shahnawazuddin, Syed ^{[1
]}

Ankita ^{[1
]}

Kumar, Avinash ^{[2
]}

Kathania, Hemant Kumar ^{[2
]}

机构：

[1] Natl Inst Technol Patna, Patna, Bihar, India

[2] Natl Inst Technol Sikkim, Ravangla, India

来源：

SPEECH AND COMPUTER, SPECOM 2023, PT I | 2023年 / 14338卷

关键词：

Children's ASR; Zero-resource ASR; Spectral smoothing; Gamma-tone-filterbank; VMD; SPEECH; RECOGNITION;

D O I：

10.1007/978-3-031-48309-7_40

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

The work presented in this paper focuses on zero-resource children's speech recognition task. In such tasks, adults' speech data is used for learning the acoustic models. However, this leads to severe acoustic mismatch and hence poor recognition rates. One of the main mismatch factor is that the pitch values are higher in the case of children's speech. In order to mitigate the ill-effects of pitch-induced acoustic mismatch, two front-end speech parameterization techniques are proposed in this study. The proposed approaches employ spectral smoothing based on either pitch-adaptive cepstral truncation or variational mode decomposition. Furthermore, we have used Gamma-tone-filterbank for warping the spectra to the ERB scale. Consequently, the cepstral coefficients exhibit lower variance than those obtained using Mel-filterbank. Therefore, the proposed features are observed to be very effective resulting in a relative reduction in word error rate by nearly 17% over the baseline.

引用

页码：494 / 505

页数：12

共 4 条

[1] LPC AUGMENT: AN LPC-BASED ASR DATA AUGMENTATION ALGORITHM FOR LOW AND ZERO-RESOURCE CHILDREN'S DIALECTS
Johnson, Alexander
Fan, Ruchao
Morris, Robin
Alwan, Abeer
2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 8577 - 8581
[2] Exploring the Role of Data Augmentation and Acoustic Feature Concatenation in the Context of Zero-Resource Children’s ASR
S. Ankita
undefined Shahnawazuddin
Circuits, Systems, and Signal Processing, 2025, 44 (3) : 1914 - 1937
[3] Creating Robust Children's ASR System in Zero-Resource Condition Through Out-of-Domain Data Augmentation
Kumar, Vinit
Kumar, Avinash
Shahnawazuddin, S.
CIRCUITS SYSTEMS AND SIGNAL PROCESSING, 2022, 41 (04) : 2205 - 2220
[4] Creating Robust Children’s ASR System in Zero-Resource Condition Through Out-of-Domain Data Augmentation
Vinit Kumar
Avinash Kumar
S. Shahnawazuddin
Circuits, Systems, and Signal Processing, 2022, 41 : 2205 - 2220

← 1 →