Multiple Feature Extraction for RNN-based Assamese Speech Recognition for Speech to Text Conversion Application

被引：0

作者：

Dutta, Krishna ^{[1
]}

Sarma, Kandarpa Kumar ^{[1
]}

机构：

[1] Gauhati Univ, Dept ECE, Gauhati 781014, Assam, India

来源：

PROCEEDINGS OF THE 2012 INTERNATIONAL CONFERENCE ON COMMUNICATIONS, DEVICES AND INTELLIGENT SYSTEMS (CODLS) | 2012年

关键词：

Moving Average Filter; LPC; MFCC; RNN;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The current work proposes a prototype model for speech recognition in Assamese language using Linear Predictive Coding (LPC) and Mel frequency cepstral coefficient (MFCC). The speech recognition is a part of a speech to text conversion system. The LPC and MFCC features are extracted by two different Recurrent Neural Networks (RNN), which are used to recognize the vocal extract of Assamese language- a major language in the North Eastern part of India. In this work, decision block is designed by a combined framework of RNN block to extract the features. Using this combined architecture our system is able to generate 10% gain in the recognition rate than the case when individual architectures are used.

引用

页码：600 / 603

页数：4

共 50 条

[1] RNN-based prosodic modeling for mandarin speech and its application to speech-to-text conversion
Wang, WJ
Liao, YF
Chen, SH
SPEECH COMMUNICATION, 2002, 36 (3-4) : 247 - 265
[2] An overview of RNN-based Mandarin speech recognition approaches
Liao, YF
Hong, WT
Wang, WJ
Wang, YR
Chen, SH
JOURNAL OF THE CHINESE INSTITUTE OF ENGINEERS, 1999, 22 (05) : 535 - 547
[3] A modular RNN-based method for continuous Mandarin speech recognition
Liao, YF
Chen, SH
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2001, 9 (03): : 252 - 263
[4] An RNN-based prosodic information synthesizer for Mandarin text-to-speech
Chen, SH
Hwang, SH
Wang, YR
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1998, 6 (03): : 226 - 239
[5] AN RNN-BASED SPECTRAL INFORMATIONG ENERATION FOR MANDARIN TEXT-TO-SPEECH
EOOO/CCL, Industrial Technology Research Institute, Chutung, Hsinchu, Taiwan
不详
Eur. Conf. Speech Commun. Technol., EUROSPEECH, 1600, (549-552):
[6] An RNN-based noise estimation and likelihood compensation for noisy speech recognition
Hong, WT
Chen, SH
NEURAL NETWORKS FOR SIGNAL PROCESSING VI, 1996, : 293 - 301
[7] An RNN-based preclassification method for fast continuous Mandarin speech recognition
Chen, SH
Liao, YF
Chiang, SM
Chang, SG
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1998, 6 (01): : 86 - 90
[8] Optimization of RNN-Based Speech Activity Detection
Gelly, Gregory
Gauvain, Jean-Luc
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2018, 26 (03) : 646 - 656
[9] On a Hybrid NN/HMM Speech Recognition System with a RNN-Based Language Model
Soutner, Daniel
Zelinka, Jan
Mueller, Ludek
SPEECH AND COMPUTER, 2014, 8773 : 315 - 321
[10] The application of optimization in feature extraction of speech recognition
Gu, L
Liu, RS
ICSP '96 - 1996 3RD INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, PROCEEDINGS, VOLS I AND II, 1996, : 745 - 748

← 1 2 3 4 5 →