Syllable based Hindi speech recognition

被引：7

作者：

Bhatt, Shobha ^{[1
]}

Jain, Anurag ^{[1
]}

Dev, Amita ^{[2
]}

机构：

[1] Guru Gobind Singh Indraprastha Univ, Univ Sch Informat & Commun Technol, Sect 16 C, New Delhi 110078, India

[2] Indira Gandhi Delhi Tech Univ Women, Dept Informat Technol, New Delhi 110006, India

来源：

JOURNAL OF INFORMATION & OPTIMIZATION SCIENCES | 2020年 / 41卷 / 06期

关键词：

Speech recognition; Syllable; Acoustic model; HMM; PLP; Hindi speech;

D O I：

10.1080/02522667.2020.1809091

中图分类号：

G25 [图书馆学、图书馆事业]; G35 [情报学、情报工作];

学科分类号：

1205 ; 120501 ;

摘要：

In this paper, one of the acoustic units of speech, the syllable, is used for the development of a continuous Hindi speech recognition system. The syllable is a larger acoustic unit that overcomes the contextual effects and requires fewer training samples in comparison to triphone based and word-based models. Other acoustic units such as phoneme-based suffer from contextual influences, and context-dependent triphones suffer due to the non-availability of triphone patterns with a large memory storage for numerous models. Earlier research works related to Hindi speech recognition were performed using the word, phoneme, and context-dependent models. The authors proposed a syllable based Hindi speech recognition system in this study due to different advantages of syllable units such as longer acoustic units, fast decoding, reducing contextual effects, and reduction of irregularities due to phonemes. The continuous Hindi speech recognition system was developed utilizing syllable based acoustic units. Hindi is widely spoken in India and other parts of the world also. The experiments are performed on Continuous Hindi speech by using a widely known Hidden Markov Model (HMM) with perceptual linear predictive coefficients(PLPs). The research outcomes reveal that by using syllables, the performance of the system was increased by 27% than phoneme and 20% than triphones. Research findings indicate that by selecting an appropriate acoustic unit for Hindi, the performance of the speech recognition system may be improved. Further, the study also provides useful insights to develop a syllable based pronunciation dictionary that may be used in speech recognition, speaker identification, and text to speech conversion systems.

引用

页码：1333 / 1351

页数：19

共 50 条

[31] On the syllable structures of Chinese relating to speech recognition
Zhang, JL
[J]. ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 2450 - 2453
[32] Monophone-based connected word Hindi speech recognition improvement
Bhatt S.
Jain A.
Dev A.
[J]. Sadhana - Academy Proceedings in Engineering Sciences, 2021, 46 (02)
[33] Automatic speech segmentation in syllable centric speech recognition system
Panda S.P.
Nayak A.K.
[J]. International Journal of Speech Technology, 2016, 19 (1) : 9 - 18
[34] A STUDY OF THE SYLLABLE ORIENTED RECOGNITION OF CONTINUOUS SPEECH
TANAKA, A
TOGAWA, F
UEDA, T
HAKARIDANI, M
IWAHASHI, H
NISHIOKA, Y
KOBAYASHI, T
KINPARA, S
YAMASHITA, K
[J]. SPEECH COMMUNICATION, 1983, 2 (2-3) : 207 - 210
[35] Integrating syllable boundary information into speech recognition
Wu, SL
Shire, ML
Greenberg, S
Morgan, N
[J]. 1997 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I - V: VOL I: PLENARY, EXPERT SUMMARIES, SPECIAL, AUDIO, UNDERWATER ACOUSTICS, VLSI; VOL II: SPEECH PROCESSING; VOL III: SPEECH PROCESSING, DIGITAL SIGNAL PROCESSING; VOL IV: MULTIDIMENSIONAL SIGNAL PROCESSING, NEURAL NETWORKS - VOL V: STATISTICAL SIGNAL AND ARRAY PROCESSING, APPLICATIONS, 1997, : 987 - 990
[36] Discriminative Techniques for Hindi Speech Recognition System
Aggarwal, Rajesh Kumar
Dave, Mayank
[J]. INFORMATION SYSTEMS FOR INDIAN LANGUAGES, 2011, 139 : 261 - 266
[37] Improved Syllable Based Acoustic Modeling by Inter-syllable Transition Model for Continuous Chinese Speech Recognition
Chao, Hao
Liu, Wenju
[J]. PROCEEDINGS OF THE 2009 CHINESE CONFERENCE ON PATTERN RECOGNITION AND THE FIRST CJK JOINT WORKSHOP ON PATTERN RECOGNITION, VOLS 1 AND 2, 2009, : 654 - 657
[38] Convolution Neural Network Based Visual Speech Recognition System for Syllable Identification
Pahuja, Hunny
Ranjan, Priya
Ujlayan, Amit
Goyal, Ayush
[J]. Recent Advances in Computer Science and Communications, 2022, 15 (01) : 139 - 150
[39] SYLLABLE-BASED SPEECH RECOGNITION USING ELECTROMYOGRAPHY AND DECISION SET CLASSIFIER
Topalovic, Marko
Damnjanovic, Dorde
Peulic, Aleksandar
Blagojevic, Milan
Filipovic, Nenad
[J]. BIOMEDICAL ENGINEERING-APPLICATIONS BASIS COMMUNICATIONS, 2015, 27 (02):
[40] Syllable based language model for large vocabulary continuous speech recognition of Uyghur
[J]. Silamu, W. (wushour@xju.edu.cn), 1600, Tsinghua University (53):

← 1 2 3 4 5 →