Printed Text Recognition using BLSTM and MDLSTM for Indian languages

被引：0

作者：

Chavan, Vishal ^{[1
]}

Malage, Abhijit ^{[1
]}

Mehrotra, Kapil ^{[1
]}

Gupta, Manish Kumar ^{[1
]}

机构：

[1] C DAC, Pune, Maharashtra, India

来源：

2017 FOURTH INTERNATIONAL CONFERENCE ON IMAGE INFORMATION PROCESSING (ICIIP) | 2017年

关键词：

Recurrent Neural Network; Optical Character Recognition; Bidirectional LSTM; Multidimensional LSTM; OCR SYSTEM;

D O I：

暂无

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

In this paper, we evaluated the recognition performance of BLSTM (Bidirectional LSTM) and MDLSTM (two-dimensional LSTM) neural network architecture on printed documents. We also compare the performance of 2 architectures with tesseract on same test bed. We demonstrate our experimentation on 7 Indian languages i.e. Hindi, Marathi, Tamil, Kannada, Malayalam, Bangla and Gurumukhi. The input to both the architecture will be segmented lines. The data-set used contains approximate 5000 pages for each language which then divided into train, validation and test set. The Histogram of Gradients are extracted at line level to feed into the BLSTM network. Whereas MDLSTM processes 2D image (raw pixels) of each line. The level and number of hidden layers in both the architectures are empirically selected and kept same for all the languages. The output CTC layer will contain the number of unicode present in the evaluated languages and one blank label. The input layer was fully connected to hidden layers, and these were fully connected to themselves and to the output layer. The validated result shows MDLSTM outperforms both BLSTM and tesseract for all the languages included in our experimentation.

引用

页码：345 / 350

页数：6

共 50 条

[21] Indian Languages Corpus for Speech Recognition
Basu, Joyanta
Khan, Soma
Roy, Rajib
Saxena, Babita
Ganguly, Dipankar
Arora, Sunita
Arora, Karunesh Kumar
Bansal, Shweta
Agrawal, Shyam Sunder
2019 22ND CONFERENCE OF THE ORIENTAL COCOSDA INTERNATIONAL COMMITTEE FOR THE CO-ORDINATION AND STANDARDISATION OF SPEECH DATABASES AND ASSESSMENT TECHNIQUES (O-COCOSDA), 2019, : 13 - 18
[22] Machine recognition of printed Kannada text
Kumar, BV
Ramakrishnan, AG
DOCUMENT ANALYSIS SYSTEM V, PROCEEDINGS, 2002, 2423 : 37 - 48
[23] ANALYZING SENTIMENT IN INDIAN LANGUAGES MICRO TEXT USING RECURRENT NEURAL NETWORK
Seshadri, Shriya
Madasamy, Anand Kumar
Padannayil, Soman Kotti
IIOAB JOURNAL, 2016, 7 : 313 - 318
[24] Development and analysis of multilingual phone recognition systems using Indian languages
K. E. Manjunath
Dinesh Babu Jayagopi
K. Sreenivasa Rao
V. Ramasubramanian
International Journal of Speech Technology, 2019, 22 : 157 - 168
[25] Development and analysis of multilingual phone recognition systems using Indian languages
Manjunath, K. E.
Jayagopi, Dinesh Babu
Rao, K. Sreenivasa
Ramasubramanian, V.
INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2019, 22 (01) : 157 - 168
[26] Printed Ottoman text recognition using synthetic data and data augmentation
Esma F. Bilgin Tasdemir
International Journal on Document Analysis and Recognition (IJDAR), 2023, 26 : 273 - 287
[27] Effective Printed Tamil Text Segmentation and Recognition Using Bayesian Classifier
Manisha, S.
Sharmila, T. Sree
COMPUTATIONAL INTELLIGENCE IN DATA MINING, CIDM 2016, 2017, 556 : 729 - 738
[28] Printed Ottoman text recognition using synthetic data and data augmentation
Tasdemir, Esma F. Bilgin F.
INTERNATIONAL JOURNAL ON DOCUMENT ANALYSIS AND RECOGNITION, 2023, 26 (03) : 273 - 287
[29] IndicOCR: A Pipeline for Recognizing Printed Documents for Indian Languages
Tulsyan, Krishna
Flemin, Tessy
Mondal, Ajoy
Jawahar, C. V.
PROCEEDINGS OF 7TH JOINT INTERNATIONAL CONFERENCE ON DATA SCIENCE AND MANAGEMENT OF DATA, CODS-COMAD 2024, 2024, : 519 - 522
[30] Chinese Image Text Recognition with BLSTM-CTC: A Segmentation-Free Method
Zhai, Chuanlei
Chen, Zhineng
Li, Jie
Xu, Bo
PATTERN RECOGNITION (CCPR 2016), PT II, 2016, 663 : 525 - 536

← 1 2 3 4 5 →