A Convolutional Recurrent Neural-Network-Based Machine Learning for Scene Text Recognition Application

被引:6
|
作者
Liu, Yiyi [1 ]
Wang, Yuxin [1 ]
Shi, Hongjian [1 ]
机构
[1] Beijing Normal Univ Hong Kong Baptist Univ United, Guangdong Prov Key Lab Interdisciplinary Res & App, Zhuhai 519087, Peoples R China
来源
SYMMETRY-BASEL | 2023年 / 15卷 / 04期
关键词
CRNN; DBNet; OCR; Retinex;
D O I
10.3390/sym15040849
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Optical character recognition (OCR) is the process of acquiring text and layout information through analysis and recognition of text data image files. It is also a process to identify the geometric location and orientation of the texts and their symmetrical behavior. It usually consists of two steps: text detection and text recognition. Scene text recognition is a subfield of OCR that focuses on processing text in natural scenes, such as streets, billboards, license plates, etc. Unlike traditional document category photographs, it is a challenging task to use computer technology to locate and read text information in natural scenes. Imaging sequence recognition is a longstanding subject of research in the field of computer vision. Great progress has been made in this field; however, most models struggled to recognize text in images of complex scenes with high accuracy. This paper proposes a new pattern of text recognition based on the convolutional recurrent neural network (CRNN) as a solution to address this issue. It combines real-time scene text detection with differentiable binarization (DBNet) for text detection and segmentation, text direction classifier, and the Retinex algorithm for image enhancement. To evaluate the effectiveness of the proposed method, we performed experimental analysis of the proposed algorithm, and carried out simulation on complex scene image data based on existing literature data and also on several real datasets designed for a variety of nonstationary environments. Experimental results demonstrated that our proposed model performed better than the baseline methods on three benchmark datasets and achieved on-par performance with other approaches on existing datasets. This model can solve the problem that CRNN cannot identify text in complex and multi-oriented text scenes. Furthermore, it outperforms the original CRNN model with higher accuracy across a wider variety of application scenarios.
引用
收藏
页数:17
相关论文
共 50 条
  • [41] An End-to-End Trainable Neural Network for Image-Based Sequence Recognition and Its Application to Scene Text Recognition
    Shi, Baoguang
    Bai, Xiang
    Yao, Cong
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2017, 39 (11) : 2298 - 2304
  • [42] Inception recurrent convolutional neural network for object recognition
    Md Zahangir Alom
    Mahmudul Hasan
    Chris Yakopcic
    Tarek M. Taha
    Vijayan K. Asari
    Machine Vision and Applications, 2021, 32
  • [43] Inception recurrent convolutional neural network for object recognition
    Alom, Md Zahangir
    Hasan, Mahmudul
    Yakopcic, Chris
    Taha, Tarek M.
    Asari, Vijayan K.
    MACHINE VISION AND APPLICATIONS, 2021, 32 (01)
  • [44] Scene Text Extraction using Convolutional Neural Network with Amended MSER
    Yegnaraman, Aparna
    Valli, S.
    JOURNAL OF SCIENTIFIC & INDUSTRIAL RESEARCH, 2021, 80 (09): : 817 - 827
  • [45] Heterogeneous Network Based Semi-supervised Learning for Scene Text Recognition
    Jiang, Qianyi
    Song, Qi
    Li, Nan
    Zhang, Rui
    Wei, Xiaolin
    DOCUMENT ANALYSIS AND RECOGNITION, ICDAR 2021, PT IV, 2021, 12824 : 64 - 78
  • [46] Acoustic Scene Recognition Based on Convolutional Neural Networks
    Sun, Fengjiao
    Wang, Mingjiang
    Xu, Qihang
    Xuan, Xiaogung
    Zhang, Xin
    2019 IEEE 4TH INTERNATIONAL CONFERENCE ON SIGNAL AND IMAGE PROCESSING (ICSIP 2019), 2019, : 122 - 126
  • [47] Recurrent neural network learning for text routing
    Wermter, S
    Arevian, G
    Panchev, C
    NINTH INTERNATIONAL CONFERENCE ON ARTIFICIAL NEURAL NETWORKS (ICANN99), VOLS 1 AND 2, 1999, (470): : 898 - 903
  • [48] A Machine Learning Approach to Hypothesis Decoding in Scene Text Recognition
    Libovicky, Jindrich
    Neumann, Lukas
    Pecina, Pavel
    Matas, Jiri
    COMPUTER VISION - ACCV 2014 WORKSHOPS, PT II, 2015, 9009 : 169 - 180
  • [49] Deep neural network with attention model for scene text recognition
    Li, Shuohao
    Tang, Min
    Guo, Qiang
    Lei, Jun
    Zhang, Jun
    IET COMPUTER VISION, 2017, 11 (07) : 605 - 612
  • [50] Performance Comparison of Text-based Sentiment Analysis using Recurrent Neural Network and Convolutional Neural Network
    Purnamasari, Prima Dewi
    Taqiyuddin, Muhammad
    Ratna, Anak Agung Putri
    PROCEEDINGS OF THE 3RD INTERNATIONAL CONFERENCE ON COMMUNICATION AND INFORMATION PROCESSING (ICCIP 2017), 2017, : 19 - 23