Accurate, data-efficient, unconstrained text recognition with convolutional neural networks

被引:58
|
作者
Yousef, Mohamed [1 ]
Hussain, Khaled F. [1 ]
Mohammed, Usama S. [2 ]
机构
[1] Assiut Univ, Fac Comp & Informat, Comp Sci Dept, Asyut 71515, Egypt
[2] Assiut Univ, Elect Engn Dept, Fac Engn, Asyut 71515, Egypt
关键词
Text recognition; Optical character recognition; Handwriting recognition; CAPTCHA Solving; License plate recognition; Convolutional neural network; Deep learning; SCENE TEXT; LSTM;
D O I
10.1016/j.patcog.2020.107482
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Unconstrained text recognition is an important computer vision task, featuring a wide variety of different sub-tasks, each with its own set of challenges. One of the biggest promises of deep neural networks has been the convergence and automation of feature extractors from input raw signals, allowing for the highest possible performance with minimum required domain knowledge. To this end, we propose a data-efficient, end-to-end neural network model for generic, unconstrained text recognition. In our proposed architecture we strive for simplicity and efficiency without sacrificing recognition accuracy. Our proposed architecture is a fully convolutional network without any recurrent connections trained with the CTC loss function. Thus it operates on arbitrary input sizes and produces strings of arbitrary length in a very efficient and parallelizable manner. We show the generality and superiority of our proposed text recognition architecture by achieving state-of-the-art results on seven public benchmark datasets, covering a wide spectrum of text recognition tasks, namely: Handwriting Recognition, CAPTCHA recognition, OCR, License Plate Recognition, and Scene Text Recognition. Our proposed architecture has won the ICFHR2018 Competition on Automated Text Recognition on a READ Dataset. (C) 2020 Published by Elsevier Ltd.
引用
下载
收藏
页数:12
相关论文
共 50 条
  • [31] Convolutional Neural Networks for Text Hashing
    Xu, Jiaming
    Wang, Peng
    Tian, Guanhua
    Xu, Bo
    Zhao, Jun
    Wang, Fangyuan
    Hao, Hongwei
    PROCEEDINGS OF THE TWENTY-FOURTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE (IJCAI), 2015, : 1369 - 1375
  • [32] Text normalization with convolutional neural networks
    Yolchuyeva, Sevinj
    Nemeth, Geza
    Gyires-Toth, Balint
    INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2018, 21 (03) : 589 - 600
  • [33] Text detection with convolutional neural networks
    Delakis, Manolis
    Garcia, Christophe
    VISAPP 2008: PROCEEDINGS OF THE THIRD INTERNATIONAL CONFERENCE ON COMPUTER VISION THEORY AND APPLICATIONS, VOL 2, 2008, : 290 - 294
  • [34] Ligature Recognition in Urdu Caption Text using Deep Convolutional Neural Networks
    Hayat, Umar
    Aatif, Muhammad
    Zeeshan, Osama
    Siddiqi, Imran
    2018 14TH INTERNATIONAL CONFERENCE ON EMERGING TECHNOLOGIES (ICET), 2018,
  • [35] On the improvement of handwritten text line recognition with octave convolutional recurrent neural networks
    Castro, Dayvid
    Zanchettin, Cleber
    Amaral, Luis A. Nunes
    INTERNATIONAL JOURNAL ON DOCUMENT ANALYSIS AND RECOGNITION, 2024, 27 (4) : 567 - 581
  • [36] An Attention-Based Convolutional Recurrent Neural Networks for Scene Text Recognition
    Alshawi, Adil Abdullah Abdulhussein
    Tanha, Jafar
    Balafar, Mohammad Ali
    IEEE ACCESS, 2024, 12 : 8123 - 8134
  • [37] Data-Efficient Inference of Nonlinear Oscillator Networks
    Singhal, Bharat
    Vu, Minh
    Zeng, Shen
    Li, Jr-Shin
    IFAC PAPERSONLINE, 2023, 56 (02): : 10089 - 10094
  • [38] An adaptive threshold mechanism for accurate and efficient deep spiking convolutional neural networks
    Chen, Yunhua
    Mai, Yingchao
    Feng, Ren
    Xiao, Jinsheng
    NEUROCOMPUTING, 2022, 469 : 189 - 197
  • [39] Scale-Equivariant Unrolled Neural Networks for Data-Efficient Accelerated MRI Reconstruction
    Gunel, Beliz
    Sahiner, Arda
    Desai, Arjun D.
    Chaudhari, Akshay S.
    Vasanawala, Shreyas
    Pilanci, Mert
    Pauly, John
    MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION, MICCAI 2022, PT VI, 2022, 13436 : 737 - 747
  • [40] Investigating data representation for efficient and reliable Convolutional Neural Networks
    Ruospo, Annachiara
    Sanchez, Ernesto
    Traiola, Marcello
    O'Connor, Ian
    Bosio, Alberto
    MICROPROCESSORS AND MICROSYSTEMS, 2021, 86