Towards End-to-end Text Spotting with Convolutional Recurrent Neural Networks

被引:164
|
作者
Li, Hui [1 ,2 ]
Wang, Peng [1 ,2 ,3 ]
Shen, Chunhua [1 ,2 ]
机构
[1] Univ Adelaide, Adelaide, SA, Australia
[2] Australian Ctr Robot Vis, Brisbane, Qld, Australia
[3] Northwestern Polytech Univ, Xian, Shaanxi, Peoples R China
关键词
D O I
10.1109/ICCV.2017.560
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this work, we jointly address the problem of text detection and recognition in natural scene images based on convolutional recurrent neural networks. We propose a unified network that simultaneously localizes and recognizes text with a single forward pass, avoiding intermediate processes, such as image cropping, feature re-calculation, word separation, and character grouping. In contrast to existing approaches that consider text detection and recognition as two distinct tasks and tackle them one by one, the proposed framework settles these two tasks concurrently. The whole framework can be trained end-to-end, requiring only images, ground-truth bounding boxes and text labels. The convolutional features are calculated only once and shared by both detection and recognition, which saves processing time. Through multi-task training, the learned features become more informative and improves the overall performance. Our proposed method has achieved competitive performance on several benchmark datasets.
引用
收藏
页码:5248 / 5256
页数:9
相关论文
共 50 条
  • [1] End-to-End Text Recognition with Convolutional Neural Networks
    Wang, Tao
    Wu, David J.
    Coates, Adam
    Ng, Andrew Y.
    [J]. 2012 21ST INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR 2012), 2012, : 3304 - 3308
  • [2] Towards Unconstrained End-to-End Text Spotting
    Qin, Siyang
    Bissacco, Alessandro
    Raptis, Michalis
    Fujii, Yasuhisa
    Xiao, Ying
    [J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 4703 - 4713
  • [3] Towards End-to-End Text Spotting in Natural Scenes
    Wang, Peng
    Li, Hui
    Shen, Chunhua
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (10) : 7266 - 7281
  • [4] Towards End-to-End Speech Recognition with Recurrent Neural Networks
    Graves, Alex
    Jaitly, Navdeep
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 32 (CYCLE 2), 2014, 32 : 1764 - 1772
  • [5] Towards End-to-End Speech Recognition with Deep Convolutional Neural Networks
    Zhang, Ying
    Pezeshki, Mohammad
    Brakel, Philemon
    Zhang, Saizheng
    Laurent, Cesar
    Bengio, Yoshua
    Courville, Aaron
    [J]. 17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 410 - 414
  • [6] Text Perceptron: Towards End-to-End Arbitrary-Shaped Text Spotting
    Qiao, Liang
    Tang, Sanli
    Cheng, Zhanzhan
    Xu, Yunlu
    Niu, Yi
    Pu, Shiliang
    Wu, Fei
    [J]. THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 11899 - 11907
  • [7] Handwritten Text Segmentation via End-to-End Learning of Convolutional Neural Networks
    Junho Jo
    Hyung Il Koo
    Jae Woong Soh
    Nam Ik Cho
    [J]. Multimedia Tools and Applications, 2020, 79 : 32137 - 32150
  • [8] Handwritten Text Segmentation via End-to-End Learning of Convolutional Neural Networks
    Jo, Junho
    Koo, Hyung Il
    Soh, Jae Woong
    Cho, Nam Ik
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2020, 79 (43-44) : 32137 - 32150
  • [9] Towards End-to-End Speech Recognition with Deep Multipath Convolutional Neural Networks
    Zhang, Wei
    Zhai, Minghao
    Huang, Zilong
    Liu, Chen
    Li, Wei
    Cao, Yi
    [J]. INTELLIGENT ROBOTICS AND APPLICATIONS, ICIRA 2019, PART VI, 2019, 11745 : 332 - 341
  • [10] Towards end-to-end likelihood-free inference with convolutional neural networks
    Radev, Stefan T.
    Mertens, Ulf K.
    Voss, Andreas
    Koethe, Ullrich
    [J]. BRITISH JOURNAL OF MATHEMATICAL & STATISTICAL PSYCHOLOGY, 2020, 73 (01): : 23 - 43