Reading Text in the Wild with Convolutional Neural Networks

被引:2
|
作者
Max Jaderberg
Karen Simonyan
Andrea Vedaldi
Andrew Zisserman
机构
[1] University of Oxford,Department of Engineering Science
来源
关键词
Text spotting; Text recognition; Text detection ; Deep learning; Convolutional neural networks; Synthetic data; Text retrieval;
D O I
暂无
中图分类号
学科分类号
摘要
In this work we present an end-to-end system for text spotting—localising and recognising text in natural scene images—and text based image retrieval. This system is based on a region proposal mechanism for detection and deep convolutional neural networks for recognition. Our pipeline uses a novel combination of complementary proposal generation techniques to ensure high recall, and a fast subsequent filtering stage for improving precision. For the recognition and ranking of proposals, we train very large convolutional neural networks to perform word recognition on the whole proposal region at the same time, departing from the character classifier based systems of the past. These networks are trained solely on data produced by a synthetic text generation engine, requiring no human labelled data. Analysing the stages of our pipeline, we show state-of-the-art performance throughout. We perform rigorous experiments across a number of standard end-to-end text spotting benchmarks and text-based image retrieval datasets, showing a large improvement over all previous methods. Finally, we demonstrate a real-world application of our text spotting system to allow thousands of hours of news footage to be instantly searchable via a text query.
引用
收藏
页码:1 / 20
页数:19
相关论文
共 50 条
  • [1] Reading Text in the Wild with Convolutional Neural Networks
    Jaderberg, Max
    Simonyan, Karen
    Vedaldi, Andrea
    Zisserman, Andrew
    [J]. INTERNATIONAL JOURNAL OF COMPUTER VISION, 2016, 116 (01) : 1 - 20
  • [2] Text/non-text image classification in the wild with convolutional neural networks
    Bai, Xiang
    Shi, Baoguang
    Zhang, Chengquan
    Cai, Xuan
    Qi, Li
    [J]. PATTERN RECOGNITION, 2017, 66 : 437 - 446
  • [3] Convolutional Neural Networks for Text Hashing
    Xu, Jiaming
    Wang, Peng
    Tian, Guanhua
    Xu, Bo
    Zhao, Jun
    Wang, Fangyuan
    Hao, Hongwei
    [J]. PROCEEDINGS OF THE TWENTY-FOURTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE (IJCAI), 2015, : 1369 - 1375
  • [4] Text normalization with convolutional neural networks
    Yolchuyeva, Sevinj
    Nemeth, Geza
    Gyires-Toth, Balint
    [J]. INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2018, 21 (03) : 589 - 600
  • [5] Text detection with convolutional neural networks
    Delakis, Manolis
    Garcia, Christophe
    [J]. VISAPP 2008: PROCEEDINGS OF THE THIRD INTERNATIONAL CONFERENCE ON COMPUTER VISION THEORY AND APPLICATIONS, VOL 2, 2008, : 290 - 294
  • [6] On the Interpretation of Convolutional Neural Networks for Text Classification
    Xu, Jincheng
    Du, Qingfeng
    [J]. ECAI 2020: 24TH EUROPEAN CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2020, 325 : 2252 - 2259
  • [7] Convolutional Neural Networks for Financial Text Regression
    Dereli, Nesat
    Saraclar, Murat
    [J]. 57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019:): STUDENT RESEARCH WORKSHOP, 2019, : 331 - 337
  • [8] Convolutional Recurrent Neural Networks for Text Classification
    Lyu, Shengfei
    Liu, Jiaqi
    [J]. JOURNAL OF DATABASE MANAGEMENT, 2021, 32 (04) : 65 - 82
  • [9] Recurrent Convolutional Neural Networks for Text Classification
    Lai, Siwei
    Xu, Liheng
    Liu, Kang
    Zhao, Jun
    [J]. PROCEEDINGS OF THE TWENTY-NINTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2015, : 2267 - 2273
  • [10] Convolutional Recurrent Neural Networks for Text Classification
    Wang, Ruishuang
    Li, Zhao
    Cao, Jian
    Chen, Tong
    Wang, Lei
    [J]. 2019 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2019,