Text Gestalt: Stroke-Aware Scene Text Image Super-resolution

被引:0
|
作者
Chen, Jingye [1 ]
Yu, Haiyang [1 ]
Ma, Jianqi [2 ]
Li, Bin [1 ]
Xue, Xiangyang [1 ]
机构
[1] Fudan Univ, Sch Comp Sci, Shanghai Key Lab Intelligent Informat Proc, Shanghai, Peoples R China
[2] Hong Kong Polytech Univ, Hong Kong, Peoples R China
来源
THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / THE TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE | 2022年
基金
中国国家自然科学基金;
关键词
RECOGNITION; NETWORK;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In the last decade, the blossom of deep learning has witnessed the rapid development of scene text recognition. However, the recognition of low-resolution scene text images remains a challenge. Even though some super-resolution methods have been proposed to tackle this problem, they usually treat text images as general images while ignoring the fact that the visual quality of strokes (the atomic unit of text) plays an essential role for text recognition. According to Gestalt Psychology, humans are capable of composing parts of details into the most similar objects guided by prior knowledge. Likewise, when humans observe a low-resolution text image, they will inherently use partial stroke-level details to recover the appearance of holistic characters. Inspired by Gestalt Psychology, we put forward a Stroke-Aware Scene Text Image Super-Resolution method containing a Stroke-Focused Module (SFM) to concentrate on stroke-level internal structures of characters in text images. Specifically, we attempt to design rules for decomposing English characters and digits at stroke-level, then pre-train a text recognizer to provide stroke-level attention maps as positional clues with the purpose of controlling the consistency between the generated super-resolution image and high-resolution ground truth. The extensive experimental results validate that the proposed method can indeed generate more distinguishable images on Text-Zoom and manually constructed Chinese character dataset Degraded-IC13. Furthermore, since the proposed SFM is only used to provide stroke-level guidance when training, it will not bring any time overhead during the test phase.
引用
收藏
页码:285 / 293
页数:9
相关论文
共 50 条
  • [31] Parametric loss-based super-resolution for scene text recognition
    Viriyavisuthisakul, Supatta
    Sanguansat, Parinya
    Racharak, Teeradaj
    Le Nguyen, Minh
    Kaothanthong, Natsuda
    Haruechaiyasak, Choochart
    Yamasaki, Toshihiko
    MACHINE VISION AND APPLICATIONS, 2023, 34 (04)
  • [32] Image and Text: Fighting the Same Battle? Super-resolution Learning for Imbalanced Text Classification
    Meunier, Romain
    Benamar, Farah
    Moriceau, Veronique
    Stolfl, Patricia
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EMNLP 2023), 2023, : 10707 - 10720
  • [33] Scene Text Aware Image Retargeting
    Patel, Diptiben
    Raman, Shanmuganathan
    2019 7TH IEEE GLOBAL CONFERENCE ON SIGNAL AND INFORMATION PROCESSING (IEEE GLOBALSIP), 2019,
  • [34] ICDAR2015 Competition on Text Image Super-Resolution
    Peyrard, Clement
    Baccouche, Moez
    Mamalet, Franck
    Garcia, Christophe
    2015 13TH IAPR INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), 2015, : 1201 - 1205
  • [35] ADVERSARIAL TEXT IMAGE SUPER-RESOLUTION USING SINKHORN DISTANCE
    Geng, Cong
    Chen, Li
    Zhang, Xiaoyun
    Gao, Zhiyong
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 2663 - 2667
  • [36] Anisotropic Total Variation Method for Text Image Super-Resolution
    Bayarsaikhan, Battulga
    Kwon, Younghee
    Kim, Jin Hyung
    PROCEEDINGS OF THE 8TH IAPR INTERNATIONAL WORKSHOP ON DOCUMENT ANALYSIS SYSTEMS, 2008, : 473 - 479
  • [37] Scene Text Image Super-Resolution Through Multi-Scale Interaction of Structural and Semantic Priors
    Zhu Z.
    Zhang L.
    Bai Y.
    Wang Y.
    Li P.
    IEEE Transactions on Artificial Intelligence, 2024, 5 (07): : 1 - 11
  • [38] Scene Text Image Super-Resolution Reconstruction Based on Perceiving Multi-Domain Character Distance
    Huang, Jun-Yang
    Chen, Hong-Hui
    Wang, Jia-Bao
    Chen, Ping-Ping
    Lin, Zhi-Jian
    Tien Tzu Hsueh Pao/Acta Electronica Sinica, 2024, 52 (07): : 2262 - 2270
  • [39] Soft-edge-guided significant coordinate attention network for scene text image super-resolution
    Xi, Chenchen
    Zhang, Kaibing
    He, Xin
    Hu, Yanting
    Chen, Jinguang
    VISUAL COMPUTER, 2024, 40 (08): : 5393 - 5406
  • [40] TextDiff: Enhancing scene text image super-resolution with mask-guided residual diffusion models
    Liu, Baolin
    Yang, Zongyuan
    Chiu, Chinwai
    Xiong, Yongping
    PATTERN RECOGNITION, 2025, 164