Text Gestalt: Stroke-Aware Scene Text Image Super-resolution

被引:0
|
作者
Chen, Jingye [1 ]
Yu, Haiyang [1 ]
Ma, Jianqi [2 ]
Li, Bin [1 ]
Xue, Xiangyang [1 ]
机构
[1] Fudan Univ, Sch Comp Sci, Shanghai Key Lab Intelligent Informat Proc, Shanghai, Peoples R China
[2] Hong Kong Polytech Univ, Hong Kong, Peoples R China
来源
THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / THE TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE | 2022年
基金
中国国家自然科学基金;
关键词
RECOGNITION; NETWORK;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In the last decade, the blossom of deep learning has witnessed the rapid development of scene text recognition. However, the recognition of low-resolution scene text images remains a challenge. Even though some super-resolution methods have been proposed to tackle this problem, they usually treat text images as general images while ignoring the fact that the visual quality of strokes (the atomic unit of text) plays an essential role for text recognition. According to Gestalt Psychology, humans are capable of composing parts of details into the most similar objects guided by prior knowledge. Likewise, when humans observe a low-resolution text image, they will inherently use partial stroke-level details to recover the appearance of holistic characters. Inspired by Gestalt Psychology, we put forward a Stroke-Aware Scene Text Image Super-Resolution method containing a Stroke-Focused Module (SFM) to concentrate on stroke-level internal structures of characters in text images. Specifically, we attempt to design rules for decomposing English characters and digits at stroke-level, then pre-train a text recognizer to provide stroke-level attention maps as positional clues with the purpose of controlling the consistency between the generated super-resolution image and high-resolution ground truth. The extensive experimental results validate that the proposed method can indeed generate more distinguishable images on Text-Zoom and manually constructed Chinese character dataset Degraded-IC13. Furthermore, since the proposed SFM is only used to provide stroke-level guidance when training, it will not bring any time overhead during the test phase.
引用
收藏
页码:285 / 293
页数:9
相关论文
共 50 条
  • [21] Towards Robust Scene Text Image Super-resolution via Explicit Location Enhancement
    Guo, Hang
    Dai, Tao
    Meng, Guanghao
    Xia, Shu-Tao
    PROCEEDINGS OF THE THIRTY-SECOND INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2023, 2023, : 782 - 790
  • [22] Multi-Task Learning for Scene Text Image Super-Resolution with Multiple Transformers
    Honda, Kosuke
    Kurematsu, Masaki
    Fujita, Hamido
    Selamat, Ali
    ELECTRONICS, 2022, 11 (22)
  • [23] Improving Scene Text Image Super-resolution via Dual Prior Modulation Network
    Zhu, Shipeng
    Zhao, Zuoyan
    Fang, Pengfei
    Xue, Hui
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 3, 2023, : 3843 - 3851
  • [24] QT-TextSR: Enhancing scene text image super-resolution via efficient interaction with text recognition using a Query-aware Transformer
    Liu, Chongyu
    Jiang, Qing
    Peng, Dezhi
    Kong, Yuxin
    Zhang, Jiaixin
    Xiong, Longfei
    Duan, Jiwei
    Sun, Cheng
    Jin, Lianwen
    NEUROCOMPUTING, 2025, 620
  • [25] More and Less: Enhancing Abundance and Refining Redundancy for Text-Prior-Guided Scene Text Image Super-Resolution
    Yang, Wei
    Luo, Yihong
    Ibrayim, Mayire
    Hamdulla, Askar
    DOCUMENT ANALYSIS AND RECOGNITION-ICDAR 2024, PT V, 2024, 14808 : 129 - 146
  • [26] Scene text image super-resolution via textual reasoning and multiscale cross-convolution
    Lan Yu
    Xiaojie Li
    Qi Yu
    Guangju Li
    Dehu Jin
    Meng Qi
    Applied Intelligence, 2024, 54 : 1997 - 2008
  • [27] Pragmatic degradation learning for scene text image super-resolution with data-training strategy
    Yang, Shengying
    Xie, Lifeng
    Ran, Xiaoxiao
    Lei, Jingsheng
    Qian, Xiaohong
    KNOWLEDGE-BASED SYSTEMS, 2024, 285
  • [28] Real Scene Text Image Super-Resolution Based on Multi-Scale and Attention Fusion
    Lu, Xinhua
    Wei, Haihai
    Ma, Li
    Xue, Qingji
    Fu, Yonghui
    JOURNAL OF INFORMATION PROCESSING SYSTEMS, 2023, 19 (04): : 427 - 438
  • [29] Scene text image super-resolution via textual reasoning and multiscale cross-convolution
    Yu, Lan
    Li, Xiaojie
    Yu, Qi
    Li, Guangju
    Jin, Dehu
    Qi, Meng
    APPLIED INTELLIGENCE, 2024, 54 (02) : 1997 - 2008
  • [30] Parametric loss-based super-resolution for scene text recognition
    Supatta Viriyavisuthisakul
    Parinya Sanguansat
    Teeradaj Racharak
    Minh Le Nguyen
    Natsuda Kaothanthong
    Choochart Haruechaiyasak
    Toshihiko Yamasaki
    Machine Vision and Applications, 2023, 34