A Multi-scale Deformable Convolution Network Model for Text Recognition

被引:0
|
作者
Cheng, Lang [1 ]
Yan, Junhong [1 ]
Chen, Minghui [1 ]
Lu, Yuanwen [1 ]
Li, Yunhong [1 ]
Hu, Lei [1 ]
机构
[1] Jiangxi Normal Univ, Sch Comp & Informat Engn, Nanchang 330022, Jiangxi, Peoples R China
关键词
Text recognition; Multi-scale feature extraction; Deformable convolution; Receptive field;
D O I
10.1117/12.2623370
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Natural scene text recognition is one of the most challenging tasks in recent years. Compared with traditional document text, natural scene text has the characteristics of various shapes and different directions, so the accuracy of scene text recognition still needs to be improved. In order to locate the text region better and identify the text content more accurate, we present a multi-scale deformable convolution network model for text recognition. The initial image is irregularly corrected through the rectified network, and the ResNet with FPN structure is used as the backbone network to achieve multi-scale feature extraction. In addition, the feature fusion method of Add is adopted to reduce feature information losing and increase the strength of feature extraction in the text area. The deformable convolution block is introduced in the deep convolution to improve the deformation modeling ability of convolution and expand the receptive field. The prediction module adopts the Transformer and abandons the inherent pre and post attributes of RNN to realize parallel operation and solve the problem of path length between remote dependencies. In order to evaluate the effectiveness of the proposed method, we trained our model on two mixed data sets, MJSynth and SynthText, and tested it on some regular and irregular data sets. The experiment results demonstrate that this method performs well in irregular scene text recognition, especially in CUTE80.
引用
收藏
页数:9
相关论文
共 50 条
  • [1] AMDNet: Adaptive Fall Detection Based on Multi-scale Deformable Convolution Network
    Jiang, Minghua
    Zhang, Keyi
    Ma, Yongkang
    Liu, Li
    Peng, Tao
    Hu, Xinrong
    Yu, Feng
    [J]. ADVANCES IN COMPUTER GRAPHICS, CGI 2023, PT III, 2024, 14497 : 3 - 14
  • [2] MCDCNet: Multi-scale constrained deformable convolution network for apple leaf disease detection
    Liu, Bin
    Huang, Xulei
    Sun, Leiming
    Wei, Xing
    Ji, Zeyu
    Zhang, Haixi
    [J]. COMPUTERS AND ELECTRONICS IN AGRICULTURE, 2024, 222
  • [3] Deep image compression based on multi-scale deformable convolution
    Li, Daowen
    Li, Yingming
    Sun, Heming
    Yu, Lu
    [J]. JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2022, 87
  • [4] Text Recognition Model Based on Multi-Scale Fusion CRNN
    Zou, Le
    He, Zhihuang
    Wang, Kai
    Wu, Zhize
    Wang, Yifan
    Zhang, Guanhong
    Wang, Xiaofeng
    [J]. SENSORS, 2023, 23 (16)
  • [5] Multi-scale Convolution and Feature-weighting Network for Radar Target Recognition
    Wang, Chenchen
    Su, Weimin
    Gu, Hong
    Yang, Jianchao
    [J]. 2019 IEEE MTT-S INTERNATIONAL MICROWAVE BIOMEDICAL CONFERENCE (IMBIOC 2019), 2019,
  • [6] Multi-Scale Convolution-Capsule Network for Crop Insect Pest Recognition
    Xu, Cong
    Yu, Changqing
    Zhang, Shanwen
    Wang, Xuqi
    [J]. ELECTRONICS, 2022, 11 (10)
  • [7] Video Frame Interpolation via Multi-scale Expandable Deformable Convolution
    Zhang, Dengyong
    Huang, Pu
    Ding, Xiangling
    Li, Feng
    Yang, Gaobo
    [J]. PROCEEDINGS OF THE 2023 ACM WORKSHOP ON INFORMATION HIDING AND MULTIMEDIA SECURITY, IH&MMSEC 2023, 2023, : 19 - 28
  • [8] A Decoupled YOLOv5 with Deformable Convolution and Multi-scale Attention
    Yuan, Gui
    Liu, Gang
    Chen, Jian
    [J]. KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT, PT I, 2022, 13368 : 3 - 14
  • [9] Research on traffic sign recognition method based on multi-scale convolution neural network
    Wei, Tiancheng
    Chen, Xiaofeng
    Yin, Yuanliang
    [J]. Xibei Gongye Daxue Xuebao/Journal of Northwestern Polytechnical University, 2021, 39 (04): : 891 - 900
  • [10] Multi-Scale Adaptive Graph Convolution Network for Skeleton-Based Action Recognition
    Hu, Huangshui
    Fang, Yue
    Han, Mei
    Qi, Xingshuo
    [J]. IEEE ACCESS, 2024, 12 : 16868 - 16880