Robust Document Image Dewarping Method using Text-lines and Line Segments

被引:22
|
作者
Kil, Taeho [1 ,2 ]
Seo, Wonkyo [1 ,2 ]
Koo, Hyung Il [3 ]
Cho, Nam Ik [1 ,2 ]
机构
[1] Seoul Natl Univ, Dept Elect & Comp Engn, Seoul, South Korea
[2] Seoul Natl Univ, INMC, Seoul, South Korea
[3] Ajou Univ, Dept Elect & Comp Engn, Suwon, South Korea
来源
2017 14TH IAPR INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), VOL 1 | 2017年
关键词
ALGORITHM;
D O I
10.1109/ICDAR.2017.146
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Conventional text-line based document dewarping methods have problems when handling complex layout and/or very few text-lines. When there are few aligned text-lines in the image, this usually means that photos, graphics and/or tables take large portion of the input instead. Hence, for the robust document dewarping, we propose to use line segments in the image in addition to the aligned text-lines. Based on the assumption and observation that many of the line segments in the image are horizontally or vertically aligned in the well-rectified images, we encode this property into the cost function in addition to the text-line alignment cost. By minimizing the function, we can obtain transformation parameters for camera pose, page curve, etc., which are used for document rectification. Considering that there are many outliers in line segment directions and missed text-lines in some cases, the overall algorithm is designed in an iterative manner. At each step, we remove text components and line segments that are not well aligned, and then minimize the cost function with the updated information. Experimental results show that the proposed method is robust to the variety of page layouts.
引用
收藏
页码:865 / 870
页数:6
相关论文
共 50 条
  • [41] Image mosaicking using SURF features of line segments
    Yang, Zhanlong
    Shen, Dinggang
    Yap, Pew-Thian
    PLOS ONE, 2017, 12 (03):
  • [42] Robust skew estimation using straight lines in document images
    Koo, Hyung Il
    Cho, Nam Ik
    JOURNAL OF ELECTRONIC IMAGING, 2016, 25 (03)
  • [43] Text line segmentation in handwritten document using a production system
    Nicolas, S
    Paquet, T
    Heutte, L
    NINTH INTERNATIONAL WORKSHOP ON FRONTIERS IN HANDWRITING RECOGNITION, PROCEEDINGS, 2004, : 245 - 250
  • [44] Text identification for document image analysis using a neural network
    Strouthopoulos, C
    Papamarkos, N
    IMAGE AND VISION COMPUTING, 1998, 16 (12-13) : 879 - 896
  • [45] Large scalability in document image matching using text retrieval
    Moraleda, Jorge
    PATTERN RECOGNITION LETTERS, 2012, 33 (07) : 863 - 871
  • [46] Robust and fast text-line extraction using focal linearity of the text-line
    Goto, H
    Aso, H
    SYSTEMS AND COMPUTERS IN JAPAN, 1995, 26 (13) : 21 - 31
  • [47] Automatic tracing and extraction of text-line and word segments directly in JPEG compressed document images
    Rajesh, Bulla
    Javed, Mohammed
    Nagabhushan, P.
    IET IMAGE PROCESSING, 2020, 14 (09) : 1909 - 1919
  • [48] Towards Robust Tampered Text Detection in Document Image: New dataset and New Solution
    Qu, Chenfan
    Liu, Chongyu
    Liu, Yuliang
    Chen, Xinhong
    Peng, Dezhi
    Guo, Fengjun
    Jin, Lianwen
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, : 5937 - 5946
  • [49] A robust video text extraction method based on text traversing line and stroke connectivity
    Peng Tianqiang
    Tian Pohuang
    Li Bicheng
    ICSP: 2008 9TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, VOLS 1-5, PROCEEDINGS, 2008, : 1002 - 1005
  • [50] Face recognition using multiple image view line segments
    Aeberhard, S
    de Vel, O
    FOURTEENTH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOLS 1 AND 2, 1998, : 1198 - 1200