A Robust Technique for Character String Extraction from Complex Document Images

被引:0
|
作者
Chen, Yen-Lin [1 ]
机构
[1] Univ E Asia, Dept Comp Sci & Informat Engn, Taichung 41354, Taiwan
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A new technique for segmenting and extracting character strings from various real-life complex document images is proposed in this study. The proposed text extraction technique first decompose the document image into distinct object planes to extract and separate homogeneous objects including textual regions of interest, non-text objects such as graphics and pictures, and background textures. Then a text extraction procedure is applied to the resultant planes to extract character strings with different characteristics in the corresponding planes. The document image is processed regionally and adaptively according to its local features, and thus detailed characteristics of extracted textual objects can be well-preserved, especially small characters with thin strokes. From the experimental results and comparisons to the existing technique, the proposed approach demonstrates its effectiveness and advantages on extracting character strings with various illuminations, sizes, and font styles from various types of complex document images.
引用
收藏
页码:1742 / 1750
页数:9
相关论文
共 50 条
  • [21] Robust bit extraction from images
    Fridrich, Jiri
    International Conference on Multimedia Computing and Systems -Proceedings, 1999, 2 : 536 - 540
  • [22] Character string extraction on unconstrained maps
    Xu, JZ
    AM/FM INTERNATIONAL CONFERENCE XIX, PROCEEDINGS - THRIVING IN AN AGE OF COMPETITION, 1996, : 29 - 35
  • [23] Robust reconstruction of low-resolution document images by exploiting repetitive character behaviour
    Hiêp Q. Luong
    Wilfried Philips
    International Journal of Document Analysis and Recognition (IJDAR), 2008, 11 : 39 - 51
  • [24] Robust reconstruction of low-resolution document images by exploiting repetitive character behaviour
    Luong, Hiep Q.
    Philips, Wilfried
    INTERNATIONAL JOURNAL ON DOCUMENT ANALYSIS AND RECOGNITION, 2008, 11 (01) : 39 - 51
  • [25] An adaptive technique for the extraction of object region and boundary from images with complex environment
    Valaparla, DP
    Asari, VK
    30TH APPLIED IMAGERY PATTERN RECOGNITION WORKSHOP, PROCEEDINGS: ANALYSIS AND UNDERSTANDING OF TIME VARYING IMAGERY, 2001, : 194 - 199
  • [26] A Robust Page Frame Detection Method for Complex Historical Document Images
    Reza, Mohammad Mohsin
    Rakib, Md Ajraf
    Bukhari, Syed Saqib
    Dengel, Andreas
    ICPRAM: PROCEEDINGS OF THE 8TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION APPLICATIONS AND METHODS, 2019, : 556 - 564
  • [27] Robust text detection from binarized document images
    Okun, O
    Yan, Y
    Pietikäinen, M
    16TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL III, PROCEEDINGS, 2002, : 61 - 64
  • [28] A model of stroke extraction from Chinese character images
    Cao, R
    Tan, CL
    15TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL 4, PROCEEDINGS: APPLICATIONS, ROBOTICS SYSTEMS AND ARCHITECTURES, 2000, : 368 - 371
  • [29] Video text extraction from images for character recognition
    Amarapur, Basavaraj
    Patil, Nagaraj
    2006 CANADIAN CONFERENCE ON ELECTRICAL AND COMPUTER ENGINEERING, VOLS 1-5, 2006, : 95 - +
  • [30] Segmentation and Text extraction from Document Images: Survey
    Mukarambi, Gururaj
    Gaikwad, Hema
    Dhandra, B., V
    2019 INNOVATIONS IN POWER AND ADVANCED COMPUTING TECHNOLOGIES (I-PACT), 2019,