An automatic histogram detection and information extraction from document images

被引:0
|
作者
P. H. Anagha
A. Baskar
机构
[1] Amrita Vishwa Vidyapeetham,Dept of Computer Science and Engineering, Amrita School of Engineering
关键词
Histogram; Hough line detector; Morphological operator; Information; Extraction;
D O I
暂无
中图分类号
学科分类号
摘要
Histogram is an important data chart that is commonly present in scientific documents. In this paper, an automatic histogram detection and information extraction methodology, based on Hough line detector and Morphological operator, is proposed. The proffered system is comprised of three steps: pre-processing, axis detection, and chart pattern extraction. In the pre-processing step, the RGB image pattern of a histogram is converted into a binary image. Next, in the axis detection step, horizontal axis, vertical axis and title of the histogram are extracted. In this step Hough line detector methodology was applied to detect horizontal and vertical lines in the image patterns. From the set of identified vertical lines, both the endpoints of a line, having the same minimum values of x co-ordinate was considered as a vertical axis. Similarly, from the set of identified horizontal lines, the two endpoints of a line having the same maximum values of y co-ordinate were considered as a horizontal axis. With respect to the dimensions of the horizontal axis and vertical axis, a rectangular region containing horizontal axis values and label, vertical axis values and label and title are extracted. In the final chart pattern extraction step, using morphological operations, the frequency of data present in the histogram was identified. Verification and validation tests of the propounded system yielded promising results, indicative of efficient approach for extraction of histogram information.
引用
收藏
页码:77 / 85
页数:8
相关论文
共 50 条
  • [21] Extraction of relevant information from document images using measures of visual attention
    Maderlechner, G
    Schreyer, A
    Suda, P
    15TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL 4, PROCEEDINGS: APPLICATIONS, ROBOTICS SYSTEMS AND ARCHITECTURES, 2000, : 385 - 388
  • [22] TEXTLINE INFORMATION EXTRACTION FROM GRAYSCALE CAMERA-CAPTURED DOCUMENT IMAGES
    Bukhari, Syed Saqib
    Breuel, Thomas M.
    Shafait, Faisal
    2009 16TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, VOLS 1-6, 2009, : 2013 - +
  • [23] Extraction of Muscle Areas from Ultrasonographic Images Using Refined Histogram Stretching and Fascia Information
    Kim, Kwang-Baek
    Song, Doo Heon
    Joo, Young Hoon
    Lee, Hae-Jung
    Woo, Young Woon
    JOURNAL OF COMPUTATIONAL AND THEORETICAL NANOSCIENCE, 2010, 7 (05) : 921 - 926
  • [24] Extraction of Muscle Areas from Ultrasonographic Images Using Refined Histogram Stretching and Fascia Information
    Kim, Kwang-Baek
    Kim, Sungshin
    Park, Suhyun
    Woo, Young Woon
    2008 THIRD INTERNATIONAL CONFERENCE ON BIO-INSPIRED COMPUTING: THEORIES AND APPLICATIONS, 2008, : 69 - +
  • [25] Automatic object detection employing viewing angle histogram for range images
    Chen, Liang-Chia
    Xuan-Loc Nguyen
    Lin, Shyh-Tsong
    2012 IEEE/ASME INTERNATIONAL CONFERENCE ON ADVANCED INTELLIGENT MECHATRONICS (AIM), 2012, : 196 - 201
  • [26] Automatic Airport Detection with Line Segment Detector and Histogram of Oriented Gradients from Satellite Images
    Budak, Umit
    Alcin, Omer Faruk
    Sengur, Abdulkadir
    2018 INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND DATA PROCESSING (IDAP), 2018,
  • [27] Automatic Error Analysis for Document-level Information Extraction
    Das, Aliva
    Du, Xinya
    Wang, Barry
    Shi, Kejian
    Gu, Jiayuan
    Porter, Thomas
    Cardie, Claire
    PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 3960 - 3975
  • [28] AUTOMATIC MOTION DETECTION AND OBJECT EXTRACTION METHOD BASED ON HISTOGRAM OF DISPARITY
    Liu, Xianru
    Lai, Xuzhi
    Wu, Min
    2014 12TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING (ICSP), 2014, : 769 - 773
  • [29] Drilling a Large Corpus of Document Images of Geological Information Extraction
    Debezia, Jean-Louis
    Boillet, Melodie
    Kermorvant, Christopher
    Barral, Quentin
    MACHINE LEARNING AND PRINCIPLES AND PRACTICE OF KNOWLEDGE DISCOVERY IN DATABASES, PT II, 2021, 1525 : 525 - 530
  • [30] Glyph Extraction from Historic Document Images
    Meyer-Lerbs, Lothar
    Schuldt, Arne
    Gottfried, Bjoern
    DOCENG2010: PROCEEDINGS OF THE 2010 ACM SYMPOSIUM ON DOCUMENT ENGINEERING, 2010, : 227 - 230