An automatic histogram detection and information extraction from document images

被引：0

作者：

P. H. Anagha

A. Baskar

机构：

[1] Amrita Vishwa Vidyapeetham,Dept of Computer Science and Engineering, Amrita School of Engineering

来源：

International Journal of Speech Technology | 2021年 / 24卷

关键词：

Histogram; Hough line detector; Morphological operator; Information; Extraction;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

Histogram is an important data chart that is commonly present in scientific documents. In this paper, an automatic histogram detection and information extraction methodology, based on Hough line detector and Morphological operator, is proposed. The proffered system is comprised of three steps: pre-processing, axis detection, and chart pattern extraction. In the pre-processing step, the RGB image pattern of a histogram is converted into a binary image. Next, in the axis detection step, horizontal axis, vertical axis and title of the histogram are extracted. In this step Hough line detector methodology was applied to detect horizontal and vertical lines in the image patterns. From the set of identified vertical lines, both the endpoints of a line, having the same minimum values of x co-ordinate was considered as a vertical axis. Similarly, from the set of identified horizontal lines, the two endpoints of a line having the same maximum values of y co-ordinate were considered as a horizontal axis. With respect to the dimensions of the horizontal axis and vertical axis, a rectangular region containing horizontal axis values and label, vertical axis values and label and title are extracted. In the final chart pattern extraction step, using morphological operations, the frequency of data present in the histogram was identified. Verification and validation tests of the propounded system yielded promising results, indicative of efficient approach for extraction of histogram information.

引用

页码：77 / 85

页数：8

共 50 条

[1] An automatic histogram detection and information extraction from document images
Anagha, P. H.
Baskar, A.
INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2021, 24 (01) : 77 - 85
[2] Automatic name extraction from degraded document images
Laurence Likforman-Sulem
Pascal Vaillant
Aliette de Bodard de la Jacopière
Pattern Analysis and Applications, 2006, 9 : 211 - 227
[3] Automatic keyword extraction from historical document images
Terasawa, K
Nagasaki, T
Kawashima, T
DOCUMENT ANALYSIS SYSTEMS VII, PROCEEDINGS, 2006, 3872 : 413 - 424
[4] Automatic name extraction from degraded document images
Likforman-Sulem, Laurence
Vaillant, Pascal
de la Jacopiere, Aliette de Bodard
PATTERN ANALYSIS AND APPLICATIONS, 2006, 9 (2-3) : 211 - 227
[5] Automatic Extraction of Text and Non-text Information Directly from Compressed Document Images
Javed, Mohammed
Nagabhushan, P.
Chaudhuri, Bidyut B.
PROCEEDINGS OF THE 16TH INTERNATIONAL CONFERENCE ON HYBRID INTELLIGENT SYSTEMS (HIS 2016), 2017, 552 : 38 - 46
[6] Automatic Table Detection and Retention from Scanned Document Images via Analysis of Structural Information
Ranka, Varsha
Patil, Shubham
Patni, Shubham
Raut, Tushar
Mehrotra, Kapil
Gupta, Manish Kumar
2017 FOURTH INTERNATIONAL CONFERENCE ON IMAGE INFORMATION PROCESSING (ICIIP), 2017, : 244 - 249
[7] Automatic table detection in document images
Gatos, B
Danatsas, D
Pratikakis, I
Perantonis, SJ
PATTERN RECOGNITION AND DATA MINING, PT 1, PROCEEDINGS, 2005, 3686 : 609 - 618
[8] Text Extraction from Document Images using Edge Information
Grover, Sachin
Arora, Kushal
Mitra, Suman K.
2009 ANNUAL IEEE INDIA CONFERENCE (INDICON 2009), 2009, : 582 - +
[9] VERSATILE TECHNIQUE FOR AUTOMATIC EXTRACTION OF INFORMATION FROM RECONNAISSANCE IMAGES
TISDALE, GE
IEEE TRANSACTIONS ON AEROSPACE AND ELECTRONIC SYSTEMS, 1971, AES7 (04) : 738 - &
[10] Quadratic spline wavelet approach to automatic extraction of baselines from document images
Tang, YY
Yang, LH
Liu, JM
PROCEEDINGS OF THE FOURTH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION, VOLS 1 AND 2, 1997, : 693 - 696

← 1 2 3 4 5 →