A simple text/graphic separation method for document image segmentation

被引：0

作者：

Zirari, F. ^{[1
]}

Ennaji, A. ^{[1
]}

Nicolas, S. ^{[1
]}

Mammass, D. ^{[2
]}

机构：

[1] Univ Rouen, LITIS Lab, Rouen, France

[2] Ibn Zohr Univ, IRF SIC Lab, Agadir, Morocco

来源：

2013 ACS INTERNATIONAL CONFERENCE ON COMPUTER SYSTEMS AND APPLICATIONS (AICCSA) | 2013年

关键词：

text/non-text separating; connected components; graph; structural analysis; document image;

D O I：

暂无

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

Page segmentation into text and non-text elements is an essential preprocessing step before optical character recognition (OCR) operation. In case of poor segmentation, an OCR classification engine produces garbage characters due to the presence of non-text elements. This paper presents a method to separate the textual and non textual components in document images using a graph-based modeling and structural analysis. This is a fast and efficient method to separate adequately the graphical and the textual parts of a document. We have evaluated our method on two well-known subsets: the UW-III dataset and the ICDAR 2009 page segmentation competition dataset. Comparisons are led with two methods of state-of-the-art; these results showing that our method proved better performances in this task.

引用

页数：4

共 50 条

[21] An Improved Method for Text Segmentation and Skew Normalization of Handwriting Image
Bal, Abhishek
Saha, Rajib
PROGRESS IN INTELLIGENT COMPUTING TECHNIQUES: THEORY, PRACTICE, AND APPLICATIONS, VOL 1, 2018, 518 : 181 - 196
[22] SimTxtSeg: Weakly-Supervised Medical Image Segmentation with Simple Text Cues
Xie, Yuxin
Zhou, Tao
Zhou, Yi
Chen, Geng
MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2024, PT VIII, 2024, 15008 : 634 - 644
[23] A method for text-line segmentation for unconstrained Arabic and Persian handwritten text image
Shakoori, Reza
2014 IEEE 15TH INTERNATIONAL CONFERENCE ON INFORMATION REUSE AND INTEGRATION (IRI), 2014, : 338 - 344
[24] iDocChip: A Configurable Hardware Architecture for Historical Document Image ProcessingMultiresolution Morphology-based Text and Image Segmentation
Menbere Kina Tekleyohannes
Vladimir Rybalkin
Muhammad Mohsin Ghaffar
Javier Alejandro Varela
Norbert Wehn
Andreas Dengel
International Journal of Parallel Programming, 2021, 49 : 253 - 284
[25] A simple and effective sub-image separation method
Ali, Mushtaq
Asghar, Muhammad Zubair
Shah, Mohsin
Mahmood, Tauqeer
MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 81 (11) : 14893 - 14910
[26] Document segmentation and classification into musical scores and text
Pedersoli, Fabrizio
Tzanetakis, George
INTERNATIONAL JOURNAL ON DOCUMENT ANALYSIS AND RECOGNITION, 2016, 19 (04) : 289 - 304
[27] A novel document image segmentation method using medial axis transform
Tzeng, CH
Tsai, WH
PROCEEDINGS OF THE FIFTH JOINT CONFERENCE ON INFORMATION SCIENCES, VOLS 1 AND 2, 2000, : A224 - A227
[28] Text segmentation in degraded historical document images
Kavitha, A. S.
Shivakumara, P.
Kumar, G. H.
Lu, Tong
EGYPTIAN INFORMATICS JOURNAL, 2016, 17 (02) : 189 - 197
[29] Segmentation of text and graphics from document images
Chowdhury, S. P.
Mandal, S.
Das, A. K.
Chanda, Bhabatosh
ICDAR 2007: NINTH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION, VOLS I AND II, PROCEEDINGS, 2007, : 619 - +
[30] Document segmentation and classification into musical scores and text
Fabrizio Pedersoli
George Tzanetakis
International Journal on Document Analysis and Recognition (IJDAR), 2016, 19 : 289 - 304

← 1 2 3 4 5 →