Text and non-text separation in offline document images: a survey

被引:0
|
作者
Showmik Bhowmik
Ram Sarkar
Mita Nasipuri
David Doermann
机构
[1] Jadavpur University,Institute for Advanced Computer Studies
[2] University of Maryland,undefined
关键词
Text/non-text separation; Segmentation; Offline document images; Engineering drawing; Map; Unconstrained handwritten document; Newspaper; Journal; Magazine; Check; Form; Survey;
D O I
暂无
中图分类号
学科分类号
摘要
Separation of text and non-text is an essential processing step for any document analysis system. Therefore, it is important to have a clear understanding of the state-of-the-art of text/non-text separation in order to facilitate the development of efficient document processing systems. This paper first summarizes the technical challenges of performing text/non-text separation. It then categorizes offline document images into different classes according to the nature of the challenges one faces, in an attempt to provide insight into various techniques presented in the literature. The pros and cons of various techniques are explained wherever possible. Along with the evaluation protocols, benchmark databases, this paper also presents a performance comparison of different methods. Finally, this article highlights the future research challenges and directions in this domain.
引用
收藏
页码:1 / 20
页数:19
相关论文
共 50 条
  • [1] Text and non-text separation in offline document images: a survey
    Bhowmik, Showmik
    Sarkar, Ram
    Nasipuri, Mita
    Doermann, David
    INTERNATIONAL JOURNAL ON DOCUMENT ANALYSIS AND RECOGNITION, 2018, 21 (1-2) : 1 - 20
  • [2] Text/non-text classification of connected components in document images
    Julca-Aguilar, Frank D.
    Maia, Ana L. L. M.
    Hirata, Nina S. T.
    2017 30TH SIBGRAPI CONFERENCE ON GRAPHICS, PATTERNS AND IMAGES (SIBGRAPI), 2017, : 450 - 455
  • [3] A Novel Method for Text and Non-Text Segmentation in Document Images
    Deivalakshmi, S.
    Palanisamy, P.
    Vishwanathan, Gayatri
    2013 INTERNATIONAL CONFERENCE ON COMMUNICATIONS AND SIGNAL PROCESSING (ICCSP), 2013, : 255 - 259
  • [4] Text and Non-text Separation in Handwritten Document Images Using Local Binary Pattern Operator
    Bhowmik, Showmik
    Sarkar, Ram
    Nasipuri, Mita
    PROCEEDINGS OF THE FIRST INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTING AND COMMUNICATION, 2017, 458 : 507 - 515
  • [5] Separation of Text and Non-text in Document Layout Analysis using a Recursive Filter
    Tuan-Anh Tran
    Na, In-Seop
    Kim, Soo-Hyung
    KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS, 2015, 9 (10): : 4072 - 4091
  • [6] Text/Non-Text Separation from Handwritten Document Images Using LBP Based Features: An Empirical Study
    Ghosh, Sourav
    Lahiri, Dibyadwati
    Bhowmik, Showmik
    Kavallieratou, Ergina
    Sarkar, Ram
    JOURNAL OF IMAGING, 2018, 4 (04)
  • [7] Automatic Extraction of Text and Non-text Information Directly from Compressed Document Images
    Javed, Mohammed
    Nagabhushan, P.
    Chaudhuri, Bidyut B.
    PROCEEDINGS OF THE 16TH INTERNATIONAL CONFERENCE ON HYBRID INTELLIGENT SYSTEMS (HIS 2016), 2017, 552 : 38 - 46
  • [8] A Chinese Document Layout Analysis Based on Non-text Images
    Fu Xiaoling
    Li Xiaofeng
    2009 INTERNATIONAL FORUM ON COMPUTER SCIENCE-TECHNOLOGY AND APPLICATIONS, VOL 1, PROCEEDINGS, 2009, : 326 - 328
  • [9] Automatic Discrimination of Text and Non-Text Natural Images
    Zhang, Chengquan
    Yao, Cong
    Shi, Baoguang
    Bai, Xiang
    2015 13TH IAPR INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), 2015, : 886 - 890
  • [10] Offline Text and Non-text Segmentation for Hand-Drawn Diagrams
    Pravalpruk, Buntita
    Dailey, Matthew M.
    PRICAI 2016: TRENDS IN ARTIFICIAL INTELLIGENCE, 2016, 9810 : 380 - 392