Separation of Text from Non-Text Doodles of Poet Rabindranath Tagore's Manuscripts

被引:0
|
作者
Chaudhuri, B. B. [1 ]
Borah, Samarjeet [1 ]
Saraf, Ankita [1 ]
Goyal, Alisha [1 ]
Kumari, Alka [1 ]
机构
[1] Indian Stat Inst, CVPR Unit, Kolkata 700108, India
关键词
Text; Non text Doodles; Rabindranath Tagore; Connected Components; pixels; Stroke Width; EXTRACTION; SEGMENTATION;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
As gaining popularity of internet facilities have given a convenient and faster approach to mine a warehouse of both historical and contemporary handwritten documents; this has led to a continuous research and development in the field of information retrieval algorithm. In such handwritten documents, graphics and images are combined with text and often overlap one another. This paper presents a technique for separating textual data from non-textual information. The technique is based on some already published works. It is implemented in poet Rabindranath Tagore's manuscript. The approach generates connected components as basic primitive and tries to classify them as text or non-text based on a comparison between the total number of pixels and the number of boundary pixels constituting the component. A window is generated and further separation is done on the basis of the stroke width computed for each window. The paper also contains a brief review on some of the already published works.
引用
收藏
页码:165 / 169
页数:5
相关论文
共 50 条
  • [1] Text and non-text separation in offline document images: a survey
    Showmik Bhowmik
    Ram Sarkar
    Mita Nasipuri
    David Doermann
    International Journal on Document Analysis and Recognition (IJDAR), 2018, 21 : 1 - 20
  • [2] The Poet's Religion of Rabindranath Tagore
    Glaysher, Frederick
    RUPKATHA JOURNAL ON INTERDISCIPLINARY STUDIES IN HUMANITIES, 2011, 3 (04): : 400 - 416
  • [3] Text and Non-text Separation in Scanned Color-Official Documents
    Nandedkar, Amit Vijay
    Mukherjee, Jayanta
    Sural, Shamik
    COMPUTER VISION, GRAPHICS, AND IMAGE PROCESSING, ICVGIP 2016, 2017, 10481 : 231 - 242
  • [4] Text and non-text separation in offline document images: a survey
    Bhowmik, Showmik
    Sarkar, Ram
    Nasipuri, Mita
    Doermann, David
    INTERNATIONAL JOURNAL ON DOCUMENT ANALYSIS AND RECOGNITION, 2018, 21 (1-2) : 1 - 20
  • [5] Father, Son and Holy Text: Rabindranath Tagore and the Upanisads
    Hatcher, Brian A.
    JOURNAL OF HINDU STUDIES, 2011, 4 (02): : 119 - 143
  • [6] Context Modeling for Text/Non-Text Separation in Freeform Online Handwritten Documents
    Delaye, Adrien
    Liu, Cheng-Lin
    DOCUMENT RECOGNITION AND RETRIEVAL XX, 2013, 8658
  • [7] Separation of Text and Non-text in Document Layout Analysis using a Recursive Filter
    Tuan-Anh Tran
    Na, In-Seop
    Kim, Soo-Hyung
    KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS, 2015, 9 (10): : 4072 - 4091
  • [8] User interface for text and non-text classification
    Thanh Thi Xuan Lam
    Anh Duc Le
    Nakagawa, Masaki
    2019 INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION WORKSHOPS (ICDAR 2019 WORKSHOP) AND 2ND INTERNATIONAL WORKSHOP ON HUMAN-DOCUMENT INTERACTION, VOL 3, 2019, : 1 - 5
  • [9] Poet to poet: Satyajit Ray's film 'translations' of Rabindranath Tagore
    Robinson, A
    AGENDA, 1998, 36 (01): : 212 - 225
  • [10] A Poet's School: Rabindranath Tagore and the Politics of Aesthetic Education
    Ghosh, Ranjan
    SOUTH ASIA-JOURNAL OF SOUTH ASIAN STUDIES, 2012, 35 (01) : 13 - 32