Separation of Text from Non-Text Doodles of Poet Rabindranath Tagore's Manuscripts

被引:0
|
作者
Chaudhuri, B. B. [1 ]
Borah, Samarjeet [1 ]
Saraf, Ankita [1 ]
Goyal, Alisha [1 ]
Kumari, Alka [1 ]
机构
[1] Indian Stat Inst, CVPR Unit, Kolkata 700108, India
关键词
Text; Non text Doodles; Rabindranath Tagore; Connected Components; pixels; Stroke Width; EXTRACTION; SEGMENTATION;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
As gaining popularity of internet facilities have given a convenient and faster approach to mine a warehouse of both historical and contemporary handwritten documents; this has led to a continuous research and development in the field of information retrieval algorithm. In such handwritten documents, graphics and images are combined with text and often overlap one another. This paper presents a technique for separating textual data from non-textual information. The technique is based on some already published works. It is implemented in poet Rabindranath Tagore's manuscript. The approach generates connected components as basic primitive and tries to classify them as text or non-text based on a comparison between the total number of pixels and the number of boundary pixels constituting the component. A window is generated and further separation is done on the basis of the stroke width computed for each window. The paper also contains a brief review on some of the already published works.
引用
收藏
页码:165 / 169
页数:5
相关论文
共 50 条
  • [31] Text/non-text image classification in the wild with convolutional neural networks
    Bai, Xiang
    Shi, Baoguang
    Zhang, Chengquan
    Cai, Xuan
    Qi, Li
    PATTERN RECOGNITION, 2017, 66 : 437 - 446
  • [32] Text and Non-Text Region Identification Using Texture and Connected Components
    Vidyarthi, Ankit
    Mittal, Namita
    Kansal, Ankita
    2014 INTERNATIONAL CONFERENCE ON SIGNAL PROPAGATION AND COMPUTER TECHNOLOGY (ICSPCT 2014), 2014, : 604 - 609
  • [33] Discussing Cultural Influences. Text, Context and Non-Text in Rabbine Judaism
    Ego, B.
    ZEITSCHRIFT FUR DIE ALTTESTAMENTLICHE WISSENSCHAFT, 2008, 120 (03): : 450 - 451
  • [34] Coalition game based feature selection for text non-text separation in handwritten documents using LBP based features
    Ghosh, Manosij
    Ghosh, Kushal Kanti
    Bhowmik, Showmik
    Sarkar, Ram
    MULTIMEDIA TOOLS AND APPLICATIONS, 2021, 80 (02) : 3229 - 3249
  • [35] Coalition game based feature selection for text non-text separation in handwritten documents using LBP based features
    Manosij Ghosh
    Kushal Kanti Ghosh
    Showmik Bhowmik
    Ram Sarkar
    Multimedia Tools and Applications, 2021, 80 : 3229 - 3249
  • [36] INTEGRATING TEXT WITH NON-TEXT - A PICTURE IS WORTH LK WORDS - KIMBERLEY,R
    MOORE, A
    PROGRAM-AUTOMATED LIBRARY AND INFORMATION SYSTEMS, 1987, 21 (02): : 218 - 219
  • [37] Comparison of MRF and CRF for Text/Non-text Classification in Japanese Ink Documents
    Inatani, Soichiro
    Truyen Van Phan
    Nakagawa, Masaki
    2014 14TH INTERNATIONAL CONFERENCE ON FRONTIERS IN HANDWRITING RECOGNITION (ICFHR), 2014, : 684 - 689
  • [38] Text/Non-Text Classification in Online Handwritten Documents with Recurrent Neural Networks
    Truyen Van Phan
    Nakagawa, Masaki
    2014 14TH INTERNATIONAL CONFERENCE ON FRONTIERS IN HANDWRITING RECOGNITION (ICFHR), 2014, : 23 - 28
  • [39] Video Text Binarization using Connected Component Level Non-text Filtering
    Cho, Beom Geun
    Kim, Shin Gon
    Koo, Hyung Il
    2018 INTERNATIONAL CONFERENCE ON ELECTRONICS, INFORMATION, AND COMMUNICATION (ICEIC), 2018, : 493 - 494
  • [40] Comparison of MRF and CRF for Text/Non-text Classification in Japanese Ink Documents
    Inatani, Soichiro
    Phan, Truyen Van
    Nakagawa, Masaki
    Proceedings of International Conference on Frontiers in Handwriting Recognition, ICFHR, 2014, 2014-December : 684 - 689