ON SEGMENTATION OF TOUCHING CHARACTERS AND OVERLAPPING LINES IN DEGRADED PRINTED GURMUKHI SCRIPT

被引:9
|
作者
Jindal, Manish Kumar [1 ]
Lehal, Gurpreet Singh [2 ]
Sharma, Rajendra Kumar [3 ]
机构
[1] Panjab Univ, Reg Ctr, Dept Comp Sci & Applicat, Muktsar 152026, Punjab, India
[2] Punjabi Univ, Dept Comp Sci, Patiala 147002, Punjab, India
[3] Thapar Univ, Sch Math & Comp Applicat, Patiala 147002, Punjab, India
关键词
Gurmukhi script; touching characters; horizontally overlapping lines; top zone; character segmentation;
D O I
10.1142/S0219467809003460
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Character segmentation plays a very important role in a text recognition system. The simple technique of using inter-character gap for segmentation is useful for fine printed documents, but this technique fails to give satisfactory results if the input text contains touching characters. In this paper, we have proposed two algorithms to segment touching characters, and one algorithm to segment overlapping lines in degraded printed Gurmukhi document. Various categories of touching characters in different zones, along with their solutions, have been proposed. The solution methodology extensively uses the structural properties of Gurmukhi script. The algorithm proposed for segmenting horizontally overlapping lines uses a heuristics based upon the height of a character. The problem of multiple horizontally overlapping lines may occur in a number of situations such as printed newspapers, old magazines and books etc. Similarity among Indian scripts allows us to use these algorithms for solving the segmentation problems in other Indian languages also.
引用
收藏
页码:321 / 353
页数:33
相关论文
共 50 条
  • [11] Dewarping Machine Printed Documents of Gurmukhi Script
    Sharma, Dharam Veer
    Wadhwa, Shilpi
    INFORMATION SYSTEMS FOR INDIAN LANGUAGES, 2011, 139 : 117 - 123
  • [12] SEGMENTATION OF TOUCHING CHARACTER PRINTED LANNA SCRIPT USING JUNCTION POINT
    Kosarat, Rujipan
    Hiransakolwong, Nualsawat
    JOURNAL OF ENGINEERING SCIENCE AND TECHNOLOGY, 2018, 13 (10) : 3331 - 3343
  • [13] Segmentation of touching characters in printed Korean/English document recognition
    Kim, JH
    Kim, KK
    Chien, SI
    Choi, HM
    INFORMATION INTELLIGENCE AND SYSTEMS, VOLS 1-4, 1996, : 438 - 443
  • [14] Challenges in Segmentation of Text in Handwritten Gurmukhi Script
    Rajiv, K. Sharma
    Amardeep, S. Dhiman
    INFORMATION PROCESSING AND MANAGEMENT, 2010, 70 : 388 - +
  • [15] A Clustering Strategy for Touching Characters in Korean and English Printed Text Segmentation
    Wahyono
    Jo, Kang-Hyun
    2012 9TH INTERNATIONAL CONFERENCE ON UBIQUITOUS ROBOTS AND AMBIENT INTELLIGENCE (URAL), 2012, : 23 - 25
  • [16] A study of different kinds of degradation in printed Gurmukhi script
    Jindal, M. K.
    Sharma, R. K.
    Lehal, G. S.
    ICCTA 2007: INTERNATIONAL CONFERENCE ON COMPUTING: THEORY AND APPLICATIONS, PROCEEDINGS, 2007, : 538 - +
  • [17] Separation of Machine Printed Roman and Gurmukhi Script Words
    Sharma, Dharamveer
    INFORMATION PROCESSING AND MANAGEMENT, 2010, 70 : 485 - 490
  • [18] Touching character segmentation of Devanagari script
    Babu, Subith
    Jangid, Mahesh
    7TH INTERNATIONAL CONFERENCE ON COMPUTING, COMMUNICATION AND NETWORKING TECHNOLOGIES (ICCCNT 2016), 2016,
  • [19] Segmentation of Printed Bangla Characters Using Structural Properties of Bangla Script
    Chowdhury, Mohammad Isbal Sakib
    Dey, Barnali
    Rahman, Md. Saifur
    PROCEEDINGS OF ICECE 2008, VOLS 1 AND 2, 2008, : 639 - 643
  • [20] Efficient Segmentation of Printed Tamil Script into Characters Using Projection and Structure
    Thangairulappan, Kathirvalavakumar
    Mohan, Karthigaiselvi
    2017 FOURTH INTERNATIONAL CONFERENCE ON IMAGE INFORMATION PROCESSING (ICIIP), 2017, : 484 - 489