Segmenting Characters from Malayalam Handwritten Documents

被引:0
|
作者
Hashrin, C. P. [1 ]
Jossy, Amal [1 ]
Sudhakaran, K. [1 ]
Thushara, A. [1 ]
John, Ansamma [1 ]
机构
[1] TKM Coll Engn, Dept Comp Sci & Engn, Kollam, Kerala, India
关键词
OCR; segmentation; RECOGNITION;
D O I
10.1109/iciict1.2019.8741416
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Construction of an Optical Character Recognition (OCR) model for handwritten documents poses many challenges, the most prominent of them being dataset collection, character segmentation and classification. This paper focuses on the segmentation part, and presents a novel approach to segment individual characters from Malayalam handwritten documents. It is a three-stage approach where morphological operations, contour analysis, and bounding box detection are used to extract individual lines from the document, words from each line, and then characters from each word. An additional masking method is performed to tackle the overlapping of bounding boxes due to skewed lines and the presence of diacritics. The segmented characters can either be used to create datasets or fed to OCR models.
引用
收藏
页数:6
相关论文
共 50 条
  • [31] VARIABILITY IN HANDWRITTEN CHARACTERS
    WING, AM
    VISIBLE LANGUAGE, 1979, 13 (03) : 283 - 298
  • [32] The shape of handwritten characters
    Chakravarthy, VS
    Kompella, B
    PATTERN RECOGNITION LETTERS, 2003, 24 (12) : 1901 - 1913
  • [33] Individuality of handwritten characters
    Zhang, B
    Srihari, SN
    Lee, S
    SEVENTH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION, VOLS I AND II, PROCEEDINGS, 2003, : 1086 - 1090
  • [34] HMPLMD: Handwritten Malayalam palm leaf manuscript dataset
    Nair, B. J. Bipin
    Rani, N. Shobha
    DATA IN BRIEF, 2023, 47
  • [35] Extraction of characters from color documents
    Kasuga, H
    Okamoto, M
    Yamamoto, H
    DOCUMENT RECOGNITION AND RETRIEVAL VII, 2000, 3967 : 278 - 285
  • [36] Extraction of characters from color documents
    Kasuga, H.
    Okamoto, M.
    Yamamoto, H.
    Proceedings of SPIE - The International Society for Optical Engineering, 2000, 3967 : 278 - 285
  • [37] Handwriting-Based Text Line Segmentation from Malayalam Documents
    Pearlsy, P., V
    Sankar, Deepa
    APPLIED SCIENCES-BASEL, 2023, 13 (17):
  • [38] A two stage approach for handwritten Malayalam character recognition
    John, Jomy
    Pramod, K., V
    Balakrishnan, Kannan
    Chaudhuri, Bidyut B.
    2014 14TH INTERNATIONAL CONFERENCE ON FRONTIERS IN HANDWRITING RECOGNITION (ICFHR), 2014, : 199 - 204
  • [39] A hybrid deep learning model to recognize handwritten characters in ancient documents in Devanagari and Maithili scripts
    Jindal, Amar
    Ghosh, Rajib
    MULTIMEDIA TOOLS AND APPLICATIONS, 2024, 83 (03) : 8389 - 8412
  • [40] A hybrid deep learning model to recognize handwritten characters in ancient documents in Devanagari and Maithili scripts
    Amar Jindal
    Rajib Ghosh
    Multimedia Tools and Applications, 2024, 83 : 8389 - 8412