Segmenting Characters from Malayalam Handwritten Documents

被引:0
|
作者
Hashrin, C. P. [1 ]
Jossy, Amal [1 ]
Sudhakaran, K. [1 ]
Thushara, A. [1 ]
John, Ansamma [1 ]
机构
[1] TKM Coll Engn, Dept Comp Sci & Engn, Kollam, Kerala, India
关键词
OCR; segmentation; RECOGNITION;
D O I
10.1109/iciict1.2019.8741416
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Construction of an Optical Character Recognition (OCR) model for handwritten documents poses many challenges, the most prominent of them being dataset collection, character segmentation and classification. This paper focuses on the segmentation part, and presents a novel approach to segment individual characters from Malayalam handwritten documents. It is a three-stage approach where morphological operations, contour analysis, and bounding box detection are used to extract individual lines from the document, words from each line, and then characters from each word. An additional masking method is performed to tackle the overlapping of bounding boxes due to skewed lines and the presence of diacritics. The segmented characters can either be used to create datasets or fed to OCR models.
引用
收藏
页数:6
相关论文
共 50 条
  • [1] Character Segmentation in Malayalam Handwritten Documents
    Shanjana, C.
    James, Ajay
    2014 INTERNATIONAL CONFERENCE ON ADVANCES IN ENGINEERING AND TECHNOLOGY RESEARCH (ICAETR), 2014,
  • [2] Application of Modified Spectral bisection for Segmenting Malayalam documents
    Dhanya, P. M.
    Jathavedan, M.
    2013 THIRD INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING AND COMMUNICATIONS (ICACC 2013), 2013, : 29 - 33
  • [3] A new method for segmenting handwritten Chinese characters
    Tseng, LY
    Chen, RC
    PROCEEDINGS OF THE FOURTH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION, VOLS 1 AND 2, 1997, : 568 - 571
  • [4] Multiscale Residual Network for Recognizing Handwritten Malayalam Characters
    Salim, Samatha Pararath
    James, Ajay
    Simon, Philomina
    Divakaran, Bisna Nellichode
    TRAITEMENT DU SIGNAL, 2024, 41 (01) : 421 - 430
  • [5] An Improved Algorithm for Segmenting and Recognizing Connected Handwritten Characters
    Zhao, Xiaoyu
    Chi, Zheru
    Feng, Dagan
    11TH INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION, ROBOTICS AND VISION (ICARCV 2010), 2010, : 1611 - 1615
  • [6] HOG feature-based recognition for Malayalam handwritten characters
    Anjali, E. P.
    James, Ajay
    Chandran, Saravanan
    EMERGING TRENDS IN ENGINEERING, SCIENCE AND TECHNOLOGY FOR SOCIETY, ENERGY AND ENVIRONMENT, 2018, : 799 - 804
  • [7] A Novel Segmentation and Skew Correction Approach for Handwritten Malayalam Documents
    Haji, Shahnaz Abubakker Bapputty
    James, Ajay
    Chandran, Saravanan
    INTERNATIONAL CONFERENCE ON EMERGING TRENDS IN ENGINEERING, SCIENCE AND TECHNOLOGY (ICETEST - 2015), 2016, 24 : 1341 - 1348
  • [8] A scale space approach for automatically segmenting words from historical handwritten documents
    Manmatha, R
    Rothfeder, JL
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2005, 27 (08) : 1212 - 1225
  • [9] A multi-stage approach for segmenting handwritten Chinese characters
    Ma, Rui
    Xia, Yongquan
    Yang, Jingyu
    PROGRESS IN INTELLIGENCE COMPUTATION AND APPLICATIONS, PROCEEDINGS, 2007, : 417 - 421
  • [10] Recognition of Simple and Conjunct Handwritten Malayalam Characters Using LCPA Algorithm
    Rahiman, M. Abdul
    Rajasree, M. S.
    ADVANCES IN COMPUTING AND COMMUNICATIONS, PT III, 2011, 192 : 304 - +