Multi-skew detection of Indian script documents

被引:15
|
作者
Pal, U [1 ]
Mitra, M [1 ]
Chaudhuri, BB [1 ]
机构
[1] Indian Stat Inst, Comp Vis & Pattern Recognit Unit, Kolkata 35, W Bengal, India
关键词
D O I
10.1109/ICDAR.2001.953801
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
There tire many documents where text lines are not parallel to each other i.e. these lines have different inclinations with the horizontal lines (mufti-skein documents). For the OCR of such a document we have to estimate the skew angle of individual text lines because a single rotation cannot de-skew all text lines of the document. In this paper, we describe a robust technique for multi-skew angle detection from Indian documents containing the most popular Indian scripts Devnagari and Bangla. Most characters in these scripts have horizontal lines at the top, called headlines. The character head-lines usually connect one another in a word and the word appears as a single component. In the proposed method, the connected components are tit,first labeled and selected. The upper envelopes of selected components tire found by column-wise scanning,from the top of the component. Portions of the zipper envelope satisfying the properties of a digital straight line tire detected. They arc then clustered into groups belonging to single text lines. Estimates from these individual clusters give the skew angle of each text line. The proposed mufti-skein detection technique has an accuracy about 98.3%.
引用
收藏
页码:292 / 296
页数:3
相关论文
共 50 条
  • [21] Page-level Script Identification from Multi-script Handwritten Documents
    Singh, Pawan Kumar
    Dalal, Santu Kumar
    Sarkar, Ram
    Nasipuri, Mita
    2015 THIRD INTERNATIONAL CONFERENCE ON COMPUTER, COMMUNICATION, CONTROL AND INFORMATION TECHNOLOGY (C3IT), 2015,
  • [22] Word-Level Script Identification from Handwritten Multi-script Documents
    Singh, Pawan Kumar
    Mondal, Arafat
    Bhowmik, Showmik
    Sarkar, Ram
    Nasipuri, Mita
    PROCEEDINGS OF THE 3RD INTERNATIONAL CONFERENCE ON FRONTIERS OF INTELLIGENT COMPUTING: THEORY AND APPLICATIONS (FICTA) 2014, VOL 1, 2015, 327 : 551 - 558
  • [23] A robust and fast skew detection algorithm for generic documents
    Yu, B
    Jain, AK
    PATTERN RECOGNITION, 1996, 29 (10) : 1599 - 1629
  • [24] Robust skew detection in mixed text/graphics documents
    Amin, A
    Wu, S
    EIGHTH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION, VOLS 1 AND 2, PROCEEDINGS, 2005, : 247 - 251
  • [25] Separating Indic Scripts with matra for Effective Handwritten Script Identification in Multi-Script Documents
    Obaidullah, Sk Md
    Goswami, Chitrita
    Santosh, K. C.
    Das, Nibaran
    Halder, Chayan
    Roy, Kaushik
    INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2017, 31 (05)
  • [26] Skew Detection and Correction of Devanagari Script Using Interval Halving Method
    Jundale, Trupti A.
    Hegadi, Ravindra S.
    RECENT TRENDS IN IMAGE PROCESSING AND PATTERN RECOGNITION (RTIP2R 2016), 2017, 709 : 28 - 38
  • [27] Skew angle detection of a cursive handwritten Devanagari script character image
    Kapoor, Rajiv
    Bagai, Deepak
    Kamal, T.S.
    Journal of the Indian Institute of Science, 2002, 82 (3-4) : 161 - 175
  • [28] Skew detection, page segmentation, and script classification of printed document images
    Waked, B
    Bergler, S
    Suen, CY
    Khoury, S
    1998 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS, VOLS 1-5, 1998, : 4470 - 4475
  • [29] Multi-script bibliographic database: an Indian perspective
    Chandrakar, R
    ONLINE INFORMATION REVIEW, 2002, 26 (04) : 246 - 251
  • [30] A novel local skew correction and segmentation approach for printed multilingual Indian documents
    Soora, Narasimha Reddy
    Deshpande, Parag S.
    ALEXANDRIA ENGINEERING JOURNAL, 2018, 57 (03) : 1609 - 1618