Window-Based Feature Extraction Framework for Machine-Printed/Handwritten and Arabic/Latin Text Discrimination

被引:0
|
作者
Mezghani, Anis [1 ]
Slimane, Fouad [2 ]
Kanoun, Slim [3 ]
Kherallah, Monji [1 ]
机构
[1] Univ Sfax, REs Grp Intelligent Machines Lab, Sfax, Tunisia
[2] Tech Univ Carolo Wilhelmina Braunschweig, Inst Commun Technol IFN, Braunschweig, Germany
[3] Univ Sfax, ISIMS, MIRACL Lab, Sfax, Tunisia
关键词
Heterogeneous documents; writing type identification; script identification; GMM; sliding window; WRITER IDENTIFICATION; CLASSIFICATION; IMAGES; SYSTEM; SCRIPT;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we propose a new writing type and script text classification technique to recognize the identity of texts extracted from heterogeneous document images. English, French and Arabic languages are used in these documents with mixed handwritten and machine-printed types. In order to identify each text-line/word image, we propose to use 23 features computed on a fixed-length sliding window. Gaussian Mixture Models (GMMs) are used to achieve the classification objective. This framework has been tested on machine-printed and handwritten text-blocks, text-lines and words extracted from different document images of the Maurdor database. Experimental results reveal the effectiveness of our proposed system in writing type and script identification.
引用
收藏
页码:329 / 335
页数:7
相关论文
共 46 条
  • [1] Machine-printed from handwritten text discrimination
    Kavallieratou, E
    Stamatatos, S
    Antonopoulou, H
    NINTH INTERNATIONAL WORKSHOP ON FRONTIERS IN HANDWRITING RECOGNITION, PROCEEDINGS, 2004, : 312 - 316
  • [2] Pyramid Histogram of Oriented Gradient for Machine-printed/Handwritten and Arabic/Latin word discrimination
    Saidani, A.
    Echi, A. Kacem
    2014 6TH INTERNATIONAL CONFERENCE OF SOFT COMPUTING AND PATTERN RECOGNITION (SOCPAR), 2014, : 267 - 272
  • [3] Farsi/Arabic Handwritten from Machine-Printed Words Discrimination
    Mozaffari, Saeed
    Bahar, Parnia
    13TH INTERNATIONAL CONFERENCE ON FRONTIERS IN HANDWRITING RECOGNITION (ICFHR 2012), 2012, : 698 - 703
  • [4] Arabic/Latin and Handwritten/Machine-printed Formula Classification and Recognition
    Ayeb, Kawther Khazri
    Echi, Afef Kacem
    Belaid, Abdel
    2017 1ST INTERNATIONAL WORKSHOP ON ARABIC SCRIPT ANALYSIS AND RECOGNITION (ASAR), 2017, : 90 - 94
  • [5] Identification of Machine-printed and Handwritten Words in Arabic and Latin Scripts
    Saidani, A.
    Echi, A. Kacem
    Belaid, A.
    2013 12TH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), 2013, : 798 - 802
  • [6] Handwritten and Machine-Printed Text Discrimination Using a Template Matching Approach
    Emambakhsh, Mehryar
    He, Yulan
    Nabney, Ian
    PROCEEDINGS OF 12TH IAPR WORKSHOP ON DOCUMENT ANALYSIS SYSTEMS, (DAS 2016), 2016, : 399 - 404
  • [7] Discrimination of machine-printed from handwritten text using simple structural characteristics
    Kavallieratou, E
    Stamatatos, S
    PROCEEDINGS OF THE 17TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL 1, 2004, : 437 - 440
  • [8] Connected Component Level Discrimination of Handwritten and Machine-Printed Text Using Eigenfaces
    Pinson, Samuel J.
    Barrett, William A.
    11TH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR 2011), 2011, : 1394 - 1398
  • [9] RECOGNITION OF HANDWRITTEN AND MACHINE-PRINTED TEXT FOR POSTAL ADDRESS INTERPRETATION
    SRIHARI, SN
    PATTERN RECOGNITION LETTERS, 1993, 14 (04) : 291 - 302
  • [10] Forensic document examination with automatic separation of handwritten and machine-printed text
    Greening, C
    Sagar, VK
    Leedham, G
    HANDWRITING AND DRAWING RESEARCH: BASIC AND APPLIED ISSUES, 1996, : 509 - 520