OCR-independent and Segmentation-free Word-Spotting in Handwritten Arabic Archive Documents

被引:0
|
作者
Aouadi, N. [1 ]
Kacem, A. [1 ]
机构
[1] LaTICE, Res Lab Technol Informat & Commun & Elect Engn, Tunis, Tunisia
关键词
OCR; Word-spotting; Generalized Hough Transform; Clustering; Handwritten Recognition; Historical document; RETRIEVAL;
D O I
暂无
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
In this paper, a word-spotting approach is presented that can help in reading handwritten Arabic Archive Documents. Because of the low quality of these documents, the proposed approach is free segmentation, independent of OCR, using a global transformation of word images. It is a based learning approach which employs Generalized Hough Transform (GHT) technique. It detects words, described by their models, in documents images by finding the model's position in the image. With the GHT, the problem of finding the model's position is transformed to a problem of finding the transformation's parameter that maps the model into the image. Parameters such as Hough threshold and distance between voting points are considered for a better location and recognition of words. We tested our system on registers from the 19th century onwards, held in the National Archives of Tunisia. Our first experiments reach an average of 94% of well-spotted words.
引用
收藏
页码:36 / 41
页数:6
相关论文
共 50 条
  • [11] A Segmentation-free Handwritten Word Spotting Approach by Relaxed Feature Matching
    Hast, Anders
    Fornes, Alicia
    [J]. PROCEEDINGS OF 12TH IAPR WORKSHOP ON DOCUMENT ANALYSIS SYSTEMS, (DAS 2016), 2016, : 150 - 155
  • [12] Segmentation-free Keyword Spotting for Handwritten Documents based on Heat Kernel Signature
    Zhang, Xi
    Tan, Chew Lim
    [J]. 2013 12TH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), 2013, : 827 - 831
  • [13] Segmentation-free word spotting with exemplar SVMs
    Almazan, Jon
    Gordo, Albert
    Fornes, Alicia
    Valveny, Ernest
    [J]. PATTERN RECOGNITION, 2014, 47 (12) : 3967 - 3978
  • [14] Local Feature Based Word Spotting in Handwritten Archive Documents
    Czuni, Laszlo
    Kiss, Peter Jozsef
    Gal, Monika
    Lipovits, Agnes
    [J]. 2013 11TH INTERNATIONAL WORKSHOP ON CONTENT-BASED MULTIMEDIA INDEXING (CBMI 2013), 2013, : 178 - 183
  • [15] Word spotting for Handwritten Arabic documents using Harris detector
    Elfakiri, Youssef
    Chenouni, Driss
    Khaissidi, Ghizlane
    El Yacoubi, Mounim
    Mrabti, Mostafa
    [J]. 2016 INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY FOR ORGANIZATIONS DEVELOPMENT (IT4OD), 2016,
  • [16] Segmentation-free word spotting in historical Bangla handwritten document using Wave Kernel Signature
    Das, Sugata
    Mandal, Sekhar
    [J]. PATTERN ANALYSIS AND APPLICATIONS, 2020, 23 (02) : 593 - 610
  • [17] Segmentation-free word spotting in historical Bangla handwritten document using Wave Kernel Signature
    Sugata Das
    Sekhar Mandal
    [J]. Pattern Analysis and Applications, 2020, 23 : 593 - 610
  • [18] Learning-based word spotting system for Arabic handwritten documents
    Khayyat, Muna
    Lam, Louisa
    Suen, Ching Y.
    [J]. PATTERN RECOGNITION, 2014, 47 (03) : 1021 - 1030
  • [19] A Novel Word-Spotting Method for Handwritten Documents Using an Optimization-Based Classifier
    Tavoli, Reza
    Keyvanpour, Mohammadreza
    [J]. APPLIED ARTIFICIAL INTELLIGENCE, 2017, 31 (04) : 346 - 375
  • [20] Statistical script independent word spotting in offline handwritten documents
    Wshah, Safwan
    Kumar, Gaurav
    Govindaraju, Venu
    [J]. PATTERN RECOGNITION, 2014, 47 (03) : 1039 - 1050