OCR-independent and Segmentation-free Word-Spotting in Handwritten Arabic Archive Documents

被引:0
|
作者
Aouadi, N. [1 ]
Kacem, A. [1 ]
机构
[1] LaTICE, Res Lab Technol Informat & Commun & Elect Engn, Tunis, Tunisia
关键词
OCR; Word-spotting; Generalized Hough Transform; Clustering; Handwritten Recognition; Historical document; RETRIEVAL;
D O I
暂无
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
In this paper, a word-spotting approach is presented that can help in reading handwritten Arabic Archive Documents. Because of the low quality of these documents, the proposed approach is free segmentation, independent of OCR, using a global transformation of word images. It is a based learning approach which employs Generalized Hough Transform (GHT) technique. It detects words, described by their models, in documents images by finding the model's position in the image. With the GHT, the problem of finding the model's position is transformed to a problem of finding the transformation's parameter that maps the model into the image. Parameters such as Hough threshold and distance between voting points are considered for a better location and recognition of words. We tested our system on registers from the 19th century onwards, held in the National Archives of Tunisia. Our first experiments reach an average of 94% of well-spotted words.
引用
收藏
页码:36 / 41
页数:6
相关论文
共 50 条
  • [1] Segmentation-free Word Spotting for Handwritten Arabic Documents
    Khaissidi, G.
    Elfakir, Y.
    Mrabti, M.
    Lakhliai, Z.
    Chenouni, D.
    El Yacoubi, M.
    [J]. INTERNATIONAL JOURNAL OF INTERACTIVE MULTIMEDIA AND ARTIFICIAL INTELLIGENCE, 2016, 4 (01): : 6 - 10
  • [2] Word Spotting as a Service: An Unsupervised and Segmentation-Free Framework for Handwritten Documents
    Zagoris, Konstantinos
    Amanatiadis, Angelos
    Pratikakis, Ioannis
    [J]. JOURNAL OF IMAGING, 2021, 7 (12)
  • [3] Bag-of-Features HMMs for Segmentation-free Word Spotting in Handwritten Documents
    Rothacker, Leonard
    Rusinol, Marcal
    Fink, Gernot A.
    [J]. 2013 12TH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), 2013, : 1305 - 1309
  • [4] Segmentation-free Keyword Spotting for Bangla Handwritten Documents
    Zhang, Xi
    Pal, Umapada
    Tan, Chew Lim
    [J]. 2014 14TH INTERNATIONAL CONFERENCE ON FRONTIERS IN HANDWRITING RECOGNITION (ICFHR), 2014, : 381 - 386
  • [5] A segmentation free Word Spotting for handwritten documents
    Ghorbel, Adam
    Ogier, Lean-Marc
    Vincent, Nicole
    [J]. 2015 13TH IAPR INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), 2015, : 346 - 350
  • [6] Visual keyword based word-spotting in handwritten documents
    Kolcz, A
    Alspector, J
    Augusteijn, M
    Carlson, R
    Popescu, GV
    [J]. DOCUMENT RECOGNITION V, 1998, 3305 : 185 - 193
  • [7] An Historical Handwritten Arabic Dataset for Segmentation-Free Word Spotting-HADARA80P
    Pantke, Werner
    Dennhardt, Martin
    Fecker, Daniel
    Maergner, Volker
    Fingscheidt, Tim
    [J]. 2014 14TH INTERNATIONAL CONFERENCE ON FRONTIERS IN HANDWRITING RECOGNITION (ICFHR), 2014, : 15 - 20
  • [8] A segmentation-free word spotting method for historical printed documents
    Konidaris, Thomas
    Kesidis, Anastasios L.
    Gatos, Basilis
    [J]. PATTERN ANALYSIS AND APPLICATIONS, 2016, 19 (04) : 963 - 976
  • [9] A segmentation-free word spotting method for historical printed documents
    Thomas Konidaris
    Anastasios L. Kesidis
    Basilis Gatos
    [J]. Pattern Analysis and Applications, 2016, 19 : 963 - 976
  • [10] Segmentation-free Word Spotting in Historical Bangla Handwritten Binarized Document
    Das, Sugata
    Mandal, Sekhar
    [J]. 2017 NINTH INTERNATIONAL CONFERENCE ON ADVANCES IN PATTERN RECOGNITION (ICAPR), 2017, : 76 - 81