OCR-independent and Segmentation-free Word-Spotting in Handwritten Arabic Archive Documents

被引:0
|
作者
Aouadi, N. [1 ]
Kacem, A. [1 ]
机构
[1] LaTICE, Res Lab Technol Informat & Commun & Elect Engn, Tunis, Tunisia
关键词
OCR; Word-spotting; Generalized Hough Transform; Clustering; Handwritten Recognition; Historical document; RETRIEVAL;
D O I
暂无
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
In this paper, a word-spotting approach is presented that can help in reading handwritten Arabic Archive Documents. Because of the low quality of these documents, the proposed approach is free segmentation, independent of OCR, using a global transformation of word images. It is a based learning approach which employs Generalized Hough Transform (GHT) technique. It detects words, described by their models, in documents images by finding the model's position in the image. With the GHT, the problem of finding the model's position is transformed to a problem of finding the transformation's parameter that maps the model into the image. Parameters such as Hough threshold and distance between voting points are considered for a better location and recognition of words. We tested our system on registers from the 19th century onwards, held in the National Archives of Tunisia. Our first experiments reach an average of 94% of well-spotted words.
引用
收藏
页码:36 / 41
页数:6
相关论文
共 50 条
  • [41] An OCR Free Method for Word Spotting in Printed Documents: the Evaluation of Different Feature Sets
    Rios, Israel
    Britto, Alceu de Souza, Jr.
    Koerich, Alessandro Lameiras
    Soares Oliveira, Luis Eduardo
    JOURNAL OF UNIVERSAL COMPUTER SCIENCE, 2011, 17 (01) : 48 - 63
  • [42] Segmentation-free Query-by-String Word Spotting with Bag-of-Features HMMs
    Rothacker, Leonard
    Fink, Gernot A.
    2015 13TH IAPR INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), 2015, : 661 - 665
  • [43] Segmentation Free Word Spotting for Handwritten Documents Using Bag of Visual Words Based on Co-HOG Descriptor
    Thontadari, C.
    Prabhakar, C. J.
    INTERNATIONAL JOURNAL OF INFORMATION RETRIEVAL RESEARCH, 2019, 9 (02) : 49 - 65
  • [44] Neural Ctrl-F: Segmentation-free Query-by-StringWord Spotting in Handwritten Manuscript Collections
    Wilkinson, Tomas
    Lindstrom, Jonas
    Brun, Anders
    2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 4443 - 4452
  • [45] Bootstrapping Weakly Supervised Segmentation-free Word Spotting through HMM-based Alignment
    Wilkinson, Tomas
    Nettelblad, Carl
    2020 17TH INTERNATIONAL CONFERENCE ON FRONTIERS IN HANDWRITING RECOGNITION (ICFHR 2020), 2020, : 49 - 54
  • [46] Hierarchical representation learning using spherical k-means for segmentation-free word spotting
    Mhiri, Mohamed
    Abuelwafa, Sherif
    Desrosiers, Christian
    Cheriet, Mohamed
    PATTERN RECOGNITION LETTERS, 2018, 101 : 52 - 59
  • [47] Handwritten word recognition using segmentation-free hidden Markov modeling and segmentation-based dynamic programming techniques
    Mohamed, M
    Gader, P
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1996, 18 (05) : 548 - 554
  • [48] Structural information implant in a context based segmentation-free HMM handwritten word recognition system for Latin and Bangla script
    Vajda, S
    Belaïd, A
    EIGHTH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION, VOLS 1 AND 2, PROCEEDINGS, 2005, : 1126 - 1130
  • [49] Online Handwritten Cursive Word Recognition Using Segmentation-free MRF in Combination with P2DBMN-MQDF
    Zhu, Bilan
    Shivram, Arti
    Setlur, Srirangaraj
    Govindaraju, Venu
    Nakagawa, Masaki
    2013 12TH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), 2013, : 349 - 353
  • [50] Seam carving, horizontal projection profile and contour tracing for line and word segmentation of language independent handwritten documents
    Das, Mamatarani
    Panda, Mrutyunjaya
    RESULTS IN ENGINEERING, 2023, 18