Word matching using single closed contours for indexing handwritten historical documents

被引:50
|
作者
Adamek, Tornasz [1 ]
O'Connor, Noel E. [1 ]
Smeaton, Alan F. [1 ]
机构
[1] Dublin City Univ, Ctr Digital Video Proc, Dublin 9, Ireland
关键词
historical manuscripts; holistic word recognition; contour matching; annotation indexing;
D O I
10.1007/s10032-006-0024-y
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Effective indexing is crucial for providing convenient access to scanned versions of large collections of historically valuable handwritten manuscripts. Since traditional handwriting recognizers based on optical character recognition (OCR) do not perform well on historical documents, recently a holistic word recognition approach has gained in popularity as an attractive and more straightforward solution (Lavrenko et al. in proc. document Image Analysis for Libraries (DIAL'04), pp. 278-287,2004). Such techniques attempt to recognize words based on scalar and profile-based features extracted from whole word images. In this paper, we propose a new approach to holistic word recognition for historical handwritten manuscripts based on matching word contours instead of whole images or word profiles. The new method consists of robust extraction of closed word contours and the application of an elastic contour matching technique proposed originally for general shapes (Adamek and O'Connor in IEEE Trans Circuits Syst Video Technol 5: 2004). We demonstrate that multiscale contour-based descriptors can effectively capture intrinsic word features avoiding any segmentation of words into smaller subunits. Our experiments show a recognition accuracy of 83%, which considerably exceeds the performance of other systems reported in the literature.
引用
收藏
页码:153 / 165
页数:13
相关论文
共 50 条
  • [31] Offline general handwritten word recognition using an approximate BEAM matching algorithm
    Favata, JT
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2001, 23 (09) : 1009 - 1021
  • [32] Keyword Matching in Historical Machine-Printed Documents Using Synthetic Data, Word Portions and Dynamic Time Warping
    Konidaris, T.
    Gatos, B.
    Perantonis, S. J.
    Kesidis, A.
    PROCEEDINGS OF THE 8TH IAPR INTERNATIONAL WORKSHOP ON DOCUMENT ANALYSIS SYSTEMS, 2008, : 539 - 545
  • [33] Line and word Segmentation of Kannada Handwritten Text documents using Projection Profile Technique
    Banumathi, K. L.
    Chandra, Jagadeesh A. P.
    2016 INTERNATIONAL CONFERENCE ON ELECTRICAL, ELECTRONICS, COMMUNICATION, COMPUTER AND OPTIMIZATION TECHNIQUES (ICEECCOT), 2016, : 196 - 201
  • [34] Unsupervised Text Binarization in Handwritten Historical Documents Using k-Means Clustering
    Kusetogullari, Huseyin
    PROCEEDINGS OF SAI INTELLIGENT SYSTEMS CONFERENCE (INTELLISYS) 2016, VOL 2, 2018, 16 : 23 - 32
  • [35] Japanese character segmentation for historical handwritten official documents using fully convolutional networks
    Watanabe, Kei
    Takahashi, Shinji
    Kamaya, Yuki
    Yamada, Masashi
    Mekada, Yoshito
    Hasegawa, Junichi
    Miyazaki, Shinya
    Proceedings of the International Conference on Document Analysis and Recognition, ICDAR, 2019, : 934 - 940
  • [36] Word spotting in historical printed documents using shape and sequence comparisons
    Khurshid, Khurram
    Faure, Claudie
    Vincent, Nicole
    PATTERN RECOGNITION, 2012, 45 (07) : 2598 - 2609
  • [37] Word spotting in historical documents using primitive codebook and dynamic programming
    Roy, Partha Pratim
    Rayar, Frederic
    Ramel, Jean-Yves
    IMAGE AND VISION COMPUTING, 2015, 44 : 15 - 28
  • [38] Word and character segmentation in ancient handwritten documents in Devanagari and Maithili scripts using horizontal zoning
    Jindal, Amar
    Ghosh, Rajib
    EXPERT SYSTEMS WITH APPLICATIONS, 2023, 225
  • [39] A Novel Word-Spotting Method for Handwritten Documents Using an Optimization-Based Classifier
    Tavoli, Reza
    Keyvanpour, Mohammadreza
    APPLIED ARTIFICIAL INTELLIGENCE, 2017, 31 (04) : 346 - 375
  • [40] A Coarse-to-Fine Word Spotting Approach for Historical Handwritten Documents Based on Graph Embedding and Graph Edit Distance
    Wang, Peng
    Eglin, Veronique
    Garcia, Christophe
    Largeron, Christine
    Llados, Josep
    Fornes, Alicia
    2014 22ND INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2014, : 3074 - 3079