Word matching using single closed contours for indexing handwritten historical documents

被引:50
|
作者
Adamek, Tornasz [1 ]
O'Connor, Noel E. [1 ]
Smeaton, Alan F. [1 ]
机构
[1] Dublin City Univ, Ctr Digital Video Proc, Dublin 9, Ireland
关键词
historical manuscripts; holistic word recognition; contour matching; annotation indexing;
D O I
10.1007/s10032-006-0024-y
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Effective indexing is crucial for providing convenient access to scanned versions of large collections of historically valuable handwritten manuscripts. Since traditional handwriting recognizers based on optical character recognition (OCR) do not perform well on historical documents, recently a holistic word recognition approach has gained in popularity as an attractive and more straightforward solution (Lavrenko et al. in proc. document Image Analysis for Libraries (DIAL'04), pp. 278-287,2004). Such techniques attempt to recognize words based on scalar and profile-based features extracted from whole word images. In this paper, we propose a new approach to holistic word recognition for historical handwritten manuscripts based on matching word contours instead of whole images or word profiles. The new method consists of robust extraction of closed word contours and the application of an elastic contour matching technique proposed originally for general shapes (Adamek and O'Connor in IEEE Trans Circuits Syst Video Technol 5: 2004). We demonstrate that multiscale contour-based descriptors can effectively capture intrinsic word features avoiding any segmentation of words into smaller subunits. Our experiments show a recognition accuracy of 83%, which considerably exceeds the performance of other systems reported in the literature.
引用
收藏
页码:153 / 165
页数:13
相关论文
共 50 条
  • [1] Word matching using single closed contours for indexing handwritten historical documents
    Tomasz Adamek
    Noel E. O’Connor
    Alan F. Smeaton
    International Journal of Document Analysis and Recognition (IJDAR), 2007, 9 : 153 - 165
  • [2] Holistic word recognition for handwritten historical documents
    Lavrenko, V
    Rath, TM
    Manmatha, R
    FIRST INTERNATIONAL WORKSHOP ON DOCUMENT IMAGE ANALYSIS FOR LIBRARIES, PROCEEDINGS, 2004, : 278 - 287
  • [3] Sequential Word Spotting in Historical Handwritten Documents
    Fernandez-Mota, David
    Llados, Josep
    Fornes, Alicia
    Manmatha, R.
    2014 11TH IAPR INTERNATIONAL WORKSHOP ON DOCUMENT ANALYSIS SYSTEMS (DAS 2014), 2014, : 101 - 105
  • [4] Indexing historical documents by word shape signatures
    Llados, Josep
    Sanchez, Gemma
    ICDAR 2007: NINTH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION, VOLS I AND II, PROCEEDINGS, 2007, : 362 - 366
  • [5] ON THE INFLUENCE OF WORD REPRESENTATIONS FOR HANDWRITTEN WORD SPOTTING IN HISTORICAL DOCUMENTS
    Llados, Josep
    Rusinol, Marcal
    Fornes, Alicia
    Fernandez, David
    Dutta, Anjan
    INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2012, 26 (05)
  • [6] Learning-free handwritten word spotting method for historical handwritten documents
    Mohammed, Hanadi Hassen
    Subramanian, Nandhini
    Al-Madeed, Somaya
    IET IMAGE PROCESSING, 2021, 15 (10) : 2332 - 2341
  • [7] Keyword spotting in historical handwritten documents based on graph matching
    Stauffer, Michael
    Fischer, Andreas
    Riesen, Kaspar
    PATTERN RECOGNITION, 2018, 81 : 240 - 253
  • [8] Word Stretching for Effective Segmentation and Classification of Historical Arabic Handwritten Documents
    Al Aghbari, Zaher
    Brook, Salama
    RCIS 2009: PROCEEDINGS OF THE IEEE INTERNATIONAL CONFERENCE ON RESEARCH CHALLENGES IN INFORMATION SCIENCE, 2009, : 217 - 224
  • [9] Word indexing of ancient documents using fuzzy classification
    Sousa, Jodo M. C.
    Gil, Joao M.
    Pinto, Joao R. Caldas
    IEEE TRANSACTIONS ON FUZZY SYSTEMS, 2007, 15 (05) : 852 - 862
  • [10] Exploiting Collection Level for Improving Assisted Handwritten Word Transcription of Historical Documents
    Guichard, Laurent
    Chazalon, Joseph
    Coueason, Bertrand
    11TH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR 2011), 2011, : 875 - 879