A Combined System for Text Line Extraction and Handwriting Recognition in Historical Documents

被引:12
|
作者
Fischer, Andreas [1 ]
Baechler, Micheal [2 ]
Garz, Angelika [2 ]
Liwicki, Marcus [2 ]
Ingold, Rolf [2 ]
机构
[1] Polytech Montreal, Dept Elect Engn, Montreal, PQ, Canada
[2] Univ Fribourg, Dept Informat, CH-1700 Fribourg, Switzerland
关键词
SEGMENTATION; WORDS;
D O I
10.1109/DAS.2014.51
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Automated reading of historical handwriting is needed to search and browse ancient manuscripts in digital libraries based on their textual content. In this paper, we present a combined system for text localization and transcription in page images. It includes flexible learning-based methods for layout analysis and handwriting recognition, which were developed in the context of the Swiss research project HisDoc. A comprehensive experimental evaluation is provided for the medieval Parzival database, demonstrating a promising word recognition accuracy of 93.0% with closed vocabulary. In order to harmonize the evaluation of the two document analysis tasks, we introduce a novel evaluation measure for text line extraction that takes substitution, deletion, as well as insertion errors into account.
引用
收藏
页码:71 / 75
页数:5
相关论文
共 50 条
  • [21] A set of benchmarks for Handwritten Text Recognition on historical documents
    Andreu Sanchez, Joan
    Romero, Veronica
    Toselli, Alejandro H.
    Villegas, Mauricio
    Vidal, Enrique
    PATTERN RECOGNITION, 2019, 94 : 122 - 134
  • [22] Feature Extraction Techniques of Online Handwriting Arabic Text Recognition
    Abuzaraida, Mustafa Ali
    Zeki, Akram M.
    Zeki, Ahmed M.
    2013 5TH INTERNATIONAL CONFERENCE ON INFORMATION AND COMMUNICATION TECHNOLOGY FOR THE MUSLIM WORLD (ICT4M), 2013,
  • [23] Fast and Lightweight Text Line Detection on Historical Documents
    Melnikov, Aleksei
    Zagaynov, Ivan
    DOCUMENT ANALYSIS SYSTEMS, 2020, 12116 : 441 - 450
  • [24] Text Line Segmentation in Images of Handwritten Historical Documents
    Sanchez, A.
    Suarez, P. D.
    Melloz, C. A. B.
    Oliveira, A. L. I.
    Alves, V. M. O.
    2008 FIRST INTERNATIONAL WORKSHOPS ON IMAGE PROCESSING THEORY, TOOLS AND APPLICATIONS (IPTA), 2008, : 232 - +
  • [25] An Interactive Approach with Off-line and On-line Handwritten Text Recognition Combination for Transcribing Historical Documents
    Granell, Emilio
    Romero, Veronica
    Martinez-Hinarejos, Carlos D.
    PROCEEDINGS OF 12TH IAPR WORKSHOP ON DOCUMENT ANALYSIS SYSTEMS, (DAS 2016), 2016, : 269 - 274
  • [26] REVISING DOCUMENTS WITH TEXT EDITORS, HANDWRITING-RECOGNITION SYSTEMS, AND SPEECH-RECOGNITION SYSTEMS
    GOULD, JD
    ALFARO, L
    HUMAN FACTORS, 1984, 26 (04) : 391 - 406
  • [27] Isolated Vietnamese Handwriting Recognition Embedded System Applied Combined Feature Extraction Method
    Thach Tran Van
    Phi Nguyen Huu
    Trang Hoang
    2015 INTERNATIONAL CONFERENCE ON ADVANCED TECHNOLOGIES FOR COMMUNICATIONS (ATC), 2015, : 479 - 483
  • [28] Text line extraction for historical document images
    Saabni, Raid
    Asi, Abedelkadir
    El-Sana, Jihad
    PATTERN RECOGNITION LETTERS, 2014, 35 : 23 - 33
  • [29] Touching text line segmentation combined local baseline and connected component for Uchen Tibetan historical documents
    Hu, Pengfei
    Wang, Weilan
    Li, Qiaoqiao
    Wang, Tiejun
    INFORMATION PROCESSING & MANAGEMENT, 2021, 58 (06)
  • [30] Reducing the Human Effort in Text Line Segmentation for Historical Documents
    Granell, Emilio
    Quiros, Lorenzo
    Romero, Veronica
    Andreu Sanchez, Joan
    DOCUMENT ANALYSIS AND RECOGNITION, ICDAR 2021, PT III, 2021, 12823 : 523 - 537