Performance evaluation and benchmarking of six-page segmentation algorithms

被引:113
|
作者
Shafait, Faisal [1 ]
Keysers, Daniel [1 ]
Breuel, Thomas M. [2 ]
机构
[1] DFKI GmbH, German Res Ctr Artificial Intelligence, Image Understanding & Pattern Recognit Res Grp, D-67663 Kaiserslautern, Germany
[2] Tech Univ Kaiserslautern, Dept Comp Sci, D-67663 Kaiserslautern, Germany
关键词
document page segmentation; OCR; performance evaluation; performance metric;
D O I
10.1109/TPAMI.2007.70837
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Informative benchmarks are crucial for optimizing the page segmentation step of an OCR system, frequently the performance limiting step for overall OCR system performance. We show that current evaluation scores are insufficient for diagnosing specific errors in page segmentation and fail to identify some classes of serious segmentation errors altogether. This paper introduces a vectorial score that is sensitive to, and identifies, the most important classes of segmentation errors (over, under, and mis-segmentation) and what page components (lines, blocks, etc.) are affected. Unlike previous schemes, our evaluation method has a canonical representation of ground-truth data and guarantees pixel-accurate evaluation results for arbitrary region shapes. We present the results of evaluating widely used segmentation algorithms (x-y cut, smearing, whitespace analysis, constrained text-line finding, docstrum, and Voronoi) on the UW-III database and demonstrate that the new evaluation scheme permits the identification of several specific flaws in individual segmentation methods.
引用
收藏
页码:941 / 954
页数:14
相关论文
共 50 条
  • [21] Goal-Oriented Performance Evaluation Methodology for Page Segmentation Techniques
    Stamatopoulos, Nikolaos
    Louloudis, Georgios
    Gatos, Basilis
    2015 13TH IAPR INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), 2015, : 281 - 285
  • [22] Optimal selection of segmentation algorithms based on performance evaluation
    Zhang, YJ
    Luo, HT
    OPTICAL ENGINEERING, 2000, 39 (06) : 1450 - 1456
  • [23] Automated performance evaluation of range image segmentation algorithms
    Min, J
    Powell, M
    Bowyer, KW
    IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS, 2004, 34 (01): : 263 - 271
  • [24] Quantitative Performance Evaluation Algorithms for Pavement Distress Segmentation
    Kaul, Vivek
    Tsai, Yichang
    Mersereau, Russell M.
    TRANSPORTATION RESEARCH RECORD, 2010, (2153) : 106 - 113
  • [25] Effective multiresolution arc segmentation: Algorithms and performance evaluation
    Song, JQ
    Lyu, MR
    Cai, SJ
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2004, 26 (11) : 1491 - 1506
  • [26] A benchmarking environment for performance evaluation of tree-based rekeying algorithms
    Shoufan, Abdulhadi
    Arul, Tolga
    JOURNAL OF SYSTEMS AND SOFTWARE, 2011, 84 (07) : 1130 - 1143
  • [27] Automatic training of page segmentation algorithms: An optimization approach
    Mao, S
    Kanungo, T
    15TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL 4, PROCEEDINGS: APPLICATIONS, ROBOTICS SYSTEMS AND ARCHITECTURES, 2000, : 531 - 534
  • [28] On Benchmarking Cell Nuclei Segmentation Algorithms for Fluorescence Microscopy
    Wirth, Frederike
    Brinkmann, Eva-Maria
    Brinker, Klaus
    PROCEEDINGS OF THE 13TH INTERNATIONAL JOINT CONFERENCE ON BIOMEDICAL ENGINEERING SYSTEMS AND TECHNOLOGIES, VOL 2: BIOIMAGING, 2020, : 164 - 171
  • [29] The Social Protection Floors Recommendation, 2012 (No. 202): Can a six-page document change the course of social history?
    Cichon, Michael
    INTERNATIONAL SOCIAL SECURITY REVIEW, 2013, 66 (3-4) : 21 - 43
  • [30] Fast and Accurate Ground Truth Generation for Skew-Tolerance Evaluation of Page Segmentation Algorithms
    Oleg Okun
    Matti Pietikäinen
    EURASIP Journal on Advances in Signal Processing, 2006