A statistical approach to the generation of a database for evaluating OCR software

被引:0
|
作者
Brundick F.S. [1 ]
Brodeen A.E.M. [1 ]
Taylor M.S. [2 ]
机构
[1] US Army Research Laboratory, ATTN: AMSRL-CI-CT, Aberdeen Proving Ground
[2] University of Maryland, Maryland
关键词
Bootstrap; OCR evaluation; Statistics M.S. Taylor presently at OAO Corporation;
D O I
10.1007/s100320200067
中图分类号
学科分类号
摘要
In this paper we consider a statistical approach to augment a limited database of groundtruth documents for use in evaluation of optical character recognition software. A modified moving-blocks bootstrap procedure is used to construct surrogate documents for this purpose which prove to serve effectively and, in some regards, indistinguishably from groundtruth. The proposed method is validated through a rigorous statistical procedure. © 2002 Springer-Verlag Berlin Heidelberg.
引用
收藏
页码:170 / 176
页数:6
相关论文
共 50 条
  • [21] A Relational Database Approach to Report Generation
    Kaercher, Joerg
    Leonard, Arah
    Ruf, Michael
    ACTA CRYSTALLOGRAPHICA A-FOUNDATION AND ADVANCES, 2005, 61 : C110 - C110
  • [22] Developing and evaluating a pipeline for Setswana OCR
    Kotze, Gideon
    Wolff, Friedel
    2017 PATTERN RECOGNITION ASSOCIATION OF SOUTH AFRICA AND ROBOTICS AND MECHATRONICS (PRASA-ROBMECH), 2017, : 236 - 241
  • [23] Statistical learning for OCR error correction
    Mei, Jie
    Islam, Aminul
    Moh'd, Abidalrahman
    Wu, Yajing
    Milios, Evangelos
    INFORMATION PROCESSING & MANAGEMENT, 2018, 54 (06) : 874 - 887
  • [24] Application of statistical science to testing and evaluating software intensive systems
    Poore, JH
    Trammell, CJ
    SCIENCE AND ENGINEERING FOR SOFTWARE DEVELOPMENT: A RECOGNITION OF HARLAN D. MILLS' LEGACY (SESD 99), PROCEEDINGS, 1999, : 40 - 57
  • [25] Dependability approach for evaluating software development risks
    Melo, Alexsandro
    Guimaraes Tavares, Eduardo Antonio
    Sousa, Erica
    Nogueira, Bruno Costa e Silva
    Marinho, Marcelo
    IET SOFTWARE, 2015, 9 (01) : 17 - 27
  • [26] A Machine Learning Approach for Statistical Software Testing
    Baskiotis, Nicolas
    Sebag, Michele
    Gaudel, Marie-Claude
    Gouraud, Sandrine
    20TH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2007, : 2274 - 2279
  • [27] Statistical Process Control for Software: a Systematic Approach
    Boffoli, N.
    Bruno, G.
    Caivano, D.
    Mastelloni, G.
    ESEM'08: PROCEEDINGS OF THE 2008 ACM-IEEE INTERNATIONAL SYMPOSIUM ON EMPIRICAL SOFTWARE ENGINEERING AND MEASUREMENT, 2008, : 327 - 329
  • [28] Neuro semantic thresholding using OCR software for high precision OCR applications
    Lazaro, Jesus
    Luis Martin, Jose
    Arias, Jagoba
    Astarloa, Armando
    Cuadrado, Carlos
    IMAGE AND VISION COMPUTING, 2010, 28 (04) : 571 - 578
  • [29] ALTERNATIVE STATISTICAL APPROACH TO EVALUATING INTERLABORATORY PERFORMANCE
    EHRMEYER, SS
    LAESSIG, RH
    CLINICAL CHEMISTRY, 1985, 31 (01) : 106 - 108
  • [30] Evaluating OCR and non-OCR text representations for learning document classifiers
    Junker, M
    Hoch, R
    PROCEEDINGS OF THE FOURTH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION, VOLS 1 AND 2, 1997, : 1060 - 1066