OCRSpell: An interactive spelling correction system for OCR errors in text

被引:31
|
作者
Taghva K. [1 ]
Stofsky E. [1 ]
机构
[1] Information Science Research Institute, University of Nevada, Las Vegas, Las Vegas
关键词
Error correction; Information retrieval; OCR-Spell checkers; Scanning;
D O I
10.1007/PL00013558
中图分类号
学科分类号
摘要
In this paper, we describe a spelling correction system designed specifically for OCR-generated text that selects candidate words through the use of information gathered from multiple knowledge sources. This system for text correction is based on static and dynamic device mappings, approximate string matching, and n-gram analysis. Our statistically based, Bayesian system incorporates a learning feature that collects confusion information at the collection and document levels. An evaluation of the new system is presented as well. © 2001 Springer-Verlag Berlin Heidelberg.
引用
收藏
页码:125 / 137
页数:12
相关论文
共 50 条
  • [1] A Spell Correction Model for OCR Errors for Arabic Text
    Muhammad, Mariam
    ELGhazaly, Tarek
    Ezzat, Mostafa
    Gheith, Mervat
    PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON ADVANCED INTELLIGENT SYSTEMS AND INFORMATICS 2016, 2017, 533 : 124 - 136
  • [2] A SPELLING CORRECTION METHOD AND ITS APPLICATION TO AN OCR SYSTEM
    TAKAHASHI, H
    ITOH, N
    AMANO, T
    YAMASHITA, A
    PATTERN RECOGNITION, 1990, 23 (3-4) : 363 - 377
  • [3] SYSTEM-DESIGN FOR DETECTION AND CORRECTION OF SPELLING-ERRORS IN SCIENTIFIC AND SCHOLARLY TEXT
    POLLOCK, JJ
    ZAMORA, A
    JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE, 1984, 35 (02): : 104 - 109
  • [4] SYSTEM FOR AUTOMATIC DETECTION AND CORRECTION OF SPELLING-ERRORS - SPELLING SYSTEM FOR COMPUTERS
    KHARIN, NP
    NAUCHNO-TEKHNICHESKAYA INFORMATSIYA SERIYA 2-INFORMATSIONNYE PROTSESSY I SISTEMY, 1992, (11): : 27 - 32
  • [5] All, and only, the errors: more complete and consistent spelling and OCR-error correction evaluation
    Reynaert, Martin
    SIXTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, LREC 2008, 2008, : 1867 - 1872
  • [6] Learning string distance with smoothing for OCR spelling correction
    Hladek, Daniel
    Stas, Jan
    Ondas, Stanislav
    Juhar, Jozef
    Kovacs, Laszlo
    MULTIMEDIA TOOLS AND APPLICATIONS, 2017, 76 (22) : 24549 - 24567
  • [7] Learning string distance with smoothing for OCR spelling correction
    Daniel Hládek
    Ján Staš
    Stanislav Ondáš
    Jozef Juhár
    Lászlo Kovács
    Multimedia Tools and Applications, 2017, 76 : 24549 - 24567
  • [8] Evaluating text categorization in the presence of OCR errors
    Taghva, K
    Nartker, T
    Borsack, J
    Lumos, S
    Condit, A
    Young, R
    DOCUMENT RECOGNITION AND RETRIEVAL VIII, 2001, 4307 : 68 - 74
  • [9] A Method of Chinese Text Detecting Errors Based on Recognition Errors by OCR
    Tian Zhuo
    Li Baicheng
    MODERN TECHNOLOGIES IN MATERIALS, MECHANICS AND INTELLIGENT SYSTEMS, 2014, 1049 : 1540 - 1543
  • [10] An Efficient Unsupervised Approach for OCR Error Correction of Vietnamese OCR Text
    Nguyen, Quoc-Dung
    Phan, Nguyet-Minh
    Kromer, Pavel
    Le, Duc-Anh
    IEEE ACCESS, 2023, 11 : 58406 - 58421