OCRSpell: An interactive spelling correction system for OCR errors in text

被引:31
|
作者
Taghva K. [1 ]
Stofsky E. [1 ]
机构
[1] Information Science Research Institute, University of Nevada, Las Vegas, Las Vegas
关键词
Error correction; Information retrieval; OCR-Spell checkers; Scanning;
D O I
10.1007/PL00013558
中图分类号
学科分类号
摘要
In this paper, we describe a spelling correction system designed specifically for OCR-generated text that selects candidate words through the use of information gathered from multiple knowledge sources. This system for text correction is based on static and dynamic device mappings, approximate string matching, and n-gram analysis. Our statistically based, Bayesian system incorporates a learning feature that collects confusion information at the collection and document levels. An evaluation of the new system is presented as well. © 2001 Springer-Verlag Berlin Heidelberg.
引用
收藏
页码:125 / 137
页数:12
相关论文
共 50 条
  • [41] SPELLING CORRECTION FOR AN INTELLIGENT TUTORING SYSTEM
    LEE, YH
    EVENS, M
    MICHAEL, JA
    ROVICK, AA
    LECTURE NOTES IN COMPUTER SCIENCE, 1991, 507 : 77 - 83
  • [42] Efficient Solutions for OCR Text Remote Correction in Content Conversion Systems
    Boiangiu, Costin-Anton
    Topliceanu, Alexandru
    Bucur, Ion
    CONTROL ENGINEERING AND APPLIED INFORMATICS, 2013, 15 (01): : 22 - 32
  • [43] AUTOMATIC ERROR-CORRECTION AND QUERY EVALUATION OF OCR GENERATED TEXT
    TAGHVA, K
    BORSACK, J
    CONDIT, A
    ONLINE & CDROM REVIEW, 1994, 18 (01): : 47 - 47
  • [44] A high accuracy OCR system for printed Telugu text
    Lakshmi, CV
    Patvardhan, C
    IEEE TENCON 2003: CONFERENCE ON CONVERGENT TECHNOLOGIES FOR THE ASIA-PACIFIC REGION, VOLS 1-4, 2003, : 725 - 729
  • [45] DATA BASE INPUT AND TEXT HANDLING IN AN OCR SYSTEM
    REITZ, G
    IEEE COMPUTER GROUP NEWS, 1970, 3 (03): : 17 - &
  • [46] Character confusion versus focus word-based correction of spelling and OCR variants in corpora
    Martin W. C. Reynaert
    International Journal on Document Analysis and Recognition (IJDAR), 2011, 14 : 173 - 187
  • [47] Analysis of Recent Deep Learning Techniques for Arabic Handwritten-Text OCR and Post-OCR Correction
    Najam, Rayyan
    Faizullah, Safiullah
    APPLIED SCIENCES-BASEL, 2023, 13 (13):
  • [48] Character confusion versus focus word-based correction of spelling and OCR variants in corpora
    Reynaert, Martin W. C.
    INTERNATIONAL JOURNAL ON DOCUMENT ANALYSIS AND RECOGNITION, 2011, 14 (02) : 173 - 187
  • [49] AUTOMATED PROCESSES OF DETECTION AND CORRECTION OF TEXT ERRORS
    BELONOGOV, GG
    ZELENKOV, YG
    KUZNETSOV, BA
    KHOROSHILOV, AA
    NAUCHNO-TEKHNICHESKAYA INFORMATSIYA SERIYA 1-ORGANIZATSIYA I METODIKA INFORMATSIONNOI RABOTY, 1991, (7-8): : 45 - 47
  • [50] Spelling Errors in Korean Students’ Constructed Responses and the Efficacy of Automatic Spelling Correction on Automated Computer Scoring
    Hyeonju Lee
    Minsu Ha
    Jurim Lee
    Rahmi Qurota Aini
    Ai Nurlaelasari Rusmana
    Yustika Sya’bandari
    Jun-Ki Lee
    Sein Shin
    Gyeong-Geon Lee
    Jaegul Choo
    Sungchul Choi
    Namhyoung Kim
    Jisun Park
    Technology, Knowledge and Learning, 2023, 28 : 185 - 205