Character extraction from documents using wavelet maxima

被引:13
|
作者
Hwang, WL [1 ]
Chang, F [1 ]
机构
[1] Acad Sinica, Inst Informat Sci, Taipei, Taiwan
关键词
optical character recognition; thresholding; wavelet maxima;
D O I
10.1016/S0262-8856(97)00063-2
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The extraction of character images is an important front-end processing task in optical character recognition (OCR) and other applications. This process is extremely important because OCR applications usually extract salient features and process them. The existence of noise not only destroys features of characters, but also introduces unwanted features. We propose a new algorithm which removes unwanted background noise from a textual image. Our algorithm is based on the observation that the magnitude of the intensity variation of character boundaries differs from that of noise at various scales of their wavelet transform. Therefore, most of the edges corresponding to the character boundaries at each scale can be extracted using a thresholding method. The internal region of a character is determined by a voting procedure, which uses the arguments of the remaining edges. The interior of the recovered characters is solid, containing no holes. The recovered characters tend to become fattened because of the smoothness applied in the calculation of the wavelet transform. To obtain a quality restoration of the character image, the precise locations of characters in the original image are then estimated using a Bayesian criterion. We also present some experimental results that suggest the effectiveness of our method. (C) 1998 Elsevier Science B.V. All rights reserved.
引用
收藏
页码:307 / 315
页数:9
相关论文
共 50 条
  • [1] Character extraction from documents using wavelet maxima
    Hwang, WL
    Chang, F
    WAVELET APPLICATIONS IN SIGNAL AND IMAGE PROCESSING IV, PTS 1 AND 2, 1996, 2825 : 1003 - 1015
  • [2] Character string extraction from color documents
    Hase, H
    Shinokawa, T
    Yoneda, M
    Suen, CY
    PATTERN RECOGNITION, 2001, 34 (07) : 1349 - 1365
  • [3] Method for Character Domain Extraction from Image Using Wavelet Transform
    Taniguchi, Taiki
    Yoshitomi, Yasunari
    PROCEEDINGS OF INTERNATIONAL CONFERENCE ON ARTIFICIAL LIFE AND ROBOTICS (ICAROB2015), 2015, : 302 - 305
  • [4] Method for Character Domain Extraction from Image Using Wavelet Transform
    Taniguchi, Taiki
    Yoshitomi, Yasunari
    JOURNAL OF ROBOTICS NETWORKING AND ARTIFICIAL LIFE, 2015, 2 (02): : 103 - 106
  • [5] Character pattern extraction from documents with complex backgrounds
    Goto H.
    Aso H.
    International Journal on Document Analysis and Recognition, 2002, 4 (04) : 258 - 268
  • [6] Character pattern extraction from colorful documents with complex backgrounds
    Goto, H
    Aso, H
    16TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL III, PROCEEDINGS, 2002, : 180 - 183
  • [7] Lanna Handwritten Character Recognition on Historical Documents Using Feature Extraction
    Khankasikam, Krisda
    INFORMATION TECHNOLOGY APPLICATIONS IN INDUSTRY, PTS 1-4, 2013, 263-266 : 2553 - 2560
  • [8] Wavelet-based feature extraction from character images
    Park, JH
    Oh, IS
    INTELLIGENT DATA ENGINEERING AND AUTOMATED LEARNING, 2003, 2690 : 1092 - 1096
  • [9] Wavelet maxima and moment invariants based iris feature extraction
    Nabti, Makram
    Bouridane, Ahmed
    2007 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, VOLS 1-7, 2007, : 961 - 964
  • [10] Character recognition with wavelet moments-based feature extraction using SVM
    Institute of Nautical Technology, Dalian Maritime University, Dalian 116026, China
    High Technol Letters, 2006, SUPPL. (130-134):