Character extraction from documents using wavelet maxima

被引:13
|
作者
Hwang, WL [1 ]
Chang, F [1 ]
机构
[1] Acad Sinica, Inst Informat Sci, Taipei, Taiwan
关键词
optical character recognition; thresholding; wavelet maxima;
D O I
10.1016/S0262-8856(97)00063-2
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The extraction of character images is an important front-end processing task in optical character recognition (OCR) and other applications. This process is extremely important because OCR applications usually extract salient features and process them. The existence of noise not only destroys features of characters, but also introduces unwanted features. We propose a new algorithm which removes unwanted background noise from a textual image. Our algorithm is based on the observation that the magnitude of the intensity variation of character boundaries differs from that of noise at various scales of their wavelet transform. Therefore, most of the edges corresponding to the character boundaries at each scale can be extracted using a thresholding method. The internal region of a character is determined by a voting procedure, which uses the arguments of the remaining edges. The interior of the recovered characters is solid, containing no holes. The recovered characters tend to become fattened because of the smoothness applied in the calculation of the wavelet transform. To obtain a quality restoration of the character image, the precise locations of characters in the original image are then estimated using a Bayesian criterion. We also present some experimental results that suggest the effectiveness of our method. (C) 1998 Elsevier Science B.V. All rights reserved.
引用
收藏
页码:307 / 315
页数:9
相关论文
共 50 条
  • [21] Detection of edges in analytical images using wavelet maxima
    Wolkenstein, M
    Kolber, T
    Nikolov, S
    Hutter, H
    JOURNAL OF TRACE AND MICROPROBE TECHNIQUES, 2000, 18 (01): : 1 - 14
  • [22] ECG signal maxima detection using wavelet transform
    Ktata, S.
    Ouni, K.
    Ellouze, N.
    2006 IEEE INTERNATIONAL SYMPOSIUM ON INDUSTRIAL ELECTRONICS, VOLS 1-7, 2006, : 700 - +
  • [23] Electrocardiogram compression using modulus maxima of wavelet transform
    Kong, J
    Chi, ZR
    Lu, WX
    PROCEEDINGS OF THE 20TH ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY, VOL 20, PTS 1-6: BIOMEDICAL ENGINEERING TOWARDS THE YEAR 2000 AND BEYOND, 1998, 20 : 1527 - 1530
  • [24] Simple algorithm for wavelet maxima modulus extraction in time-scale representations
    LeTien, T
    Talhami, H
    Nguyen, DT
    ELECTRONICS LETTERS, 1997, 33 (05) : 370 - 371
  • [25] Extraction of failure character signal of rolling element bearings by wavelet
    Fu, Qinyi
    Zhang, Yicheng
    Ying, Lijun
    Li, Guoshun
    Jixie Gongcheng Xuebao/Chinese Journal of Mechanical Engineering, 2001, 37 (02): : 30 - 32
  • [26] Reconstruction from 2-D wavelet transform modulus maxima using projection
    Liew, AWC
    Law, NF
    IEE PROCEEDINGS-VISION IMAGE AND SIGNAL PROCESSING, 2000, 147 (02): : 176 - 184
  • [27] Keyword Extraction from Hindi Documents Using Statistical Approach
    Sharan, Aditi
    Siddiqi, Sifatullah
    Singh, Jagendra
    INTELLIGENT COMPUTING, COMMUNICATION AND DEVICES, 2015, 309 : 507 - 513
  • [28] Keyword extraction from documents using a neural network model
    Jo, Taeho
    Lee, Malrey
    Gatton, Thomas M.
    2006 INTERNATIONAL CONFERENCE ON HYBRID INFORMATION TECHNOLOGY, VOL 2, PROCEEDINGS, 2006, : 194 - +
  • [29] Term Extraction from Medical Documents Using Word Embeddings
    Bay, Matthias
    Bruness, Daniel
    Herold, Miriam
    Schulze, Christian
    Guckert, Michael
    Minor, Mirj Am
    2020 6TH IEEE CONGRESS ON INFORMATION SCIENCE AND TECHNOLOGY (IEEE CIST'20), 2020, : 328 - 333
  • [30] Keywords Extraction from Arabic Documents Using Centrality Measures
    Al Etaiwi, Wael
    Awajan, Arafat A.
    Suleiman, Dima
    2019 SIXTH INTERNATIONAL CONFERENCE ON SOCIAL NETWORKS ANALYSIS, MANAGEMENT AND SECURITY (SNAMS), 2019, : 237 - 241