Word Extraction Method by Generating Multiple Character Hypotheses

被引:1
|
作者
Takebe, Hiroaki [1 ]
Fujimoto, Katsuhito [1 ]
机构
[1] Fujitsu Labs Ltd, Kawasaki, Kanagawa 211, Japan
来源
PROCEEDINGS OF THE 8TH IAPR INTERNATIONAL WORKSHOP ON DOCUMENT ANALYSIS SYSTEMS | 2008年
关键词
D O I
10.1109/DAS.2008.35
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
It is necessary to extract precisely words of headers and data for recognizing logical structure of form images. However, word extraction often jails because of layout analysis or character recognition error, which leads correct character hypotheses not to be generated We propose a word extraction method which generates multiple character hypotheses and extracts their combinations which correspond with the character orders of words. Firstly character hypotheses which overlap with each other are generated by combinatorial recognition of connected components and their combinations which correspond with words are extracted by clique extraction from a graph. And then, character hypotheses are generated by recognition with limited target and their combinations which correspond with words are extracted by matching between lattices based on local optimum, in which variety of recognition results and regular expression of words are considered We confirmed the effect of our method by the experiment for form images.
引用
收藏
页码:299 / 306
页数:8
相关论文
共 50 条
  • [1] Generating multiple weighted reordering hypotheses for an SMT system
    Costa-jussa, Marta R.
    Fonollosa, Jose A. R.
    PROCESAMIENTO DEL LENGUAJE NATURAL, 2008, (41): : 267 - 272
  • [2] A New Fourier-Moments based Video Word and Character Extraction Method for Recognition
    Rajendran, Deepak
    Shivakumara, Palaiahnakote
    Su, Bolan
    Lu, Shijian
    Tan, Chew Lim
    11TH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR 2011), 2011, : 1165 - 1169
  • [3] METHOD OF MULTIPLE WORKING HYPOTHESES
    CHAMBERLIN, TC
    SCIENCE, 1965, 148 (3671) : 754 - +
  • [4] THE METHOD OF MULTIPLE WORKING HYPOTHESES
    Chamberlin, T. C.
    SCIENTIFIC MONTHLY, 1944, 59 : 357 - 362
  • [5] The method of multiple working hypotheses
    Chamberlin, TC
    JOURNAL OF GEOLOGY, 1931, 39 (02): : 155 - 165
  • [6] THE METHOD OF MULTIPLE WORKING HYPOTHESES
    LAWSON, AE
    JOURNAL OF RESEARCH IN SCIENCE TEACHING, 1990, 27 (03) : 195 - 196
  • [7] Multiple Character Embeddings for Chinese Word Segmentation
    Wang, Jingkang
    Zhou, Jianing
    Zhou, Jie
    Liu, Gongshen
    57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019:): STUDENT RESEARCH WORKSHOP, 2019, : 210 - 216
  • [8] Scene Character Detection and Recognition Based on Multiple Hypotheses Framework
    Huang, Rong
    Oba, Shinpei
    Palaiahnakote, Shivakumara
    Uchida, Seiichi
    2012 21ST INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR 2012), 2012, : 717 - 720
  • [9] WORD ON HYPOTHESES
    COHEN, A
    COHEN, N
    JOURNAL OF AMERICAN FOLKLORE, 1974, 87 (344) : 156 - 160
  • [10] ON A METHOD FOR GENERATING KANJI CHARACTER PATTERNS.
    Tominaga, Hideyoshi
    Kida, Hiromi
    Hosaka, Ken-ichi
    IAHS-AISH Publication (International Association of Hydrological Sciences-Association Internationale des Sciences Hydrologiques), 1978, : 127 - 133