Graph Model Optimization based Historical Chinese Character Segmentation Method

被引:6
|
作者
Ji, Jingning [1 ]
Peng, Liangrui [1 ]
Li, Bohan [1 ]
机构
[1] Tsinghua Univ, Dept Elect Engn, Tsinghua Natl Lab Informat Sci & Technol, Beijing 100084, Peoples R China
来源
2014 11TH IAPR INTERNATIONAL WORKSHOP ON DOCUMENT ANALYSIS SYSTEMS (DAS 2014) | 2014年
关键词
historical Chinese document; character segmentation; graph model;
D O I
10.1109/DAS.2014.57
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Historical Chinese document recognition technology is important for digital library. However, historical Chinese character segmentation remains a difficult problem due to the complex structure of Chinese characters and various writing styles. This paper presents a novel method for historical Chinese character segmentation based on graph model. After a preliminary over-segmentation stage, the system applies a merging process. The candidate segmentation positions are denoted by the nodes of a graph, and the merging process is regarded as selecting an optimal path of the graph. The weight of edge in the graph is calculated by the cost function which considers geometric features and recognition confidence. Experimental results show that the proposed method is effective with a detection rate of 94.6% and an accuracy rate of 96.1% on a test set of practical historical Chinese document samples.
引用
收藏
页码:282 / 286
页数:5
相关论文
共 50 条
  • [21] A Sequence Labeling Based Approach for Character Segmentation of Historical Documents
    Gao, Liangcai
    Zhang, Xiaode
    Tang, Zhi
    Huang, Yaoxiong
    Jin, Lianwen
    2018 13TH IAPR INTERNATIONAL WORKSHOP ON DOCUMENT ANALYSIS SYSTEMS (DAS), 2018, : 305 - 310
  • [22] Novel Character Segmentation Method for Overlapped Chinese Handwriting Recognition based on LSTM Neural Networks
    Su, Tonghua
    Jia, Shukai
    Wang, Qiufeng
    Sun, Li
    Wang, Ruigang
    2016 23RD INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2016, : 1141 - 1146
  • [23] A hidden Markov model based segmentation and recognition algorithm for chinese handwritten address character strings
    Fu, Q
    Ding, XQ
    Liu, CS
    Jiang, Y
    Ren, Z
    EIGHTH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION, VOLS 1 AND 2, PROCEEDINGS, 2005, : 590 - 594
  • [24] A two-stage character segmentation method for Chinese license plate
    Tian, Jiangmin
    Wang, Ran
    Wang, Guoyou
    Liu, Jianguo
    Xia, Yuanchun
    COMPUTERS & ELECTRICAL ENGINEERING, 2015, 46 : 539 - 553
  • [25] Research on Optimization Segmentation Algorithm for Chinese/English Mixed Character Image in OCR
    Liu Mingzhu
    Suo Yuxiu
    Ding Yinan
    2014 FOURTH INTERNATIONAL CONFERENCE ON INSTRUMENTATION AND MEASUREMENT, COMPUTER, COMMUNICATION AND CONTROL (IMCCC), 2014, : 764 - 769
  • [26] CNN based Transfer Learning for Historical Chinese Character Recognition
    Tang, Yejun
    Peng, Liangrui
    Xu, Qian
    Wang, Yanwei
    Furuhata, Akio
    PROCEEDINGS OF 12TH IAPR WORKSHOP ON DOCUMENT ANALYSIS SYSTEMS, (DAS 2016), 2016, : 25 - 29
  • [27] Character Segmentation for Historical Uchen Tibetan Document Based on Structure Attributes
    Zhang Ce
    Wang Weilan
    LASER & OPTOELECTRONICS PROGRESS, 2021, 58 (20)
  • [28] Chinese Character Recognition Method Based on Image Processing and Hidden Markov Model
    Wang Zhen-yan
    2014 Fifth International Conference on Intelligent Systems Design and Engineering Applications (ISDEA), 2014, : 276 - 279
  • [29] Graph-based ensemble method for text line segmentation in offline Chinese handwritten documents
    Huang, L. (huangliang1576@gmail.com), 1600, Huazhong University of Science and Technology (42):
  • [30] Radical Similarity Based Model Optimization and Post-correction for Chinese Character Recognition
    Han, Zhongyuan
    Du, Jun
    Ma, Jiefeng
    Hu, Pengfei
    Zhang, Zhenrong
    DOCUMENT ANALYSIS AND RECOGNITION-ICDAR 2024, PT I, 2024, 14804 : 152 - 168