Language model of Chinese character recognition and its application

被引:0
|
作者
Zhang, S [1 ]
Wu, XL [1 ]
机构
[1] Chinese Acad Sci, Inst Automat, Engn Ctr Character Recognit, Beijing 100080, Peoples R China
关键词
character recognition; Markov language model; combined model; cache-based model language model; trigram model; 3g-gram model;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
This paper presents a 5-gram combined model that can reflect features of Chinese and Chinese character recognition based on introducing several kinds of Markov language models. The major feature of this model is that it captures both forward and backward statistical characters of one word. The model contains three traditional "trigram components", a "cache component" which reflects short-term patterns of word use, and a "3g-gram component" based on a new classification method that is fast and automatic. Experiment on a 1,500,000-word corpus shows significant improvement achieved by the proposed model.
引用
收藏
页码:1507 / 1513
页数:7
相关论文
共 50 条
  • [41] Character independent font recognition on a single Chinese character
    Ding, Xiaoqing
    Chen, Li
    Wu, Tao
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2007, 29 (02) : 195 - 204
  • [42] Optical Character Recognition with Chinese and Korean Character Decomposition
    Chang, Chun Chieh
    Arora, Ashish
    Perera, Leibny Paola Garcia
    Etter, David
    Povey, Daniel
    Khudanpur, Sanjeev
    [J]. 2019 INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION WORKSHOPS (ICDARW), VOL 5, 2019, : 134 - 139
  • [43] Chinese Character Recognition with Augmented Character Profile Matching
    Zu, Xinyan
    Yu, Haiyang
    Li, Bin
    Xue, Xiangyang
    [J]. PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022, : 6094 - 6102
  • [44] A Novel Multilevel Stacked SqueezeNet Model for Handwritten Chinese Character Recognition
    Du, Yuankun
    Liu, Fengping
    Liu, Zhilong
    [J]. COMPUTER SCIENCE AND INFORMATION SYSTEMS, 2023, 20 (04) : 1771 - 1795
  • [45] Offline handwritten Chinese character recognition based on DBN fusion model
    Liu, Lu
    Sun, Weiwei
    Ding, Bo
    [J]. 2016 IEEE INTERNATIONAL CONFERENCE ON INFORMATION AND AUTOMATION (ICIA), 2016, : 1807 - 1811
  • [46] Handwritten Chinese character recognition using Kernel active handwriting model
    Shi, DM
    Ong, YS
    Tan, EC
    [J]. 2003 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS, VOLS 1-5, CONFERENCE PROCEEDINGS, 2003, : 251 - 255
  • [47] Hybrid model for Chinese character recognition based on Tesseract-OCR
    Wang, Bo
    Ma, Yi-Wei
    Hu, Hong-Tao
    [J]. INTERNATIONAL JOURNAL OF INTERNET PROTOCOL TECHNOLOGY, 2020, 13 (02) : 102 - 108
  • [48] Similar Handwritten Chinese Character Recognition Using Hierarchical CNN Model
    Wang, Qingqing
    Lu, Yue
    [J]. 2017 14TH IAPR INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), VOL 1, 2017, : 603 - 608
  • [49] Application of artificial neural network model for optical character recognition
    Mani, N
    Srinivasan, B
    [J]. SMC '97 CONFERENCE PROCEEDINGS - 1997 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS, VOLS 1-5: CONFERENCE THEME: COMPUTATIONAL CYBERNETICS AND SIMULATION, 1997, : 2517 - 2520
  • [50] Recognition of finger character or sign language
    Chin, M
    Adachi, Y
    Ozaki, M
    Ishii, N
    [J]. KNOWLEDGE-BASED INTELLIGENT INFORMATION ENGINEERING SYSTEMS & ALLIED TECHNOLOGIES, PTS 1 AND 2, 2001, 69 : 632 - 636