A language model based on semantically clustered words in a Chinese character recognition system

被引:8
|
作者
Lee, HJ
Tung, CH
机构
[1] Dept. of Comp. Sci. and Info. Eng., National Chiao Tung University, Hsinchu
关键词
contextual postprocessing; language model; semantics; word group;
D O I
10.1016/S0031-3203(96)00154-9
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents a new method for clustering the words in a dictionary into ward groups. A Chinese character recognition system can then use these groups in a language model to improve the recognition accuracy. In the language model, the number of parameters we must train beforehand can be kept to a reasonable value. The Chinese synonym dictionary Tong2yi4ci2 ci2lin2 providing the semantic features is used to calculate the weights of the semantic attributes of the character-based word classes. The weights of the semantic attributes are next updated according to the words of the Behavior dictionary, which has a rather complete word set. Then, the word classes are clustered to In groups according to the semantic measurement by a greedy method. The words in the Behavior dictionary can finally be assigned to the m groups. The parameter space for the bigram contextual information of the character recognition system is m(2). From the experimental results, the recognition system with the proposed model has shown better performance than that of a character-based bigram language model. (C) 1997 Pattern Recognition Society. Published by Elsevier Science Ltd.
引用
收藏
页码:1339 / 1346
页数:8
相关论文
共 50 条
  • [1] A word language model based contextual language processing on Chinese character recognition
    Huang, Chen
    Ding, Xiaoqing
    Chen, Yan
    DOCUMENT RECOGNITION AND RETRIEVAL XVII, 2010, 7534
  • [2] Language model for Chinese character recognition with dense errors
    Zhang, S
    Wu, XL
    IC-AI'2001: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOLS I-III, 2001, : 598 - 602
  • [3] Variable length language model for Chinese character recognition
    Zhang, S
    Wu, XL
    ADVANCES IN MULTIMODAL INTERFACES - ICMI 2000, PROCEEDINGS, 2000, 1948 : 267 - 271
  • [4] Language model of Chinese character recognition and its application
    Zhang, S
    Wu, XL
    2000 5TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING PROCEEDINGS, VOLS I-III, 2000, : 1507 - 1513
  • [5] Application of bidirectional probabilistic character language model in handwritten words recognition
    Sas, Jerzy
    INTELLIGENT DATA ENGINEERING AND AUTOMATED LEARNING - IDEAL 2006, PROCEEDINGS, 2006, 4224 : 679 - 687
  • [6] A hybrid post-processing system for offline handwritten Chinese character recognition based on a statistical language model
    Xu, RF
    Yeung, DS
    Sh, DM
    INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2005, 19 (03) : 415 - 428
  • [7] Research on Chinese Character Recognition Using Bag of Words
    Gui, Jiaping
    Zhou, Yi
    Lin, Xinda
    Chen, Kai
    Guan, Haibing
    INFORMATION TECHNOLOGY FOR MANUFACTURING SYSTEMS, PTS 1 AND 2, 2010, : 395 - +
  • [8] Parsing Chinese Synthetic Words with a Character-based Dependency Model
    Cheng, Fei
    Duh, Kevin
    Matsumoto, Yuji
    LREC 2014 - NINTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2014,
  • [9] Chinese Character Recognition Based on Character Reconstruction
    Yun Li
    Mei Xie
    2009 INTERNATIONAL CONFERENCE ON COMMUNICATIONS, CIRCUITS AND SYSTEMS PROCEEDINGS, VOLUMES I & II: COMMUNICATIONS, NETWORKS AND SIGNAL PROCESSING, VOL I/ELECTRONIC DEVICES, CIRUITS AND SYSTEMS, VOL II, 2009, : 460 - 463
  • [10] Novel method based on integrating characters with words for contextual processing of Chinese character recognition
    Li, Yuan-Xiang
    Ding, Xiao-Qing
    Wu, You-Shou
    Jisuanji Yanjiu yu Fazhan/Computer Research and Development, 2002, 39 (07):