A Hybrid Model for Chinese Spelling Check

被引:16
|
作者
Zhao, Hai [1 ,2 ]
Cai, Deng [1 ,2 ]
Xin, Yang [3 ]
Wang, Yuzhu [3 ]
Jia, Zhongye [4 ]
机构
[1] Shanghai Jiao Tong Univ, Dept Comp Sci & Engn, 800 Dongchuan Rd, Shanghai 200240, Peoples R China
[2] Shanghai Jiao Tong Univ, Key Lab Shanghai Educ Commiss Intelligent Interac, 800 Dongchuan Rd, Shanghai 200240, Peoples R China
[3] Huawei Technol Co Ltd, 2222 Xinjinqiao Rd, Shanghai 201206, Peoples R China
[4] Baosteel Res Inst, 655 Fujin Rd, Shanghai 201900, Peoples R China
基金
中国国家自然科学基金;
关键词
Chinese spelling check; hybrid model; graph model; conditional random field; rule-based model;
D O I
10.1145/3047405
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Spelling check for Chinese has more challenging difficulties than that for other languages. A hybrid model for Chinese spelling check is presented in this article. The hybrid model consists of three components: one graph-based model for generic errors and two independently trained models for specific errors. In the graph model, a directed acyclic graph is generated for each sentence, and the single-source shortest-path algorithm is performed on the graph to detect and correct general spelling errors at the same time. Prior to that, two types of errors over functional words (characters) are first solved by conditional random fields: the confusion of (at) (pinyin is zai in Chinese), (again, more, then) (pinyin: zai) and (of) (pinyin: de), (- ly, adverb- forming particle) (pinyin: de), and (so that, have to) (pinyin: de). Finally, a rule- based model is exploited to distinguish pronoun usage confusion: (she) (pinyin: ta), (he) (pinyin: ta), and some other common collocation errors. The proposed model is evaluated on the standard datasets released by the SIGHAN Bake-off shared tasks, giving state-of-the-art results.
引用
收藏
页数:22
相关论文
共 50 条
  • [41] A Hybrid Chinese Information Retrieval Model
    Li, Zhihan
    Xu, Yue
    Geva, Shlomo
    [J]. ACTIVE MEDIA TECHNOLOGY, 2010, 6335 : 267 - 276
  • [42] Improving Chinese Spelling Correction by Ranking
    Feng, Junjia
    Wang, Shuai
    Yin, Wenbiao
    Shang, Lin
    [J]. 2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN, 2023,
  • [43] A Multimodal Method for Chinese Spelling Correction
    Zhao, Guochao
    Guo, Yan
    Xia, Fengliang
    Ma, Chengcheng
    [J]. 2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,
  • [44] Spelling Check: A New Cognition-Inspired Sequence Learning Memory
    Soisoonthorn, Thasayu
    Unger, Herwig
    Maliyaem, Maleerat
    [J]. JOURNAL OF ADVANCES IN INFORMATION TECHNOLOGY, 2023, 14 (03) : 399 - 410
  • [45] METACOGNITION IN SPELLING - USING WRITING AND READING TO SELF-CHECK SPELLINGS
    BLOCK, KK
    PESKOWITZ, NB
    [J]. ELEMENTARY SCHOOL JOURNAL, 1990, 91 (02): : 151 - 164
  • [46] Positive check or Chinese checks?
    Lee, J
    Campbell, C
    Feng, W
    [J]. JOURNAL OF ASIAN STUDIES, 2002, 61 (02): : 591 - 607
  • [47] A hybrid model for Chinese named entity recognition
    Sun, Xiao
    Huang, Degen
    [J]. RECENT ADVANCE OF CHINESE COMPUTING TECHNOLOGIES, 2007, : 232 - 237
  • [48] The role of stroke knowledge in reading and spelling in Chinese
    Lo, Lap-yan
    Yeung, Pui-sze
    Ho, Connie Suk-Han
    Chan, David Wai-ock
    Chung, Kevin
    [J]. JOURNAL OF RESEARCH IN READING, 2016, 39 (04) : 367 - 388
  • [49] CHINESE SPELLING TEXT GENERATION OF MATHEMATICAL FORMULAS
    Dong, Su
    Liu, Shan
    Liu, Sicen
    Tang, Buzhou
    [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 7127 - 7131
  • [50] Hybrid Interaction System Model Check: a Task Oriented Semantic Level Cognitive Process Model )
    Lai Xiangwei
    Zhou Yanhui
    Zhang Weiqun
    [J]. 2009 INTERNATIONAL FORUM ON COMPUTER SCIENCE-TECHNOLOGY AND APPLICATIONS, VOL 1, PROCEEDINGS, 2009, : 64 - 67