CLSpell: Contrastive learning with phonological and visual knowledge for chinese spelling check

被引:0
|
作者
Mao, Xingliang [1 ,2 ]
Shan, Youran [3 ]
Li, Fangfang [2 ,3 ]
Chen, Xiaohong [4 ]
Zhang, Shichao [3 ]
机构
[1] Hunan Univ Technol & Business, Sch Digital Media & Humanities, Changsha 410000, Peoples R China
[2] Xiangjiang Lab, Changsha 410000, Peoples R China
[3] Cent South Univ, Sch Comp Sci & Engn, Changsha 410038, Peoples R China
[4] Hunan Univ Technol & Business, Inst Big Data & Internet Innovat, Changsha 410000, Peoples R China
关键词
Chinese spelling check; Multi-task training; Comparative learning;
D O I
10.1016/j.neucom.2023.126468
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The task of Chinese Spelling Check (CSC) is to identify and correct spelling errors in text, which are mainly caused by phonologically and visually similar characters. Although pre-trained language models are helpful for this task, they lack phonological and visual information. Previous works have primarily focused on identifying errors based on local contextual data, while neglecting the importance of sentence-level information. To address the above issues, Contrastive Learning Spell (CLSpell) is proposed, which combines phonetic and glyphic information through contrastive learning and simultaneously acquires local and global information through multitask joint learning. During pretraining, token representations are learned using a combination of phonological, visual, and semantic information. Moreover, we propose to include an auxiliary task of correct sentence discrimination in the multi-task joint training process to capture sentence-level information. Experiments on widely used benchmarks demonstrate that the proposed method surpasses all competing methods.
引用
收藏
页数:9
相关论文
共 50 条
  • [1] SpellGCN: Incorporating Phonological and Visual Similarities into Language Models for Chinese Spelling Check
    Cheng, Xingyi
    Xu, Weidi
    Chen, Kunlong
    Jiang, Shaohua
    Wang, Feng
    Wang, Taifeng
    Chu, Wei
    Qi, Yuan
    [J]. 58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020), 2020, : 871 - 881
  • [2] Chinese Spelling Correction Based on Knowledge Enhancement and Contrastive Learning
    Wang, Hao
    Ma, Yao
    Duan, Jianyong
    He, Li
    Li, Xin
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2024, E107D (09) : 1264 - 1273
  • [3] CCCSpell: A Consistent and Contrastive Learning Approach with Character Similarity for Chinese Spelling Check
    Su, Jindian
    Lin, Xiaobin
    Xie, Yunhao
    Cheng, Zehua
    [J]. 2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN, 2023,
  • [4] Prompt as a Knowledge Probe for Chinese Spelling Check
    Peng, Kun
    Sun, Nannan
    Cao, Jiahao
    Liu, Rui
    Ren, Jiaqian
    Jiang, Lei
    [J]. KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT, KSEM 2022, PT III, 2022, 13370 : 516 - 527
  • [5] VISUAL AND PHONOLOGICAL CODES IN SPELLING
    SLOBODA, J
    [J]. BULLETIN OF THE BRITISH PSYCHOLOGICAL SOCIETY, 1978, 31 (FEB): : 74 - 74
  • [6] DUKE: Distance Fusion and Knowledge Enhanced Framework for Chinese Spelling Check
    Liang, Jianzeng
    Huang, Wenkang
    Li, Fengyi
    Shi, Qiuhui
    [J]. 2022 EURO-ASIA CONFERENCE ON FRONTIERS OF COMPUTER SCIENCE AND INFORMATION TECHNOLOGY, FCSIT, 2022, : 1 - 5
  • [7] Visual and Phonological Feature Enhanced Siamese BERT for Chinese Spelling Error Correction
    Liu, Yujia
    Guo, Hongliang
    Wang, Shuai
    Wang, Tiejun
    [J]. APPLIED SCIENCES-BASEL, 2022, 12 (09):
  • [8] Dual-Detector: An Unsupervised Learning Framework for Chinese Spelling Check
    Shao, Feiran
    Li, Jinlong
    [J]. ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PAKDD 2023, PT IV, 2023, 13938 : 162 - 173
  • [9] Improve Chinese Spelling Check by Reevaluation
    Wang, Shuai
    Shang, Lin
    [J]. ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PAKDD 2022, PT III, 2022, 13282 : 237 - 248
  • [10] A Hybrid Model for Chinese Spelling Check
    Zhao, Hai
    Cai, Deng
    Xin, Yang
    Wang, Yuzhu
    Jia, Zhongye
    [J]. ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2017, 16 (03)