A RULE-BASED METHOD FOR COMMAS' DISAMBIGUATION IN CHINESE PATENT TEXT

被引:0
|
作者
Song, Qianqian [1 ,2 ]
Zhu, Yun [1 ,2 ]
Wang, Lixia [1 ,2 ]
Jin, Yaohong [1 ,2 ]
机构
[1] Beijing Normal Univ, Inst Chinese Informat Proc, Beijing 100875, Peoples R China
[2] Beijing Normal Univ, CPIC BNU Joint Lab Machine Translat, Beijing 100875, Peoples R China
基金
国家高技术研究发展计划(863计划);
关键词
Rule-based method; Commas' disambiguation; Chinese patent text; MT;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We described a rule-based method for disambiguating Chinese commas in patent text, which will be beneficial to the work on Chinese-English Patent MT. We annotated ten thousand sentences of patent text, and made a number of rules according to the annotated results. Experiments were conducted on 5 intact patent documents containing 1219 commas, and our model achieves an accuracy of over 90% overall.
引用
收藏
页码:1506 / 1510
页数:5
相关论文
共 50 条
  • [31] Rule-Based Method for Entity Resolution
    Li, Lingli
    Li, Jianzhong
    Gao, Hong
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2015, 27 (01) : 250 - 263
  • [32] Robust Chinese Short Text Entity Disambiguation Method Based on Feature Fusion and Contrastive Learning
    Mei, Qishun
    Li, Xuhui
    [J]. INFORMATION, 2024, 15 (03)
  • [33] Rule-based Text Normalization for Malay Social Media Texts
    Ariffin, Siti Noor Allia Noor
    Tiun, Sabrina
    [J]. INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2020, 11 (10) : 156 - 162
  • [34] Rule-based text normalization for malay social media texts
    Ariffin, Siti Noor Allia Noor
    Tiun, Sabrina
    [J]. International Journal of Advanced Computer Science and Applications, 2020, 11 (10): : 156 - 162
  • [35] Do thesauri enhance rule-based categorization for OCR text?
    Taghva, K
    Coombs, J
    [J]. DOCUMENT RECOGNITION AND RETRIEVAL X, 2003, 5010 : 111 - 119
  • [36] A Statistical and Rule-Based Spelling and Grammar Checker for Indonesian Text
    Fahda, Asanilta
    Purwarianti, Ayu
    [J]. PROCEEDINGS OF 2017 INTERNATIONAL CONFERENCE ON DATA AND SOFTWARE ENGINEERING (ICODSE), 2017,
  • [37] A Rule-Based Approach to Embedding Techniques for Text Document Classification
    Aubaid, Asmaa M.
    Mishra, Alok
    [J]. APPLIED SCIENCES-BASEL, 2020, 10 (11):
  • [38] Rule-Based Turkish Text Summarizer (RB-TTS)
    Birant, Cagdas Can
    Aktas, Ozlem
    [J]. ADVANCES IN ELECTRICAL AND COMPUTER ENGINEERING, 2018, 18 (03) : 113 - 118
  • [39] A Rule-Based Algorithm for Extracting Medical Data from Text
    Wang, Li
    Yao, Min
    Zhang, Yuanpeng
    Geng, Xingyun
    Qian, Danmin
    Jiang, Kui
    Dong, Jiancheng
    [J]. INTERNATIONAL SYMPOSIUM ON SIGNAL PROCESSING BIOMEDICAL ENGINEERING, AND INFORMATICS (SPBEI 2013), 2014, : 185 - 193
  • [40] Rule-Based Storytelling Text-to-Speech (TTS) Synthesis
    Ramli, Izzad
    Seman, Noraini
    Ardi, Norizah
    Jamil, Nursuriati
    [J]. 2016 3RD INTERNATIONAL CONFERENCE ON MECHANICS AND MECHATRONICS RESEARCH (ICMMR 2016), 2016, 77