An example-based study on Chinese word segmentation using critical fragments

被引:0
|
作者
Hu, QA [1 ]
Pan, HH [1 ]
Kit, C [1 ]
机构
[1] City Univ Hong Kong, Dept Chinese Translat & Linguist, Hong Kong, Peoples R China
来源
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In our study, sentences are represented as sequences of critical fragments, and critical fragments with more than one distinct resolution found in the training corpus are considered as being ambiguous. Different from other studies, the ambiguous critical fragments are disambiguated using an example-based system(1) in our study. The contexts, i.e. the adjacent characters, words and critical fragments, on either side of an ambiguous critical fragment, are used to measure the distance between training and testing examples. Two kinds of measures, overlap metric and chi-squared feature weighting, are employed, and our system achieves a precision of 93.65% and a recall of 96.56% in the open test.
引用
收藏
页码:714 / 722
页数:9
相关论文
共 50 条
  • [21] New Cyber Word Discovery Using Chinese Word Segmentation
    Wang, Hao
    Wang, Bing
    Zou, MengYu
    Duan, JianYong
    PROCEEDINGS OF 2019 IEEE 3RD INFORMATION TECHNOLOGY, NETWORKING, ELECTRONIC AND AUTOMATION CONTROL CONFERENCE (ITNEC 2019), 2019, : 970 - 975
  • [22] Example-Based Procedural Modeling Using Graph Grammars
    Merrell, Paul
    ACM TRANSACTIONS ON GRAPHICS, 2023, 42 (04):
  • [23] Example-based painterly image generation using GIST
    1600, Institute of Image Electronics Engineers of Japan (41):
  • [24] A Word Segmentation Method of Ancient Chinese Based on Word Alignment
    Che, Chao
    Zhao, Hanyu
    Wu, Xiaoting
    Zhou, Dongsheng
    Zhang, Qiang
    NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING (NLPCC 2019), PT I, 2019, 11838 : 761 - 772
  • [25] Example-based feature tweaking using random forests
    Lindgren, Tony
    Papapetrou, Panagiotis
    Samsten, Isak
    Asker, Lars
    2019 IEEE 20TH INTERNATIONAL CONFERENCE ON INFORMATION REUSE AND INTEGRATION FOR DATA SCIENCE (IRI 2019), 2019, : 53 - 60
  • [26] An effective searching method using the example-based query
    Joo, KH
    Lee, J
    DISTRIBUTED COMPUTING AND INTERNET TECHNOLOGY, PROCEEDINGS, 2005, 3816 : 255 - 266
  • [27] A Chinese Word Segmentation Based on Machine Learning
    Wang Hongsheng
    Cui Mingming
    PROCEEDINGS OF THE FIRST INTERNATIONAL WORKSHOP ON EDUCATION TECHNOLOGY AND COMPUTER SCIENCE, VOL II, 2009, : 610 - 613
  • [28] Chinese Word Segmentation Based on Deep Learning
    Wang, Mengge
    Li, Xiaoge
    Wei, Zheng
    Zhi, Shuting
    Wang, Haoyue
    PROCEEDINGS OF 2018 10TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND COMPUTING (ICMLC 2018), 2018, : 16 - 20
  • [29] Study on the Influencing Factors of Chinese Word Segmentation
    Xiu, Chi
    Song, Rou
    2012 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP 2012), 2012, : 29 - 32
  • [30] Chinese word segmentation based on contextual entropy
    Huang, JH
    Powers, D
    PACLIC 17: Language, Information and Computation, Proceedings, 2003, : 152 - 158