Research on Enhancing the Effectiveness of the Chinese Text Automatic Categorization Based on ICTCLAS Segmentation Method

被引:0
|
作者
Li, Xiangdong [1 ]
Zhang, Cheng [2 ]
机构
[1] Wuhan Univ, Sch Informat Management, Ctr Studies Informat Resources, Wuhan 430072, Peoples R China
[2] Wuhan Univ, Sch Informat Management, Wuhan 430072, Peoples R China
关键词
Chinese segmentation; text automatic categorization; classification effect; mix; high information;
D O I
暂无
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
The article proposed a method that suggest a way to replace some lower category identification capacity items from the ICTCLAS segmentation result by drawing the feature items that owns a better category identification capacity from the 2-gram segmentation result to improve the classification effect of ICTCALS segmentation method. By using KNN categorization algorithm and Naive Bayes text categorization method, it proved this way worked well on FuDan university corpus. And it also analyzed the reason why the method was relatively noneffective on the Sogou laboratory corpus through the test.
引用
收藏
页码:267 / 270
页数:4
相关论文
共 50 条
  • [32] The Research of Knowledge-based Chinese Segmentation Method
    Zhou, Guangming
    MATERIAL SCIENCE, CIVIL ENGINEERING AND ARCHITECTURE SCIENCE, MECHANICAL ENGINEERING AND MANUFACTURING TECHNOLOGY II, 2014, 651-653 : 2545 - 2548
  • [33] Using LSA and text segmentation to improve automatic Chinese dialogue text summarization
    Liu Chuan-han
    Wang Yong-cheng
    Zheng Fei
    Liu De-rong
    JOURNAL OF ZHEJIANG UNIVERSITY-SCIENCE A, 2007, 8 (01): : 79 - 87
  • [34] ACTS - AN AUTOMATIC CHINESE TEXT SEGMENTATION SYSTEM FOR FULL-TEXT RETRIEVAL
    WU, ZM
    TSENG, G
    JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE, 1995, 46 (02): : 83 - 96
  • [35] Chinese text categorization based on alternative covering algorithm
    Key Laboratory of Intelligent Computing and Signal Processing Ministry of Education, Anhui University, Hefei 230039, China
    不详
    Jisuanji Gongcheng, 2006, 19 (183-184):
  • [36] Chinese Text Categorization Based on Deep Belief Networks
    Song, Jia
    Qin, Sijun
    Zhang, Pengzhou
    2016 IEEE/ACIS 15TH INTERNATIONAL CONFERENCE ON COMPUTER AND INFORMATION SCIENCE (ICIS), 2016, : 1123 - 1127
  • [37] Chinese text categorization based on fuzzy association rules
    Yuan, Fang
    Guo, Yu-Qin
    Yang, Liu
    Yang, Fan
    PROCEEDINGS OF 2006 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-7, 2006, : 1030 - +
  • [38] Text categorization method based on Extension Theory
    Yi, Y
    Zheng, Y
    He, ZS
    Wu, ZF
    2003 INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING AND KNOWLEDGE ENGINEERING, PROCEEDINGS, 2003, : 646 - 649
  • [39] Automatic text categorization based on K-nearest neighbor
    Sun, J.
    Wang, W.
    Zhong, Y.-X.
    Beijing Youdian Xueyuan Xuebao/Journal of Beijing University of Posts And Telecommunications, 2001, 24 (01): : 42 - 46
  • [40] Research on Chinese text classification based on Naive Bayesian method
    Geng Xinglong
    Gao Xiuyan
    Zhao Bin
    PROCEEDINGS OF THE FIFTH INTERNATIONAL SYMPOSIUM ON TEST AUTOMATION & INSTRUMENTATION, VOLS 1 AND 2, 2014, : 226 - 230