Research on Enhancing the Effectiveness of the Chinese Text Automatic Categorization Based on ICTCLAS Segmentation Method

被引:0
|
作者
Li, Xiangdong [1 ]
Zhang, Cheng [2 ]
机构
[1] Wuhan Univ, Sch Informat Management, Ctr Studies Informat Resources, Wuhan 430072, Peoples R China
[2] Wuhan Univ, Sch Informat Management, Wuhan 430072, Peoples R China
关键词
Chinese segmentation; text automatic categorization; classification effect; mix; high information;
D O I
暂无
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
The article proposed a method that suggest a way to replace some lower category identification capacity items from the ICTCLAS segmentation result by drawing the feature items that owns a better category identification capacity from the 2-gram segmentation result to improve the classification effect of ICTCALS segmentation method. By using KNN categorization algorithm and Naive Bayes text categorization method, it proved this way worked well on FuDan university corpus. And it also analyzed the reason why the method was relatively noneffective on the Sogou laboratory corpus through the test.
引用
收藏
页码:267 / 270
页数:4
相关论文
共 50 条
  • [41] Research on Automatic Summary of Chinese Short Text Based on LSTM and Keywords Correction
    Xu, Fang
    Yi, Guo
    Qi, Wang
    Zhen, Fan
    PROCEEDINGS OF 2018 TENTH INTERNATIONAL CONFERENCE ON ADVANCED COMPUTATIONAL INTELLIGENCE (ICACI), 2018, : 467 - 472
  • [42] Effectiveness of Rule-based Classifiers in Sinhala Text Categorization
    Lakmali, K. B. N.
    Haddela, Prasanna S.
    2017 NATIONAL INFORMATION TECHNOLOGY CONFERENCE (NITC), 2017, : 153 - 157
  • [43] Segmentation of Chinese Web text based on Spark
    Xu, Jiazhen
    2015 8TH INTERNATIONAL SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE AND DESIGN (ISCID), VOL 1, 2015, : 200 - 203
  • [44] CHINESE TEXT CATEGORIZATION STUDY BASED ON FEATURE WEIGHT LEARNING
    Zhan, Yan
    Chen, Hao
    Zhang, Su-Fang
    Zheng, Mei
    PROCEEDINGS OF 2009 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-6, 2009, : 1723 - +
  • [45] Research of Text Categorization Model based on Random Forests
    Xue, Dashen
    Li, Fengxin
    2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND COMMUNICATION TECHNOLOGY CICT 2015, 2015, : 173 - 176
  • [46] The Research of Text Categorization based on FP-tree
    Zhu, Cuiling
    WISM: 2009 INTERNATIONAL CONFERENCE ON WEB INFORMATION SYSTEMS AND MINING, PROCEEDINGS, 2009, : 173 - 177
  • [47] A Fast Algorithm for Chinese Text Categorization Based on Key Tree
    Liu Xin
    Liu Renren
    He Wenjing
    INFORMATION TECHNOLOGY FOR MANUFACTURING SYSTEMS II, PTS 1-3, 2011, 58-60 : 1106 - +
  • [48] The Research of Tax Text Categorization based on Rough Set
    Liu, Bin
    Xu, Guang
    Xu, Qian
    Zhang, Nan
    2012 INTERNATIONAL CONFERENCE ON MEDICAL PHYSICS AND BIOMEDICAL ENGINEERING (ICMPBE2012), 2012, 33 : 1683 - 1688
  • [49] Design of Chinese Text Categorization Classifier Based on Attribute Bagging
    Zhang, Xiang
    Zhou, Mingquan
    Dong, Lili
    Ye, Na
    2009 INTERNATIONAL CONFERENCE ON BUSINESS INTELLIGENCE AND FINANCIAL ENGINEERING, PROCEEDINGS, 2009, : 201 - 204
  • [50] Research on text categorization model based on LDA - KNN
    Chen, Weihua
    Zhang, Xian
    2017 IEEE 2ND ADVANCED INFORMATION TECHNOLOGY, ELECTRONIC AND AUTOMATION CONTROL CONFERENCE (IAEAC), 2017, : 2719 - 2726