Research on Enhancing the Effectiveness of the Chinese Text Automatic Categorization Based on ICTCLAS Segmentation Method

被引:0
|
作者
Li, Xiangdong [1 ]
Zhang, Cheng [2 ]
机构
[1] Wuhan Univ, Sch Informat Management, Ctr Studies Informat Resources, Wuhan 430072, Peoples R China
[2] Wuhan Univ, Sch Informat Management, Wuhan 430072, Peoples R China
关键词
Chinese segmentation; text automatic categorization; classification effect; mix; high information;
D O I
暂无
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
The article proposed a method that suggest a way to replace some lower category identification capacity items from the ICTCLAS segmentation result by drawing the feature items that owns a better category identification capacity from the 2-gram segmentation result to improve the classification effect of ICTCALS segmentation method. By using KNN categorization algorithm and Naive Bayes text categorization method, it proved this way worked well on FuDan university corpus. And it also analyzed the reason why the method was relatively noneffective on the Sogou laboratory corpus through the test.
引用
收藏
页码:267 / 270
页数:4
相关论文
共 50 条
  • [1] Research on Chinese Text Automatic Categorization Based on VSM
    Tong Xiao-Jun
    Cui Ming-Gen
    Song Guo-Long
    2007 INTERNATIONAL CONFERENCE ON WIRELESS COMMUNICATIONS, NETWORKING AND MOBILE COMPUTING, VOLS 1-15, 2007, : 3863 - +
  • [2] Automatic Chinese Text Categorization System Based on Mutual Information
    Lu, Zhimao
    Shi, Hong
    Zhang, Qi
    Yuan, Chaoyue
    2009 IEEE INTERNATIONAL CONFERENCE ON MECHATRONICS AND AUTOMATION, VOLS 1-7, CONFERENCE PROCEEDINGS, 2009, : 4986 - 4990
  • [3] The Research on Automatic Construction Techniques of Large-scale Corpus for Chinese Text Categorization
    Hu, Yan
    Wu, Wei
    Miao, Miao
    IEEC 2009: FIRST INTERNATIONAL SYMPOSIUM ON INFORMATION ENGINEERING AND ELECTRONIC COMMERCE, PROCEEDINGS, 2009, : 640 - 645
  • [4] The Research of Chinese Text Automatic Classification Based on Multiple
    Zhang, Shengli
    INFORMATION TECHNOLOGY APPLICATIONS IN INDUSTRY, PTS 1-4, 2013, 263-266 : 1543 - 1548
  • [5] The Research on Chinese Automatic Segmentation
    Li Huan-qin
    Yan Shi-tao
    ADVANCES IN APPLIED SCIENCE AND INDUSTRIAL TECHNOLOGY, PTS 1 AND 2, 2013, 798-799 : 818 - +
  • [6] An Evolutionary Approach to Automatic Chinese Text Segmentation
    Zhang, Dong
    2013 NINTH INTERNATIONAL CONFERENCE ON NATURAL COMPUTATION (ICNC), 2013, : 771 - 776
  • [7] Research on Feature Selection Method in Chinese Text Automatic Classification
    Hong, Ying
    Shao, Xiwen
    PROCEEDINGS OF THE 2015 INTERNATIONAL CONFERENCE ON APPLIED SCIENCE AND ENGINEERING INNOVATION, 2015, 12 : 1759 - 1763
  • [8] Research on feature selection method in Chinese text automatic classification
    Hong, Ying
    Geng, Zengmin
    ENERGY SCIENCE AND APPLIED TECHNOLOGY, 2016, : 359 - 361
  • [9] A logistic regression-based smoothing method for Chinese text categorization
    Yen, Show-Jane
    Lee, Yue-Shi
    Ying, Jia-Ching
    Wu, Yu-Chieh
    EXPERT SYSTEMS WITH APPLICATIONS, 2011, 38 (09) : 11581 - 11590
  • [10] A method of Chinese text categorization based on proximal support vector machine
    Zhou, JG
    Wang, K
    Wu, J
    Yan, PL
    Wu, M
    PROCEEDINGS OF 2005 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-9, 2005, : 1615 - 1619