An improved co-training text categorization algorithm based on diversity measures

被引:0
|
作者
Tang, Huan-Ling [1 ,2 ]
Lin, Zheng-Kui [1 ]
Lu, Ming-Yu [1 ]
机构
[1] College of Information and Science Technique, Dalian Maritime University, Dalian 116026, China
[2] Department of Computer and Information Engineering, Yantai Vocational College, Yantai 264670, China
来源
Tien Tzu Hsueh Pao/Acta Electronica Sinica | 2008年 / 36卷 / SUPPL.期
关键词
D O I
暂无
中图分类号
学科分类号
摘要
Co-training algorithm is constrained by its assumption that the features can be split into two compatible and independent subsets. However, the assumption is usually violated in real-world application, especially for independence. We discover its real purpose is to find two classifiers with certain accuracy and sufficient diversity to co-train. First, multi-views are created using different term evaluation functions. Second, instead of directly computing the independence between two sub-views, this paper evaluates the independence between two classifiers, trained on them, by using diversity measures indirectly. Thus a pair of classifiers with certain accuracy and greater diversity is selected. The experimental results show two improved algorithms named TV-SC and TV-DC are both outperform another co-training algorithm named Co-Rnd based on random splitting method, and TV-DC outperforms TV-SC.
引用
收藏
页码:138 / 143
相关论文
共 50 条
  • [1] DCPE Co-Training: Co-Training Based on Diversity of Class Probability Estimation
    Xu, Jin
    He, Haibo
    Man, Hong
    [J]. 2010 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS IJCNN 2010, 2010,
  • [2] An improved text categorization algorithm based on VSM
    Geng, Ji
    Lu, Yunling
    Chen, Wei
    Qin, Zhiguang
    [J]. 2014 IEEE 17TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE AND ENGINEERING (CSE), 2014, : 1701 - 1706
  • [3] Item Categorization Algorithm Based on Improved Text Representation
    Zhenchao, Tu
    Jing, Ma
    [J]. Data Analysis and Knowledge Discovery, 2022, 6 (05) : 34 - 43
  • [4] Co-training Based on Multi-type Text Features
    Liu, Wenting
    Jing, Xiaojun
    Chen, Yaqin
    Li, Jia
    [J]. SIGNAL AND INFORMATION PROCESSING, NETWORKING AND COMPUTERS, 2018, 473 : 213 - 220
  • [5] Vertical Ensemble Co-Training for Text Classification
    Katz, Gilad
    Caragea, Cornelia
    Shabtai, Asaf
    [J]. ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2018, 9 (02)
  • [6] An Improved Parallel Algorithm for Text Categorization
    Yang, Wenchuan
    Fu, Yimin
    Zhang, Dong
    [J]. 2016 INTERNATIONAL SYMPOSIUM ON COMPUTER, CONSUMER AND CONTROL (IS3C), 2016, : 451 - 454
  • [7] Integrating co-training and recognition for text detection
    Wu, W
    Chen, DT
    Yang, J
    [J]. 2005 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), VOLS 1 AND 2, 2005, : 1167 - 1170
  • [8] Chinese Organization Name Recognition Based on Co-training Algorithm
    Ke Xiao
    Li Shaozi
    [J]. 2008 3RD INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEM AND KNOWLEDGE ENGINEERING, VOLS 1 AND 2, 2008, : 771 - 777
  • [9] Contextual Text Categorization: An Improved Stemming Algorithm to Increase the Quality of Categorization in Arabic Text
    Gadri, Said
    Moussaoui, Abdelouahab
    [J]. INTERNATIONAL ARAB JOURNAL OF INFORMATION TECHNOLOGY, 2017, 14 (06) : 835 - 841
  • [10] An EM based training algorithm for cross-language text categorization
    Rigutini, L
    Maggini, M
    Liu, B
    [J]. 2005 IEEE/WIC/ACM INTERNATIONAL CONFERENCE ON WEB INTELLIGENCE, PROCEEDINGS, 2005, : 529 - 535