A Fused Multi-feature Based Co-training Approach For Document Clustering

被引:4
|
作者
Wang, Yuanqing [1 ]
Wang, Wenjun [1 ]
Dai, Weidi [1 ]
Jiao, Pengfei [1 ]
Yu, Wei [1 ]
机构
[1] Tianjin Univ, Sch Comp Sci & Technol, Tianjin Key Lab Cognit Comp & Applicat, Tianjin, Peoples R China
关键词
multi-feature; co-training; document clustering; spectral clustering;
D O I
10.1109/ICISCE.2016.19
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Document clustering is a popular topic in data mining and information retrieval. Most models and methods for this problem are based on computing the similarity between pair documents modeled in a space of all terms, or a new feature space obtained by applying a topic modeling technique for a given corpus. In this paper, we regard these two ideas as clustering on term feature and on semantic feature, and have an assumption that they can contribute to each other in clustering. Also, we propose a co-training approach for spectral clustering taking two features into account. Experiments on four real-world datasets show the feasibility and efficacy of our proposed approach compared with a number of the baseline methods.
引用
收藏
页码:38 / 43
页数:6
相关论文
共 50 条
  • [31] A robust approach to content-based musical genre classification and retrieval using multi-feature clustering
    Park, KS
    Oh, SH
    Yoon, WJ
    Lee, KK
    ADVANCES IN COMPUTER SCIENCE - ASIAN 2004, PROCEEDINGS, 2004, 3321 : 212 - 222
  • [32] A robust approach to content-based musical genre classification and retrieval using multi-feature clustering
    Park, Kyu-Sik
    Oh, Sang-Heon
    Yoon, Won-Jung
    Lee, Kang-Kue
    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2004, 3321 : 212 - 222
  • [33] Using clustering and co-training to boost classification performance
    Kyriakopoulou, Antonia
    19TH IEEE INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE, VOL II, PROCEEDINGS, 2007, : 325 - 330
  • [34] A New Co-training Approach Based on SVM for Image Retrieval
    Liu, Hui
    Han, Hua
    Li, Zhenhua
    INTELLIGENT COMPUTING AND INFORMATION SCIENCE, PT I, 2011, 134 (0I): : 77 - +
  • [35] A co-training approach based TEF-WA technique
    Huanling, Tang
    Mingyu, Lu
    Na, Liu
    2007 IFIP INTERNATIONAL CONFERENCE ON NETWORK AND PARALLEL COMPUTING WORKSHOPS, PROCEEDINGS, 2007, : 1021 - +
  • [36] A Bayesian Network approach to multi-feature based image retrieval
    Zhang, Qianni
    Izquierdo, Ebroul
    SEMANTIC MULTIMEDIA, PROCEEDINGS, 2006, 4306 : 138 - +
  • [37] Query-focused multi-document summarization using co-training based semi-supervised learning
    Hu, Po
    Ji, Donghong
    Wang, Hai
    Teng, Chong
    PACLIC 23 - Proceedings of the 23rd Pacific Asia Conference on Language, Information and Computation, 2009, 1 : 190 - 199
  • [38] CoRec: A Co-Training Approach for Recommender Systems
    da Costa, Arthur F.
    Manzato, Marcelo G.
    Campello, Ricardo J. G. B.
    33RD ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING, 2018, : 696 - 703
  • [39] A Novel Approach to Visual Navigation Based on Multi-feature Matching
    Shao, Wei
    Gu, Tianhao
    Ma, Yin
    2017 CHINESE AUTOMATION CONGRESS (CAC), 2017, : 6733 - 6738
  • [40] A Multi-Feature Based Automatic Approach to Geospatial Record Linking
    Zhang, Ying
    Yang, Puhai
    Li, Chaopeng
    Zhang, Gengrui
    Wang, Cheng
    He, Hui
    Hu, Xiang
    Guan, Zhitao
    INTERNATIONAL JOURNAL ON SEMANTIC WEB AND INFORMATION SYSTEMS, 2018, 14 (04) : 73 - 91