SISC: A Text Classification Approach Using Semi Supervised Subspace Clustering

被引:0
|
作者
Ahmed, Mohammad Salim [1 ]
Khan, Latifur [1 ]
机构
[1] Univ Texas Dallas, Dept Comp Sci, Richardson, TX 75083 USA
关键词
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Text classification poses some specific challenges. One such challenge is its high dimensionality where each document (data point) contains only a small subset of them. In this paper, we propose Semi-supervised Impurity based Subspace Clustering (SISC) in conjunction with,c-Nearest Neighbor approach, based on semi-supervised subspace clustering that considers the high dimensionality as well as the sparse nature of them in text data. S/SC finds clusters in the subspaces of the high dimensional text data where each text document has fuzzy cluster membership. This fuzzy clustering exploits two factors - chi square statistic of the dimensions and the impurity measure within each cluster. Empirical evaluation on real world data sets reveals the effectiveness of our approach as it significantly outperforms other state-of-the-art text classification and subspace clustering algorithms.
引用
收藏
页码:1 / 6
页数:6
相关论文
共 50 条
  • [21] An Exploration of Semi-supervised Text Classification
    Lien, Henrik
    Biermann, Daniel
    Palumbo, Fabrizio
    Goodwin, Morten
    [J]. ENGINEERING APPLICATIONS OF NEURAL NETWORKS, EAAAI/EANN 2022, 2022, 1600 : 477 - 488
  • [22] Semi-supervised text classification using positive and unlabeled data
    Yu, Shuang
    Zhou, Xueyuan
    Li, Chunping
    [J]. ADVANCES IN INTELLIGENT IT: ACTIVE MEDIA TECHNOLOGY 2006, 2006, 138 : 249 - 254
  • [23] Automatic Bug Triage using Semi-Supervised Text Classification
    Xuan, Jifeng
    Jiang, He
    Ren, Zhilei
    Yan, Jun
    Luo, Zhongxuan
    [J]. 22ND INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING & KNOWLEDGE ENGINEERING (SEKE 2010), 2010, : 209 - 214
  • [24] A semi-supervised clustering approach using labeled data
    Taghizabet, A.
    Tanha, J.
    Amini, A.
    Mohammadzadeh, J.
    [J]. SCIENTIA IRANICA, 2023, 30 (01) : 104 - 115
  • [25] News Article Classification with Clustering using Semi-Supervised Learning
    Krishnamoorthy, Arjun
    Patil, Akshay Kishor
    Vasudevan, N.
    Pathari, Vinod
    [J]. 2018 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, COMMUNICATIONS AND INFORMATICS (ICACCI), 2018, : 86 - 91
  • [26] Semi-supervised text classification with deep convolutional neural network using feature fusion approach
    Shayegh, Parvaneh
    Li, Yuefeng
    Zhang, Jinglan
    Zhang, Qing
    [J]. 2019 IEEE/WIC/ACM INTERNATIONAL CONFERENCE ON WEB INTELLIGENCE (WI 2019), 2019, : 363 - 366
  • [27] A classification-based approach to semi-supervised clustering with pairwise constraints
    Smieja, Marek
    Struski, Lukasz
    Figueiredo, Mario A. T.
    [J]. NEURAL NETWORKS, 2020, 127 : 193 - 203
  • [28] A Semi-Supervised Text Clustering Approach Based on K-Means Algorithm
    Zhan, Lizhang
    Xu, Hong
    Chen, Xiuguo
    [J]. INTERNATIONAL CONFERENCE ON ENGINEERING AND BUSINESS MANAGEMENT (EBM2011), VOLS 1-6, 2011, : 2616 - 2620
  • [29] Semi-supervised classification based on subspace sparse representation
    Guoxian Yu
    Guoji Zhang
    Zili Zhang
    Zhiwen Yu
    Lin Deng
    [J]. Knowledge and Information Systems, 2015, 43 : 81 - 101
  • [30] Semi-supervised classification based on subspace sparse representation
    Yu, Guoxian
    Zhang, Guoji
    Zhang, Zili
    Yu, Zhiwen
    Deng, Lin
    [J]. KNOWLEDGE AND INFORMATION SYSTEMS, 2015, 43 (01) : 81 - 101