Solving Document Clustering Problem through Meta Heuristic Algorithm- Black Hole

被引:0
|
作者
Rafi, Muhammad [1 ]
Aamer, Bilal [1 ]
Naseem, Mubashir [1 ]
Osama, Muhammad [1 ]
机构
[1] Natl Univ Comp & Emerging Sci, Karachi Campus, Peshawar, Pakistan
关键词
D O I
10.1145/3184066.3184085
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The paper proposed a soft computing approach to solve document clustering problem. Document clustering is a specialized clustering problem in which textual documents autonomously segregated to a number of identifiable, subject homogenous and smaller sub-collections (also called clusters). Identifying implicit textual patterns within the documents is a challenging aspect as there can be thousands of such textual features. Partition clustering algorithm like k-means is mainly used for this problem. There are several drawbacks in k-means algorithm such as (i) initial seeds dependency, and (ii) it traps into local optimal solution. Although every k-means solution may contain some good partial arrangements for clustering. Meta-heuristic algorithm like Black Hole (BH) uses certain trade-off of randomization and local search for finding the optimal and near optimal solution. Our motivation comes from the fact that meta-heuristic optimization can quickly produce a global optimal solution using random k-means initial solution. The contributions from this research are (i) an implementation of black hole algorithm using k-mean as embedding (ii) The phenomena of global search and local search optimization are used as parameters adjustments. A series of experiments are performed with our proposed method on standard text mining datasetslike: (i) NEWS20, (ii) Reuters and (iii) WebKB and results are evaluated on Purity and Silhouette Index. In comparison the proposed method outperforms the basic k-means, GA with k-means embedding and quickly converges to global or near global optimal solution.
引用
收藏
页码:77 / 81
页数:5
相关论文
共 50 条
  • [21] HEURISTIC ALGORITHM FOR SOLVING THE GENERALIZED DELIVERY PROBLEM
    MELAMED, II
    PLOTINSKII, YM
    AUTOMATION AND REMOTE CONTROL, 1979, 40 (12) : 1845 - 1849
  • [22] A meta-heuristic framework based on clustering and preprocessed datasets for solving the link prediction problem
    Barham, Reham Shawqi
    Sharieh, Ahmad
    Sleit, Azzam
    JOURNAL OF INFORMATION SCIENCE, 2019, 45 (06) : 794 - 817
  • [23] A meta-heuristic algorithm for solving the road network design problem in regional contexts
    Gallo, Mariano
    D'Acierno, Luca
    Montella, Bruno
    PROCEEDINGS OF EWGT 2012 - 15TH MEETING OF THE EURO WORKING GROUP ON TRANSPORTATION, 2012, 54 : 84 - 95
  • [24] Black hole: A new heuristic optimization approach for data clustering
    Hatamlou, Abdolreza
    INFORMATION SCIENCES, 2013, 222 : 175 - 184
  • [25] Solving the course timetabling problem with a hybrid heuristic algorithm
    Lue, Zhipeng
    Hao, Jin-Kao
    ARTIFICIAL INTELLIGENCE: METHODOLOGY, SYSTEMS, AND APPLICATIONS, 2008, 5253 : 262 - 273
  • [26] An Efficient Heuristic Algorithm for Solving Crane Scheduling Problem
    Xie, Xie
    Li, Yanping
    Zheng, Yongyue
    Li, Xiaoli
    MATERIALS ENGINEERING AND MECHANICAL AUTOMATION, 2014, 442 : 443 - +
  • [27] Heuristic algorithm for solving the integer programming of the lottery problem
    Mohammadi, A.
    Abadi, I. Nakhaei Kamal
    SCIENTIA IRANICA, 2012, 19 (03) : 895 - 901
  • [28] Heuristic algorithm for solving the discrete network design problem
    Chang, Chia-Juch
    Chang, Sheng Hsiung
    Transportation Planning and Technology, 1993, 17 (01)
  • [29] New heuristic algorithm solving the linear ordering problem
    Chanas, Stefan
    Kobylanski, Przemyslaw
    Computational Optimization and Applications, 1996, 6 (02): : 191 - 205
  • [30] A brief survey on Meta-heuristic Approaches for Web Document Clustering
    Singh, Manjit
    Bhasin, Anshu
    Jangra, Surender
    2018 4TH INTERNATIONAL CONFERENCE ON COMPUTING SCIENCES (ICCS), 2018, : 98 - 101