Frequent Term Based Text Document Clustering Using Similarity Measures: A Novel Approach

被引:0
|
作者
Gupta, Vijay Kumar [1 ]
Dutta, Maitreyee [2 ]
Kumar, Manoj [3 ]
机构
[1] Govt Girls Polytech, Dept IT, Charkhari, Mahoba, India
[2] NITTTR, Dept CS&E, Chandigarh, India
[3] BBDNITM, Dept IT, Lucknow, Uttar Pradesh, India
关键词
Clustering; Data Mining; Cosine Similarity; Similarity Index; Fuzzy Logic; Support Vector Machine; ALGORITHM;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Clustering is one of the epic and traditional ways to make sure that the documents are retrieved at the right pace and according to the requirement. Clustering leads to keeping the similar kind of documents all together and so that they can be retrieved easily. The measure through which the relation between two documents is measured is called similarity index. There are several kind of similarity index already in the process. The proposed algorithm uses two kind of similarity index and combines them to produce a new similarity index. Similarity index plays a vital role in the clustering and classification procedure. The proposed algorithm also uses Fuzzy logic for the clustering rules and furthermore it is classified by the Support Vector Machine to justify the accuracy of the proposed solution.
引用
收藏
页码:164 / 169
页数:6
相关论文
共 50 条
  • [41] Evaluation of text document clustering approach based on particle swarm optimization
    Karol, Stuti
    Mangat, Veenu
    OPEN COMPUTER SCIENCE, 2013, 3 (02): : 69 - 90
  • [42] A novel squirrel search clustering algorithm for text document clustering
    Chaudhary M.
    Pruthi J.
    Jain V.K.
    Suryakant
    International Journal of Information Technology, 2022, 14 (6) : 3277 - 3286
  • [43] Evaluation of a Text Document Clustering Approach based on Particle Swarm Optimization
    Karol, Stuti
    Mangat, Veenu
    INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY, 2013, 13 (07): : 130 - 143
  • [44] Integrating element and term semantics for similarity-based XML document clustering
    Yang, JW
    Cheung, WK
    Chen, XO
    2005 IEEE/WIC/ACM International Conference on Web Intelligence, Proceedings, 2005, : 222 - 228
  • [45] A Novel Approach for Fuzzy Measures Acquisition Using Similarity-based Reasoning
    Wagholikar, Amol
    Deer, Peter
    JOURNAL OF INTELLIGENT SYSTEMS, 2008, 17 (1-3) : 19 - 35
  • [46] A Fingerprint Indexing Approach Using Multiple Similarity Measures and Spectral Clustering
    Mngenge, Ntethelelo A.
    Mthembu, Linda
    Nelwamondo, Fulufhelo V.
    Ngejane, Cynthia H.
    2015 12TH CONFERENCE ON COMPUTER AND ROBOT VISION CRV 2015, 2015, : 208 - 213
  • [47] Preprocessing method and similarity measures in clustering-based text mining: a preliminary study
    Iiritano, S
    Ruffolo, M
    Rullo, P
    DATA MINING IV, 2004, 7 : 73 - 79
  • [48] The impact of term-weighting schemes and similarity measures on extractive multi-document text summarization
    Sanchez-Gomez, Jesus M.
    Vega-Rodriguez, Miguel A.
    Perez, Carlos J.
    EXPERT SYSTEMS WITH APPLICATIONS, 2021, 169
  • [49] Semantic Document Clustering Using a Similarity Graph
    Stanchev, Lubomir
    2016 IEEE TENTH INTERNATIONAL CONFERENCE ON SEMANTIC COMPUTING (ICSC), 2016, : 1 - 8
  • [50] The N-Grams Based Text Similarity Detection Approach Using Self-Organizing Maps and Similarity Measures
    Stefanovic, Pavel
    Kurasova, Olga
    Strimaitis, Rokas
    APPLIED SCIENCES-BASEL, 2019, 9 (09):