FSKNN: Multi-label text categorization based on fuzzy similarity and k nearest neighbors

被引:59
|
作者
Jiang, Jung-Yi [1 ]
Tsai, Shian-Chi [1 ]
Lee, Shie-Jue [1 ]
机构
[1] Natl Sun Yat Sen Univ, Dept Elect Engn, Kaohsiung 804, Taiwan
关键词
Document classification; Multi-label classification; Fuzzy similarity measure; k-nearest neighbor algorithm; Maximum a posteriori estimate; LEARNING APPROACH; KNN;
D O I
10.1016/j.eswa.2011.08.141
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We propose an efficient approach. FSKNN, which employs fuzzy similarity measure (FSM) and k nearest neighbors (KNN), for multi-label text classification. One of the problems associated with KNN-like approaches is its demanding computational cost in finding the k nearest neighbors from all the training patterns. For FSKNN, FSM is used to group the training patterns into clusters. Then only the training documents in those clusters whose fuzzy similarities to the document exceed a predesignated threshold are considered in finding the k nearest neighbors for the document. An unseen document is labeled based on its k nearest neighbors using the maximum a posteriori estimate. Experimental results show that our proposed method can work more effectively than other methods. (C) 2011 Elsevier Ltd. All rights reserved.
引用
收藏
页码:2813 / 2821
页数:9
相关论文
共 50 条
  • [1] Multi-label text categorization using k-nearest neighbor approach with m-similarity
    Feng, Yi
    Wu, Zhaohui
    Zhou, Zhongmei
    String Processing and Information Retrieval, Proceedings, 2005, 3772 : 155 - 160
  • [2] Learning Semantic Similarity for Multi-label Text Categorization
    Li, Li
    Wang, Mengxiang
    Zhang, Longkai
    Wang, Houfeng
    CHINESE LEXICAL SEMANTICS, 2014, 8922 : 260 - 269
  • [3] Local-based k values for multi-label k-nearest neighbors rule
    Romero-del-Castillo, J. A.
    Mendoza-Hurtado, Manuel
    Ortiz-Boyer, Domingo
    Garcia-Pedrajas, Nicolas
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2022, 116
  • [4] Selection strategies for multi-label text categorization
    Montejo-Raez, Arturo
    Urena-Lopez, Luis Alfonso
    ADVANCES IN NATURAL LANGUAGE PROCESSING, PROCEEDINGS, 2006, 4139 : 585 - 592
  • [5] Boosting multi-label hierarchical text categorization
    Esuli, Andrea
    Fagni, Tiziano
    Sebastiani, Fabrizio
    INFORMATION RETRIEVAL, 2008, 11 (04): : 287 - 313
  • [6] Boosting multi-label hierarchical text categorization
    Andrea Esuli
    Tiziano Fagni
    Fabrizio Sebastiani
    Information Retrieval, 2008, 11 : 287 - 313
  • [7] Semi-Supervised Multi-label k-Nearest Neighbors Classification Algorithms
    de Lucena, Danilo C. G.
    Prudencio, Ricardo B. C.
    2015 BRAZILIAN CONFERENCE ON INTELLIGENT SYSTEMS (BRACIS 2015), 2015, : 49 - 54
  • [8] LABEL CORRELATION MIXTURE MODEL FOR MULTI-LABEL TEXT CATEGORIZATION
    He, Zhiyang
    Wu, Ji
    Lv, Ping
    2014 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY SLT 2014, 2014, : 83 - 88
  • [9] Multi-label Fuzzy Similarity-Based Nearest-Neighbour Classification Using Association Rule
    Rong, Yu
    Qu, Yanpeng
    Deng, Ansheng
    INTELLIGENT DATA ENGINEERING AND AUTOMATED LEARNING - IDEAL 2016, 2016, 9937 : 542 - 551
  • [10] Using K Nearest Neighbors for Text Segmentation with Feature Similarity
    Jo, Taeho
    2017 INTERNATIONAL CONFERENCE ON COMMUNICATION, CONTROL, COMPUTING AND ELECTRONICS ENGINEERING (ICCCCEE), 2017,