Nonparametric method of topic identification using granularity concept and graph-based modeling

被引:4
|
作者
Ganguli, Isha [1 ]
Sil, Jaya [1 ]
Sengupta, Nandita [2 ]
机构
[1] Indian Inst Engn Sci & Technol, Dept Comp Sci & Technol, Sibpur, Howrah, India
[2] Univ Coll Bahrain, Dept Informat Technol, Janabiyah, Bahrain
来源
NEURAL COMPUTING & APPLICATIONS | 2023年 / 35卷 / 02期
关键词
Granularity; Point-wise mutual information; Graph-based modeling; Hierarchical structure; Computationally efficient algorithm; DOCUMENT; CLASSIFICATION;
D O I
10.1007/s00521-020-05662-4
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper aims to classify the large unstructured documents into different topics without involving huge computational resources and a priori knowledge. The concept of granularity is employed here to extract contextual information from the documents by generating granules of words (GoWs), hierarchically. The proposed granularity-based word grouping (GBWG) algorithm in a computationally efficient way group the words at different layers by using co-occurrence measure between the words of different granules. The GBWG algorithm terminates when no new GoW is generated at any layer of the hierarchical structure. Thus multiple GoWs are obtained, each of which contains contextually related words, representing different topics. However, the GoWs may contain common words and creating ambiguity in topic identification. Louvain graph clustering algorithm has been employed to automatically identify the topics, containing unique words by using mutual information as an association measure between the words (nodes) of each GoW. A test document is classified into a particular topic based on the probability of its unique words belong to different topics. The performance of the proposed method has been compared with other unsupervised, semi-supervised, and supervised topic modeling algorithms. Experimentally, it has been shown that the proposed method is comparable or better than the state-of-the-art topic modeling algorithms which further statistically verified with the Wilcoxon Rank-sum Test.
引用
收藏
页码:1055 / 1075
页数:21
相关论文
共 50 条
  • [41] A self-adaptive graph-based clustering method with noise identification
    Lin Li
    Xiang Chen
    Chengyun Song
    Pattern Analysis and Applications, 2023, 26 (3) : 907 - 916
  • [42] Automatic topic labeling using graph-based pre-trained neural embedding
    He, Dongbin
    Ren, Yanzhao
    Khattak, Abdul Mateen
    Liu, Xinliang
    Tao, Sha
    Gao, Wanlin
    NEUROCOMPUTING, 2021, 463 : 596 - 608
  • [43] A self-adaptive graph-based clustering method with noise identification
    Li, Lin
    Chen, Xiang
    Song, Chengyun
    PATTERN ANALYSIS AND APPLICATIONS, 2023, 26 (03) : 907 - 916
  • [44] A graph-based method for mechanical product family modeling and functional tolerancing
    Wang, Haoyu
    Roy, Utpal
    Proceedings of the ASME International Design Engineering Technical Conferences and Computers and Information in Engineering Conference, 2005, Vol 2, Pts A and B, 2005, : 141 - 154
  • [45] A Graph-Based Approach to Topic Clustering of Tourist Attraction Reviews
    Sirilertworakul, Nuttha
    Yimwadsana, Boonsit
    INFORMATION AND SOFTWARE TECHNOLOGIES, ICIST 2019, 2019, 1078 : 343 - 354
  • [46] Graph-based topic models for trajectory clustering in crowd videos
    Al Ghamdi, Manal
    Gotoh, Yoshihiko
    MACHINE VISION AND APPLICATIONS, 2020, 31 (05)
  • [47] Graph-based topic models for trajectory clustering in crowd videos
    Manal Al Ghamdi
    Yoshihiko Gotoh
    Machine Vision and Applications, 2020, 31
  • [48] Improving consumption diversity via graph-based topic nudging
    Vercoutere, Stefaan
    Joris, Glen
    De Pessemier, Toon
    Martens, Luc
    User Modeling and User-Adapted Interaction, 2025, 35 (02)
  • [49] Graph-based knowledge tracing: Modeling student proficiency using graph neural networks
    Nakagawa, Hiromi
    Iwasawa, Yusuke
    Matsuo, Yutaka
    WEB INTELLIGENCE, 2021, 19 (1-2) : 87 - 102
  • [50] Graph-based Knowledge Tracing: Modeling Student Proficiency Using Graph Neural Network
    Nakagawa, Hiromi
    Iwasawa, Yusuke
    Matsuo, Yutaka
    2019 IEEE/WIC/ACM INTERNATIONAL CONFERENCE ON WEB INTELLIGENCE (WI 2019), 2019, : 156 - 163