Multi-label text classification based on the label correlation mixture model

被引:6
|
作者
He, Zhiyang [1 ]
Wu, Ji [1 ]
Lv, Ping [2 ]
机构
[1] Tsinghua Univ, Dept Elect Engn, Beijing, Peoples R China
[2] Tsinghua iFlytek Joint Lab Speech Technol, Beijing, Peoples R China
关键词
Label correlation mixture model; probabilistic generative model; multi-label text classification; label correlation model; label correlation network; Bayes decision theory; DESIGN;
D O I
10.3233/IDA-163055
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In the current paper, we propose a probabilistic generative model, the label correlation mixture model (LCMM), to depict multi-labeled document data, which can be utilized for multi-label text classification. LCMM assumes two stochastic generative processes, which correspond to two submodels: 1) a label correlation model; and 2) a label mixture model. The former model formulates labels' generative process, in which a label correlation network is created to depict the dependency between labels. Moreover, we present an efficient inference algorithm for calculating the generative probability of a multi-label class. Furthermore, in order to optimize the label correlation network, we propose a parameter-learning algorithm based on gradient descent. The second submodel in the LCMM depicts the generative process of words in a document with the given labels. Different traditional mixture models can be adopted in this generative process, such as the mixture of language models, or topic models. In the multi-label classification stage, we propose a two-step strategy to most efficiently utilize the LCMM based on the framework of Bayes decision theory. We conduct extensive multi-label classification experiments on three standard text data sets. The experimental results show significant performance improvements comparing to existing approaches. For example, the improvements on accuracy and macro F-score measures in the OHSUMED data set achieve 28.3% and 37.0%, respectively. These performance enhancements demonstrate the effectiveness of the proposed models and solutions.
引用
收藏
页码:1371 / 1392
页数:22
相关论文
共 50 条
  • [1] LABEL CORRELATION MIXTURE MODEL FOR MULTI-LABEL TEXT CATEGORIZATION
    He, Zhiyang
    Wu, Ji
    Lv, Ping
    [J]. 2014 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY SLT 2014, 2014, : 83 - 88
  • [2] Multi-Label Text Classification Based on DistilBERT and Label Correlation
    Wang, Xuyang
    Geng, Liuqing
    Zhang, Xin
    [J]. Computer Engineering and Applications, 2024, 60 (23) : 168 - 175
  • [3] Label Correlation Based Graph Convolutional Network for Multi-label Text Classification
    Huy-The Vu
    Minh-Tien Nguyen
    Van-Chien Nguyen
    Manh-Tran Tien
    Van-Hau Nguyen
    [J]. 2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,
  • [4] Multi-Label Text Classification Based on Contrastive and Correlation Learning
    Yang, Shuo
    Gao, Shu
    [J]. PROCEEDINGS OF 2024 3RD INTERNATIONAL CONFERENCE ON CYBER SECURITY, ARTIFICIAL INTELLIGENCE AND DIGITAL ECONOMY, CSAIDE 2024, 2024, : 325 - 330
  • [5] A Label Information Aware Model for Multi-label Text Classification
    Tian, Xiaoyu
    Qin, Yongbin
    Huang, Ruizhang
    Chen, Yanping
    [J]. Neural Processing Letters, 2024, 56 (05)
  • [6] Label prompt for multi-label text classification
    Rui Song
    Zelong Liu
    Xingbing Chen
    Haining An
    Zhiqi Zhang
    Xiaoguang Wang
    Hao Xu
    [J]. Applied Intelligence, 2023, 53 : 8761 - 8775
  • [7] Label prompt for multi-label text classification
    Song, Rui
    Liu, Zelong
    Chen, Xingbing
    An, Haining
    Zhang, Zhiqi
    Wang, Xiaoguang
    Xu, Hao
    [J]. APPLIED INTELLIGENCE, 2023, 53 (08) : 8761 - 8775
  • [8] Correlation Networks for Extreme Multi-label Text Classification
    Xun, Guangxu
    Jha, Kishlay
    Sun, Jianhui
    Zhang, Aidong
    [J]. KDD '20: PROCEEDINGS OF THE 26TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2020, : 1074 - 1082
  • [9] Multi-label text classification model based on semantic embedding
    Yan Danfeng
    Ke Nan
    Gu Chao
    Cui Jianfei
    Ding Yiqi
    [J]. The Journal of China Universities of Posts and Telecommunications, 2019, 26 (01) : 95 - 104
  • [10] Multi-label text classification via joint learning from label embedding and label correlation
    Liu, Huiting
    Chen, Geng
    Li, Peipei
    Zhao, Peng
    Wu, Xindong
    [J]. NEUROCOMPUTING, 2021, 460 : 385 - 398