Integrating information by Kullback-Leibler constraint for text classification

被引:1
|
作者
Yin, Shu [1 ,2 ]
Zhu, Peican [2 ]
Wu, Xinyu [3 ]
Huang, Jiajin [4 ]
Li, Xianghua [2 ]
Wang, Zhen [2 ]
Gao, Chao [2 ,3 ]
机构
[1] Northwestern Polytech Univ, Sch Comp Sci, Xian 710072, Shannxi, Peoples R China
[2] Northwestern Polytech Univ, Sch Artificial Intelligence Opt & Elect iOPEN, Xian 710072, Shannxi, Peoples R China
[3] Southwest Univ, Coll Comp & Informat Sci, Chongqing 400715, Peoples R China
[4] Beijing Univ Technol, Fac Informat Technol, Beijing 100083, Peoples R China
来源
NEURAL COMPUTING & APPLICATIONS | 2023年 / 35卷 / 24期
基金
中国国家自然科学基金;
关键词
Text classification; Graph neural network; Kullback-Leibler divergence; Constraint; CONVOLUTIONAL NEURAL-NETWORKS;
D O I
10.1007/s00521-023-08602-0
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Text classification is an important assignment for various text-related downstream assignments, such as fake news detection, sentiment analysis, and question answering. In recent years, the graph-based method achieves excellent results in text classification tasks. Instead of regarding a text as a sequence structure, this method regards it as a co-occurrence set of words. The task of text classification is then accomplished by aggregating the data from nearby nodes using the graph neural network. However, existing corpus-level graph models are difficult to incorporate the local semantic information and classify new coming texts. To address these issues, we propose a Global-Local Text Classification (GLTC) model, based on the KL constraints to realize inductive learning for text classification. Firstly, a global structural feature extractor and a local semantic feature extractor are designed to capture the structural and semantic information of text comprehensively. Then, the KL divergence is introduced as a regularization term in the loss calculation process, which ensures that the global structural feature extractor can constrain the learning of the local semantic feature extractor to achieve inductive learning. The comprehensive experiments on benchmark datasets present that GLTC outperforms baseline methods in terms of accuracy.
引用
收藏
页码:17521 / 17535
页数:15
相关论文
共 50 条
  • [1] Integrating information by Kullback–Leibler constraint for text classification
    Shu Yin
    Peican Zhu
    Xinyu Wu
    Jiajin Huang
    Xianghua Li
    Zhen Wang
    Chao Gao
    [J]. Neural Computing and Applications, 2023, 35 : 17521 - 17535
  • [2] Kullback-Leibler information and interval estimation
    Shanmugam, R
    [J]. COMMUNICATIONS IN STATISTICS-THEORY AND METHODS, 1999, 28 (09) : 2057 - 2063
  • [3] On cumulative residual Kullback-Leibler information
    Park, Sangun
    Rao, Murali
    Shin, Dong Wan
    [J]. STATISTICS & PROBABILITY LETTERS, 2012, 82 (11) : 2025 - 2032
  • [4] Using Kullback-Leibler distance for text categorization
    Bigi, B
    [J]. ADVANCES IN INFORMATION RETRIEVAL, 2003, 2633 : 305 - 319
  • [5] General cumulative Kullback-Leibler information
    Park, Sangun
    Noughabi, Hadi Alizadeh
    Kim, Ilmun
    [J]. COMMUNICATIONS IN STATISTICS-THEORY AND METHODS, 2018, 47 (07) : 1551 - 1560
  • [6] Feature Selection Algorithm for Hierarchical Text Classification Using Kullback-Leibler Divergence
    Yao Lifang
    Qin Sijun
    Zhu Huan
    [J]. 2017 2ND IEEE INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND BIG DATA ANALYSIS (ICCCBDA 2017), 2017, : 421 - 424
  • [7] Alternative Kullback-Leibler Information Entropy for Enantiomers
    Janssens, Sara
    Bultinck, Patrick
    Borgoo, Alex
    Van Alsenoy, Christian
    Geerlings, Paul
    [J]. JOURNAL OF PHYSICAL CHEMISTRY A, 2010, 114 (01): : 640 - 645
  • [8] TESTING EXPONENTIALITY BASED ON KULLBACK-LEIBLER INFORMATION
    EBRAHIMI, N
    HABIBULLAH, M
    SOOFI, ES
    [J]. JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 1992, 54 (03) : 739 - 748
  • [10] Modulation Classification Based on Kullback-Leibler Divergence
    Im, Chaewon
    Ahn, Seongjin
    Yoon, Dongweon
    [J]. 15TH INTERNATIONAL CONFERENCE ON ADVANCED TRENDS IN RADIOELECTRONICS, TELECOMMUNICATIONS AND COMPUTER ENGINEERING (TCSET - 2020), 2020, : 373 - 376