Text classification on heterogeneous information network via enhanced GCN and knowledge

被引:4
|
作者
Li, Hui [1 ]
Yan, Yan [5 ]
Wang, Shuo [4 ]
Liu, Juan [2 ,3 ]
Cui, Yunpeng [2 ,3 ]
机构
[1] Capital Univ Econ & Business, Sch Lab Econ, Beijing, Peoples R China
[2] Minist Agr & Rural, Key Lab Agr Big Data, Beijing, Peoples R China
[3] Chinese Acad Agr Sci, Agr Informat Inst, Beijing, Peoples R China
[4] Commonwealth Sci & Ind Res Org, Data61, Melbourne, Australia
[5] Chinese Acad Agr Sci, Inst Agr Econ & Dev, Beijing, Peoples R China
来源
NEURAL COMPUTING & APPLICATIONS | 2023年 / 35卷 / 20期
关键词
Text classification; Graph convolutional networks; Knowledge graph; Heterogeneous information network; Pre-trained model; CONVOLUTIONAL NEURAL-NETWORK; ATTENTION MECHANISM; BIDIRECTIONAL LSTM; WEB;
D O I
10.1007/s00521-023-08494-0
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Graph convolutional networks-based text classification methods have shown impressive success in further improving the classification results by considering the structural relationship between words and texts. However, existing GCN-based text classification methods tend to ignore the semantic representation of the node and the global structural information among nodes. Besides, only the word granularity information within the text, i.e., endogenous source, is used to represent the text. Furthermore, the existing graph convolutional network approaches are faced with major challenges to handle large and dense graphs, i.e., neighbor explosion and noisy inputs. To address these shortcomings, this paper proposes an inductive learning-based text classification method that utilizes representation learning on heterogeneous information networks and exogenous knowledge. Firstly, a weighted heterogeneous information network for text (HINT) is constructed by introducing exogenous knowledge, in which the node types cover text, entities and words. The unstructured text is represented as a structured heterogeneous information network, which expands the granularity of text features and makes full use of the exogenous structural information and explicit semantic information to enhance the interpretability of text information. Besides, we also enhanced the graph neural network against the challenges of neighbor explosion and noisy inputs derived from HINT using two strategies: graph sampling and Dropedge, for semi-supervised learning with improved classification performance. The effectiveness of our model is demonstrated by examining four publicly available text classification datasets. Based on experimental results, our approach achieves state-of-the-art performance on the text classification datasets.
引用
收藏
页码:14911 / 14927
页数:17
相关论文
共 50 条
  • [1] Text classification on heterogeneous information network via enhanced GCN and knowledge
    Hui Li
    Yan Yan
    Shuo Wang
    Juan Liu
    Yunpeng Cui
    [J]. Neural Computing and Applications, 2023, 35 : 14911 - 14927
  • [2] Text Classification with Heterogeneous Information Network Kernels
    Wang, Chenguang
    Song, Yangqiu
    Li, Haoran
    Zhang, Ming
    Han, Jiawei
    [J]. THIRTIETH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2016, : 2130 - 2136
  • [3] Improving knowledge tracing via a heterogeneous information network enhanced by student interactions
    Xu, Jia
    Huang, Xinyue
    Xiao, Teng
    Lv, Pin
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2023, 232
  • [4] KPE-GCN: A Keyphrase-Enhanced Graph Convolutional Network for Imbalanced Text Classification
    Zhao, HuaXuan
    Zhao, Hui
    Hou, Lu
    [J]. 2022 IEEE 34TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE, ICTAI, 2022, : 297 - 302
  • [5] Construction and Application of GCN Model for Text Classification with Associated Information
    融合关联信息的 GCN 文本分类模型构建及其应用研究
    [J]. Wang, Hao (ywhaowang@nju.edu.cn), 1600, Chinese Academy of Sciences (05): : 31 - 41
  • [6] RE-GCN: Relation Enhanced Graph Convolutional Network for Entity Alignment in Heterogeneous Knowledge Graphs
    Yang, Jinzhu
    Zhou, Wei
    Wei, Lingwei
    Lin, Junyu
    Han, Jizhong
    Hu, Songlin
    [J]. DATABASE SYSTEMS FOR ADVANCED APPLICATIONS (DASFAA 2020), PT II, 2020, 12113 : 432 - 447
  • [7] Heterogeneous information integration in hierarchical text classification
    Yang, Huai-Yuan
    Liu, Tie-Yan
    Gao, Li
    Ma, Wei-Ying
    [J]. ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PROCEEDINGS, 2006, 3918 : 240 - 249
  • [8] Enhanced Network Embedding with Text Information
    Yang, Shuang
    Yang, Bo
    [J]. 2018 24TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2018, : 326 - 331
  • [9] Knowledge based neural network for text classification
    Goyal, Ram Dayal
    [J]. GRC: 2007 IEEE INTERNATIONAL CONFERENCE ON GRANULAR COMPUTING, PROCEEDINGS, 2007, : 542 - 547
  • [10] Explaining Toxic Text via Knowledge Enhanced Text Generation
    Sridhar, Rohit
    Yang, Diyi
    [J]. NAACL 2022: THE 2022 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES, 2022, : 811 - 826