Hierarchical multi-instance multi-label learning for Chinese patent text classification

被引:0
|
作者
Liu, Yunduo [1 ,2 ]
Xu, Fang [3 ]
Zhao, Yushan [2 ,4 ]
Ma, Zichen [1 ,2 ]
Wang, Tengke [1 ,2 ]
Zhang, Shunxiang [1 ,2 ]
Tian, Yuhao [5 ]
机构
[1] Anhui Univ Sci & Technol, Sch Comp Sci & Engn, Huainan, Peoples R China
[2] Hefei Comprehens Natl Sci Ctr, Inst Artificial Intelligence, Hefei, Peoples R China
[3] Anhui Univ Sci & Technol, Sch Foreign Languages, Huainan, Peoples R China
[4] Anhui Univ Sci & Technol, Sch Math & Big Data, Huainan, Peoples R China
[5] Macau Univ Sci & Technol, Fac Innovat Engn, Macau, Peoples R China
基金
中国国家自然科学基金;
关键词
Patent text classification; IPC; secondary_label; multi-instance multi-label learning; patent claim;
D O I
10.1080/09540091.2023.2295818
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
To further enhance the accuracy of the Chinese patent classification, this paper proposes a model, based on the patent structure and takes the patent claim as subjects, with multi-instance multi-label learning as the main method. Firstly, the patent claims are divided into multiple independent texts using the sequence number as the splitting token. For each patent, multiple claims are regarded as multiple instances, and the corresponding IPCs serve as its multiple labels. Next, the concept of secondary_label is introduced following the composition rules of IPC, and the relationships between instances and multiple secondary_labels are mined through the construction of fully-connected layers. To capture more comprehensive semantic information of instances, BIGRU and self-attention are employed to enhance semantics and reduce information loss during the training process. Finally, the max-pooling operations are utilised to obtain the predicted categories of patents based on capturing the relationships between instances and different hierarchical labels. Experimental results on the '2017 Chinese patent dataset' demonstrate that the multi-instance multi-label approach can effectively mine deeper relationships between patents and labels in classification tasks. As a result, our model significantly improves the accuracy of patent text classification.
引用
收藏
页数:24
相关论文
共 50 条
  • [31] TRANSFERRING CNNS TO MULTI-INSTANCE MULTI-LABEL CLASSIFICATION ON SMALL DATASETS
    Dong, Mingzhi
    Pang, Kunkun
    Wu, Yang
    Xue, Jing-Hao
    Hospedales, Timothy
    Ogasawara, Tsukasa
    [J]. 2017 24TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2017, : 1332 - 1336
  • [32] Learning a Distance Metric from Multi-instance Multi-label Data
    Jin, Rong
    Wang, Shijun
    Zhou, Zhi-Hua
    [J]. CVPR: 2009 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOLS 1-4, 2009, : 896 - +
  • [33] A Multi-Instance Multi-Label Learning Approach for Protein Domain Annotation
    Meng, Yang
    Deng, Lei
    Chen, Zhigang
    Zhou, Cheng
    Liu, Diwei
    Fan, Chao
    Yan, Ting
    [J]. INTELLIGENT COMPUTING IN BIOINFORMATICS, 2014, 8590 : 104 - 111
  • [34] A Multi-instance Multi-label Dual Learning Approach for Video Captioning
    Ji, Wanting
    Wang, Ruili
    [J]. ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2021, 17 (02)
  • [35] Discover Multiple Novel Labels in Multi-Instance Multi-Label Learning
    Zhu, Yue
    Ting, Kai Ming
    Zhou, Zhi-Hua
    [J]. THIRTY-FIRST AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 2977 - 2983
  • [36] A Multi-instance Multi-label Learning Algorithm Based on Feature Selection
    Chen Tong-tong
    Liu Chan-juan
    Zou Hai-lin
    Shen Qian
    Liu Ying
    Ding Xin-miao
    [J]. 2015 10TH INTERNATIONAL CONFERENCE ON BROADBAND AND WIRELESS COMPUTING, COMMUNICATION AND APPLICATIONS (BWCCA 2015), 2015, : 587 - 590
  • [37] Meta Multi-Instance Multi-Label learning by heterogeneous network fusion
    Qiu, Sichao
    Wang, Mengyi
    Yang, Yuanlin
    Yu, Guoxian
    Wang, Jun
    Yan, Zhongmin
    Domeniconi, Carlotta
    Guo, Maozu
    [J]. INFORMATION FUSION, 2023, 94 : 272 - 283
  • [38] Multi-instance multi-label learning in the presence of novel class instances
    Pham, Anh T.
    Raich, Raviv
    Fern, Xiaoli Z.
    Perez Arriaga, Jesus
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 37, 2015, 37 : 2427 - 2435
  • [39] MIMLRBF: RBF neural networks for multi-instance multi-label learning
    Zhang, Min-Ling
    Wang, Zhi-Jian
    [J]. NEUROCOMPUTING, 2009, 72 (16-18) : 3951 - 3956
  • [40] Online Multi-Instance Multi-Label Learning for Protein Function Prediction
    Wu, Feng
    Liu, Qiong
    Hao, Tianyong
    Chen, Xiaojun
    Wu, Qingyao
    [J]. 2016 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM), 2016, : 780 - 785