Hierarchical multi-instance multi-label learning for Chinese patent text classification

被引:0
|
作者
Liu, Yunduo [1 ,2 ]
Xu, Fang [3 ]
Zhao, Yushan [2 ,4 ]
Ma, Zichen [1 ,2 ]
Wang, Tengke [1 ,2 ]
Zhang, Shunxiang [1 ,2 ]
Tian, Yuhao [5 ]
机构
[1] Anhui Univ Sci & Technol, Sch Comp Sci & Engn, Huainan, Peoples R China
[2] Hefei Comprehens Natl Sci Ctr, Inst Artificial Intelligence, Hefei, Peoples R China
[3] Anhui Univ Sci & Technol, Sch Foreign Languages, Huainan, Peoples R China
[4] Anhui Univ Sci & Technol, Sch Math & Big Data, Huainan, Peoples R China
[5] Macau Univ Sci & Technol, Fac Innovat Engn, Macau, Peoples R China
基金
中国国家自然科学基金;
关键词
Patent text classification; IPC; secondary_label; multi-instance multi-label learning; patent claim;
D O I
10.1080/09540091.2023.2295818
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
To further enhance the accuracy of the Chinese patent classification, this paper proposes a model, based on the patent structure and takes the patent claim as subjects, with multi-instance multi-label learning as the main method. Firstly, the patent claims are divided into multiple independent texts using the sequence number as the splitting token. For each patent, multiple claims are regarded as multiple instances, and the corresponding IPCs serve as its multiple labels. Next, the concept of secondary_label is introduced following the composition rules of IPC, and the relationships between instances and multiple secondary_labels are mined through the construction of fully-connected layers. To capture more comprehensive semantic information of instances, BIGRU and self-attention are employed to enhance semantics and reduce information loss during the training process. Finally, the max-pooling operations are utilised to obtain the predicted categories of patents based on capturing the relationships between instances and different hierarchical labels. Experimental results on the '2017 Chinese patent dataset' demonstrate that the multi-instance multi-label approach can effectively mine deeper relationships between patents and labels in classification tasks. As a result, our model significantly improves the accuracy of patent text classification.
引用
收藏
页数:24
相关论文
共 50 条
  • [1] A New multi-instance multi-label learning approach for image and text classification
    Yan, Kaobi
    Li, Zhixin
    Zhang, Canlong
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2016, 75 (13) : 7875 - 7890
  • [2] A New multi-instance multi-label learning approach for image and text classification
    Kaobi Yan
    Zhixin Li
    Canlong Zhang
    [J]. Multimedia Tools and Applications, 2016, 75 : 7875 - 7890
  • [3] Multi-instance multi-label learning
    Zhou, Zhi-Hua
    Zhang, Min-Ling
    Huang, Sheng-Jun
    Li, Yu-Feng
    [J]. ARTIFICIAL INTELLIGENCE, 2012, 176 (01) : 2291 - 2320
  • [4] Joint multi-label multi-instance learning for image classification
    Zha, Zheng-Jun
    Hua, Xian-Sheng
    Mei, Tao
    Wang, Jingdong
    Qi, Guo-Jun
    Wang, Zengfu
    [J]. 2008 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOLS 1-12, 2008, : 333 - +
  • [5] Instance Annotation for Multi-Instance Multi-Label Learning
    Briggs, Forrest
    Fern, Xiaoli Z.
    Raich, Raviv
    Lou, Qi
    [J]. ACM TRANSACTIONS ON KNOWLEDGE DISCOVERY FROM DATA, 2013, 7 (03)
  • [6] Learnability of multi-instance multi-label learning
    Wang Wei
    Zhou ZhiHua
    [J]. CHINESE SCIENCE BULLETIN, 2012, 57 (19): : 2488 - 2491
  • [7] Learnability of multi-instance multi-label learning
    WANG Wei & ZHOU ZhiHua National Key Laboratory for Novel Software Technology
    [J]. Science Bulletin, 2012, 57 (19) : 2492 - 2495
  • [8] Fast Multi-Instance Multi-Label Learning
    Huang, Sheng-Jun
    Gao, Wei
    Zhou, Zhi-Hua
    [J]. PROCEEDINGS OF THE TWENTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2014, : 1868 - 1874
  • [9] Active Multi-Instance Multi-Label Learning
    Retz, Robert
    Schwenker, Friedhelm
    [J]. ANALYSIS OF LARGE AND COMPLEX DATA, 2016, : 91 - 101
  • [10] Multi-Instance Multi-Label Active Learning
    Huang, Sheng-Jun
    Gao, Nengneng
    Chen, Songcan
    [J]. PROCEEDINGS OF THE TWENTY-SIXTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 1886 - 1892