DeepPatent: patent classification with convolutional neural networks and word embedding

被引:108
|
作者
Li, Shaobo [1 ,2 ]
Hu, Jie [1 ,3 ]
Cui, Yuxin [3 ]
Hu, Jianjun [2 ,3 ]
机构
[1] Guizhou Univ, Minist Educ, Key Lab Adv Mfg Technol, Guiyang 550025, Guizhou, Peoples R China
[2] Guizhou Univ, Sch Mech Engn, Guiyang 550025, Guizhou, Peoples R China
[3] Univ South Carolina, Dept Comp Sci & Engn, Columbia, SC 29208 USA
基金
中国国家自然科学基金;
关键词
Patent classification; Text classification; Convolutional neural network; Machine learning; Word embedding; 94-02; Y; TECHNOLOGY; SELECTION; REPRESENTATIONS;
D O I
10.1007/s11192-018-2905-5
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Patent classification is an essential task in patent information management and patent knowledge mining. However, this task is still largely done manually due to the unsatisfactory performance of current algorithms. Recently, deep learning methods such as convolutional neural networks (CNN) have led to great progress in image processing, voice recognition, and speech recognition, which has yet to be applied to patent classification. We proposed DeepPatent, a deep learning algorithm for patent classification based on CNN and word vector embedding. We evaluated the algorithm on the standard patent classification benchmark dataset CLEF-IP and compared it with other algorithms in the CLEF-IP competition. Experiments showed that DeepPatent with automatic feature extraction achieved a classification precision of 83.98%, which outperformed all the existing algorithms that used the same information for training. Its performance is better than the state-of-art patent classifier with a precision of 83.50%, whose performance is, however, based on 4000 characters from the description section and a lot of feature engineering while DeepPatent only used the title and abstract information. DeepPatent is further tested on USPTO-2M, a patent classification benchmark data set that we contributed with 2,000,147 records after data cleaning of 2,679,443 USA raw utility patent documents in 637 categories at the subclass level. Our algorithms achieved a precision of 73.88%.
引用
收藏
页码:721 / 744
页数:24
相关论文
共 50 条
  • [41] An Analysis of Convolutional Neural Networks for Sentence Classification
    Albuquerque Vieira, Joao Paulo
    Moura, Raimundo Santos
    2017 XLIII LATIN AMERICAN COMPUTER CONFERENCE (CLEI), 2017,
  • [42] An Ensemble of Convolutional Neural Networks for Audio Classification
    Nanni, Loris
    Maguolo, Gianluca
    Brahnam, Sheryl
    Paci, Michelangelo
    APPLIED SCIENCES-BASEL, 2021, 11 (13):
  • [43] Ship classification based on convolutional neural networks
    Li Zhenzhen
    Zhao Baojun
    Tang Linbo
    Li Zhen
    Feng Fan
    JOURNAL OF ENGINEERING-JOE, 2019, 2019 (21): : 7343 - 7346
  • [44] CONVOLUTIONAL NEURAL NETWORKS IN THE TASK OF IMAGE CLASSIFICATION
    Zelenina, Larisa
    Khaimina, Liudmila
    Khaimin, Evgenii
    Khripunov, D.
    Zashikhina, Inga
    MATHEMATICS AND INFORMATICS, 2022, 65 (01): : 19 - 29
  • [45] Classification of Radar Signals with Convolutional Neural Networks
    Hong, Seok-Jun
    Yi, Yearn-Gui
    Jo, Jeil
    Seo, Bo-Seok
    2018 TENTH INTERNATIONAL CONFERENCE ON UBIQUITOUS AND FUTURE NETWORKS (ICUFN 2018), 2018, : 894 - 896
  • [46] Plant Classification using Convolutional Neural Networks
    Yalcin, Hulya
    Razavi, Salar
    2016 FIFTH INTERNATIONAL CONFERENCE ON AGRO-GEOINFORMATICS (AGRO-GEOINFORMATICS), 2016, : 233 - 237
  • [47] Classification of Environmental Sounds with Convolutional Neural Networks
    Dincer, Yalcin
    Inik, Ozkan
    KONYA JOURNAL OF ENGINEERING SCIENCES, 2023, 11 (02):
  • [48] Sound Classification Using Convolutional Neural Networks
    Jaiswal, Kaustumbh
    Patel, Dhairya Kalpeshbhai
    2018 SEVENTH IEEE INTERNATIONAL CONFERENCE ON CLOUD COMPUTING IN EMERGING MARKETS (CCEM), 2018, : 81 - 84
  • [49] Aerial Scene Classification with Convolutional Neural Networks
    Jia, Sibo
    Liu, Huaping
    Sun, Fuchun
    ADVANCES IN NEURAL NETWORKS - ISNN 2015, 2015, 9377 : 258 - 265
  • [50] Convolutional Neural Networks for Web Documents Classification
    Artene, Codrut-Georgian
    Tibeica, Marius Nicolae
    Vecliuc, Dumitru Daniel
    Leon, Florin
    INTELLIGENT INFORMATION AND DATABASE SYSTEMS, ACIIDS 2021, 2021, 12672 : 289 - 302