Patent document clustering with deep embeddings

被引:29
|
作者
Kim, Jaeyoung [1 ]
Yoon, Janghyeok [2 ]
Park, Eunjeong [3 ]
Choi, Sungchul [1 ]
机构
[1] Gachon Univ, Dept Ind Management Engn, TEAMLAB, Seongnam Si, Gyeonggi Do, South Korea
[2] Konkuk Univ, Dept Ind Engn, Seoul, South Korea
[3] NAVER, Seongnam Si, Gyeonggi Do, South Korea
基金
新加坡国家研究基金会;
关键词
Information embedding; Patent clustering; Deep learning; Text mining; CITATION NETWORKS;
D O I
10.1007/s11192-020-03396-7
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
The analysis of scientific and technical documents is crucial in the process of establishing science and technology strategies. One popular method for such analysis is for field experts to manually classify each scientific or technical document into one of several predefined technical categories. However, not only is manual classification error-prone and expensive, but it also requires extended efforts to handle frequent data updates. In contrast, machine learning and text mining techniques enable cheaper and faster operations, and can alleviate the burden on human resources. In this paper, we propose a method for extracting embedded feature vectors by applying a neural embedding approach for text features in patent documents and automatically clustering the embedding features by utilizing a deep embedding clustering method.
引用
收藏
页码:563 / 577
页数:15
相关论文
共 50 条
  • [21] Clustering social sciences and humanities publications: Can word and document embeddings improve cluster quality?
    Eykens, Joshua
    Guns, Raf
    Engels, Tim
    18TH INTERNATIONAL CONFERENCE ON SCIENTOMETRICS & INFORMETRICS (ISSI2021), 2021, : 369 - 374
  • [22] Deep document clustering via adaptive hybrid representation learning
    Ren, Lina
    Qin, Yongbin
    Chen, Yanping
    Lin, Chuan
    Huang, Ruizhang
    KNOWLEDGE-BASED SYSTEMS, 2023, 281
  • [23] Adaptive structural enhanced representation learning for deep document clustering
    Xue, Jingjing
    Huang, Ruizhang
    Bai, Ruina
    Chen, Yanping
    Qin, Yongbin
    Lin, Chuan
    APPLIED INTELLIGENCE, 2024, 54 (23) : 12315 - 12331
  • [24] Text classification with document embeddings
    Huang, Chaochao (chaochaohuang12@fudan.edu.cn), 1600, Springer Verlag (8801):
  • [25] Text Classification with Document Embeddings
    Huang, Chaochao
    Qiu, Xipeng
    Huang, Xuanjing
    CHINESE COMPUTATIONAL LINGUISTICS AND NATURAL LANGUAGE PROCESSING BASED ON NATURALLY ANNOTATED BIG DATA, CCL 2014, 2014, 8801 : 131 - 140
  • [26] Deep clustering analysis via variational autoencoder with Gamma mixture latent embeddings
    Guo, Jiaxun
    Fan, Wentao
    Amayri, Manar
    Bouguila, Nizar
    NEURAL NETWORKS, 2025, 183
  • [27] Deep Clustering Analysis via Dual Variational Autoencoder With Spherical Latent Embeddings
    Yang, Lin
    Fan, Wentao
    Bouguila, Nizar
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2023, 34 (09) : 6303 - 6312
  • [28] Are the Best Multilingual Document Embeddings simply Based on Sentence Embeddings?
    Sannigrahi, Sonal
    van Genabith, Josef
    Espana-Bonet, Cristina
    17TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EACL 2023, 2023, : 2306 - 2316
  • [29] Document clustering
    Cozzolino, Irene
    Ferraro, Maria Brigida
    WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL STATISTICS, 2022, 14 (06)
  • [30] DOCUMENT CLUSTERING
    KOULOPOULOS, TM
    BYTE, 1992, 17 (06): : 272 - 273