Doc2vec-based link prediction approach using SAO structures: application to patent network

被引:12
|
作者
Yoon, Byungun [1 ]
Kim, Songhee [1 ]
Kim, Sunhye [1 ]
Seol, Hyeonju [2 ]
机构
[1] Dongguk Univ, Dept Ind & Syst Engn, Seoul 04620, South Korea
[2] Chungnam Natl Univ, Sch Integrated Natl Secur, Daejeon 34134, South Korea
基金
新加坡国家研究基金会;
关键词
Link prediction; Patent network; Doc2vec; Document embedding; Unmanned aerial vehicle; EXTRACTION;
D O I
10.1007/s11192-021-04187-4
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
As the amount of documents has exploded in the Internet era, many researchers have tried to understand the relationships between documents and predict the links between similar but unconnected documents. However, existing link prediction techniques that use the predefined links of documents might provide incorrect results, because of the generic problem of citation analysis. Moreover, they may fail to reflect important contents of documents in the link prediction process. Thus, we propose a new link prediction approach that employs the Doc2vec algorithm, a document-embedding method, in order to predict potential links between documents, by reflecting the functional context of technological words. For this, first, we collected both citation information and documents of patents of interest, and generated a patent network by using the citation relationship between patents. Second, we identified unconnected links between nodes and transformed the patent document into document vectors, based on the Doc2vec algorithm. In particular, since patent documents include useful functions for solving technological problems, the proposed approach extracts subject-action-object (SAO) structures that we used to generate document vectors. Then, we calculated the similarity between patents in the unconnected links of a patent network, and could predict potential links by using the similarity. Third, we validated the results of the proposed approach by comparing them using the Adamic-Adar technique, one of the traditional link prediction techniques, and word vector-based link prediction. We applied the Doc2vec-based link prediction approach to a real case, the unmanned aerial vehicle (UAV) technology field. We found that the proposed approach makes better predictions performance than the Adamic-Adar technique and the word vector approach. Our results can help analyzers accurately forecast future relationships between nodes in a network, and give R&D managers insightful information on the future direction of technological development by using a patent network.
引用
收藏
页码:5385 / 5414
页数:30
相关论文
共 50 条
  • [1] Doc2vec-based link prediction approach using SAO structures: application to patent network
    Byungun Yoon
    Songhee Kim
    Sunhye Kim
    Hyeonju Seol
    [J]. Scientometrics, 2022, 127 : 5385 - 5414
  • [2] A Doc2Vec-Based Assessment of Comments and Its Application to Change-Prone Method Analysis
    Aman, Hirohisa
    Amasaki, Sousuke
    Yokogawa, Tomoyuki
    Kawahara, Minoru
    [J]. 2018 25TH ASIA-PACIFIC SOFTWARE ENGINEERING CONFERENCE (APSEC 2018), 2018, : 643 - 647
  • [3] Web Service Recommendation via Combining Doc2Vec-based Functionality Clustering and DeepFM-based Score Prediction
    Zhang, Xiangping
    Liu, Jianxun
    Cao, Buqing
    Xiao, Qiaoxiang
    Wen, Yiping
    [J]. 2018 IEEE INT CONF ON PARALLEL & DISTRIBUTED PROCESSING WITH APPLICATIONS, UBIQUITOUS COMPUTING & COMMUNICATIONS, BIG DATA & CLOUD COMPUTING, SOCIAL COMPUTING & NETWORKING, SUSTAINABLE COMPUTING & COMMUNICATIONS, 2018, : 509 - 516
  • [4] SAO2Vec: Development of an algorithm for embedding the subject-action-object (SAO) structure using Doc2Vec
    Kim, Sunhye
    Park, Inchae
    Yoon, Byungun
    [J]. PLOS ONE, 2020, 15 (02):
  • [5] Bug Prediction Using Source Code Embedding Based on Doc2Vec
    Aladics, Tamas
    Jasz, Judit
    Ferenc, Rudolf
    [J]. COMPUTATIONAL SCIENCE AND ITS APPLICATIONS, ICCSA 2021, PT VII, 2021, 12955 : 382 - 397
  • [6] Doc2vec-based Insider Threat Detection through Behaviour Analysis of Multi-source Security Logs
    Liu, Liu
    Chen, Chao
    Zhang, Jun
    De Vel, Olivier
    Xiang, Yang
    [J]. 2020 IEEE 19TH INTERNATIONAL CONFERENCE ON TRUST, SECURITY AND PRIVACY IN COMPUTING AND COMMUNICATIONS (TRUSTCOM 2020), 2020, : 301 - 309
  • [7] Topo2Vec: A Novel Node Embedding Generation Based on Network Topology for Link Prediction
    Mallick, Koushik
    Bandyopadhyay, Sanghamitra
    Chakraborty, Subhasis
    Choudhuri, Rounaq
    Bose, Sayan
    [J]. IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS, 2019, 6 (06): : 1306 - 1317
  • [8] An Approach to Estimating Cited Sentences in Academic Papers Using Doc2vec
    Tanabe, Shunsuke
    Ohta, Manabu
    Takasu, Atsuhiro
    Adachi, Jun
    [J]. PROCEEDINGS OF THE 10TH INTERNATIONAL CONFERENCE ON MANAGEMENT OF DIGITAL ECOSYSTEMS (MEDES'18), 2018, : 118 - 125
  • [9] Web Service Recommendation based on Knowledge Graph Convolutional Network and Doc2Vec
    Geng, Jinkun
    Cao, Buqing
    Ye, Hongfan
    Chen, Junjie
    Peng, Mi
    Liu, Jianxun
    [J]. 2020 IEEE WORLD CONGRESS ON SERVICES (SERVICES), 2020, : 95 - 100
  • [10] Author Profiling with Doc2vec Neural Network-Based Document Embeddings
    Markov, Ilia
    Gomez-Adorno, Helena
    Posadas-Duran, Juan-Pablo
    Sidorov, Grigori
    Gelbukh, Alexander
    [J]. ADVANCES IN SOFT COMPUTING, MICAI 2016, PT II, 2017, 10062 : 117 - 131