Effective Query Generation and Postprocessing Strategies for Prior Art Patent Search

被引:13
|
作者
Cetintas, Suleyman [1 ]
Si, Luo [1 ]
机构
[1] Purdue Univ, Dept Comp Sci, W Lafayette, IN 47907 USA
基金
美国国家科学基金会;
关键词
Compendex;
D O I
10.1002/asi.21708
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Rapid increase in global competition demands increased protection of intellectual property rights and underlines the importance of patents as major intellectual property documents. Prior art patent search is the task of identifying related patents for a given patent file, and is an essential step in judging the validity of a patent application. This article proposes an automated query generation and postprocessing method for prior art patent search. The proposed approach first constructs structured queries by combining terms extracted from different fields of a query patent and then reranks the retrieved patents by utilizing the International Patent Classification (IPC) code similarities between the query patent and the retrieved patents along with the retrieval score. An extensive set of empirical results carried out on a large-scale, real-world dataset shows that utilizing 20 or 30 query terms extracted from all fields of an original query patent according to their log(tf)idf values helps form a representative search query out of the query patent and is found to be more effective than is using any number of query terms from any single field. It is shown that combining terms extracted from different fields of the query patent by giving higher importance to terms extracted from the abstract, claims, and description fields than to terms extracted from the title field is more effective than treating all extracted terms equally while forming the search query. Finally, utilizing the similarities between the IPC codes of the query patent and retrieved patents is shown to be beneficial to improve the effectiveness of the prior art search.
引用
收藏
页码:512 / 527
页数:16
相关论文
共 50 条
  • [1] Query Generation Techniques for Patent Prior-Art Search in Multiple Languages
    Zhou, Dong
    Liu, Jianxun
    Zhang, Sanrong
    [J]. NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING, NLPCC 2013, 2013, 400 : 310 - 321
  • [2] Is your search query well-formed? A natural query understanding for patent prior art search
    Chikkamath, Renukswamy
    Rastogi, Deepak
    Maan, Mahesh
    Endres, Markus
    [J]. WORLD PATENT INFORMATION, 2024, 76
  • [3] Using multiple query representations in patent prior-art search
    Zhou, Dong
    Truran, Mark
    Liu, Jianxun
    Zhang, Sanrong
    [J]. INFORMATION RETRIEVAL, 2014, 17 (5-6): : 471 - 491
  • [4] Using multiple query representations in patent prior-art search
    Dong Zhou
    Mark Truran
    Jianxun Liu
    Sanrong Zhang
    [J]. Information Retrieval, 2014, 17 : 471 - 491
  • [5] Patent Search Basics The Importance of Prior Art
    Henriques, Paul
    [J]. Online Searcher, 2019, 43 (05): : 40 - 42
  • [6] The Search of Prior Art and the Revelation of Information by Patent Applicants
    Corinne Langinier
    Philippe Marcoul
    [J]. Review of Industrial Organization, 2016, 49 : 399 - 427
  • [7] On Term Selection Techniques for Patent Prior Art Search
    Far, Mona Golestan
    Sanner, Scott
    Bouadjenek, Mohamed Reda
    Ferraro, Gabriela
    Hawking, David
    [J]. SIGIR 2015: PROCEEDINGS OF THE 38TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2015, : 803 - 806
  • [8] The Search of Prior Art and the Revelation of Information by Patent Applicants
    Langinier, Corinne
    Marcoul, Philippe
    [J]. REVIEW OF INDUSTRIAL ORGANIZATION, 2016, 49 (03) : 399 - 427
  • [9] End to End Neural Retrieval for Patent Prior Art Search
    Stamatis, Vasileios
    [J]. ADVANCES IN INFORMATION RETRIEVAL, PT II, 2022, 13186 : 537 - 544
  • [10] Automating the search for a patent's prior art with a full text similarity search
    Helmers, Lea
    Horn, Franziska
    Biegler, Franziska
    Oppermann, Tim
    Mueller, Klaus-Robert
    [J]. PLOS ONE, 2019, 14 (03):