Patent Document Clustering Using Dimensionality Reduction

被引:1
|
作者
Girthana, K. [1 ]
Swamynathan, S. [1 ]
机构
[1] Anna Univ, Dept Informat Sci & Technol, Madras 600025, Tamil Nadu, India
关键词
Prior art search; Dimensionality reduction; Clustering;
D O I
10.1007/978-981-10-6875-1_17
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Patents are a type of intellectual property rights that provide exclusive rights to the invention. Whenever there is a novelty or an invention, prior art search on patents is carried out to check the degree of innovation. Clustering is used to group the relevant documents of prior art search to gain insights about the patent document. The patent documents represent hundreds of features (words extracted from the title and abstract fields). The common sets of features between the documents are subtle. Therefore, the number of features for clustering increases drastically. This leads to the curse of dimensionality. Hence, in thiswork, dimensionality reduction techniques such as PCA and SVD are employed to compare and analyze the quality of clusters formed from the Google patent documents. This comparative analysiswas performed by considering title, abstract, and classification code fields of the patent document. Classification code information was used to decide the number of clusters.
引用
收藏
页码:167 / 176
页数:10
相关论文
共 50 条
  • [1] Clustering Documents using the Document to Vector Model for Dimensionality Reduction
    Radu, Robert-George
    Radulescu, Iulia-Maria
    Truica, Ciprian-Octavian
    Apostol, Elena-Simona
    Mocanu, Mariana
    [J]. PROCEEDINGS OF 2020 IEEE INTERNATIONAL CONFERENCE ON AUTOMATION, QUALITY AND TESTING, ROBOTICS (AQTR), 2020, : 57 - 62
  • [2] Word Embedding of Dimensionality Reduction for Document Clustering
    Zhu, Pengyu
    Lang, Qi
    Liu, Xiaodong
    [J]. 2023 35TH CHINESE CONTROL AND DECISION CONFERENCE, CCDC, 2023, : 4371 - 4376
  • [3] Comparing LDA with pLSI as a dimensionality reduction method in document clustering
    Masada, Tomonari
    Kiyasu, Senya
    Miyahara, Sueharu
    [J]. LARGE-SCALE KNOWLEDGE RESOURCES: CONSTRUCTION AND APPLICATION, 2008, 4938 : 13 - 26
  • [4] Effect of Dimensionality Reduction on Different Distance Measures in Document Clustering
    Paukkeri, Mari-Sanna
    Kivimaki, Ilkka
    Tirunagari, Santosh
    Oja, Erkki
    Honkela, Timo
    [J]. NEURAL INFORMATION PROCESSING, PT III, 2011, 7064 : 167 - +
  • [5] Scalable Supervised Dimensionality Reduction Using Clustering
    Raeder, Troy
    Perlich, Claudia
    Dalessandro, Brian
    Stitelman, Ori
    Provost, Foster
    [J]. 19TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING (KDD'13), 2013, : 1213 - 1221
  • [6] An efficient two-level SOMART document clustering through dimensionality reduction
    Hussin, MF
    Kamel, MS
    Nagi, MH
    [J]. NEURAL INFORMATION PROCESSING, 2004, 3316 : 158 - 165
  • [7] Arabic text clustering using improved clustering algorithms with dimensionality reduction
    Arun Kumar Sangaiah
    Ahmed E. Fakhry
    Mohamed Abdel-Basset
    Ibrahim El-henawy
    [J]. Cluster Computing, 2019, 22 : 4535 - 4549
  • [8] Arabic text clustering using improved clustering algorithms with dimensionality reduction
    Sangaiah, Arun Kumar
    Fakhry, Ahmed E.
    Abdel-Basset, Mohamed
    El-henawy, Ibrahim
    [J]. CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2019, 22 (02): : S4535 - S4549
  • [9] Consensus Clustering for Dimensionality Reduction
    Rani, D. Sandhya
    Rani, T. Sobha
    Bhavani, S. Durga
    [J]. 2014 SEVENTH INTERNATIONAL CONFERENCE ON CONTEMPORARY COMPUTING (IC3), 2014, : 148 - 153
  • [10] Nonlinear dimensionality reduction for clustering
    Tasoulis, Sotiris
    Pavlidis, Nicos G.
    Roos, Teemu
    [J]. PATTERN RECOGNITION, 2020, 107 (107)