New classification quality estimators for analysis of documentary information: Application to patent analysis and web mapping

被引:28
|
作者
Lamirel, JC
Francois, C
AL Shehabi, S
Hoffmann, M
机构
[1] LORIA, F-54506 Vandoeuvre Les Nancy, France
[2] URI INIST, CNRS, Vandoeuvre Les Nancy, France
关键词
D O I
10.1023/B:SCIE.0000034386.05278.e8
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
The information analysis process includes a cluster analysis or classification step associated with an expert validation of the results. In this paper, we propose new measures of Recall/Precision for estimating the quality of cluster analysis. These measures derive both from the Galois lattice theory and from the Information Retrieval (IR) domain. As opposed to classical measures of inertia, they present the main advantages to be both independent of the classification method and of the difference between the intrinsic dimension of the data and those of the clusters. We present two experiments on the basis of the MultiSOM model, which is an extension of Kohonen's SOM model, as a cluster analysis method. Our first experiment on patent data shows how our measures can be used to compare viewpoint-oriented classification methods, such as MultiSOM, with global cluster analysis method, such as WebSOM Our second experiment, which takes part in the EICSTES EEC project, is an original Webometrics experiment that combines content and links classification starting from a large non-homogeneous set of web pages. This experiment highlights the fact that break-even points between our different measures of Recall/Precision can be used to determine an optimal number of clusters for web data classification. The content of the clusters obtained when using different break-even points are compared for determining the quality of the resulting maps.
引用
收藏
页码:445 / 462
页数:18
相关论文
共 50 条
  • [21] Photovoltaic technologies: Mapping from patent analysis
    Vasconcelos Sampaio, Priscila Goncalves
    Aguirre Gonzalez, Mario Orestes
    de Vasconcelos, Rafael Monteiro
    Teixeira dos Santos, Marllen Aylla
    de Toledo, Jose Carlos
    Pinheiro Pereira, Jonathan Paulo
    RENEWABLE & SUSTAINABLE ENERGY REVIEWS, 2018, 93 : 215 - 224
  • [22] Agricultural biotechnology worldwide patent analysis and mapping
    Wang, Xiuhong
    AFRICAN JOURNAL OF BIOTECHNOLOGY, 2011, 10 (10): : 1936 - 1944
  • [23] A Simple Approach of Groundwater Quality Analysis, Classification, and Mapping in Peshawar, Pakistan
    Adnan, Syed
    Iqbal, Javed
    Maltamo, Matti
    Bacha, Muhammad Suleman
    Shahab, Asfandyar
    Valbuena, Ruben
    ENVIRONMENTS, 2019, 6 (12)
  • [24] Analysis of Patent Application Attention: A Network Analysis Method
    Mao, Shihao
    Hu, Yuxia
    Yuan, Xuesong
    Zhang, Mengyue
    Qiu, Qirong
    Wu, Peng
    FRONTIERS IN PHYSICS, 2022, 10
  • [25] A Web API and Web Application Development for Dissemination of Air Quality Information
    Sahin, K.
    Isikdag, U.
    4TH INTERNATIONAL GEOADVANCES WORKSHOP - GEOADVANCES 2017: ISPRS WORKSHOP ON MULTI-DIMENSIONAL & MULTI-SCALE SPATIAL DATA MODELING, 2017, 4-4 (W4): : 373 - 378
  • [26] Analysis and Practical Application of PHP Frameworks in Development of Web Information Systems
    Prokofyeva, Natalya
    Boltunova, Victoria
    ICTE 2016, 2017, 104 : 51 - 56
  • [27] Hybrid-Patent Classification Based on Patent-Network Analysis
    Liu, Duen-Ren
    Shih, Meng-Jung
    JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY, 2011, 62 (02): : 246 - 256
  • [28] Research and application of patent map analysis
    Fang, Shu
    Zhang, Man
    Xiao, Guo-Hua
    PROCEEDINGS OF ISSI 2007: 11TH INTERNATIONAL CONFERENCE OF THE INTERNATIONAL SOCIETY FOR SCIENTOMETRICS AND INFORMETRICS, VOLS I AND II, 2007, : 254 - 265
  • [29] NEW ESTIMATORS OF DISTURBANCES IN REGRESSION ANALYSIS
    ABRAHAMSE, AP
    KOERTS, J
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1971, 66 (333) : 71 - 74
  • [30] Development of new technology forecasting algorithm: Hybrid approach for morphology analysis and conjoint analysis of patent information
    Yoon, Byungun
    Park, Yongtae
    IEEE TRANSACTIONS ON ENGINEERING MANAGEMENT, 2007, 54 (03) : 588 - 599