A review of cluster analysis techniques and their uses in library and information science research: k-means and k-medoids clustering

被引:27
|
作者
Lund, Brady [1 ]
Ma, Jinxuan [1 ]
机构
[1] Emporia State Univ, Emporia, KS 66801 USA
关键词
Clustering; Library and information science; Research methods; Cluster analysis; Data analysis; K-means;
D O I
10.1108/PMM-05-2021-0026
中图分类号
G25 [图书馆学、图书馆事业]; G35 [情报学、情报工作];
学科分类号
1205 ; 120501 ;
摘要
Purpose - This literature review explores the definitions and characteristics of cluster analysis, a machine-learning technique that is frequently implemented to identify groupings in big datasets and its applicability to library and information science (LIS) research. This overview is intended for researchers who are interested in expanding their data analysis repertory to include cluster analysis, rather than for existing experts in this area. Design/methodology/approach - A review of LIS articles included in the Library and Information Source (EBSCO) database that employ cluster analysis is performed. An overview of cluster analysis in general (how it works from a statistical standpoint, and how it can be performed by researchers), the most popular cluster analysis techniques and the uses of cluster analysis in LIS is presented. Findings - The number of LIS studies that employ a cluster analytic approach has grown from about 5 per year in the early 2000s to an average of 35 studies per year in the mid- and late-2010s. The journal Scientometrics has the most articles published within LIS that use cluster analysis (102 studies). Scientometrics is the most common subject area to employ a cluster analytic approach (152 studies). The findings of this review indicate that cluster analysis could make LIS research more accessible by providing an innovative and insightful process of knowledge discovery. Originality/value - This review is the first to present cluster analysis as an accessible data analysis approach, specifically from an LIS perspective.
引用
收藏
页码:161 / 173
页数:13
相关论文
共 50 条
  • [21] Research on k-means Clustering Algorithm An Improved k-means Clustering Algorithm
    Shi Na
    Liu Xumin
    Guan Yong
    2010 THIRD INTERNATIONAL SYMPOSIUM ON INTELLIGENT INFORMATION TECHNOLOGY AND SECURITY INFORMATICS (IITSI 2010), 2010, : 63 - 67
  • [22] LPOCSIN With K-Means: An Overlapping Clustering Technique with Cluster Information
    Sarker, Partho Sarathi
    Showrov, Md. Imran Hossain
    2018 3RD INTERNATIONAL CONFERENCE ON ELECTRICAL, ELECTRONICS, COMMUNICATION, COMPUTER, AND OPTIMIZATION TECHNIQUES (ICEECCOT - 2018), 2018, : 21 - 25
  • [23] An efficient incremental clustering based improved K-Medoids for IoT multivariate data cluster analysis
    Sivadi Balakrishna
    M. Thirumaran
    R. Padmanaban
    Vijender Kumar Solanki
    Peer-to-Peer Networking and Applications, 2020, 13 : 1152 - 1175
  • [24] Improved K-Medoids Clustering Based on Cluster Validity Index and Object Density
    Pardeshi, Bharat
    Toshniwal, Durga
    2010 IEEE 2ND INTERNATIONAL ADVANCE COMPUTING CONFERENCE, 2010, : 379 - 384
  • [25] An efficient incremental clustering based improved K-Medoids for IoT multivariate data cluster analysis
    Balakrishna, Sivadi
    Thirumaran, M.
    Padmanaban, R.
    Solanki, Vijender Kumar
    PEER-TO-PEER NETWORKING AND APPLICATIONS, 2020, 13 (04) : 1152 - 1175
  • [26] k*-means -: A generalized k-means clustering algorithm with unknown cluster number
    Cheung, YM
    INTELLIGENT DATA ENGINEERING AND AUTOMATED LEARNING - IDEAL 2002, 2002, 2412 : 307 - 317
  • [27] Handling Missing Values in Chronic Kidney Disease Datasets Using KNN, K-Means and K-Medoids Algorithms
    Mahboob, Tahira
    Ijaz, Aimen
    Shahzad, Amber
    Kalsoom, Muqadas
    2018 12TH INTERNATIONAL CONFERENCE ON OPEN SOURCE SYSTEMS AND TECHNOLOGIES (ICOSST), 2018, : 76 - 81
  • [28] Review on the Research of K-means Clustering Algorithm in Big Data
    Chen Jie
    Zhang Jiyue
    Wu Junhui
    Wu Yusheng
    Si Huiping
    Lin Kaiyan
    2020 IEEE THE 3RD INTERNATIONAL CONFERENCE ON ELECTRONICS AND COMMUNICATION ENGINEERING (ICECE), 2020, : 107 - 111
  • [29] Using cluster analysis techniques based on K-means and Kohonen clustering methods in credit scoring
    Sopko, Stanislav
    Soldatyuk, Nataliya
    MATHEMATICAL METHODS IN ECONOMICS (MME 2014), 2014, : 932 - 937
  • [30] KM-MIC: An improved maximum information coefficient based on K-Medoids clustering
    Zhang, Yali
    Shang, Pengjian
    COMMUNICATIONS IN NONLINEAR SCIENCE AND NUMERICAL SIMULATION, 2022, 111