Visualizing Profiles of Large Datasets of Weighted and Mixed Data

被引:4
|
作者
Grane, Aurea [1 ]
Sow-Barry, Alpha A. [1 ]
机构
[1] Univ Carlos III Madrid, Dept Stat, Getafe 28903, Spain
关键词
clustering; Gower’ s interpolation formula; s metric; mixed data; multidimensional scaling; ALGORITHM; HEALTH; EUROPE;
D O I
10.3390/math9080891
中图分类号
O1 [数学];
学科分类号
0701 ; 070101 ;
摘要
This work provides a procedure with which to construct and visualize profiles, i.e., groups of individuals with similar characteristics, for weighted and mixed data by combining two classical multivariate techniques, multidimensional scaling (MDS) and the k-prototypes clustering algorithm. The well-known drawback of classical MDS in large datasets is circumvented by selecting a small random sample of the dataset, whose individuals are clustered by means of an adapted version of the k-prototypes algorithm and mapped via classical MDS. Gower's interpolation formula is used to project remaining individuals onto the previous configuration. In all the process, Gower's distance is used to measure the proximity between individuals. The methodology is illustrated on a real dataset, obtained from the Survey of Health, Ageing and Retirement in Europe (SHARE), which was carried out in 19 countries and represents over 124 million aged individuals in Europe. The performance of the method was evaluated through a simulation study, whose results point out that the new proposal solves the high computational cost of the classical MDS with low error.
引用
下载
收藏
页数:20
相关论文
共 50 条
  • [1] Graphics of large datasets: Visualizing a million
    Natarajan, Rajesh
    INTERFACES, 2007, 37 (05) : 494 - 496
  • [2] Visualizing large datasets: A case study with data of the buses of Sao Paulo city
    Alves Dias, Felipe Cordeiro
    Cordeiro, Daniel
    2018 IEEE 37TH INTERNATIONAL SYMPOSIUM ON RELIABLE DISTRIBUTED SYSTEMS WORKSHOPS (SRDSW 2018), 2018, : 10 - 13
  • [3] Software for visualizing volume rendering of large datasets
    Fallah, Navid
    Eydgahi, Ali
    IMECS 2008: INTERNATIONAL MULTICONFERENCE OF ENGINEERS AND COMPUTER SCIENTISTS, VOLS I AND II, 2008, : 697 - 701
  • [4] Visualizing gridded datasets with large number of missing values
    Djurcilov, Suzana
    Pang, Alex
    Proceedings of the IEEE Visualization Conference, 1999, : 405 - 408
  • [5] Datamap Visualization Technique for Interactively Visualizing Large Datasets
    Nykanen, Ossi
    PROCEEDINGS OF THE 17TH INTERNATIONAL ACADEMIC MINDTREK CONFERENCE: MAKING SENSE OF CONVERGING MEDIA, 2013, : 52 - 58
  • [6] Visualizing Large Datasets in TOPCAT v4
    Taylor, M. B.
    ASTRONOMICAL DATA ANALYSIS SOFTWARE AND SYSTEMS XXIII, 2014, 485 : 257 - 260
  • [7] MELD: Mixed effects for large datasets
    Nielson, Dylan M.
    Sederberg, Per B.
    PLOS ONE, 2017, 12 (08):
  • [8] VISUALIZING LARGE DATA SETS
    HIBBARD, WL
    SANTEK, DA
    INTERACTIVE INFORMATION AND PROCESSING SYSTEMS FOR METEOROLOGY, OCEANOGRAPHY AND HYDROLOGY, 1988, : 172 - 174
  • [9] Visualizing large data sets
    不详
    R&D MAGAZINE, 1998, 40 (01): : 73 - 73
  • [10] Visualizing large relational datasets by combining grand tour with footprint splatting of high dimensional data cubes
    Yang, L
    COMPUTATIONAL SCIENCE AND ITS APPLICATIONS - ICCSA 2003, PT 1, PROCEEDINGS, 2003, 2667 : 11 - 20