LDSScanner: Exploratory Analysis of Low-Dimensional Structures in High-Dimensional Datasets

被引:59
|
作者
Xia, Jiazhi [1 ]
Ye, Fenjin [1 ]
Chen, Wei [2 ]
Wang, Yusi [1 ]
Chen, Weifeng [3 ]
Ma, Yuxin [2 ]
Tung, Anthony K. H. [4 ]
机构
[1] Cent South Univ, Changsha, Hunan, Peoples R China
[2] Zhejiang Univ, Hangzhou, Zhejiang, Peoples R China
[3] Zhejiang Univ Finance & Econ, Hangzhou, Zhejiang, Peoples R China
[4] Natl Univ Singapore, Singapore, Singapore
基金
美国国家科学基金会; 国家自然科学基金重大项目;
关键词
High-dimensional data; low-dimensional structure; subspace; manifold; visual exploration; VISUAL EXPLORATION; VISUALIZATION; REDUCTION; METRICS;
D O I
10.1109/TVCG.2017.2744098
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Many approaches for analyzing a high-dimensional dataset assume that the dataset contains specific structures, e.g., clusters in linear subspaces or non-linear manifolds. This yields a trial-and-error process to verify the appropriate model and parameters. This paper contributes an exploratory interface that supports visual identification of low-dimensional structures in a high-dimensional dataset, and facilitates the optimized selection of data models and configurations. Our key idea is to abstract a set of global and local feature descriptors from the neighborhood graph-based representation of the latent low-dimensional structure, such as pairwise geodesic distance (GD) among points and pairwise local tangent space divergence (LTSD) among pointwise local tangent spaces (LTS). We propose a new LTSD-GD view, which is constructed by mapping LTSD and GD to the x axis and y axis using 1D multidimensional scaling, respectively. Unlike traditional dimensionality reduction methods that preserve various kinds of distances among points, the LTSD-GD view presents the distribution of pointwise LTS (x axis) and the variation of LTS in structures (the combination of x axis and y axis). We design and implement a suite of visual tools for navigating and reasoning about intrinsic structures of a high-dimensional dataset. Three case studies verify the effectiveness of our approach.
引用
收藏
页码:236 / 245
页数:10
相关论文
共 50 条
  • [31] Low-dimensional embedding of fMRI datasets
    Shen, Xilin
    Meyer, Francois G.
    NEUROIMAGE, 2008, 41 (03) : 886 - 902
  • [32] MOLECULAR-ORBITAL BASIS FOR SUPERCONDUCTIVITY IN HIGH-DIMENSIONAL AND LOW-DIMENSIONAL METALS
    JOHNSON, KH
    MESSMER, RP
    SYNTHETIC METALS, 1983, 5 (3-4) : 151 - 204
  • [33] INFERENCE FOR LOW-DIMENSIONAL COVARIATES IN A HIGH-DIMENSIONAL ACCELERATED FAILURE TIME MODEL
    Chai, Hao
    Zhang, Qingzhao
    Huang, Jian
    Ma, Shuangge
    STATISTICA SINICA, 2019, 29 (02) : 877 - 894
  • [34] An enriched approach to combining high-dimensional genomic and low-dimensional phenotypic data
    Cabrera, Javier
    Emir, Birol
    Cheng, Ge
    Duan, Yajie
    Alemayehu, Demissie
    Cherkas, Yauheniya
    JOURNAL OF BIOPHARMACEUTICAL STATISTICS, 2024,
  • [35] Low-dimensional chaos in high-dimensional phase space: how does it occur?
    Lai, YC
    Bollt, EM
    Liu, ZH
    CHAOS SOLITONS & FRACTALS, 2003, 15 (02) : 219 - 232
  • [36] Transformation from a low-dimensional framework to a high-dimensional architecture based on different metal ions: Syntheses, structures, and photoluminescences
    Gong, Yun
    Tang, Wang
    Hou, Wenbin
    Zha, Zhongyong
    Hu, Changwen
    INORGANIC CHEMISTRY, 2006, 45 (13) : 4987 - 4995
  • [37] Synthetic Generation of High-Dimensional Datasets
    Albuquerque, Georgia
    Loewe, Thomas
    Magnor, Marcus
    IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, 2011, 17 (12) : 2317 - 2324
  • [38] Joining massive high-dimensional datasets
    Kahveci, T
    Lang, CA
    Singh, AK
    19TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING, PROCEEDINGS, 2003, : 265 - 276
  • [39] A nonparametric probabilistic approach for quantifying uncertainties in low-dimensional and high-dimensional nonlinear models
    Soize, C.
    Farhat, C.
    INTERNATIONAL JOURNAL FOR NUMERICAL METHODS IN ENGINEERING, 2017, 109 (06) : 837 - 888
  • [40] Cluster validation for high-dimensional datasets
    Kim, M
    Yoo, H
    Ramakrishna, RS
    ARTIFICIAL INTELLIGENCE: METHODOLOGY, SYSTEMS, AND APPLICATIONS, PROCEEDINGS, 2004, 3192 : 178 - 187