Very Fast Interactive Visualization of Large Sets of High-dimensional Data

被引:9
|
作者
Dzwinel, Witold [1 ]
Wcislo, Rafal [1 ]
机构
[1] AGH Univ Sci & Technol, PL-30059 Krakow, Poland
关键词
multidimensional scaling; particle-based stress minimization; interactive visualization;
D O I
10.1016/j.procs.2015.05.325
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
The embedding of high-dimensional data into 2D/3D space is the most popular way of data visualization. Despite recent advances in developing of very accurate dimensionality reduction algorithms, such as BH-SNE, Q-SNE and LoCH, their relatively high computational complexity still remains the obstacle for interactive visualization of truly large datasets consisting of M similar to 10(6+) of high-dimensional N similar to 10(3+) feature vectors. We show that a new clone of the multidimensional scaling (MDS) - nr-MDS - can be up to two orders of magnitude faster than the modern dimensionality reduction algorithms. We postulate its linear O(M) computational and memory complexities. Simultaneously, our method preserves in 2D/3D target spaces high separability of data, similar to that obtained by the state-of-the-art dimensionality reduction algorithms. We present the effects of nr-MDS application in visualization of data repositories such as 20 Newsgroups (M = 1.8 . 10(4)), MNIST (M = 7 . 10(4)) and REUTERS (M = 2.67 . 10(5)).
引用
收藏
页码:572 / 581
页数:10
相关论文
共 50 条
  • [1] Visualization of very large high-dimensional data sets as minimum spanning trees
    Daniel Probst
    Jean-Louis Reymond
    [J]. Journal of Cheminformatics, 12
  • [2] Visualization of very large high-dimensional data sets as minimum spanning trees
    Probst, Daniel
    Reymond, Jean-Louis
    [J]. JOURNAL OF CHEMINFORMATICS, 2020, 12 (01)
  • [3] On the interactive visualization of very large image data sets
    Ekpar, Frank
    Yoneda, Masaaki
    Hase, Hiroyuki
    [J]. 2007 CIT: 7TH IEEE INTERNATIONAL CONFERENCE ON COMPUTER AND INFORMATION TECHNOLOGY, PROCEEDINGS, 2007, : 627 - 632
  • [4] Missing data in interactive high-dimensional data visualization
    Swayne, DF
    Buja, A
    [J]. COMPUTATIONAL STATISTICS, 1998, 13 (01) : 15 - 26
  • [5] Interactive Visualization of High-Dimensional Petascale Ocean Data
    Ellsworth, David A.
    Henze, Christopher E.
    Nelson, Bron C.
    [J]. 2017 IEEE 7TH SYMPOSIUM ON LARGE DATA ANALYSIS AND VISUALIZATION (LDAV), 2017, : 36 - 44
  • [6] Interactive Visualization of Large Data Sets
    Godfrey, Parke
    Gryz, Jarek
    Lasek, Piotr
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2016, 28 (08) : 2142 - 2157
  • [7] Outlier mining in large high-dimensional data sets
    Angiulli, F
    Pizzuti, C
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2005, 17 (02) : 203 - 215
  • [8] High-dimensional data visualization
    Tang, Lin
    [J]. NATURE METHODS, 2020, 17 (02) : 129 - 129
  • [9] High-dimensional data visualization
    Lin Tang
    [J]. Nature Methods, 2020, 17 : 129 - 129
  • [10] Focused multidimensional scaling: interactive visualization for exploration of high-dimensional data
    Lea M. Urpa
    Simon Anders
    [J]. BMC Bioinformatics, 20