Visualization of Big Data

被引:0
|
作者
Kung, Sun-Yuan [1 ]
机构
[1] Princeton Univ, Dept Elect Engn, Princeton, NJ 08544 USA
关键词
Big data; visualization; data mmmg; learning; complexity; projection; dimensional reduction; tools;
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Big data has many divergent types of sources, from physical (sensor/loT) to social and cyber (web) types, rendering it messy, imprecise, and incomplete. Due to its quantitative (volume and velocity) and qualitative (variety) challenges, big data to the users resembles something like "the elephant to the blind men". It is imperative to enact a major paradigm shift in data mining and learning tools so that information from diversified sources must be integrated together to unravel information hidden in the massive and messy big data, so that, metaphorically speaking, it would let the blind men "see" the elephant. This talk will address yet another vital "V"-paradigm: "Visualization". Visualization tools are meant to supplement (instead of replace) the domain expertise (e. g. a cardiologist) and provide a big picture to help users formulate critical questions and subsequently postulate heuristic and insightful answers. For big data, the curse of high feature dimensionality is causing grave concerns on computational complexity and over-training. In this talk, we shall explore various projection methods for dimension reduction - a prelude to visualization of vectorial and non-vectorial data. A popular visualization tool for unsupervised learning is Principal Component Analysis (PCA). PCA aims at best recoverability of the original data in the Euclidean Vector Space (EVS). We shall propose a supervised PCA Discriminant Component Analysis (DCA) - in a Canonical Vector Space (CVS). Simulations confirm that DCA far outperforms PCA, both numerically and visually. More importantly, via proper interplay between anti-recoverability in EVS and discriminant power in CVS, DCA is promising for privacy protection when personal data are being shared on the cloud in collaborative learning environments. We shall extend PCA/DCA to kernel PCA/DCA for the purpose of visualizing nonvectorial data. The success of kernel methods depend critically on which kernel function is used to represent the similarity of a pair of objects. For visualization of nonvectorial and incompletely specified data, our experimental study points to a promising application of multi-kernels, including an imputed Gaussian RBF kernel and a partial correlation kernel.
引用
收藏
页码:447 / 448
页数:2
相关论文
共 50 条
  • [1] The Big Picture for Big Data: Visualization
    Shneiderman, Ben
    [J]. SCIENCE, 2014, 343 (6172) : 730 - 730
  • [2] Big Data, Big Picture - Data Visualization of Health
    Bourke, Alison
    Ryan, Patrick B.
    Elhadad, Noemie
    Perer, Adam
    [J]. PHARMACOEPIDEMIOLOGY AND DRUG SAFETY, 2016, 25 : 48 - 48
  • [3] Interactive Visualization of Big Data
    Godfrey, Parke
    Gryz, Jarek
    Lasek, Piotr
    Razavi, Nasim
    [J]. BEYOND DATABASES, ARCHITECTURES AND STRUCTURES, BDAS 2016, 2016, 613 : 3 - 22
  • [4] Big-Data Visualization
    Keim, Daniel
    Qu, Huamin
    Ma, Kwan-Liu
    [J]. IEEE COMPUTER GRAPHICS AND APPLICATIONS, 2013, 33 (04) : 20 - 21
  • [5] BIG DATA IMPLEMENTATION AND VISUALIZATION
    Gupta, Deepa
    Siddiqui, Sameera
    [J]. 2014 INTERNATIONAL CONFERENCE ON ADVANCES IN ENGINEERING AND TECHNOLOGY RESEARCH (ICAETR), 2014,
  • [6] The Connected Age: Big Data & Data Visualization
    Skiba, Diane J.
    [J]. NURSING EDUCATION PERSPECTIVES, 2014, 35 (04) : 267 - +
  • [7] Research on Data Visualization Based on Big Data
    Xu, Shasha
    Zheng, Kouquan
    Yang, Wenjing
    Sun, Yanming
    [J]. 2019 4TH INTERNATIONAL WORKSHOP ON MATERIALS ENGINEERING AND COMPUTER SCIENCES (IWMECS 2019), 2019, : 281 - 285
  • [8] Big network traffic data visualization
    Ruan, Zichan
    Miao, Yuantian
    Pan, Lei
    Xiang, Yang
    Zhang, Jun
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2018, 77 (09) : 11459 - 11487
  • [9] Big Data Exploration, Visualization and Analytics
    Bikakis, Nikos
    Papastefanatos, George
    Papaemmanouil, Olga
    [J]. BIG DATA RESEARCH, 2019, 18
  • [10] Big Data: from collection to visualization
    Mohammed Ghesmoune
    Hanene Azzag
    Salima Benbernou
    Mustapha Lebbah
    Tarn Duong
    Mourad Ouziri
    [J]. Machine Learning, 2017, 106 : 837 - 862