Visualization of Big Data

被引:0
|
作者
Kung, Sun-Yuan [1 ]
机构
[1] Princeton Univ, Dept Elect Engn, Princeton, NJ 08544 USA
关键词
Big data; visualization; data mmmg; learning; complexity; projection; dimensional reduction; tools;
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Big data has many divergent types of sources, from physical (sensor/loT) to social and cyber (web) types, rendering it messy, imprecise, and incomplete. Due to its quantitative (volume and velocity) and qualitative (variety) challenges, big data to the users resembles something like "the elephant to the blind men". It is imperative to enact a major paradigm shift in data mining and learning tools so that information from diversified sources must be integrated together to unravel information hidden in the massive and messy big data, so that, metaphorically speaking, it would let the blind men "see" the elephant. This talk will address yet another vital "V"-paradigm: "Visualization". Visualization tools are meant to supplement (instead of replace) the domain expertise (e. g. a cardiologist) and provide a big picture to help users formulate critical questions and subsequently postulate heuristic and insightful answers. For big data, the curse of high feature dimensionality is causing grave concerns on computational complexity and over-training. In this talk, we shall explore various projection methods for dimension reduction - a prelude to visualization of vectorial and non-vectorial data. A popular visualization tool for unsupervised learning is Principal Component Analysis (PCA). PCA aims at best recoverability of the original data in the Euclidean Vector Space (EVS). We shall propose a supervised PCA Discriminant Component Analysis (DCA) - in a Canonical Vector Space (CVS). Simulations confirm that DCA far outperforms PCA, both numerically and visually. More importantly, via proper interplay between anti-recoverability in EVS and discriminant power in CVS, DCA is promising for privacy protection when personal data are being shared on the cloud in collaborative learning environments. We shall extend PCA/DCA to kernel PCA/DCA for the purpose of visualizing nonvectorial data. The success of kernel methods depend critically on which kernel function is used to represent the similarity of a pair of objects. For visualization of nonvectorial and incompletely specified data, our experimental study points to a promising application of multi-kernels, including an imputed Gaussian RBF kernel and a partial correlation kernel.
引用
收藏
页码:447 / 448
页数:2
相关论文
共 50 条
  • [41] Research on Visualization and Application of Medical Big Data
    Zhao, Hang
    Li, Guijie
    Feng, Wei
    [J]. 2018 INTERNATIONAL CONFERENCE ON ROBOTS & INTELLIGENT SYSTEM (ICRIS 2018), 2018, : 383 - 386
  • [42] Towards Big Data Visualization for Augmented Reality
    Olshannikova, Ekaterina
    Ometov, Aleksandr
    Koucheryavy, Yevgeni
    [J]. 2014 IEEE 16TH CONFERENCE ON BUSINESS INFORMATICS (CBI), VOL 2, 2014, : 33 - 37
  • [43] BIG-DATA VISUALIZATION FOR TRANSLATIONAL NEUROTRAUMA
    Nielson, Jessica
    Inoue, Tomoo
    Paquette, Jesse
    Lin, Amity
    Sacramento, Jeffrey
    Liu, Aiwen W.
    Guandique, Cristian F.
    Irvine, Karen-Amanda
    Gensel, John C.
    Beattie, Michael S.
    Bresnahan, Jacqueline C.
    Manley, Geoffrey T.
    Carlsson, Gunnar
    Lum, Pek Yee
    Ferguson, Adam R.
    [J]. JOURNAL OF NEUROTRAUMA, 2013, 30 (15) : A61 - A62
  • [44] Visualization of (multimedia) dependencies from big data
    Loredana Caruccio
    Vincenzo Deufemia
    Giuseppe Polese
    [J]. Multimedia Tools and Applications, 2019, 78 : 33151 - 33167
  • [45] Visualization Analysis for Big Data in Computational CyberPsychology
    Li, Baobin
    Zhu, Tingshao
    [J]. HUMAN CENTERED COMPUTING, HCC 2014, 2015, 8944 : 701 - 707
  • [46] Big Data trends: Modelling, Management and Visualization
    Gil, David
    Trujillo, Juan
    Song, Il-Yeol
    [J]. EXPERT SYSTEMS, 2016, 33 (04) : 362 - 363
  • [47] Deriving Big Data insights using Data Visualization Techniques
    Chandrasekar, Jesintha Bala
    Murugesh, Shivakumar
    Prasadula, Vasudeva Rao
    [J]. PROCEEDINGS OF THE 2019 INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTING AND CONTROL SYSTEMS (ICCS), 2019, : 724 - 731
  • [48] Visualization and descriptive analytics of wellness data through Big Data
    Hussain, Shujaat
    Lee, Sungyoung
    [J]. 2015 TENTH INTERNATIONAL CONFERENCE ON DIGITAL INFORMATION MANAGEMENT (ICDIM), 2015, : 164 - 166
  • [49] Big data visualization for in-situ data exploration for sportsperson
    Li, Wenya
    Karthik, C.
    Rajalakshmi, M.
    [J]. COMPUTERS & ELECTRICAL ENGINEERING, 2022, 99
  • [50] Application of Data Visualization and Big Data Analysis in Intelligent Agriculture
    Liu, Wei
    [J]. Journal of Computing and Information Technology, 2022, 29 (04) : 251 - 263