An Empirical Evaluation of the t-SNE Algorithm for Data Visualization in Structural Engineering

被引:13
|
作者
Hajibabaee, Parisa [1 ]
Pourkamali-Anaraki, Farhad [1 ]
Hariri-Ardebili, Mohammad Amin [2 ]
机构
[1] Univ Massachusetts, Comp Sci, Lowell, MA 01854 USA
[2] Univ Colorado, Civil Environm & Architectural Engn, Boulder, CO 80309 USA
关键词
Classification algorithms; supervised learning; dimensionality reduction; feature extraction; oversampling; EPISTEMIC UNCERTAINTY; RELIABILITY-ANALYSIS; SMOTE; CHALLENGES; REDUCTION;
D O I
10.1109/ICMLA52953.2021.00267
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A fundamental task in machine learning involves visualizing high-dimensional data sets that arise in high-impact application domains. When considering the context of large imbalanced data, this problem becomes much more challenging. In this paper, the t-Distributed Stochastic Neighbor Embedding (tSNE) algorithm is used to reduce the dimensions of an earthquake engineering related data set for visualization purposes. Since imbalanced data sets greatly affect the accuracy of classifiers, we employ Synthetic Minority Oversampling Technique (SMOTE) to tackle the imbalanced nature of such data set. We present the result obtained from t-SNE and SMOTE and compare it to the basic approaches with various aspects. Considering four options and six classification algorithms, we show that using t-SNE on the imbalanced data and SMOTE on the training data set, neural network classifiers have promising results without sacrificing accuracy. Hence, we can transform the studied scientific data into a two-dimensional (2D) space, enabling the visualization of the classifier and the resulting decision surface using a 2D plot.
引用
收藏
页码:1674 / 1680
页数:7
相关论文
共 50 条
  • [1] Visualization of SNPs with t-SNE
    Platzer, Alexander
    [J]. PLOS ONE, 2013, 8 (02):
  • [2] Parallel t-SNE Applied to Data Visualization in Smart Cities
    Da Silva Lopes, Maximiliano Araujo
    Doria Neto, Adriao D.
    De Medeiros Martins, Allan
    [J]. IEEE ACCESS, 2020, 8 : 11482 - 11490
  • [3] A New Supervised t-SNE with Dissimilarity Measure for Effective Data Visualization and Classification
    Hajderanj, Laureta
    Weheliye, Isakh
    Chen, Daqing
    [J]. PROCEEDINGS OF 2019 8TH INTERNATIONAL CONFERENCE ON SOFTWARE AND INFORMATION ENGINEERING (ICSIE 2019), 2019, : 232 - 236
  • [4] Visualizing Data using t-SNE
    van der Maaten, Laurens
    Hinton, Geoffrey
    [J]. JOURNAL OF MACHINE LEARNING RESEARCH, 2008, 9 : 2579 - 2605
  • [5] Generic Process Visualization Using Parametric t-SNE
    Zhu, Wenbo
    Webb, Zachary
    Han, Xianyao
    Mao, Kaitian
    Sun, Wei
    Romagnoli, Jose
    [J]. IFAC PAPERSONLINE, 2018, 51 (18): : 803 - 808
  • [6] Stability analysis of the t-SNE algorithm for human activity pattern data
    Hamad, Rebeen Ali
    Jarpe, Eric
    Lundstrom, Jens
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2018, : 1839 - 1845
  • [7] Seeing data as t-SNE and UMAP do
    Marx, Vivien
    [J]. NATURE METHODS, 2024, 21 (06) : 930 - 933
  • [8] Data on a coupled ENN / t-SNE model for soil liquefaction evaluation
    Njock, Pierre Guy Atangana
    Shen, Shui-Long
    Zhou, Annan
    Lyu, Hai-Min
    [J]. DATA IN BRIEF, 2020, 29
  • [9] t-SNE Visualization of Large-Scale Neural Recordings
    Dimitriadis, George
    Neto, Joana P.
    Kampff, Adam R.
    [J]. NEURAL COMPUTATION, 2018, 30 (07) : 1750 - 1774
  • [10] Application of t-SNE to human genetic data
    Li, Wentian
    Cerise, Jane E.
    Yang, Yaning
    Han, Henry
    [J]. JOURNAL OF BIOINFORMATICS AND COMPUTATIONAL BIOLOGY, 2017, 15 (04)