Exploration and Deconstruction of Correlation Cycles in Multidimensional Datasets

被引:0
|
作者
Dudas, Adam [1 ]
Krsak, Emil [2 ]
Kvassay, Miroslav [2 ]
机构
[1] Matej Bel Univ, Fac Nat Sci, Banska Bystrica 97401, Slovakia
[2] Univ Zilina, Fac Management & Informat, Zilina 01026, Slovakia
关键词
correlation cycles; correlation chains; correlation analysis; visual data analysis; regression analysis; predictive data analysis; MODEL;
D O I
10.3390/technologies13020085
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
Correlation analysis is one of the most prolific statistical methods used in data analysis problems, mining of knowledge focused on relationships of attributes in large datasets, and in various predictive tasks utilizing statistical, machine learning, and deep learning models. This approach to the analysis of functional relationships in multidimensional datasets is commonly used in conjunction with visual analysis approaches, which offer novel context for the relationships in data and clarify the results presented in large correlation matrices. One of such visualization methods uses graphical models called correlation graphs and chains, which visualize individual direct and indirect relationships between pairs of attributes in a dataset of interest as a graph structure, where vertices of the graph represent attributes of the dataset and edges between vertices represent the correlation of these attributes. This work focuses on the definition, identification, and exploration of so-called correlation cycles, which can be-through their deconstruction-used as an approach to lower error values in regression tasks. After the implementation of the correlation cycle identification and deconstruction, the proposed concept is evaluated on predictive analysis tasks in the context of three benchmarking datasets from the engineering field-the Sensor dataset, Superconductivity dataset, and Energy Farm dataset. The results obtained in this study show that when using simple, explainable regressors, the method utilizing deconstructed correlation cycles reaches a lower error rate in 83.3% of regression cases compared to the same regression models without the cycle incorporation.
引用
收藏
页数:19
相关论文
共 50 条
  • [1] Novel data visualisation and exploration in multidimensional datasets
    Kotsis, N
    Weir, GRS
    Ferguson, JD
    MacGregor, DR
    ENTERPRISE INFORMATION SYSTEMS III, 2002, : 90 - 96
  • [2] Correlation n-ptychs of Multidimensional Datasets
    Dudas, Adam
    GOOD PRACTICES AND NEW PERSPECTIVES IN INFORMATION SYSTEMS AND TECHNOLOGIES, VOL 6, WORLDCIST 2024, 2024, 990 : 151 - 160
  • [3] Correlation Analysis in Multidimensional Multivariate Time-varying Datasets
    Abedzadeh, Najmeh
    2015 IEEE SCIENTIFIC VISUALIZATION CONFERENCE (SCIVIS), 2015, : 139 - 140
  • [4] AVIST: A GPU-Centric Design for Visual Exploration of Large Multidimensional Datasets
    Mi, Peng
    Sun, Maoyuan
    Masiane, Moeti
    Cao, Yong
    North, Chris
    INFORMATICS-BASEL, 2016, 3 (04):
  • [5] Similarity-based visual exploration of very large georeferenced multidimensional datasets
    Peralta-Aranibar, Roger
    Pahins, Cicero A. L.
    Comba, Joao L. D.
    Gomez-Nieto, Erick
    SAC '19: PROCEEDINGS OF THE 34TH ACM/SIGAPP SYMPOSIUM ON APPLIED COMPUTING, 2019, : 683 - 686
  • [6] Gaussian Cubes: Real-Time Modeling for Visual Exploration of Large Multidimensional Datasets
    Wang, Zhe
    Ferreira, Nivan
    Wei, Youhao
    Bhaskar, Aarthy Sankari
    Scheidegger, Carlos
    IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, 2017, 23 (01) : 681 - 690
  • [7] Multidimensional Integration of RDF Datasets
    Behan, Jam Jahanzeb Khan
    Romero, Oscar
    Zimanyi, Esteban
    BIG DATA ANALYTICS AND KNOWLEDGE DISCOVERY, DAWAK 2019, 2019, 11708 : 119 - 135
  • [8] Spatial indexing of distributed multidimensional datasets
    Nam, B
    Sussman, A
    2005 IEEE International Symposium on Cluster Computing and the Grid, Vols 1 and 2, 2005, : 743 - 750
  • [9] Multidimensional Content eXploration
    Simitsis, Alkis
    Baid, Akanksha
    Sismanis, Yannis
    Reinwald, Berthold
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2008, 1 (01): : 660 - 671
  • [10] Statistical segmentation of multidimensional brain datasets
    Desco, M
    Gispert, JD
    Reig, S
    Santos, A
    Pascau, J
    Malpica, N
    Garcia-Barreno, P
    MEDICAL IMAGING: 2001: IMAGE PROCESSING, PTS 1-3, 2001, 4322 : 184 - 193