Interactive visual formula composition of multidimensional data classifiers

被引:0
|
作者
Derstroff, Adrian [1 ]
Leistikow, Simon [1 ]
Nahardani, Ali [2 ]
Gruen, Katja [3 ]
Franz, Marcus [3 ]
Hoerr, Verena [2 ]
Linsen, Lars [1 ]
机构
[1] Univ Munster, Inst Comp Sci, Einsteinstra 62, D-48149 Munster, Nordrhein Westf, Germany
[2] Univ Hosp Bonn, Heart Ctr Bonn, Dept Internal Med 2, Bonn, Germany
[3] Jena Univ Hosp, Dept Internal Med 1, Div Cardiol Angiol Pneumol & Intens Med Care, Jena, Germany
关键词
Classification; feature space; formulas; multidimensional data; visual analysis; FEATURE-SELECTION; RELEVANCE;
D O I
10.1177/14738716241270288
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Understanding how a classification result is generated and what role individual features play in the classification is crucial in many applications and, in particular, in medical contexts such as the translation of diagnosis biomarkers into clinical practice. The goal is to find (ideally simple) relationships between the features in multi-dimensional data and the classification for an explanation of the underlying phenomenon. Mathematical formulas allow for the expression of these relationships and can serve as classifiers. However, there are infinitely many mathematical formulas for the given features and they bear an inherent trade-off between complexity and accuracy. We present an interactive visual approach that supports domain experts to mitigate the trade-off issue. Core to our approach is a novel feature selection method, from which formulas are composed using symbolic regression and where state-of-the-art classifiers serve as a reference. To evaluate our approach and compare the achieved classification performance to the performance achieved by other state-of-the-art feature selection techniques, we test our methods with well-known machine learning data sets. Our evaluation shows that our feature selection method performs better than randomly selecting features for data sets with many features or when a low number of generations in the symbolic regression is required. Moreover, it consistently matches or outperforms state-of-the-art methods. Moreover, we apply our approach in a case study to a hemodynamic cohort data set, where we report our findings and domain expert feedback. Our approach was able to find formulas containing features that are in agreement with literature. Also, we could find formulas that performed better in the micro-averaged F1 score when compared to established histological indices.
引用
收藏
页码:42 / 61
页数:20
相关论文
共 50 条
  • [1] Interactive Visual Summarization of Multidimensional Data
    Kocherlakota, Sarat M.
    Healey, Christopher G.
    2009 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS (SMC 2009), VOLS 1-9, 2009, : 362 - +
  • [2] Multidimensional data visual exploration by interactive information segments
    Ferrer-Troyano, FJ
    Aguilar-Ruiz, JS
    Riquelme, JC
    DATA WAREHOUSING AND KNOWLEDGE DISCOVERY, PROCEEDINGS, 2004, 3181 : 239 - 248
  • [3] DataScope: Interactive Visual Exploratory Dashboards For Large Multidimensional Data
    Iyer, Ganesh
    DuttaDuwarah, Sapoonjyoti
    Sharma, Ashish
    2017 IEEE WORKSHOP ON VISUAL ANALYTICS IN HEALTHCARE (VAHC), 2017, : 17 - 23
  • [4] PivotViz: Interactive Visual Analysis of Multidimensional Library Transaction Data
    Nielsen, Matthias
    Gronbaek, Kaj
    PROCEEDINGS OF THE 15TH ACM/IEEE-CS JOINT CONFERENCE ON DIGITAL LIBRARIES (JCDL'15), 2015, : 139 - 142
  • [5] Interactive visual exploration of multidimensional data: Requirements for CommonGIS with OLAP
    Voss, A
    Hernandez, V
    Voss, H
    Scheider, S
    15TH INTERNATIONAL WORKSHOP ON DATABASE AND EXPERT SYSTEMS APPLICATIONS, PROCEEDINGS, 2004, : 883 - 887
  • [6] Interactive Spatial Sonification of Multidimensional Data for Composition and Auditory Display
    Barrett, Natasha
    COMPUTER MUSIC JOURNAL, 2016, 40 (02) : 47 - 69
  • [7] Data analysis of credit organizations by means of interactive visual analysis of multidimensional data
    Milman, I.E.
    Pakhomov, A.P.
    Pilyugin, V.V.
    Pisarchik, E.E.
    Stepanov, A.A.
    Beketnova, Yu.M.
    Denisenko, A.S.
    Fomin, Ya.A.
    Scientific Visualization, 2015, 7 (01): : 45 - 64
  • [8] Interactive visual analytics tool for multidimensional quantitative and categorical data analysis
    Shahid, Muhammad Laiq Ur Rahman
    Molchanov, Vladimir
    Mir, Junaid
    Shaukat, Furqan
    Linsen, Lars
    INFORMATION VISUALIZATION, 2020, 19 (03) : 234 - 246
  • [9] MultiClusterTree: Interactive Visual Exploration of Hierarchical Clusters in Multidimensional Multivariate Data
    Van Long, Tran
    Linsen, Lars
    COMPUTER GRAPHICS FORUM, 2009, 28 (03) : 823 - 830
  • [10] ClaVis: An Interactive Visual Comparison System for Classifiers
    Heyen, Frank
    Munz, Tanja
    Neumann, Michael
    Ortega, Daniel
    Ngoc Thang Vu
    Weiskopf, Daniel
    Sedlmair, Michael
    PROCEEDINGS OF THE WORKING CONFERENCE ON ADVANCED VISUAL INTERFACES AVI 2020, 2020,