The Impact of Variable Selection and Transformation on the Interpretability and Accuracy of Fuzzy Models

被引:0
|
作者
Fuchs, Caro [1 ,2 ]
Spolaor, Simone [3 ]
Kaymak, Uzay [1 ]
Nobile, Marco S. [2 ,4 ,5 ]
机构
[1] Eindhoven Univ Technol, Jheronimus Acad Data Sci, sHertogenbosch, Netherlands
[2] Eindhoven Univ Technol, Dept Ind Engn & Innovat Sci, Eindhoven, Netherlands
[3] Eindhoven Univ Technol, Dept Mech Engn, Microsyst, Eindhoven, Netherlands
[4] Ca Foscari Univ Venice, Dept Environm Sci Informat & Stat, Venice, Italy
[5] Bicocca Bioinformat Biostat & Bioimaging Ctr B4, Milan, Italy
基金
欧盟地平线“2020”;
关键词
interpretable AI; data transformation; log-transformation; data normalization; machine learning; genetic algorithm; fuzzy model; fuzzy logic;
D O I
10.1109/CIBCB55180.2022.9863019
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Data transformation is an important step in Machine Learning pipelines which can strongly improve their performance. For instance, min-max normalization is often used to make all variables lie in the same range, while log-transformation is used to map data that is scattered across several orders of magnitude to a logarithmic space. Such transformations can be beneficial when the machine learning approach measures distance in a metric space, such as cluster-based approaches. These two transformation approaches can be combined to reveal hidden patterns in the data in the case of log-normally distributed data points, which commonly occur in biological and medical data. In this work we introduce a novel evolutionary approach designed to automatically determine the optimal log-transformation and selection of variables. Our approach is built around an interpretable AI system (created by pyFUME), so that all transformations are followed by inverse transformations to map back the values into the original universe of discourse, and preserve the interpretability of the results. We test our approach on two synthetic datasets, designed to reproduce a condition in which some variables are normally distributed, some variables are log-normally distributed, and some variables are just noise in the dataset. Our results show that our approach yields better performing models compared to conventional methods, and that the resulting model is also characterised by a better interpretability, making such approach particularly useful to study biomedical datasets.
引用
收藏
页码:155 / 162
页数:8
相关论文
共 50 条
  • [1] Interpretability of Fuzzy Temporal Models
    Shabelnikov, Alexander N.
    Kovalev, Sergey M.
    Sukhanov, Andrey V.
    PROCEEDINGS OF THE THIRD INTERNATIONAL SCIENTIFIC CONFERENCE INTELLIGENT INFORMATION TECHNOLOGIES FOR INDUSTRY (IITI'18), VOL 1, 2019, 874 : 223 - 234
  • [2] Variable selection and transformation in linear regression models
    Yeo, IK
    STATISTICS & PROBABILITY LETTERS, 2005, 72 (03) : 219 - 226
  • [3] Balancing Accuracy and Interpretability through Neuro-Fuzzy Models for Cardiovascular Risk Assessment
    Casalino, Gabriella
    Castellano, Giovanna
    Kaymak, Uzay
    Zaza, Gianluca
    2021 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (IEEE SSCI 2021), 2021,
  • [4] Focusing on interpretability and accuracy of a genetic fuzzy system
    Castro, PAD
    Camargo, HA
    FUZZ-IEEE 2005: PROCEEDINGS OF THE IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS: BIGGEST LITTLE CONFERENCE IN THE WORLD, 2005, : 696 - 701
  • [5] Hybrid learning models to get the interpretability–accuracy trade-off in fuzzy modeling
    Rafael Alcalá
    Jesús Alcalá-Fdez
    Jorge Casillas
    Oscar Cordón
    Francisco Herrera
    Soft Computing, 2006, 10 : 717 - 734
  • [6] Variable selection for general transformation models with ranking data
    Li, Jianbo
    Gu, Minggao
    Zhang, Riquan
    Lian, Heng
    STATISTICS, 2014, 48 (01) : 81 - 100
  • [7] Hybrid learning models to get the interpretability-accuracy trade-off in fuzzy modeling
    Alcalá, R
    Alcalá-Fdez, J
    Casillas, J
    Cordón, O
    Herrera, F
    SOFT COMPUTING, 2006, 10 (09) : 717 - 734
  • [8] Checking Orthogonal Transformations and Genetic Algorithms for Selection of Fuzzy Rules based on Interpretability-Accuracy Concepts
    Isabel Rey, M.
    Galende, Marta
    Sainz, Gregorio I.
    Fuente, Maria J.
    IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS (FUZZ 2011), 2011, : 1271 - 1278
  • [9] CHECKING ORTHOGONAL TRANSFORMATIONS AND GENETIC ALGORITHMS FOR SELECTION OF FUZZY RULES BASED ON INTERPRETABILITY-ACCURACY CONCEPTS
    Isabel Rey, M.
    Galende, Marta
    Fuente, M. J.
    Sainz-Palmero, Gregorio I.
    INTERNATIONAL JOURNAL OF UNCERTAINTY FUZZINESS AND KNOWLEDGE-BASED SYSTEMS, 2012, 20 : 159 - 186
  • [10] On interpretability of fuzzy models based on conciseness measure
    Furuhashi, T
    Suzuki, T
    10TH IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS, VOLS 1-3: MEETING THE GRAND CHALLENGE: MACHINES THAT SERVE PEOPLE, 2001, : 284 - 287