XyGen: Synthetic data generator for feature selection

被引:2
|
作者
Kamalov, Firuz [1 ]
Elnaffar, Said [1 ]
Sulieman, Hana [2 ]
Cherukuri, Aswani Kumar [3 ]
机构
[1] Canadian Univ Dubai, Dubai, U Arab Emirates
[2] Amer Univ Sharjah, Sharjah, U Arab Emirates
[3] Vellore Inst Technol, Vellore, India
关键词
Feature selection; Synthetic data; Machine learning; Data mining; MUTUAL INFORMATION; ALGORITHMS;
D O I
10.1016/j.simpa.2023.100485
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Given the large number of feature selection algorithms, it has become imperative to have a uniform procedure for evaluating the performance of the algorithms. We propose a library of synthetic datasets designed specifically to test the effectiveness of feature selection algorithms. The datasets are inspired by applications in the field of electronics and have a range of characteristics to provide a variety of test scenarios. The software comes in the form of a Python library with standard interface for loading and generating datasets. Each dataset is implemented as a function that allows control of various parameters of the data.
引用
收藏
页数:3
相关论文
共 50 条
  • [21] On the Stability of Feature Selection in Multiomics Data
    Pestarino, Luca
    Fiorito, Giovanni
    Polidoro, Silvia
    Vineis, Paolo
    Cavalli, Andrea
    Decherchi, Sergio
    2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
  • [22] A STUDY ON FEATURE SELECTION IN BIG DATA
    Manikandan, R. P. S.
    Kalpana, A. M.
    2017 INTERNATIONAL CONFERENCE ON COMPUTER COMMUNICATION AND INFORMATICS (ICCCI), 2017,
  • [23] Biases in feature selection with missing data
    Seijo-Pardo, Borja
    Alonso-Betanzos, Amparo
    Bennett, Kristin P.
    Bolon-Canedo, Veronica
    Josse, Julie
    Saeed, Mehreen
    Guyon, Isabelle
    NEUROCOMPUTING, 2019, 342 : 97 - 112
  • [24] Causal Feature Selection With Imbalanced Data
    Ling, Zhaolong
    Wu, Jingxuan
    Zhang, Yiwen
    Zhou, Peng
    Yu, Kui
    Jiang, Bingbing
    Wu, Xindong
    IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE, 2024,
  • [25] FRIEND: Feature selection on inconsistent data
    Qi, Zhixin
    Wang, Hongzhi
    He, Tao
    Li, Jianzhong
    Gao, Hong
    NEUROCOMPUTING, 2020, 391 : 52 - 64
  • [26] BAYESIAN FEATURE SELECTION WITH DATA INTEGRATION
    Pour, Ali Foroughi
    Dalton, Lori A.
    2018 IEEE GLOBAL CONFERENCE ON SIGNAL AND INFORMATION PROCESSING (GLOBALSIP 2018), 2018, : 504 - 508
  • [27] Redundant Feature Selection for Telemetry Data
    Taylor, Phillip
    Griffths, Nathan
    Bhalerao, Abhir
    Popham, Thomas
    Zhou, Xu
    Dunoyer, Alain
    AGENTS AND DATA MINING INTERACTION (ADMI 2013), 2014, 8316 : 53 - 65
  • [28] Revisiting Feature Selection with Data Complexity
    Ngan Thi Dong
    Khosla, Megha
    2020 IEEE 20TH INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOENGINEERING (BIBE 2020), 2020, : 211 - 216
  • [29] Robust Feature Selection on Incomplete Data
    Zheng, Wei
    Zhu, Xiaofeng
    Zhu, Yonghua
    Zhang, Shichao
    PROCEEDINGS OF THE TWENTY-SEVENTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2018, : 3191 - 3197
  • [30] Wavelet feature selection for microarray data
    Liu, Yihui
    2007 IEEE/NIH LIFE SCIENCE SYSTEMS AND APPLICATIONS WORKSHOP, 2007, : 205 - 208