DataGen: a generator of datasets for evaluation of classification algorithms

被引:13
|
作者
Rachkovskij, DA [1 ]
Kussul, EM [1 ]
机构
[1] Ukrainian Acad Sci, Cybernet Ctr, UA-252650 Kiev, Ukraine
关键词
benchmarking; evaluation; classification; supervised learning; datasets; data generator; synthetic data;
D O I
10.1016/S0167-8655(98)00053-1
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Dataset generators are useful for the evaluation of an algorithm's performance because they allow control of the characteristics and amount of data used for benchmarking. We propose a dataset generator called DataGen that allows varying the number of input features and output classes, the complexity and realizations of class regions, the distributions of data samples, the noise level, the number of data samples. A C language listing of basic DataCen version is provided. (C) 1998 Published by Elsevier Science B.V. All rights reserved.
引用
收藏
页码:537 / 544
页数:8
相关论文
共 50 条
  • [31] POSSIBILISTIC METHODOLOGY FOR THE EVALUATION OF CLASSIFICATION ALGORITHMS
    Hryniewicz, Olgierd
    ICSOFT 2011: PROCEEDINGS OF THE 6TH INTERNATIONAL CONFERENCE ON SOFTWARE AND DATABASE TECHNOLOGIES, VOL 2, 2011, : 313 - 322
  • [32] Dealing with the evaluation of supervised classification algorithms
    Guzman Santafe
    Iñaki Inza
    Jose A. Lozano
    Artificial Intelligence Review, 2015, 44 : 467 - 508
  • [33] Dealing with the evaluation of supervised classification algorithms
    Santafe, Guzman
    Inza, Inaki
    Lozano, Jose A.
    ARTIFICIAL INTELLIGENCE REVIEW, 2015, 44 (04) : 467 - 508
  • [34] Classification Comparison of Machine Learning Algorithms Using Two Independent CAD Datasets
    Yuvali, Meliz
    Yaman, Belma
    Tosun, Oezguer
    MATHEMATICS, 2022, 10 (03)
  • [35] Comparative analysis of weka-based classification algorithms on medical diagnosis datasets
    Dou, Yifeng
    Meng, Wentao
    TECHNOLOGY AND HEALTH CARE, 2023, 31 : S397 - S408
  • [36] Mammographic feature generator for evaluation of image analysis algorithms
    Nappi, JJ
    Dean, PB
    IMAGE PROCESSING - MEDICAL IMAGING 1997, PTS 1 AND 2, 1997, 3034 : 911 - 918
  • [37] Stability of Feature Selection Algorithms for Classification in High-Throughput Genomics Datasets
    Moulos, Panagiotis
    Kanaris, Ioannis
    Bontempi, Gianluca
    2013 IEEE 13TH INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOENGINEERING (BIBE), 2013,
  • [38] Feature Selection and Classification of Clinical Datasets Using Bioinspired Algorithms and Super Learner
    Murugesan, S.
    Bhuvaneswaran, R. S.
    Khanna Nehemiah, H.
    Keerthana Sankari, S.
    Nancy Jane, Y.
    COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE, 2021, 2021
  • [39] Privacy-Friendly Datasets of Synthetic Fingerprints for Evaluation of Biometric Algorithms
    Makrushin, Andrey
    Mannam, Venkata Srinath
    Dittmann, Jana
    APPLIED SCIENCES-BASEL, 2023, 13 (18):
  • [40] Evaluation of single -cell RNAseq labelling algorithms using cancer datasets
    Christensen, Erik
    Luo, Ping
    Turinsky, Andrei
    Husic, Mia
    Mahalanabis, Alaina
    Naidas, Alaine
    Diaz-Mejia, Juan Javier
    Brudno, Michael
    Pugh, Trevor
    Ramani, Arun
    Shooshtari, Parisa
    BRIEFINGS IN BIOINFORMATICS, 2023, 24 (01)