Benchmark AFLOW Data Sets for Machine Learning

被引:0
|
作者
Conrad L. Clement
Steven K. Kauwe
Taylor D. Sparks
机构
[1] University of Utah,Department of Materials Science and Engineering
关键词
AFLOW; Benchmark data sets; Machine learning; Materials informatics;
D O I
暂无
中图分类号
学科分类号
摘要
Materials informatics is increasingly finding ways to exploit machine learning algorithms. Techniques such as decision trees, ensemble methods, support vector machines, and a variety of neural network architectures are used to predict likely material characteristics and property values. Supplemented with laboratory synthesis, applications of machine learning to compound discovery and characterization represent one of the most promising research directions in materials informatics. A shortcoming of this trend, in its current form, is a lack of standardized materials data sets on which to train, validate, and test model effectiveness. Applied machine learning research depends on benchmark data to make sense of its results. Fixed, predetermined data sets allow for rigorous model assessment and comparison. Machine learning publications that do not refer to benchmarks are often hard to contextualize and reproduce. In this data descriptor article, we present a collection of data sets of different material properties taken from the AFLOW database. We describe them, the procedures that generated them, and their use as potential benchmarks. We provide a compressed ZIP file containing the data sets and a GitHub repository of associated Python code. Finally, we discuss opportunities for future work incorporating the data sets and creating similar benchmark collections.
引用
收藏
页码:153 / 156
页数:3
相关论文
共 50 条
  • [1] Benchmark AFLOW Data Sets for Machine Learning
    Clement, Conrad L.
    Kauwe, Steven K.
    Sparks, Taylor D.
    [J]. INTEGRATING MATERIALS AND MANUFACTURING INNOVATION, 2020, 9 (02) : 153 - 156
  • [2] Analysis of Data Sets With Learning Conflicts for Machine Learning
    Ledesma, Sergio
    Ibarra-Manzano, Mario-Alberto
    Cabal-Yepez, Eduardo
    Almanza-Ojeda, Dora-Luz
    Avina-Cervantes, Juan-Gabriel
    [J]. IEEE ACCESS, 2018, 6 : 45062 - 45070
  • [3] Analyzing EEG Data with Machine and Deep Learning: A Benchmark
    Avola, Danilo
    Cascio, Marco
    Cinque, Luigi
    Fagioli, Alessio
    Foresti, Gian Luca
    Marini, Marco Raoul
    Pannone, Daniele
    [J]. IMAGE ANALYSIS AND PROCESSING, ICIAP 2022, PT I, 2022, 13231 : 335 - 345
  • [4] Negative Data in Data Sets for Machine Learning Training
    Maloney, Michael P.
    Coley, Connor W.
    Genheden, Samuel
    Carson, Nessa
    Helquist, Paul
    Norrby, Per-Ola
    Wiest, Olaf
    [J]. ORGANIC LETTERS, 2023, 25 (17) : 2945 - 2947
  • [5] Negative Data in Data Sets for Machine Learning Training
    Maloney, Michael P.
    Coley, Connor W.
    Genheden, Samuel
    Carson, Nessa
    Helquist, Paul
    Norrby, Per-Ola
    Wiest, Olaf
    [J]. JOURNAL OF ORGANIC CHEMISTRY, 2023, 88 (09): : 5239 - 5241
  • [6] Characterization of machine learning benching data sets
    Al-Mashouq, K
    Nawaz, Z
    [J]. 2001 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS, VOLS 1-5: E-SYSTEMS AND E-MAN FOR CYBERNETICS IN CYBERSPACE, 2002, : 3415 - 3419
  • [7] Fuzzy sets in machine learning and data mining
    Huellermeier, Eyke
    [J]. APPLIED SOFT COMPUTING, 2011, 11 (02) : 1493 - 1505
  • [8] Wireless Network Simulation to Create Machine Learning Benchmark Data
    Katzef, Marc
    Cullen, Andrew C.
    Alpcan, Tansu
    Leckie, Christopher
    Kopacz, Justin
    [J]. 2022 IEEE GLOBAL COMMUNICATIONS CONFERENCE (GLOBECOM 2022), 2022, : 6378 - 6383
  • [9] MLFMF: Data Sets for Machine Learning for Mathematical Formalization
    Bauer, Andrej
    Petkovi, Matej
    Todorovski, Ljupco
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [10] Data Sets For Machine Learning In Wireless Communications And Networks
    Fischione, Carlo
    Chafii, Marwa
    Deng, Yansha
    Erol-Kantarci, Melike
    [J]. IEEE COMMUNICATIONS MAGAZINE, 2023, 61 (09) : 80 - 81