PERFGEN: A Synthesis and Evaluation Framework for Performance Data using Generative AI

被引:0
|
作者
Banday, Banooqa H. [1 ]
Islam, Tanzima Z. [1 ]
Marathe, Aniruddha [2 ]
机构
[1] Texas State Univ, San Marcos, TX 78666 USA
[2] Lawrence Livermore Natl Lab, Livermore, CA 94550 USA
关键词
Large Language Model; Generative Modeling; Evaluation; Scientific Data;
D O I
10.1109/COMPSAC61105.2024.00035
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Collecting data in High-Performance Computing (HPC) is a laborious task, demanding that application scientists execute the application multiple times with different configurations. Due to the essential nature of performance modeling and root cause analysis as initial phases of performance enhancement, the data collection phase prolongs the optimization process. Motivated by this observation, we investigate the feasibility of leveraging the recent advancement in the field of generative Artificial Intelligence (AI) to synthesize performance samples. However, generating synthetic performance data introduces an additional hurdle: the absence of ground truths to assess the quality of the synthetic data. This work takes a step toward bridging this gap where we propose a framework-PERFGEN-for generating performance data and evaluating its quality using a novel metric called Dissimilarity. Our experiments with three performance and five machine learning datasets (including three classification and two regression datasets), confirm that our proposed Dissimilarity correlates with model accuracy better than three of the state-of-the-art metrics-SD quality, Kullback-Leibler Divergence (KL), and TabSyndex, demonstrating that the Dissimilarity metric strongly correlates with the quality of generated scientific data. We evaluate the quality by measuring how well the generated data enables a downstream Machine Learning (ML) task to generalize. Since performance data is a special case of scientific data-typically stored in tabular format and consisting of numerical, categorical, and ordinal features-our methodologies and metrics apply to scientific data from other domains as well.
引用
收藏
页码:188 / 197
页数:10
相关论文
共 50 条
  • [21] Data-driven Learning Meets Generative AI: Introducing the Framework of Metacognitive Resource Use
    Mizumoto, Atsushi
    APPLIED CORPUS LINGUISTICS, 2023, 3 (03):
  • [22] The MADE Framework: Best Practices for Creating Effective Experimental Stimuli Using Generative AI
    van Berlo, Zeph M. C.
    Campbell, Colin
    Voorveld, Hilde A. M.
    JOURNAL OF ADVERTISING, 2024, 53 (05) : 732 - 753
  • [23] Grading Generative AI-based Assignments Using a 3R Framework
    Chan, Henry C. B.
    2023 IEEE INTERNATIONAL CONFERENCE ON TEACHING, ASSESSMENT AND LEARNING FOR ENGINEERING, TALE, 2023, : 128 - 132
  • [24] The Use of Generative AI for Scientific Literature Searches for Systematic Reviews: ChatGPT and Microsoft Bing AI Performance Evaluation
    Gwon, Yong Nam
    Kim, Jae Heon
    Chung, Hyun Soo
    Jung, Eun Jee
    Chun, Joey
    Lee, Serin
    Shim, Sung Ryul
    JMIR MEDICAL INFORMATICS, 2024, 12
  • [25] The Use of Generative AI for Scientific Literature Searches for Systematic Reviews: ChatGPT and Microsoft Bing AI Performance Evaluation
    Gwon, Yong Nam
    Kim, Jae Heon
    Chung, Hyun Soo
    Jung, Eun Jee
    Chun, Joey
    Lee, Serin
    Shim, Sung Ryul
    JMIR MEDICAL INFORMATICS, 2024, 12
  • [26] A Tutorial on Teaching Data Analytics with Generative AI
    Bray, Robert L.
    INFORMS JOURNAL ON APPLIED ANALYTICS, 2025,
  • [27] Generative AI to Generate Test Data Generators
    Baudry, Benoit
    Etemadi, Khashayar
    Fang, Sen
    Gamage, Yogya
    Liu, Yi
    Liu, Yuxin
    Monperrus, Martin
    Ron, Javier
    Silva, Andre
    Tiwari, Deepika
    IEEE SOFTWARE, 2024, 41 (06) : 55 - 64
  • [28] Synthesizing Training Data for Intelligent Weed Control Systems Using Generative AI
    Modak, Sourav
    Stein, Anthony
    ARCHITECTURE OF COMPUTING SYSTEMS, ARCS 2024, 2024, 14842 : 112 - 126
  • [29] Constructing Dreams using Generative AI
    Ali, Safinah
    Ravi, Prerna
    Williams, Randi
    DiPaola, Daniella
    Breazeal, Cynthia
    THIRTY-EIGTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 21, 2024, : 23268 - 23275
  • [30] Evaluation of the Effectiveness of Prompts and Generative AI Responses
    Bandi, Ajay
    Zeng, Ruida
    COMPUTER APPLICATIONS IN INDUSTRY AND ENGINEERING, CAINE 2024, 2025, 2242 : 56 - 69