Detecting Data Races in OpenMP with Deep Learning and Large Language Models

被引:0
|
作者
Alsofyani, May [1 ]
Wang, Liqiang [1 ]
机构
[1] Univ Cent Florida, Dept Comp Sci, Orlando, FL 32816 USA
关键词
data race; race condition; bug detection; OpenMP; transformer encoder; large language model; CodeBERTa; GPT-4; Turbo;
D O I
10.1145/3677333.3678160
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Transformer-based neural network models are increasingly employed to handle software engineering issues, such as bug localization and program repair. These models, equipped with a self-attention mechanism, excel at understanding source code context and semantics. Recently, large language models (LLMs) have emerged as a promising alternative for analyzing and understanding code structure. In this paper, we propose two novel methods for detecting data race bugs in OpenMP programs. The first method is based on a transformer encoder trained from scratch. The second method leverages LLMs, specifically extending GPT-4 Turbo through the use of prompt engineering and fine-tuning techniques. For training and testing our approach, we utilized two datasets comprising different OpenMP directives. Our experiments show that the transformer encoder achieves competitive accuracy compared to LLMs, whether through fine-tuning or prompt engineering techniques. This performance may be attributed to the complexity of many OpenMP directives and the limited availability of labeled datasets.
引用
收藏
页码:96 / 103
页数:8
相关论文
共 50 条
  • [41] Meta In-Context Learning: Harnessing Large Language Models for Electrical Data Classification
    Zhou, Mi
    Li, Fusheng
    Zhang, Fan
    Zheng, Junhao
    Ma, Qianli
    ENERGIES, 2023, 16 (18)
  • [42] Demystifying Data Management for Large Language Models
    Miao, Xupeng
    Jia, Zhihao
    Cui, Bin
    COMPANION OF THE 2024 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, SIGMOD-COMPANION 2024, 2024, : 547 - 555
  • [43] Detecting data races in sequential programs with DIOTA
    Ronsse, M
    Maebe, J
    De Bosschere, K
    EURO-PAR 2004 PARALLEL PROCESSING, PROCEEDINGS, 2004, 3149 : 82 - 89
  • [44] Large Language Models and the Elliott Wave Principle: A Multi-Agent Deep Learning Approach to Big Data Analysis in Financial Markets
    Wawer, Michal
    Chudziak, Jaroslaw A.
    Niewiadomska-Szynkiewicz, Ewa
    Applied Sciences (Switzerland), 2024, 14 (24):
  • [45] Detecting Large Vessel Occlusions using Graph Deep Learning
    Kassam, Jad
    Thamm, Florian
    Rist, Leonhard
    Taubmann, Oliver
    Maier, Andreas
    GEOMETRIC DEEP LEARNING IN MEDICAL IMAGE ANALYSIS, VOL 194, 2022, 194 : 149 - 159
  • [46] Predicting medication adherence using ensemble learning and deep learning models with large scale healthcare data
    Yingqi Gu
    Akshay Zalkikar
    Mingming Liu
    Lara Kelly
    Amy Hall
    Kieran Daly
    Tomas Ward
    Scientific Reports, 11
  • [47] Predicting medication adherence using ensemble learning and deep learning models with large scale healthcare data
    Gu, Yingqi
    Zalkikar, Akshay
    Liu, Mingming
    Kelly, Lara
    Hall, Amy
    Daly, Kieran
    Ward, Tomas
    SCIENTIFIC REPORTS, 2021, 11 (01)
  • [48] Does It Matter? - OMPSanitizer: An Impact Analyzer of Reported Data Races in OpenMP Programs
    Wang, Wenwen
    Lin, Pei-Hung
    PROCEEDINGS OF THE 2021 ACM INTERNATIONAL CONFERENCE ON SUPERCOMPUTING, ICS 2021, 2021, : 40 - 51
  • [49] Optimizing coagulant dosage using deep learning models with large-scale data
    Kim J.
    Hua C.
    Kim K.
    Lin S.
    Oh G.
    Park M.-H.
    Kang S.
    Chemosphere, 2024, 350
  • [50] Software abstractions for large-scale deep learning models in big data analytics
    Khan A.H.
    Qamar A.M.
    Yusuf A.
    Khan R.
    International Journal of Advanced Computer Science and Applications, 2019, 10 (04): : 557 - 566