Detecting Data Races in OpenMP with Deep Learning and Large Language Models

被引:0
|
作者
Alsofyani, May [1 ]
Wang, Liqiang [1 ]
机构
[1] Univ Cent Florida, Dept Comp Sci, Orlando, FL 32816 USA
关键词
data race; race condition; bug detection; OpenMP; transformer encoder; large language model; CodeBERTa; GPT-4; Turbo;
D O I
10.1145/3677333.3678160
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Transformer-based neural network models are increasingly employed to handle software engineering issues, such as bug localization and program repair. These models, equipped with a self-attention mechanism, excel at understanding source code context and semantics. Recently, large language models (LLMs) have emerged as a promising alternative for analyzing and understanding code structure. In this paper, we propose two novel methods for detecting data race bugs in OpenMP programs. The first method is based on a transformer encoder trained from scratch. The second method leverages LLMs, specifically extending GPT-4 Turbo through the use of prompt engineering and fine-tuning techniques. For training and testing our approach, we utilized two datasets comprising different OpenMP directives. Our experiments show that the transformer encoder achieves competitive accuracy compared to LLMs, whether through fine-tuning or prompt engineering techniques. This performance may be attributed to the complexity of many OpenMP directives and the limited availability of labeled datasets.
引用
收藏
页码:96 / 103
页数:8
相关论文
共 50 条
  • [21] Deep Learning for Genomics: From Early Neural Nets to Modern Large Language Models
    Yue, Tianwei
    Wang, Yuanxin
    Zhang, Longxiang
    Gu, Chunming
    Xue, Haoru
    Wang, Wenping
    Lyu, Qi
    Dun, Yujie
    INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES, 2023, 24 (21)
  • [23] Deep Learning and Web Applications Vulnerabilities Detection: An Approach Based on Large Language Models
    Nana, Sidwendluian Romaric
    Bassole, Didier
    Guel, Desire
    Sie, Oumarou
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2024, 15 (07) : 1391 - 1399
  • [24] Advanced deep learning and large language models for suicide ideation detection on social media
    Qorich, Mohammed
    El Ouazzani, Rajae
    PROGRESS IN ARTIFICIAL INTELLIGENCE, 2024, 13 (02) : 135 - 147
  • [25] Deep learning detection method for large language models-generated scientific content
    Alhijawi, Bushra
    Jarrar, Rawan
    AbuAlRub, Aseel
    Bader, Arwa
    Neural Computing and Applications, 2025, 37 (01) : 91 - 104
  • [26] Automated detecting, segmenting and measuring of grains in images of fluvial sediments: The potential for large and precise data from specialist deep learning models and transfer learning
    Mair, David
    Witz, Guillaume
    Do Prado, Ariel Henrique
    Garefalakis, Philippos
    Schlunegger, Fritz
    EARTH SURFACE PROCESSES AND LANDFORMS, 2024, 49 (03) : 1099 - 1116
  • [27] Language Models Based on Deep Learning: A Review
    Wang N.-Y.
    Ye Y.-X.
    Liu L.
    Feng L.-Z.
    Bao T.
    Peng T.
    Peng, Tao (tpeng@jlu.edu.cn), 1600, Chinese Academy of Sciences (32): : 1082 - 1115
  • [28] ChatPhishDetector: Detecting Phishing Sites Using Large Language Models
    Koide, Takashi
    Nakano, Hiroki
    Chiba, Daiki
    IEEE Access, 2024, 12 : 154381 - 154400
  • [29] Devising and Detecting Phishing Emails Using Large Language Models
    Heiding, Fredrik
    Schneier, Bruce
    Vishwanath, Arun
    Bernstein, Jeremy
    Park, Peter S.
    IEEE ACCESS, 2024, 12 : 42131 - 42146
  • [30] Detecting hallucinations in large language models using semantic entropy
    Farquhar, Sebastian
    Kossen, Jannik
    Kuhn, Lorenz
    Gal, Yarin
    NATURE, 2024, 630 (8017) : 625 - 630