Detecting Data Races in OpenMP with Deep Learning and Large Language Models

被引：0

作者：

Alsofyani, May ^{[1
]}

Wang, Liqiang ^{[1
]}

机构：

[1] Univ Cent Florida, Dept Comp Sci, Orlando, FL 32816 USA

来源：

53RD INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING, ICPP 2024 | 2024年

关键词：

data race; race condition; bug detection; OpenMP; transformer encoder; large language model; CodeBERTa; GPT-4; Turbo;

D O I：

10.1145/3677333.3678160

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Transformer-based neural network models are increasingly employed to handle software engineering issues, such as bug localization and program repair. These models, equipped with a self-attention mechanism, excel at understanding source code context and semantics. Recently, large language models (LLMs) have emerged as a promising alternative for analyzing and understanding code structure. In this paper, we propose two novel methods for detecting data race bugs in OpenMP programs. The first method is based on a transformer encoder trained from scratch. The second method leverages LLMs, specifically extending GPT-4 Turbo through the use of prompt engineering and fine-tuning techniques. For training and testing our approach, we utilized two datasets comprising different OpenMP directives. Our experiments show that the transformer encoder achieves competitive accuracy compared to LLMs, whether through fine-tuning or prompt engineering techniques. This performance may be attributed to the complexity of many OpenMP directives and the limited availability of labeled datasets.

引用

页码：96 / 103

页数：8

共 50 条

[41] Meta In-Context Learning: Harnessing Large Language Models for Electrical Data Classification
Zhou, Mi
Li, Fusheng
Zhang, Fan
Zheng, Junhao
Ma, Qianli
ENERGIES, 2023, 16 (18)
[42] Demystifying Data Management for Large Language Models
Miao, Xupeng
Jia, Zhihao
Cui, Bin
COMPANION OF THE 2024 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, SIGMOD-COMPANION 2024, 2024, : 547 - 555
[43] Detecting data races in sequential programs with DIOTA
Ronsse, M
Maebe, J
De Bosschere, K
EURO-PAR 2004 PARALLEL PROCESSING, PROCEEDINGS, 2004, 3149 : 82 - 89
[44] Large Language Models and the Elliott Wave Principle: A Multi-Agent Deep Learning Approach to Big Data Analysis in Financial Markets
Wawer, Michal
Chudziak, Jaroslaw A.
Niewiadomska-Szynkiewicz, Ewa
Applied Sciences (Switzerland), 2024, 14 (24):
[45] Detecting Large Vessel Occlusions using Graph Deep Learning
Kassam, Jad
Thamm, Florian
Rist, Leonhard
Taubmann, Oliver
Maier, Andreas
GEOMETRIC DEEP LEARNING IN MEDICAL IMAGE ANALYSIS, VOL 194, 2022, 194 : 149 - 159
[46] Predicting medication adherence using ensemble learning and deep learning models with large scale healthcare data
Yingqi Gu
Akshay Zalkikar
Mingming Liu
Lara Kelly
Amy Hall
Kieran Daly
Tomas Ward
Scientific Reports, 11
[47] Predicting medication adherence using ensemble learning and deep learning models with large scale healthcare data
Gu, Yingqi
Zalkikar, Akshay
Liu, Mingming
Kelly, Lara
Hall, Amy
Daly, Kieran
Ward, Tomas
SCIENTIFIC REPORTS, 2021, 11 (01)
[48] Does It Matter? - OMPSanitizer: An Impact Analyzer of Reported Data Races in OpenMP Programs
Wang, Wenwen
Lin, Pei-Hung
PROCEEDINGS OF THE 2021 ACM INTERNATIONAL CONFERENCE ON SUPERCOMPUTING, ICS 2021, 2021, : 40 - 51
[49] Optimizing coagulant dosage using deep learning models with large-scale data
Kim J.
Hua C.
Kim K.
Lin S.
Oh G.
Park M.-H.
Kang S.
Chemosphere, 2024, 350
[50] Software abstractions for large-scale deep learning models in big data analytics
Khan A.H.
Qamar A.M.
Yusuf A.
Khan R.
International Journal of Advanced Computer Science and Applications, 2019, 10 (04): : 557 - 566

← 1 2 3 4 5 →