Towards Robust Models of Code via Energy-Based Learning on Auxiliary Datasets

被引:0
|
作者
Bui, Nghi D. Q. [1 ,2 ]
Yu, Yijun [2 ]
机构
[1] Singapore Management Univ, Singapore, Singapore
[2] Huawei Ireland Res Ctr, Dublin, Ireland
关键词
D O I
10.1145/3551349.3561171
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Existing approaches to improving the robustness of source code models concentrate on recognizing adversarial samples rather than valid samples that fall outside of a given distribution, which we refer to as out-of-distribution (OOD) samples. To this end, we propose to use an auxiliary dataset (out-of-distribution) such that, when trained together with the main dataset, they will enhance the model's robustness. We adapt energy-bounded learning objective function to assign a higher score to in-distribution samples and a lower score to out-of-distribution samples in order to incorporate such out-of-distribution samples into the training process of source code models. In terms of OOD detection and adversarial samples detection, our evaluation results demonstrate a greater robustness for existing source code models to become more accurate at recognizing OOD data while being more resistant to adversarial attacks at the same time.
引用
收藏
页数:3
相关论文
共 50 条
  • [21] Deep Energy-Based NARX Models
    Hendriks, Johannes N.
    Gustafsson, Fredrik K.
    Ribeiro, Antonio H.
    Wills, Adrian G.
    Schon, Thomas B.
    IFAC PAPERSONLINE, 2021, 54 (07): : 505 - 510
  • [22] Computational energy-based redesign of robust proteins
    Stracquadanio, Giovanni
    Nicosia, Giuseppe
    COMPUTERS & CHEMICAL ENGINEERING, 2011, 35 (03) : 464 - 473
  • [23] Energy-based models for environmental biotechnology
    Rodriguez, Jorge
    Lema, Juan M.
    Kleerebezem, Robbert
    TRENDS IN BIOTECHNOLOGY, 2008, 26 (07) : 366 - 374
  • [24] Energy-Based Models of P Systems
    Mauri, Giancarlo
    Leporati, Alberto
    Zandron, Claudio
    MEMBRANE COMPUTING, 2010, 5957 : 104 - 124
  • [25] Residual Energy-Based Models for Text
    Bakhtin, Anton
    Deng, Yuntian
    Gross, Sam
    Ott, Myle
    Ranzato, Marc'Aurelio
    Szlam, Arthur
    JOURNAL OF MACHINE LEARNING RESEARCH, 2021, 22
  • [26] Residual energy-based models for text
    Bakhtin, Anton
    Deng, Yuntian
    Gross, Sam
    Ott, Myle
    Ranzato, Marc'Aurelio
    Szlam, Arthur
    Journal of Machine Learning Research, 2021, 22
  • [27] Energy-Based Reranking: Improving Neural Machine Translation Using Energy-Based Models
    Bhattacharyya, Sumanta
    Rooshenas, Amirmohammad
    Naskar, Subhajit
    Sun, Simeng
    Iyyer, Mohit
    McCallum, Andrew
    59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (ACL-IJCNLP 2021), VOL 1, 2021, : 4528 - 4537
  • [28] CLAWSAT: Towards Both Robust and Accurate Code Models
    Jia, Jinghan
    Srikant, Shashank
    Mitrovska, Tamara
    Gan, Chuang
    Chang, Shiyu
    Liu, Sijia
    O'Reilly, Una-May
    2023 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE ANALYSIS, EVOLUTION AND REENGINEERING, SANER, 2023, : 212 - 223
  • [29] Efficient training of energy-based models via spin-glass control
    Pozas-Kerstjens, Alejandro
    Munoz-Gil, Gorka
    Pinol, Eloy
    Garcia-March, Miguel Angel
    Acin, Antonio
    Lewenstein, Maciej
    Grzybowski, Przemyslaw R.
    MACHINE LEARNING-SCIENCE AND TECHNOLOGY, 2021, 2 (02):
  • [30] Perturb-and-max-product: Sampling and learning in discrete energy-based models
    Lazaro-Gredilla, Miguel
    Dedieu, Antoine
    George, Dileep
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34