Canary Extraction in Natural Language Understanding Models

被引:0
|
作者
Parikh, Rahil [1 ]
Dupuy, Christophe [2 ]
Gupta, Rahul [2 ]
机构
[1] Univ Maryland, Inst Syst Res, Baltimore, MD 21201 USA
[2] Amazon Alexa AI, New York, NY USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Natural Language Understanding (NLU) models can be trained on sensitive information such as phone numbers, zip-codes etc. Recent literature has focused on Model Inversion Attacks (ModIvA) that can extract training data from model parameters. In this work, we present a version of such an attack by extracting canaries inserted in NLU training data. In the attack, an adversary with open-box access to the model reconstructs the canaries contained in the model's training set. We evaluate our approach by performing text completion on canaries and demonstrate that by using the prefix (non-sensitive) tokens of the canary, we can generate the full canary. As an example, our attack is able to reconstruct a four digit code in the training dataset of the NLU model with a probability of 0.5 in its best configuration. As countermeasures, we identify several defense mechanisms that, when combined, effectively eliminate the risk of ModIvA in our experiments.
引用
收藏
页码:552 / 560
页数:9
相关论文
共 50 条
  • [1] MODELS OF NATURAL-LANGUAGE UNDERSTANDING
    BATES, M
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1995, 92 (22) : 9977 - 9982
  • [2] Fertility models for statistical natural language understanding
    Della Pietra, S
    Epstein, M
    Roukos, S
    Ward, T
    [J]. 35TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 8TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, PROCEEDINGS OF THE CONFERENCE, 1997, : 168 - 173
  • [3] Shortcut Learning of Large Language Models in Natural Language Understanding
    Du, Mengnan
    He, Fengxiang
    Zou, Na
    Tao, Dacheng
    Hu, Xia
    [J]. COMMUNICATIONS OF THE ACM, 2024, 67 (01) : 110 - 120
  • [4] Understanding natural language: Potential application of large language models to ophthalmology
    Yang, Zefeng
    Wang, Deming
    Zhou, Fengqi
    Song, Diping
    Zhang, Yinhang
    Jiang, Jiaxuan
    Kong, Kangjie
    Liu, Xiaoyi
    Qiao, Yu
    Chang, Robert T.
    Han, Ying
    Li, Fei
    Tham, Clement C.
    Zhang, Xiulan
    [J]. ASIA-PACIFIC JOURNAL OF OPHTHALMOLOGY, 2024, 13 (04):
  • [5] Using Natural Sentences for Understanding Biases in Language Models
    Alnegheimish, Sarah
    Guo, Alicia
    Sun, Yi
    [J]. NAACL 2022: THE 2022 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES, 2022, : 2824 - 2830
  • [6] The use of the natural language understanding agents with conceptual models
    Vasilecas, Olegas
    Laukaitis, Algirdas
    [J]. ICEIS 2006: PROCEEDINGS OF THE EIGHTH INTERNATIONAL CONFERENCE ON ENTERPRISE INFORMATION SYSTEMS: ARTIFICIAL INTELLIGENCE AND DECISION SUPPORT SYSTEMS, 2006, : 308 - 311
  • [7] Grey-box Extraction of Natural Language Models
    Zanella-Beguelin, Santiago
    Tople, Shruti
    Paverd, Andrew
    Koepf, Boris
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
  • [8] Effect of Visual Extensions on Natural Language Understanding in Vision-and-Language Models
    Iki, Taichi
    Aizawa, Akiko
    [J]. 2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 2189 - 2196
  • [9] Reliable Natural Language Understanding with Large Language Models and Answer Set Programming
    Rajasekharan, Abhiramon
    Zeng, Yankai
    Padalkar, Parth
    Gupta, Gopal
    [J]. Electronic Proceedings in Theoretical Computer Science, EPTCS, 2023, 385 : 274 - 287
  • [10] Reliable Natural Language Understanding with Large Language Models and Answer Set Programming
    Rajasekharan, Abhiramon
    Zeng, Yankai
    Padalkar, Parth
    Gupta, Gopal
    [J]. ELECTRONIC PROCEEDINGS IN THEORETICAL COMPUTER SCIENCE, 2023, (385): : 274 - 287