Towards Effective Paraphrasing for Information Disguise

被引:2
|
作者
Agarwal, Anmol [1 ]
Gupta, Shrey [1 ]
Bonagiri, Vamshi [1 ]
Gaur, Manas [2 ]
Reagle, Joseph [3 ]
Kumaraguru, Ponnurangam [1 ]
机构
[1] Int Inst Informat Technol, Hyderabad, India
[2] Univ Maryland, Baltimore, MD USA
[3] Northeastern Univ, Boston, MA USA
来源
ADVANCES IN INFORMATION RETRIEVAL, ECIR 2023, PT II | 2023年 / 13981卷
关键词
Neural information retrieval; Adversarial retrieval; Paraphrasing; Information disguise; Computational ethics;
D O I
10.1007/978-3-031-28238-6_22
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Information Disguise (ID), a part of computational ethics in Natural Language Processing (NLP), is concerned with best practices of textual paraphrasing to prevent the non-consensual use of authors' posts on the Internet. Research on ID becomes important when authors' written online communication pertains to sensitive domains, e.g., mental health. Over time, researchers have utilized AI-based automated word spinners (e.g., SpinRewriter, WordAI) for paraphrasing content. However, these tools fail to satisfy the purpose of ID as their paraphrased content still leads to the source when queried on search engines. There is limited prior work on judging the effectiveness of paraphrasing methods for ID on search engines or their proxies, neural retriever (NeurIR) models. We propose a framework where, for a given sentence from an author's post, we perform iterative perturbation on the sentence in the direction of paraphrasing with an attempt to confuse the search mechanism of a NeurIR system when the sentence is queried on it. Our experiments involve the subreddit "r/AmItheAsshole" as the source of public content and Dense Passage Retriever as a NeurIR system-based proxy for search engines. Our work introduces a novel method of phrase-importance rankings using perplexity scores and involves multilevel phrase substitutions via beam search. Our multi-phrase substitution scheme succeeds in disguising sentences 82% of the time and hence takes an essential step towards enabling researchers to disguise sensitive content effectively before making it public. We also release the code of our approach. (https://github.com/idecir/idecir-Towards-Effective-Paraphrasing-for-Information-Disguise)
引用
收藏
页码:331 / 340
页数:10
相关论文
共 50 条
  • [1] Paraphrasing: An Effective Comprehension Strategy
    Kletzien, Sharon B.
    READING TEACHER, 2009, 63 (01): : 73 - 77
  • [2] Experiments in query paraphrasing for information retrieval
    Zukerman, I
    Raskutti, B
    Wen, YY
    AL 2002: ADVANCES IN ARTIFICIAL INTELLIGENCE, 2002, 2557 : 24 - 35
  • [3] Semantic Paraphrasing for Information Retrieval and Extraction
    Apresjan, Juri D.
    Boguslavsky, Igor M.
    Iomdin, Leonid L.
    Cinman, Leonid L.
    Timoshenko, Svetlana P.
    FLEXIBLE QUERY ANSWERING SYSTEMS: 8TH INTERNATIONAL CONFERENCE, FQAS 2009, 2009, 5822 : 512 - 523
  • [4] Application Research of Paraphrasing in Language Information Processing
    Wang, Zhongjian
    Wang, Ling
    2011 AASRI CONFERENCE ON ARTIFICIAL INTELLIGENCE AND INDUSTRY APPLICATION (AASRI-AIIA 2011), VOL 2, 2011, : 37 - 40
  • [5] F-disguise Information and Reduction of True Information
    Li, Yuying
    Lin, Qifa
    Li, Jintong
    Jiang, Xiaoqing
    2016 IEEE INTERNATIONAL CONFERENCE ON CYBER TECHNOLOGY IN AUTOMATION, CONTROL, AND INTELLIGENT SYSTEMS (CYBER), 2016, : 401 - 406
  • [6] International cooperation of information consultants: Towards the effective exchange of electronic information
    Miwa, M
    INTERNATIONAL FORUM ON INFORMATION AND DOCUMENTATION, 1997, 22 (01): : 29 - 36
  • [7] Towards effective information management: A view from Ghana
    EntsuaMensah, C
    INTERNATIONAL JOURNAL OF INFORMATION MANAGEMENT, 1996, 16 (02) : 149 - 156
  • [8] From advanced towards effective traveller information systems
    Lyons, GA
    TRAVEL BEHAVIOUR RESEARCH: THE LEADING EDGE, 2001, : 813 - 826
  • [9] Towards effective visual information storage on DNA support
    Secilmis, Luka
    Testolina, Michela
    Lazzarotto, Davi
    Ebrahimi, Touradj
    APPLICATIONS OF DIGITAL IMAGE PROCESSING XLV, 2022, 12226
  • [10] Towards effective evaluation of digital community information systems
    Unruh, KI
    Pettigrew, KE
    Durrance, JC
    ASIST 2002: PROCEEDINGS OF THE 65TH ASIST ANNUAL MEETING, VOL 39, 2002, 2002, 39 : 250 - 259