N-Gram Based Secure Similar Document Detection

被引:0
|
作者
Jiang, Wei [1 ]
Samanthula, Bharath K. [1 ]
机构
[1] Missouri S&T, Dept Comp Sci, Rolla, MO 65401 USA
关键词
privacy; security; n-gram;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Secure similar document detection (SSDD) plays an important role in many applications, such as justifying the need-to-know basis and facilitating communication between government agencies. The SSDD problem considers situations where Alice with a query document wants to find similar information from Bob's document collection. During this process, the content of the query document is not disclosed to Bob, and Bob's document collection is not disclosed to Alice. Existing SSDD protocols are developed under the vector space model, which has the advantage of identifying global similar information. To effectively and securely detect similar documents with overlapping text fragments, this paper proposes a novel n-gram based SSDD protocol.
引用
收藏
页码:239 / 246
页数:8
相关论文
共 50 条
  • [1] Similar N-gram Language Model
    Gillot, Christian
    Cerisara, Christophe
    Langlois, David
    Haton, Jean-Paul
    [J]. 11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4, 2010, : 1824 - 1827
  • [2] N-gram Density based Malware Detection
    O'Kane, Philip
    Sezer, Sakir
    McLaughlin, Kieran
    [J]. 2014 WORLD SYMPOSIUM ON COMPUTER APPLICATIONS & RESEARCH (WSCAR), 2014,
  • [3] NOVEL TOPIC N-GRAM COUNT LM INCORPORATING DOCUMENT-BASED TOPIC DISTRIBUTIONS AND N-GRAM COUNTS
    Haidar, Md. Akmal
    O'Shaughnessy, Douglas
    [J]. 2014 PROCEEDINGS OF THE 22ND EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2014, : 2310 - 2314
  • [4] Character n-Gram Spotting in Document Images
    Praveen, Sudha M.
    Sankar, Pramod K.
    Jawahar, C. V.
    [J]. 11TH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR 2011), 2011, : 941 - 945
  • [5] A Semantic Similarity Measure for Scholarly Document Based on the Study of n-gram
    Samen, Yannick-Ulrich Tchantchou
    [J]. JOURNAL OF WEB ENGINEERING, 2022, 21 (07): : 2095 - 2114
  • [6] N-Gram Based Paraphrase Generator from Large text Document
    Gadag, Ashwini I.
    Sagar, B. M.
    [J]. 2016 INTERNATIONAL CONFERENCE ON COMPUTATION SYSTEM AND INFORMATION TECHNOLOGY FOR SUSTAINABLE SOLUTIONS (CSITSS), 2016, : 91 - 94
  • [7] N-gram MalGAN: Evading machine learning detection via feature n-gram
    Zhu, Enmin
    Zhang, Jianjie
    Yan, Jijie
    Chen, Kongyang
    Gao, Chongzhi
    [J]. DIGITAL COMMUNICATIONS AND NETWORKS, 2022, 8 (04) : 485 - 491
  • [8] N-gram MalGAN:Evading machine learning detection via feature n-gram
    Enmin Zhu
    Jianjie Zhang
    Jijie Yan
    Kongyang Chen
    Chongzhi Gao
    [J]. Digital Communications and Networks., 2022, 8 (04) - 491
  • [9] Analysis of N-gram model on Telugu Document Classification
    Rani, B. Padmaja
    Vardhan, B. Vishnu
    Durga, A. Kanaka
    Reddy, L. Pratap
    Babu, A. Vinaya
    [J]. 2008 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION, VOLS 1-8, 2008, : 3199 - +
  • [10] N-gram language models for document image decoding
    Kopec, GE
    Said, MR
    Popat, K
    [J]. DOCUMENT RECOGNITION AND RETRIEVAL IX, 2002, 4670 : 191 - 202