A Word-Embedding-Based Steganalysis Method for Linguistic Steganography via Synonym Substitution

被引:22
|
作者
Xiang, Lingyun [1 ,2 ]
Yu, Jingmin [2 ]
Yang, Chunfang [3 ]
Zeng, Daojian [1 ,2 ]
Shen, Xiaobo [4 ]
机构
[1] Changsha Univ Sci & Technol, Hunan Prov Key Lab Intelligent Proc Big Data Tran, Changsha 410114, Hunan, Peoples R China
[2] Changsha Univ Sci & Technol, Sch Comp & Commun Engn, Changsha 410114, Hunan, Peoples R China
[3] Zhengzhou Sci & Technol Inst, Zhengzhou 450001, Henan, Peoples R China
[4] Nanyang Technol Univ, Sch Comp Sci & Engn, Singapore 639798, Singapore
来源
IEEE ACCESS | 2018年 / 6卷
基金
中国国家自然科学基金;
关键词
Steganalysis; steganography; word embedding; Skip-gram language model; TF-IDF;
D O I
10.1109/ACCESS.2018.2878273
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The development of steganography technology threatens the security of privacy information in smart campus. To prevent privacy disclosure, a linguistic steganalysis method based on word embedding is proposed to detect the privacy information hidden in synonyms in the texts. With the continuous Skip-gram language model, each synonym and words in its context are represented as word embeddings, which aims to encode semantic meanings of words into low-dimensional dense vectors. The context fitness, which characterizes the suitability of a synonym by its semantic correlations with context words, is effectively estimated by their corresponding word embeddings and weighted by TF-IDF values of context words. By analyzing the differences of context fitness values of synonyms in the same synonym set and the differences of those in the cover and stego text, three features are extracted and fed into a support vector machine classifier for steganalysis task. The experimental results show that the proposed steganalysis improves the average F-value at least 4.8% over two baselines. In addition, the detection performance can be further improved by learning better word embeddings.
引用
收藏
页码:64131 / 64141
页数:11
相关论文
共 50 条
  • [21] Word-embedding-based query expansion: Incorporating Deep Averaging Networks in Arabic document retrieval
    Farhan, Yasir Hadi
    Noah, Shahrul Azman Mohd
    Mohd, Masnizah
    Atwan, Jaffar
    JOURNAL OF INFORMATION SCIENCE, 2023, 49 (05) : 1168 - 1186
  • [22] Baseline Needs More Love: On Simple Word-Embedding-Based Models and Associated Pooling Mechanisms
    Shen, Dinghan
    Wang, Guoyin
    Wang, Wenlin
    Min, Martin Renqiang
    Su, Qinliang
    Zhang, Yizhe
    Li, Chunyuan
    Henao, Ricardo
    Carin, Lawrence
    PROCEEDINGS OF THE 56TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL), VOL 1, 2018, : 440 - 450
  • [23] Paraphrasing Method Based on Contextual Synonym Substitution
    Barmawi, Ari Moesriami
    Muhammad, Ali
    JOURNAL OF ICT RESEARCH AND APPLICATIONS, 2019, 13 (03) : 257 - 282
  • [24] An Improved Module Based Substitution Steganography Method
    Akhtar, Nadeem
    Bano, Ambreen
    Islam, Faraz
    2014 FOURTH INTERNATIONAL CONFERENCE ON COMMUNICATION SYSTEMS AND NETWORK TECHNOLOGIES (CSNT), 2014, : 695 - 698
  • [25] Detection of substitution-based linguistic steganography by relative frequency analysis
    Chen, Zhili
    Huang, Liusheng
    Yang, Wei
    DIGITAL INVESTIGATION, 2011, 8 (01) : 68 - 77
  • [26] Dimensional Sentiment Analysis for Chinese Words Based on Synonym Lexicon and Word Embedding
    Cheng, Wei
    Zhu, Yue
    Song, Yuansheng
    Jian, Ping
    PROCEEDINGS OF THE 2016 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP), 2016, : 312 - 316
  • [27] Cross-Modal Text Steganography Against Synonym Substitution-Based Text Attack
    Peng, Wanli
    Wang, Tao
    Qian, Zhenxing
    Li, Sheng
    Zhang, Xinpeng
    IEEE SIGNAL PROCESSING LETTERS, 2023, 30 : 299 - 303
  • [28] Word-Embedding-Based Traffic Document Classification Model for Detecting Emerging Risks Using Sentiment Similarity Weight
    Kim, Min-Jeong
    Kang, Ji-Soo
    Chung, Kyungyong
    IEEE ACCESS, 2020, 8 : 183983 - 183994
  • [29] General Steganalysis of Generative Linguistic Steganography Based on Dynamic Segment-Level Lexical Association Extraction
    Li, Songbin
    Du, Hui
    Wang, Jingang
    IEEE SIGNAL PROCESSING LETTERS, 2025, 32 : 191 - 195
  • [30] Cross correlation feature mining for steganalysis of hash based least significant bit substitution video steganography
    Fan, MingQuan
    Liu, PeiPei
    Wang, HongXia
    Sun, XingMing
    TELECOMMUNICATION SYSTEMS, 2016, 63 (04) : 523 - 529