Linguistic Steganalysis Based on Clustering and Ensemble Learning in Imbalanced Scenario

被引:0
|
作者
Guo, Shengnan [1 ]
Chen, Xuekai [1 ]
Wang, Zhuang [1 ]
Yang, Zhongliang [1 ]
Zhou, Linna [1 ]
机构
[1] Beijing Univ Posts & Telecommun, Sch Cyberspace Secur, Beijing 100876, Peoples R China
基金
中国国家自然科学基金;
关键词
Linguistic Steganalysis; Clustering; Ensemble Learning; FEATURES;
D O I
10.1007/978-981-97-2585-4_22
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
With the rapid development of the Internet, more and more methods of text steganography have emerged. However, these methods are easily abused in public networks for malicious purposes, which poses a great threat to cyberspace security. At present, a large number of text steganalysis methods have been proposed to game with text steganography. However, existing methods typically assume a balanced class distribution. In reality, stego texts are far less than cover texts. How to accurately detect stego texts in massive texts becomes a challenge. In this paper, we propose a text steganalysis method based on an under-sample method and ensemble learning in imbalanced scenarios. Specifically, we introduce the thinking of clustering to under-sample the majority class samples (cover texts) based on the detection difficulty of the samples, in order to select samples with rich information. Ensemble learning is then used to ensemble the detection results of multiple base classifiers and guide the sampling process. We designed several experiments to test the detection performance of the proposed model. Experimental results show that the proposed model can effectively compensate for the deficiencies of existing methods, even in highly imbalanced datasets, the model can still detect stego texts effectively.
引用
收藏
页码:304 / 318
页数:15
相关论文
共 50 条
  • [1] CLUSTERING-BASED SUBSET ENSEMBLE LEARNING METHOD FOR IMBALANCED DATA
    Hu, Xiao-Sheng
    Zhang, Run-Jing
    PROCEEDINGS OF 2013 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS (ICMLC), VOLS 1-4, 2013, : 35 - 39
  • [2] An Efficient Ensemble of Convolutional Deep Steganalysis Based on Clustering
    Abazar, Tayebe
    Masjedi, Peyman
    Taheri, Mohammad
    2020 6TH INTERNATIONAL CONFERENCE ON WEB RESEARCH (ICWR), 2020, : 260 - 264
  • [3] An Ensemble Learning Algorithm Based on Density Peaks Clustering and Fitness for Imbalanced Data
    Xu, Hui
    Liu, Qicheng
    IEEE ACCESS, 2022, 10 : 116120 - 116128
  • [4] Linguistic Steganalysis in Few-Shot Scenario
    Wang, Huili
    Yang, Zhongliang
    Yang, Jinshuai
    Chen, Cheng
    Huang, Yongfeng
    IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, 2023, 18 : 4870 - 4882
  • [5] JPEG Steganalysis Based on Multi-Projection Ensemble Discriminant Clustering
    Sun, Yan
    Feng, Guorui
    Ren, Yanli
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2019, E102D (01) : 198 - 201
  • [6] An imbalanced ensemble learning method based on dual clustering and stage-wise hybrid sampling
    Li, Fan
    Wang, Bo
    Wang, Pin
    Jiang, Mingfeng
    Li, Yongming
    APPLIED INTELLIGENCE, 2023, 53 (18) : 21167 - 21191
  • [7] An imbalanced ensemble learning method based on dual clustering and stage-wise hybrid sampling
    Fan Li
    Bo Wang
    Pin Wang
    Mingfeng Jiang
    Yongming Li
    Applied Intelligence, 2023, 53 : 21167 - 21191
  • [8] RLS-DTS: Reinforcement-Learning Linguistic Steganalysis in Distribution-Transformed Scenario
    Wang, Yihao
    Zhang, Ru
    Liu, Jianyi
    IEEE SIGNAL PROCESSING LETTERS, 2023, 30 : 1232 - 1236
  • [9] An effective linguistic steganalysis framework based on hierarchical mutual learning
    Xue, Yiming
    Kong, Lingzhi
    Peng, Wanli
    Zhong, Ping
    Wen, Juan
    INFORMATION SCIENCES, 2022, 586 : 140 - 154
  • [10] Logistic regression for imbalanced learning based on clustering
    Guo, Huaping
    Wei, Tao
    INTERNATIONAL JOURNAL OF COMPUTATIONAL SCIENCE AND ENGINEERING, 2019, 18 (01) : 54 - 64