Spam filtering based on online ranking logistic regression

被引:0
|
作者
机构
[1] Sun, Guanglu
[2] Qi, Haoliang
来源
Sun, G. (guanglu_sun@163.com) | 1600年 / Tsinghua University卷 / 53期
关键词
Binary classification - Classification models - Discriminative models - Logistic Regression modeling - Machine learning methods - On-line rankings - Spam - Statistical significance;
D O I
暂无
中图分类号
学科分类号
摘要
Spam filtering is an important issue in Web information processing. Many machine learning methods are utilized to filter spam. Current researches transform the filtering problem into binary classification, in which the optimization target of the classification model is not consistent with 1-AUC, the usual evaluation measurement. The inconsistence results in the deviation of model optimization, which makes a bad influence on filtering results. In this study, spam filtering was transformed into the ranking model through the optimization oriented to 1-AUC with online ranking logistic regression model then proposed to tackle the deviation of the model's score in the online learning module. TONE (train on or near error), re-sampling and weights update methods were used to promote the learning speed in online adjustment of model's parameters. Experiments on open evaluation datasets show that the developed method is better than the traditional online logistic regression model with statistical significance.
引用
收藏
相关论文
共 50 条
  • [1] The Improved Logistic Regression Models for Spam Filtering
    Han, Yong
    Yang, Muyun
    Qi, Haoliang
    He, Xiaoning
    Li, Sheng
    2009 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING, 2009, : 314 - 317
  • [2] Spam filtering based on preference ranking
    Lan, MJ
    Zhou, WL
    Fifth International Conference on Computer and Information Technology - Proceedings, 2005, : 223 - 227
  • [3] Spam filtering using a logistic regression model trained by an artificial bee colony algorithm
    Dedeturk, Bilge Kagan
    Akay, Bahriye
    APPLIED SOFT COMPUTING, 2020, 91
  • [4] Spam Filtering:Online Naive Bayes Based on TONE
    Guanglu Sun
    Hongyue Sun
    Yingcai Ma
    Yuewu Shen
    ZTECommunications, 2013, 11 (02) : 51 - 54
  • [5] Active learning for online spam filtering
    Liu, Wuying
    Wang, Ting
    INFORMATION RETRIEVAL TECHNOLOGY, 2008, 4993 : 555 - 560
  • [6] Adaptive Sensor Ranking Based on Utility Using Logistic Regression
    Sundar, S.
    Baby, Cyril Joe
    Itagi, Anirudh
    Soni, Siddharth
    SOFT COMPUTING FOR PROBLEM SOLVING, SOCPROS 2018, VOL 1, 2020, 1048 : 365 - 376
  • [7] SPEED UP INFORMATION GAIN BASED ONLINE SVM FOR SPAM FILTERING
    Sun, Guanglu
    Shen, Yuewu
    Qi, Haoliang
    4TH INTERNATIONAL CONFERENCE ON ADVANCED COMPUTER THEORY AND ENGINEERING ( ICACTE 2011), 2011, : 663 - 666
  • [8] Index-based Online Text Classification for SMS Spam Filtering
    Liu, Wuying
    Wang, Ting
    JOURNAL OF COMPUTERS, 2010, 5 (06) : 844 - 851
  • [9] Automatic detection of Spam using Bayesian Logistic Regression
    Ortiz Martos, Antonio Jesus
    Martin Valdivia, Maria Teresa
    Urena Lopez, L. Alfonso
    Garcia Cumbreras, Miguel Angel
    PROCESAMIENTO DEL LENGUAJE NATURAL, 2005, (35): : 127 - 133
  • [10] POSTER: Online Spam Filtering in Social Networks
    Gao, Hongyu
    Chen, Yan
    Lee, Kathy
    Palsetia, Diana
    Choudhary, Alok
    PROCEEDINGS OF THE 18TH ACM CONFERENCE ON COMPUTER & COMMUNICATIONS SECURITY (CCS 11), 2011, : 769 - 771