Scalable Spam Classifier for Web Tables

被引：0

作者：

Villasenor, Santiago ^{[1
]}

Nguyen, Tom ^{[1
]}

Kola, Anusha ^{[1
]}

Soderman, Sean ^{[1
]}

Gubanov, Michael ^{[1
]}

机构：

[1] Univ Texas San Antonio, Dept Comp Sci, San Antonio, TX 78249 USA

来源：

2017 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA) | 2017年

关键词：

Web-search; Large-scale Data Management; Cloud Computing; Data Fusion and Cleaning; Summarization; Human-Computer Interaction;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Internet mail spam is a problem for most organizations and individuals. Spam can be classified into two categories: fraud and commercial. The fraud category includes phishing, scams, malware, counterfeit products and any other criminal activities. The commercial category includes promotional messages and newsletters that we do not want to receive, being sent illegally from legitimate organizations. Fraud can be seen as being a high threat with high volume while commercial spam is the opposite. Similar to mail, there are spam Web tables that do not have any useful content. Here we describe our machine-learning classifier for efficient and effective Web tables spam filtering that was tested on a large-scale Web tables corpus of approximate to 36 million tables.

引用

页码：4849 / 4851

页数：3

共 50 条

[1] Antyscam - Practical Web Spam Classifier
Luckner, Marcin
Gad, Michal
Sobkowiak, Pawel
[J]. INTERNATIONAL JOURNAL OF ELECTRONICS AND TELECOMMUNICATIONS, 2019, 65 (04) : 713 - 722
[2] Web Spam Detection using SVM Classifier
Patil, Rahul C.
Patil, D. R.
[J]. PROCEEDINGS OF 2015 IEEE 9TH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS AND CONTROL (ISCO), 2015,
[3] Causal Cognition and Spam Classifier
Taniguchi, Hidetaka
Oyo, Kuratomo
Kohno, Yu
Takahashi, Tatsuji
[J]. PROCEEDINGS OF THE INTERNATIONAL CONFERENCE OF NUMERICAL ANALYSIS AND APPLIED MATHEMATICS 2014 (ICNAAM-2014), 2015, 1648
[4] Towards Web Spam Filtering using a Classifier based on the Minimum Description Length Principle
Silva, Renato M.
Yamakami, Akebo
Almeida, Tiago A.
[J]. 2016 15TH IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA 2016), 2016, : 470 - 475
[5] A scalable hybrid approach for extracting head components from Web tables
Jung, SW
Kwon, HC
[J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2006, 18 (02) : 174 - 187
[6] A Scalable Spam Filtering Architecture
Ferreira, Nuno
Carvalho, Gracinda
Pereira, Paulo Rogerio
[J]. TECHNOLOGICAL INNOVATION FOR THE INTERNET OF THINGS, 2013, 394 : 107 - 114
[7] An interoperable and scalable Web-based system for classifier sharing and fusion
Tsoumakas, Grigorios
Vlahavas, Ioannis
[J]. EXPERT SYSTEMS WITH APPLICATIONS, 2007, 33 (03) : 716 - 724
[8] A scalable spam filtering architecture
Ferreira, Nuno
Carvalho, Gracinda
Pereira, Paulo Rogério
[J]. IFIP Advances in Information and Communication Technology, 2013, 394 : 107 - 114
[9] Exploiting the Spam Correlations in Scalable Online Social Spam Detection
Xu, Hailu
Hu, Liting
Liu, Pinchao
Guan, Boyuan
[J]. CLOUD COMPUTING - CLOUD 2019, 2019, 11513 : 146 - 160
[10] Harnessing the Nature of Spam in Scalable Online Social Spam Detection
Xu, Hailu
Guan, Boyuan
Liu, Pinchao
Escudero, William
Hu, Liting
[J]. 2018 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2018, : 3733 - 3736

← 1 2 3 4 5 →