Mining the Global Terrorism Dataset using Machine Learning Algorithms

被引：0

作者：

Alsaedi, Alaa S. ^{[1
]}

Almobarak, Arwa S. ^{[1
]}

Alharbi, Saad T. ^{[1
]}

机构：

[1] Taibah Univ, Coll Comp Sci, Dept Comp Sci & Engn, Madinah, Saudi Arabia

来源：

2019 IEEE/ACS 16TH INTERNATIONAL CONFERENCE ON COMPUTER SYSTEMS AND APPLICATIONS (AICCSA 2019) | 2019年

关键词：

classification; K nearest neighbor; Naive Bays; Random Forest; Cross Validation; Accuracy; global terrorism;

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

From the beginning of the present century, terrorist attacks have represented a massive concern for both developed and developing countries alike. Therefore, countries spare no effort to use all available means to fight and eradicate them. Due to this, we aim through this paper, to take the advantages of the available data mining and machine learning techniques, by applying them to the Global Terrorism Database (GTD) dataset in order to acquire valuable information about the predicted attacks and attackers. We believe that these models will be of great benefit when utilized by governments and intelligence agencies; since they help them to make pre-emptive strikes against terrorist groups quickly and over a short period of time. In our paper, we focus on two tasks: predicting the success of an attack and predicting the identity of the terrorist organization that is behind the attack. To implement this, we adopted three machine learning algorithms: K-nearest neighbor (KNN), Naive Bayes (NB) and Random Forest (RF), which we used to train our models. More specifically, each algorithm was used to construct two models, one for each task, where the data were sampled for the first model using the holdout method, while for the second model cross-validation was employed. In the end, we compared the performance of the models in terms of accuracy, precision, recall and F-measure metrics. We noticed that the RF models outperformed the other models, while the NB models were the least efficient among the three algorithm models.

引用

页数：7

共 50 条

[1] Performance Assessment Using Supervised Machine Learning Algorithms of Opinion Mining on Social Media Dataset
Susmitha, M.
Pranitha, R. Laxmi
[J]. PROCEEDINGS OF SECOND INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTER ENGINEERING AND COMMUNICATION SYSTEMS, ICACECS 2021, 2022, : 419 - 427
[2] Dynamic Feature Dataset for Ransomware Detection Using Machine Learning Algorithms
Herrera-Silva, Juan A.
Hernandez-alvarez, Myriam
[J]. SENSORS, 2023, 23 (03)
[3] Comparative Study of Machine Learning Algorithms using a Breast Cancer Dataset
El-Shair, Zaid A.
Sanchez-Perez, Luis A.
Rawashdeh, Samir A.
[J]. 2020 IEEE INTERNATIONAL CONFERENCE ON ELECTRO INFORMATION TECHNOLOGY (EIT), 2020, : 500 - 508
[4] Mining Mixed Data Bases Using Machine Learning Algorithms
Kuri-Morales, Angel
[J]. PATTERN RECOGNITION, MCPR 2022, 2022, 13264 : 70 - 80
[5] Algorithms for Data Mining and Machine Learning
Schulz, Volker H.
[J]. SIAM REVIEW, 2020, 62 (03) : 739 - 739
[6] Machine Learning Algorithms for Detecting and Analyzing Social Bots Using a Novel Dataset
Jalal, Niyaz
Ghafoor, Kayhan Z.
[J]. ARO-THE SCIENTIFIC JOURNAL OF KOYA UNIVERSITY, 2022, 10 (02): : 11 - 21
[7] Snow and glacial feature identification using Hyperion dataset and machine learning algorithms
Haq M.A.
Alshehri M.
Rahaman G.
Ghosh A.
Baral P.
Shekhar C.
[J]. Arabian Journal of Geosciences, 2021, 14 (15)
[8] Effectiveness of dataset reduction in testing machine learning algorithms
Chandrasekaran, Jaganmohan
Feng, Huadong
Lei, Yu
Kacker, Raghu
Kuhn, D. Richard
[J]. 2020 IEEE INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE TESTING (AITEST), 2020, : 133 - 140
[9] A dataset of oracle characters for benchmarking machine learning algorithms
Wang, Mei
Deng, Weihong
[J]. SCIENTIFIC DATA, 2024, 11 (01)
[10] A dataset of oracle characters for benchmarking machine learning algorithms
Mei Wang
Weihong Deng
[J]. Scientific Data, 11

← 1 2 3 4 5 →