Automatic classification of social media reports on violent incidents in South Africa using machine learning

被引:5
|
作者
Kotze, Eduan [1 ]
Senekal, Burgert A. [2 ]
Daelemans, Walter [3 ]
机构
[1] Univ Free State, Dept Comp Sci & Informat, Bloemfontein, South Africa
[2] Univ Free State, Dept South African Sign Language & Deaf Studie, Bloemfontein, South Africa
[3] Univ Antwerp, CLiPS Res Ctr, Antwerp, Belgium
关键词
WhatsApp; text classification; Word2Vec; protests; open-source intelligence; OSINT; EVENT; CONFLICT;
D O I
10.17159/sajs.2020/6557
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
With the growing amount of data available in the digital age, it has become increasingly important to use automated methods to extract useful information from data. One such application is the extraction of events from news sources for the purpose of a quantitative analysis that does not rely on someone needing to read through thousands of news articles. Overseas, projects such as the Integrated Crisis Early Warning System (ICEWS) monitor news stories and extract events using automated coding. However, not all violent events are reported in the news, and while monitoring only news agencies is sufficient for projects such as ICEWS which have a global focus, more news sources are required when assessing a local situation. We used WhatsApp as a news source to identify the occurrence of violent incidents in South Africa. Using machine learning, we have shown how violent incidents can be coded and recorded, allowing for a local level recording of these events over time. Our experimental results show good performance on both training and testing data sets using a logistic regression classifier with unigrams and Word2vec feature models. Future work will evaluate the inclusion of pre-trained word embedding for both Afrikaans and English words to improve the performance of the machine learning classifier. Significance: The logistic regression classifier using TFIDF unigram, CBOW and skip-gram Word2Vec models were successfully implemented to automatically analyse and classify WhatsApp messages from groups that share information on protests and mass violence in South Africa. At the time of publishing, messages were collected from 26 WhatsApp groups across South Africa and automatically classified on an hourly basis.
引用
收藏
页码:43 / 50
页数:8
相关论文
共 50 条
  • [1] Social Media Mining to Detect Online Violent Extremism using Machine Learning Techniques
    Mussiraliyeva, Shynar
    Bagitova, Kalamkas
    Sultan, Daniyar
    [J]. INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2023, 14 (06) : 1384 - 1393
  • [2] Identification and classification of road traffic incidents in Panama City through the analysis of a social media stream and machine learning
    Liu, Lucia
    Guevara, Ameth
    Sanchez-Galan, Javier E.
    [J]. Intelligent Systems with Applications, 2022, 16
  • [3] Automatic flow classification using machine learning
    Anantavrasilp, Isara
    Schoeler, Thorsten
    [J]. SOFTCOM 2007: 15TH INTERNATIONAL CONFERENCE ON SOFTWARE, TELECOMMUNICATIONS AND COMPUTER NETWORKS, 2007, : 390 - +
  • [4] Automatic Vulnerability Classification Using Machine Learning
    Gawron, Marian
    Cheng, Feng
    Meinel, Christoph
    [J]. RISKS AND SECURITY OF INTERNET AND SYSTEMS, CRISIS 2017, 2018, 10694 : 3 - 17
  • [5] Automatic Patents Classification Using Supervised Machine Learning
    Shahid, Muhammad
    Ahmed, Adeel
    Mushtaq, Muhammad Faheem
    Ullah, Saleem
    Matiullah
    Akram, Urooj
    [J]. RECENT ADVANCES ON SOFT COMPUTING AND DATA MINING (SCDM 2020), 2020, 978 : 297 - 307
  • [6] Automatic tortuosity classification using machine learning approach
    Turior, Rashmi
    Chutinantvarodom, Pornthep
    Uyyanonvara, Bunyarit
    [J]. INDUSTRIAL INSTRUMENTATION AND CONTROL SYSTEMS, PTS 1-4, 2013, 241-244 : 3143 - 3147
  • [7] Automatic classification of object code using machine learning
    Clemens, John
    [J]. DIGITAL INVESTIGATION, 2015, 14 : S156 - S162
  • [8] Automatic Classification of Vulnerabilities using Deep Learning and Machine Learning Algorithms
    Ramesh, Vishnu
    Abraham, Sara
    Vinod, P.
    Mohamed, Isham
    Visaggio, Corrado A.
    Laudanna, Sonia
    [J]. 2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
  • [9] Machine Learning Algorithms applied in Automatic Classification of Social Network Users
    Alves de Lima, Bruno Vicente
    Machado, Vinicius Ponte
    [J]. 2012 FOURTH INTERNATIONAL CONFERENCE ON COMPUTATIONAL ASPECTS OF SOCIAL NETWORKS (CASON), 2012, : 58 - 62
  • [10] Food Safety by Using Machine Learning for Automatic Classification of Seeds of the South-American Incanut Plant
    Lemanzyk, Thomas
    Anding, Katharina
    Linss, Gerhard
    Hernandez, Jorge Rodriguez
    Theska, Rene
    [J]. 2014 JOINT IMEKO TC1-TC7-TC13 SYMPOSIUM: MEASUREMENT SCIENCE BEHIND SAFETY AND SECURITY, 2015, 588