Resources and benchmark corpora for hate speech detection: a systematic review

被引：153

作者：

Poletto, Fabio ^{[1
]}

Basile, Valerio ^{[1
]}

Sanguinetti, Manuela ^{[1
]}

Bosco, Cristina ^{[1
]}

Patti, Viviana ^{[1
]}

机构：

[1] Univ Turin, Turin, Italy

来源：

LANGUAGE RESOURCES AND EVALUATION | 2021年 / 55卷 / 02期

关键词：

Hate speech detection; Benchmark corpora; Natural Language Processing shared tasks; Systematic review;

D O I：

10.1007/s10579-020-09502-8

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

Hate Speech in social media is a complex phenomenon, whose detection has recently gained significant traction in the Natural Language Processing community, as attested by several recent review works. Annotated corpora and benchmarks are key resources, considering the vast number of supervised approaches that have been proposed. Lexica play an important role as well for the development of hate speech detection systems. In this review, we systematically analyze the resources made available by the community at large, including their development methodology, topical focus, language coverage, and other factors. The results of our analysis highlight a heterogeneous, growing landscape, marked by several issues and venues for improvement.

引用

页码：477 / 523

页数：47

共 50 条

[1] Resources and benchmark corpora for hate speech detection: a systematic review
Fabio Poletto
Valerio Basile
Manuela Sanguinetti
Cristina Bosco
Viviana Patti
[J]. Language Resources and Evaluation, 2021, 55 : 477 - 523
[2] Systematic Literature Review Of Hate Speech Detection With Text Mining
Rini
Utami, Ema
Hartanto, Anggit Dwi
[J]. PROCEEDINGS OF ICORIS 2020: 2020 THE 2ND INTERNATIONAL CONFERENCE ON CYBERNETICS AND INTELLIGENT SYSTEM (ICORIS), 2020, : 228 - 233
[3] HateXplain: A Benchmark Dataset for Explainable Hate Speech Detection
Mathew, Binny
Saha, Punyajoy
Yimam, Seid Muhie
Biemann, Chris
Goyal, Pawan
Mukherjee, Animesh
[J]. THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 14867 - 14875
[4] Detection of fake news and hate speech for Ethiopian languages: a systematic review of the approaches
Wubetu Barud Demilie
Ayodeji Olalekan Salau
[J]. Journal of Big Data, 9
[5] A systematic review of hate speech automatic detection using natural language processing
Jahan, Md Saroar
Oussalah, Mourad
[J]. NEUROCOMPUTING, 2023, 546
[6] Detection of fake news and hate speech for Ethiopian languages: a systematic review of the approaches
Demilie, Wubetu Barud
Salau, Ayodeji Olalekan
[J]. JOURNAL OF BIG DATA, 2022, 9 (01)
[7] Dynamically Refined Regularization for Improving Cross-corpora Hate Speech Detection
Bose, Tulika
Aletras, Nikolaos
Illina, Irina
Fohr, Dominique
[J]. FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), 2022, : 372 - 382
[8] Systematic keyword and bias analyses in hate speech detection
Sarracen, Gretel Liz De la Pella
Rosso, Paolo
[J]. INFORMATION PROCESSING & MANAGEMENT, 2023, 60 (05)
[9] Twitter Hate Speech Detection: A Systematic Review of Methods, Taxonomy Analysis, Challenges, and Opportunities
Mansur, Zainab
Omar, Nazlia
Tiun, Sabrina
[J]. IEEE ACCESS, 2023, 11 : 16226 - 16249
[10] Modern Standard Arabic Speech Corpora: A Systematic Review
Alqadasi, Ammar Mohammed Ali
Abdulghafor, Rawad
Sunar, Mohd Shahrizal
Salam, Md. Sah Bin H. J.
[J]. IEEE ACCESS, 2023, 11 : 55771 - 55796

← 1 2 3 4 5 →