Towards Generalized Offensive Language Identification

被引：0

作者：

Dmonte, Alphaeus ^{[1
]}

Arya, Tejas ^{[2
]}

Ranasinghe, Tharindu ^{[3
]}

Zampieri, Marcos ^{[1
]}

机构：

[1] George Mason Univ, Fairfax, VA 22030 USA

[2] Rochester Inst Technol, Rochester, NY USA

[3] Univ Lancaster, Lancaster, England

来源：

SOCIAL NETWORKS ANALYSIS AND MINING, ASONAM 2024, PT I | 2025年 / 15211卷

关键词：

Offensive Language; Large Language Models; Generalizability;

D O I：

10.1007/978-3-031-78541-2_17

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The prevalence of offensive content on the internet, encompassing hate speech and cyberbullying, is a pervasive issue worldwide. Consequently, it has garnered significant attention from the machine learning (ML) and natural language processing (NLP) communities. As a result, numerous systems have been developed to automatically identify potentially harmful content and to mitigate its impact. These systems can follow two approaches; (i) Use publicly available models and application endpoints, including prompting large language models (LLMs) (ii) Annotate datasets and train ML models on them. However, both approaches lack an understanding of how generalizable they are. Furthermore, the applicability of these systems is often questioned in off-domain and practical environments. This paper empirically evaluates the generalizability of offensive language detection models and datasets across a novel generalized benchmark: GenOffense. We answer three research questions on generalizability. Our findings will be useful in creating robust real-world offensive language detection systems.

引用

页码：271 / 286

页数：16

共 50 条

[41] Elevating Offensive Language Detection: CNN-GRU and BERT for Enhanced Hate Speech Identification
Madhavi, M.
Agal, Sanjay
Odedra, Niyati Dhirubhai
Chowdhary, Harish
Ruprah, Taranpreet Singh
Vuyyuru, Veera Ankalu
El-Ebiary, Yousef A. Baker
INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2024, 15 (05) : 1164 - 1172
[42] Detection of Offensive Language and ITS Severity for Low Resource Language
Saeed, Ramsha
Afzal, Hammad
Rauf, Sadaf Abdul
Iltaf, Naima
ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2023, 22 (06)
[43] A Multi-Architecture Approach for Offensive Language Identification Combining Classical Natural Language Processing and BERT-Variant Models
Yadav, Ashok
Khan, Farrukh Aslam
Singh, Vrijendra
APPLIED SCIENCES-BASEL, 2024, 14 (23):
[44] Offensive Language Detection in Nepali Social Media
Niraula, Nobal B.
Dulal, Saurab
Koirala, Diwa
WOAH 2021: THE 5TH WORKSHOP ON ONLINE ABUSE AND HARMS, 2021, : 67 - 75
[45] On the effects of machine translation on offensive language detection
Dmonte, Alphaeus
Satapara, Shrey
Alsudais, Rehab
Ranasinghe, Tharindu
Zampieri, Marcos
SOCIAL NETWORK ANALYSIS AND MINING, 2025, 14 (01)
[46] Deep learning based sentiment analysis and offensive language identification on multilingual code-mixed data
Shanmugavadivel, Kogilavani
Sathishkumar, V. E.
Raja, Sandhiya
Lingaiah, T. Bheema
Neelakandan, S.
Subramanian, Malliga
SCIENTIFIC REPORTS, 2022, 12 (01)
[47] Offensive Language: Taboo, Offence and Social Control
Zhuo, Tianying
Ying, Hongying
INTERNET PRAGMATICS, 2024, 7 (02): : 326 - 330
[48] DravidianCodeMix: sentiment analysis and offensive language identification dataset for Dravidian languages in code-mixed text
Chakravarthi, Bharathi Raja
Priyadharshini, Ruba
Muralidaran, Vigneshwaran
Jose, Navya
Suryawanshi, Shardul
Sherly, Elizabeth
McCrae, John P.
LANGUAGE RESOURCES AND EVALUATION, 2022, 56 (03) : 765 - 806
[49] DravidianCodeMix: sentiment analysis and offensive language identification dataset for Dravidian languages in code-mixed text
Bharathi Raja Chakravarthi
Ruba Priyadharshini
Vigneshwaran Muralidaran
Navya Jose
Shardul Suryawanshi
Elizabeth Sherly
John P. McCrae
Language Resources and Evaluation, 2022, 56 : 765 - 806
[50] Offensive Language: Taboo, Offence and Social Control
Jay, Timothy B.
JOURNAL OF MULTILINGUAL AND MULTICULTURAL DEVELOPMENT, 2023, 44 (01) : 80 - 82

← 1 2 3 4 5 →