Artificial Intelligence inspired method for cross-lingual cyberhate detection from low resource languages

被引:0
|
作者
Kaur, Manpreet [1 ]
Saini, Munish [1 ]
机构
[1] Guru Nanak Dev Univ, Dept Comp Engn & Technol, Amritsar, Punjab, India
关键词
Artificial Intelligence; cross-lingual; cyberhate; low resource languages; social media; HIGHER-EDUCATION; HATE SPEECH; COMMUNITY;
D O I
10.1145/3677176
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The appearance of inflammatory language on social media by college or university students is quite prevalent, inspiring platforms to engage in community safety mechanisms. Escalating hate speech entails creating sophisticated artificial intelligence-based, machine learning, and deep learning algorithms to detect offensive internet content. With a few noteworthy exceptions, the majority of the studies on automatic hate speech recognition have emphasized high-resource languages, mainly English. We bridge this gap by addressing hate speech detection in Punjabi (Gurmukhi), a low-resource Indo-Aryan language articulated in Indian educational institutions. This research identifies cross-lingual hate speech in the code-switched English-Punjabi language used on social media. It proposes an approach combining the best hate speech detection techniques to cover existing state-of-the-art system gaps and limitations. In this method, the Roman Punjabi is transliterated, and then Bidirectional Encoder Representations from Transformer (BERT) based models are employed for hate detection. The proposed model has achieved 0.86 precision and 0.83 recall, and various higher educational institutions could employ it to discover the issues/domains where hate prevails the most.
引用
收藏
页数:23
相关论文
共 50 条
  • [21] Cross-lingual intent classification in a low resource industrial setting
    Khalil, Talaat
    Kielczewski, Kornel
    Chouliaras, Georgios Christos
    Keldibek, Amina
    Versteegh, Maarten
    2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019): PROCEEDINGS OF THE CONFERENCE, 2019, : 6419 - 6424
  • [22] End-to-end Text-to-speech for Low-resource Languages by Cross-Lingual Transfer Learning
    Chen, Yuan-Jui
    Tu, Tao
    Yeh, Cheng-chieh
    Lee, Hung-yi
    INTERSPEECH 2019, 2019, : 2075 - 2079
  • [23] Word Discovering in Low-Resources Languages Through Cross-Lingual Phonemes
    Garcia-Granada, Fernando
    Sanchis, Emilio
    Jose Castro-Bleda, Maria
    Angel Gonzalez, Jose
    Hurtado, Lluis-F.
    SPEECH AND COMPUTER, SPECOM 2019, 2019, 11658 : 133 - 141
  • [24] Monolingual and Cross-Lingual Intent Detection without Training Data in Target Languages
    Kapociute-Dzikiene, Jurgita
    Salimbajevs, Askars
    Skadins, Raivis
    ELECTRONICS, 2021, 10 (12)
  • [25] CROSS-LINGUAL TRANSFER LEARNING FOR LOW-RESOURCE SPEECH TRANSLATION
    Khurana, Sameer
    Dawalatabad, Nauman
    Laurent, Antoine
    Vicente, Luis
    Gimeno, Pablo
    Mingote, Victoria
    Glass, James
    2024 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING WORKSHOPS, ICASSPW 2024, 2024, : 670 - 674
  • [26] Cross-Lingual Language Modeling for Low-Resource Speech Recognition
    Xu, Ping
    Fung, Pascale
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2013, 21 (06): : 1134 - 1144
  • [27] Design Challenges in Low-resource Cross-lingual Entity Linking
    Fu, Xingyu
    Shi, Weijia
    Yu, Xiaodong
    Zhao, Zian
    Roth, Dan
    PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 6418 - 6432
  • [28] Cross-lingual Low Resource Speaker Adaptation Using Phonological Features
    Maniati, Georgia
    Ellinas, Nikolaos
    Markopoulos, Konstantinos
    Vamvoukakis, Georgios
    Sung, June Sig
    Park, Hyoungmin
    Chalamandaris, Aimilios
    Tsiakoulis, Pirros
    INTERSPEECH 2021, 2021, : 1594 - 1598
  • [29] Exploiting Adapters for Cross-Lingual Low-Resource Speech Recognition
    Hou, Wenxin
    Zhu, Han
    Wang, Yidong
    Wang, Jindong
    Qin, Tao
    Xu, Renju
    Shinozaki, Takahiro
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2022, 30 : 317 - 329
  • [30] Low-Resource Cross-Lingual Adaptive Training for Nigerian Pidgin
    Lin, Pin-Jie
    Saeed, Muhammed
    Chang, Ernie
    Scholman, Merel
    INTERSPEECH 2023, 2023, : 3954 - 3958