DeToxy: A Large-Scale Multimodal Dataset for Toxicity Classification in Spoken Utterances

被引:1
|
作者
Ghosh, Sreyan [1 ]
Lepcha, Samden [3 ]
Sakshi, S. [4 ]
Shah, Rajiv Ratn [2 ]
Umesh, S. [1 ]
机构
[1] IIT Madras, Speech Lab, Dept Elect Engn, Chennai, Tamil Nadu, India
[2] IIIT Delhi, MIDAS Labs, Delhi, India
[3] TEG Analyt, Bangalore, Karnataka, India
[4] Cisco Syst, Bangalore, Karnataka, India
来源
关键词
Speech Toxicity Analysis; End-to-End; 2-step; Multimodal;
D O I
10.21437/Interspeech.2022-10752
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Toxic speech is regarded as one of the crucial issues plaguing online social media today. Most recent work on toxic speech detection is constrained to the modality of text and written conversations with very limited work on toxicity detection from spoken utterances or using the modality of speech. In this paper, we introduce a new dataset DeToxy, the first publicly available toxicity annotated dataset for the English language. DeToxy is sourced from various openly available speech databases and consists of over 2 million utterances. We believe that our dataset would act as a benchmark for the relatively new and unexplored Spoken Language Processing (SLP) task of detecting toxicity from spoken utterances and boost further research in this space. Finally, we also provide strong unimodal baselines for our dataset and compare traditional two-step cascade and End-to-End (E2E) approaches. Our experiments show that in the case of spoken utterances, text-based approaches are largely dependent on gold human-annotated transcripts for their performance and also suffer from the problem of keyword bias. However, the presence of speech files in DeToxy helps facilitates the development of E2E speech models which alleviate both the above-stated problems by better capturing speech clues.
引用
收藏
页码:5185 / 5189
页数:5
相关论文
共 50 条
  • [1] MultiSubs: A Large-scale Multimodal and Multilingual Dataset
    Wang, Josiah
    Figueiredo, Josiel
    Specia, Lucia
    [J]. LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 6776 - 6785
  • [2] BjTT: A Large-Scale Multimodal Dataset for Traffic Prediction
    Zhang, Chengyang
    Zhang, Yong
    Shao, Qitan
    Feng, Jiangtao
    Li, Bo
    Lv, Yisheng
    Piao, Xinglin
    Yin, Baocai
    [J]. IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2024, : 18992 - 19003
  • [3] A large-scale hyperspectral dataset for flower classification
    Zheng, Yongrong
    Zhang, Tao
    Fu, Ying
    [J]. KNOWLEDGE-BASED SYSTEMS, 2022, 236
  • [4] Zenseact Open Dataset: A large-scale and diverse multimodal dataset for autonomous driving
    Alibeigi, Mina
    Ljungbergh, William
    Tonderski, Adam
    Hess, Georg
    Lilja, Adam
    Lindstrom, Carl
    Motorniuk, Daria
    Fu, Junsheng
    Widahl, Jenny
    Petersson, Christoffer
    [J]. 2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 20121 - 20131
  • [5] A Large-Scale Chinese Multimodal NER Dataset with Speech Clues
    Sui, Dianbo
    Tian, Zhengkun
    Chen, Yubo
    Liu, Kang
    Zhao, Jun
    [J]. 59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (ACL-IJCNLP 2021), VOL 1, 2021, : 2807 - 2818
  • [6] SDFC dataset: a large-scale benchmark dataset for hyperspectral image classification
    Sun, Liwei
    Zhang, Junjie
    Li, Jia
    Wang, Yueming
    Zeng, Dan
    [J]. OPTICAL AND QUANTUM ELECTRONICS, 2023, 55 (02)
  • [7] SDFC dataset: a large-scale benchmark dataset for hyperspectral image classification
    Liwei Sun
    Junjie Zhang
    Jia Li
    Yueming Wang
    Dan Zeng
    [J]. Optical and Quantum Electronics, 2023, 55
  • [8] Learning Fused Representations for Large-Scale Multimodal Classification
    Nawaz, Shah
    Calefati, Alessandro
    Janjua, Muhammad Kamran
    Anwaar, Muhammad Umer
    Gallo, Ignazio
    [J]. IEEE SENSORS LETTERS, 2019, 3 (01)
  • [9] MEVA: A Large-Scale Multiview, Multimodal Video Dataset for Activity Detection
    Corona, Kellie
    Osterdahl, Katie
    Collins, Roderic
    Hoogs, Anthony
    [J]. 2021 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2021), 2021, : 1059 - 1067
  • [10] MCLS: A Large-Scale Multimodal Cross-Lingual Summarization Dataset
    Shi, Xiaorui
    [J]. CHINESE COMPUTATIONAL LINGUISTICS, CCL 2023, 2023, 14232 : 273 - 288