A Quality Type-aware Annotated Corpus and Lexicon for Harassment Research

被引:28
|
作者
Rezvan, Mohammadreza [1 ]
Shekarpour, Saeedeh [2 ]
Balasuriya, Lakshika [1 ]
Thirunarayan, Krishnaprasad [1 ]
Shalin, Valerie L. [1 ]
Sheth, Amit [1 ]
机构
[1] Kno E Sis Ctr, Dayton, OH 43017 USA
[2] Univ Dayton, Dayton, OH 45469 USA
基金
美国国家科学基金会;
关键词
Annotated corpus; context; sexual; racial; political; appearance-related; intellectual; cyberbullying; harassment; offensive Lexicon; profane word;
D O I
10.1145/3201064.3201103
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
A quality annotated corpus is essential to research. Despite the recent focus of the Web science community on cyberbullying research, the community lacks standard benchmarks. This paper provides both a quality annotated corpus and an offensive words lexicon capturing different types of harassment content: (i) sexual, (ii) racial, (iii) appearance-related, (iv) intellectual, and (v) political(1). We first crawled data from Twitter using this content-tailored offensive lexicon. As mere presence of an offensive word is not a reliable indicator of harassment, human judges annotated tweets for the presence of harassment. Our corpus consists of 25,000 annotated tweets for the five types of harassment content and is available on the Git repository(2).
引用
收藏
页码:33 / 36
页数:4
相关论文
共 50 条
  • [1] On Type-Aware Entity Retrieval
    Garigliotti, Dario
    Balog, Krisztian
    [J]. ICTIR'17: PROCEEDINGS OF THE 2017 ACM SIGIR INTERNATIONAL CONFERENCE THEORY OF INFORMATION RETRIEVAL, 2017, : 27 - 34
  • [2] Tare: Type-Aware Neural Program Repair
    Zhu, Qihao
    Sun, Zeyu
    Zhang, Wenjie
    Xiong, Yingfei
    Zhang, Lu
    [J]. 2023 IEEE/ACM 45TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, ICSE, 2023, : 1443 - 1455
  • [3] Research on type-aware fashion compatibility prediction based on a hybrid attention mechanism
    Li, Yun
    Li, Guoxiang
    Zhang, Jing
    Jing, Peiguang
    Lu, Xingyu
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2024, 83 (30) : 74003 - 74020
  • [4] Learning Type-Aware Embeddings for Fashion Compatibility
    Vasileva, Mariya, I
    Plummer, Bryan A.
    Dusad, Krishna
    Rajpal, Shreya
    Kumar, Ranjitha
    Forsyth, David
    [J]. COMPUTER VISION - ECCV 2018, PT XVI, 2018, 11220 : 405 - 421
  • [5] Type-Aware Transactions for Faster Concurrent Code
    Herman, Nathaniel
    Inala, Jeevana Priya
    Huang, Yihe
    Tsai, Lillian
    Kohler, Eddie
    Liskov, Barbara
    Shrira, Liuba
    [J]. PROCEEDINGS OF THE ELEVENTH EUROPEAN CONFERENCE ON COMPUTER SYSTEMS, (EUROSYS 2016), 2016,
  • [6] TYPE-AWARE MEDICAL VISUAL QUESTION ANSWERING
    Zhang, Anda
    Tao, Wei
    Li, Ziyan
    Wang, Haofen
    Zhang, Wenqiang
    [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 4838 - 4842
  • [7] Type-Aware Concolic Testing of Java']JavaScript Programs
    Dhok, Monika
    Ramanathan, Murali Krishna
    Sinha, Nishant
    [J]. 2016 IEEE/ACM 38TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE), 2016, : 168 - 179
  • [8] A type-aware coding approach for unequal message protection
    Yao, Xinyuanmeng
    Wan, Hai
    Ma, Xiao
    [J]. PHYSICAL COMMUNICATION, 2022, 53
  • [9] Type-aware Convolutional Neural Networks for Slot Filling
    Adel, Heike
    Schuetze, Hinrich
    [J]. JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2019, 66 : 297 - 339
  • [10] TaGSim: Type-aware Graph Similarity Learning and Computation
    Bai, Jiyang
    Zhao, Peixiang
    [J]. PROCEEDINGS OF THE VLDB ENDOWMENT, 2021, 15 (02): : 335 - 347