A Quality Type-aware Annotated Corpus and Lexicon for Harassment Research

被引:28
|
作者
Rezvan, Mohammadreza [1 ]
Shekarpour, Saeedeh [2 ]
Balasuriya, Lakshika [1 ]
Thirunarayan, Krishnaprasad [1 ]
Shalin, Valerie L. [1 ]
Sheth, Amit [1 ]
机构
[1] Kno E Sis Ctr, Dayton, OH 43017 USA
[2] Univ Dayton, Dayton, OH 45469 USA
基金
美国国家科学基金会;
关键词
Annotated corpus; context; sexual; racial; political; appearance-related; intellectual; cyberbullying; harassment; offensive Lexicon; profane word;
D O I
10.1145/3201064.3201103
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
A quality annotated corpus is essential to research. Despite the recent focus of the Web science community on cyberbullying research, the community lacks standard benchmarks. This paper provides both a quality annotated corpus and an offensive words lexicon capturing different types of harassment content: (i) sexual, (ii) racial, (iii) appearance-related, (iv) intellectual, and (v) political(1). We first crawled data from Twitter using this content-tailored offensive lexicon. As mere presence of an offensive word is not a reliable indicator of harassment, human judges annotated tweets for the presence of harassment. Our corpus consists of 25,000 annotated tweets for the five types of harassment content and is available on the Git repository(2).
引用
下载
收藏
页码:33 / 36
页数:4
相关论文
共 50 条
  • [21] Relation Extraction with Type-aware Map Memories of Word Dependencies
    Chen, Guimin
    Tian, Yuanhe
    Song, Yan
    Wan, Xiang
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-IJCNLP 2021, 2021, : 2501 - 2512
  • [22] Stress Testing SMT Solvers via Type-aware Mutation
    Zhang, Chengyu
    2020 ACM/IEEE 42ND INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING: COMPANION PROCEEDINGS (ICSE-COMPANION 2020), 2020, : 119 - 121
  • [23] On the Unusual Effectiveness of Type-Aware Operator Mutations for Testing SMT Solvers
    Winterer, Dominik
    Zhang, Chengyu
    Su, Zhendong
    PROCEEDINGS OF THE ACM ON PROGRAMMING LANGUAGES-PACMPL, 2020, 4 (OOPSLA):
  • [24] Type-Aware Web Service Composition Using Boolean Satisfiability Solver
    Nam, Wonhong
    Kil, Hunyoung
    Lee, Dongwon
    IEEE JOINT CONFERENCE ON E-COMMERCE TECHNOLOGY (CEC'08) AND ENTERPRISE COMPUTING, E-COMMERCE AND E-SERVICES (EEE'08), 2008, : 331 - 334
  • [25] A Type-Aware Coding Approach to Joint Source-Channel Coding
    Yao, Xinyuanmeng
    Ma, Xiao
    IEEE COMMUNICATIONS LETTERS, 2021, 25 (11) : 3454 - 3457
  • [26] Combing Type-Aware Attention and Graph Convolutional Networks for Event Detection
    Ding, Kun
    Xu, Lu
    Liu, Ming
    Zhang, Xiaoxiong
    Liu, Liu
    Zeng, Daojian
    Liu, Yuting
    Jin, Chen
    CMC-COMPUTERS MATERIALS & CONTINUA, 2023, 74 (01): : 641 - 654
  • [27] Few-Shot Knowledge Graph Completion Combined with Type-Aware Attention
    Pu X.
    Wang H.
    Xian Y.
    Data Analysis and Knowledge Discovery, 2023, 7 (09) : 51 - 63
  • [28] A Danish FrameNet Lexicon and an Annotated Corpus Used for Training and Evaluating a Semantic Frame Classifier
    Pedersen, Bolette S.
    Nimb, Sanni
    Sogaard, Anders
    Hartmann, Mareike
    Olsen, Sussi
    PROCEEDINGS OF THE ELEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2018), 2018, : 2382 - 2386
  • [29] Type-Aware Federated Scheduling for Typed DAG Tasks on Heterogeneous Multicore Platforms
    Lin, Ching-Chi
    Shi, Junjie
    Ueter, Niklas
    Guenzel, Mario
    Reineke, Jan
    Chen, Jian-Jia
    IEEE TRANSACTIONS ON COMPUTERS, 2023, 72 (05) : 1286 - 1300
  • [30] A hybrid genetic algorithm with type-aware chromosomes for Traveling Salesman Problems with Drone
    Mahmoudinazlou, Sasan
    Kwon, Changhyun
    EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2024, 318 (03) : 719 - 739