Towards Better Detection of Biased Language with Scarce, Noisy, and Biased Annotations

被引:2
|
作者
Li, Zhuoyan [1 ]
Lu, Zhuoran [1 ]
Yin, Ming [1 ]
机构
[1] Purdue Univ, W Lafayette, IN 47907 USA
关键词
Biased Language; Bias Detection; Contrastive Learning; Fairness;
D O I
10.1145/3514094.3534142
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Biased language is prevalent in today's online social media. To reduce the amount of online biased language, one critical first step is to accurately detect such biased language, ideally automatically. This is a challenging problem, however, as the annotated data necessary for training a biased language classifier is either scarce and costly (e.g., when collected from experts), or noisy and potentially biased on their own (e.g., when collected from crowd workers). The biased language classifier built based on these annotations may thus be inaccurate, and sometimes unfair (e.g., have systematic accuracy disparities across texts with different political leanings). In this paper, we propose a novel method, CLEARE, for biased language detection, in which we utilize self-supervised contrastive learning to enhance the biased language classifier-we learn a robust encoder of the textual data through solving a min-max optimization problem, so that the encoder could help achieve the best classification performance even if the worst data augmentation strategy is selected. Extensive evaluations suggest that CLEARE shows substantial improvements compared to the state-of-art biased language detection methods on several benchmark datasets, in terms of improving both the accuracy and the fairness of the detection.
引用
收藏
页码:411 / 423
页数:13
相关论文
共 50 条
  • [1] Towards A Reliable Ground-Truth For Biased Language Detection
    Spinde, Timo
    Krieger, David
    Plank, Manuel
    Gipp, Bela
    [J]. 2021 ACM/IEEE JOINT CONFERENCE ON DIGITAL LIBRARIES (JCDL 2021), 2021, : 324 - 325
  • [2] Detection of Abusive Language: the Problem of Biased Datasets
    Wiegand, Michael
    Ruppenhofertt, Josef
    Kleinbauer, Thomas
    [J]. 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, 2019, : 602 - 608
  • [3] More is Better: English Language Statistics are Biased Toward Addition
    Winter, Bodo
    Fischer, Martin H.
    Scheepers, Christoph
    Myachykov, Andriy
    [J]. COGNITIVE SCIENCE, 2023, 47 (04)
  • [4] SEX BIASED LANGUAGE
    USHERWOOD, T
    [J]. BRITISH MEDICAL JOURNAL, 1992, 304 (6843): : 1692 - 1692
  • [5] Towards Robust Adaptive Object Detection under Noisy Annotations
    Liu, Xinyu
    Li, Wuyang
    Yang, Qiushi
    Li, Baopu
    Yuan, Yixuan
    [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 14187 - 14196
  • [6] Biased to learn language
    Sebastian-Galles, Nuria
    [J]. DEVELOPMENTAL SCIENCE, 2007, 10 (06) : 713 - 718
  • [7] Fairness Evaluation in Presence of Biased Noisy Labels
    Fogliato, Riccardo
    G'Sell, Max
    Chouldechova, Alexandra
    [J]. INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 108, 2020, 108 : 2325 - 2335
  • [8] Biased towards food: Electrophysiological evidence for biased attention to food stimuli
    Kumar, Sanjay
    Higgs, Suzanne
    Rutters, Femke
    Humphreys, Glyn W.
    [J]. BRAIN AND COGNITION, 2016, 110 : 85 - 93
  • [9] Reliable amplitude and frequency estimation for biased and noisy signals
    Raissi, Tarek
    [J]. COMMUNICATIONS IN NONLINEAR SCIENCE AND NUMERICAL SIMULATION, 2011, 16 (11) : 4153 - 4162
  • [10] 'Noisy, fallible and biased though it be' - (On the vagaries of educational research)
    Hamilton, D
    [J]. BRITISH JOURNAL OF EDUCATIONAL STUDIES, 2002, 50 (01) : 144 - 164