Five sources of bias in natural language processing

被引:74
|
作者
Hovy, Dirk [1 ]
Prabhumoye, Shrimai [2 ]
机构
[1] Bocconi Univ, Mkt Dept, Via Roentgen 1-2, I-20136 Milan, Italy
[2] Carnegie Mellon Univ, Sch Comp Sci, Pittsburgh, PA 15213 USA
来源
LANGUAGE AND LINGUISTICS COMPASS | 2021年 / 15卷 / 08期
基金
欧盟地平线“2020”; 欧洲研究理事会;
关键词
D O I
10.1111/lnc3.12432
中图分类号
H [语言、文字];
学科分类号
05 ;
摘要
Recently, there has been an increased interest in demographically grounded bias in natural language processing (NLP) applications. Much of the recent work has focused on describing bias and providing an overview of bias in a larger context. Here, we provide a simple, actionable summary of this recent work. We outline five sources where bias can occur in NLP systems: (1) the data, (2) the annotation process, (3) the input representations, (4) the models, and finally (5) the research design (or how we conceptualize our research). We explore each of the bias sources in detail in this article, including examples and links to related work, as well as potential counter-measures.
引用
收藏
页数:19
相关论文
共 50 条
  • [1] An analysis of gender bias studies in natural language processing
    Costa-jussa, Marta R.
    [J]. NATURE MACHINE INTELLIGENCE, 2019, 1 (11) : 495 - 496
  • [2] An analysis of gender bias studies in natural language processing
    Marta R. Costa-jussà
    [J]. Nature Machine Intelligence, 2019, 1 : 495 - 496
  • [3] Editorial: Bias, Subjectivity and Perspectives in Natural Language Processing
    Basile, Valerio
    Caselli, Tommaso
    Balahur, Alexandra
    Ku, Lun-Wei
    [J]. FRONTIERS IN ARTIFICIAL INTELLIGENCE, 2022, 5
  • [4] GENDERED LANGUAGE IN NARRATIVE COMMENTS OF LEARNERS: NATURAL LANGUAGE PROCESSING AND GENDER BIAS
    Saker, Katerina
    Klein, Robin
    [J]. JOURNAL OF GENERAL INTERNAL MEDICINE, 2022, 37 (SUPPL 2) : 211 - 211
  • [5] The Meaning and Measurement of Bias: Lessons from Natural Language Processing
    Jacobs, Abigail Z.
    Blodgett, Su Lin
    Barocas, Solon
    Daume, Hal, III
    Wallach, Hanna
    [J]. FAT* '20: PROCEEDINGS OF THE 2020 CONFERENCE ON FAIRNESS, ACCOUNTABILITY, AND TRANSPARENCY, 2020, : 706 - 706
  • [6] Toward Bias Analysis Using Tweets and Natural Language Processing
    Tankard, Earl, Jr.
    Flowers, Christopher
    Li, Jiang
    Rawat, Danda B.
    [J]. 2021 IEEE 18TH ANNUAL CONSUMER COMMUNICATIONS & NETWORKING CONFERENCE (CCNC), 2021,
  • [7] Mitigating Gender Bias in Natural Language Processing: Literature Review
    Sun, Tony
    Gaut, Andrew
    Tang, Shirlyn
    Huang, Yuxin
    ElSherief, Mai
    Zhao, Jieyu
    Mirza, Diba
    Belding, Elizabeth
    Chang, Kai-Wei
    Wang, William Yang
    [J]. 57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 1630 - 1640
  • [8] Nbias: A natural language processing framework for BIAS identification in text
    Raza, Shaina
    Garg, Muskan
    Reji, Deepak John
    Bashir, Syed Raza
    Ding, Chen
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2024, 237
  • [9] Processing natural language without natural language processing
    Brill, E
    [J]. COMPUTATIONAL LINGUISTICS AND INTELLIGENT TEXT PROCESSING, PROCEEDINGS, 2003, 2588 : 360 - 369
  • [10] Gender bias in resident evaluations: Natural language processing and competency evaluation
    Andrews, Jane
    Chartash, David
    Hay, Seonaid
    [J]. MEDICAL EDUCATION, 2021, 55 (12) : 1383 - 1387