Crowdsourcing for Language Resource Development: Criticisms About Amazon Mechanical Turk Overpowering Use

被引:4
|
作者
Fort, Karen [1 ]
Adda, Gilles [2 ]
Sagot, Benoit [3 ]
Mariani, Joseph [2 ,4 ]
Couillault, Alain [5 ]
机构
[1] LORIA, Vandoeuvre Les Nancy, France
[2] LIMSI CNRS, Spoken Language Proc Grp, Orsay, France
[3] Univ Paris 07, INRIA Paris Rocquencourt, Alpage, Rocquencourt, France
[4] CNRS, IMMI, Orsay, France
[5] Univ Rochelle, L3i Lab, Rochelle, France
基金
欧盟第七框架计划;
关键词
Amazon Mechanical Turk; Language resources; Ethics;
D O I
10.1007/978-3-319-08958-4_25
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This article is a position paper about Amazon Mechanical Turk, the use of which has been steadily growing in language processing in the past few years. According to the mainstream opinion expressed in articles of the domain, this type of on-line working platforms allows to develop quickly all sorts of quality language resources, at a very low price, by people doing that as a hobby. We shall demonstrate here that the situation is far from being that ideal. Our goal here is manifold: 1- to inform researchers, so that they can make their own choices, 2- to develop alternatives with the help of funding agencies and scientific associations, 3- to propose practical and organizational solutions in order to improve language resources development, while limiting the risks of ethical and legal issues without letting go price or quality, 4- to introduce an Ethics and Big Data Charter for the documentation of language resources.
引用
收藏
页码:303 / 314
页数:12
相关论文
共 30 条
  • [1] The Use of Crowdsourcing in Addiction Science Research: Amazon Mechanical Turk
    Strickland, Justin C.
    Stoops, William W.
    EXPERIMENTAL AND CLINICAL PSYCHOPHARMACOLOGY, 2019, 27 (01) : 1 - 18
  • [2] Using Amazon Mechanical Turk and other compensated crowdsourcing sites
    Schmidt, Gordon B.
    Jettinghoff, William M.
    BUSINESS HORIZONS, 2016, 59 (04) : 391 - 400
  • [3] Crowdsourcing research: Data collection with Amazon's Mechanical Turk
    Sheehan, Kim Bartel
    COMMUNICATION MONOGRAPHS, 2018, 85 (01) : 140 - 156
  • [4] Crowdsourcing a Normative Natural Language Dataset: A Comparison of Amazon Mechanical Turk and In-Lab Data Collection
    Saunders, Daniel R.
    Bex, Peter J.
    Woods, Russell L.
    JOURNAL OF MEDICAL INTERNET RESEARCH, 2013, 15 (05)
  • [5] Crowdsourcing Participatory Evaluation of Medical Pictograms Using Amazon Mechanical Turk
    Yu, Bei
    Willis, Matt
    Sun, Peiyuan
    Wang, Jun
    JOURNAL OF MEDICAL INTERNET RESEARCH, 2013, 15 (06)
  • [6] Using Crowdsourcing Websites for Sociological Research: The Case of Amazon Mechanical Turk
    Shank D.B.
    The American Sociologist, 2016, 47 (1) : 47 - 55
  • [7] USING THE AMAZON MECHANICAL TURK FOR TRANSCRIPTION OF SPOKEN LANGUAGE
    Marge, Matthew
    Banerjee, Satanjeev
    Rudnicky, Alexander I.
    2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 5270 - 5273
  • [8] Growing a Spoken Language Interface on Amazon Mechanical Turk
    McGraw, Ian
    Glass, James
    Seneff, Stephanie
    12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 3064 - 3067
  • [9] Using Crowdsourcing for Alcohol and Nicotine Use Research: Prevalence, Data Quality, and Attrition on Amazon Mechanical Turk
    Rung, Jillian M.
    Almog, Shahar
    Ferreiro, Andrea Vasquez
    Berry, Meredith S.
    SUBSTANCE USE & MISUSE, 2022, 57 (06) : 857 - 866
  • [10] Who Broke Amazon Mechanical Turk? An Analysis of Crowdsourcing Data Quality over Time
    Marshall, Catherine C.
    Goguladinne, Partha S. R.
    Maheshwari, Mudit
    Sathe, Apoorva
    Shipman, Frank M.
    PROCEEDINGS OF THE 15TH ACM WEB SCIENCE CONFERENCE, WEBSCI 2023, 2023, : 335 - 345