Gamified crowdsourcing for idiom corpora construction

被引:2
|
作者
Eryigit, GulSen [1 ,2 ]
Sentas, Ali [1 ]
Monti, Johanna [3 ]
机构
[1] Istanbul Tech Univ, Fac Comp & Informat, Istanbul, Turkey
[2] Istanbul Tech Univ, Dept Artificial Intelligence & Data Engn, Istanbul, Turkey
[3] Univ Naples Orientale, Dept Literary Linguist & Comparat Studies, Naples, Italy
关键词
Crowdsourcing; Gamification; Game with a purpose (GWAP); Idiomatic expressions; Language resources; GAMIFICATION; WORK;
D O I
10.1017/S1351324921000401
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Learning idiomatic expressions is seen as one of the most challenging stages in second-language learning because of their unpredictable meaning. A similar situation holds for their identification within natural language processing applications such as machine translation and parsing. The lack of high-quality usage samples exacerbates this challenge not only for humans but also for artificial intelligence systems. This article introduces a gamified crowdsourcing approach for collecting language learning materials for idiomatic expressions; a messaging bot is designed as an asynchronous multiplayer game for native speakers who compete with each other while providing idiomatic and nonidiomatic usage examples and rating other players' entries. As opposed to classical crowd-processing annotation efforts in the field, for the first time in the literature, a crowd-creating & crowd-rating approach is implemented and tested for idiom corpora construction. The approach is language-independent and evaluated on two languages in comparison to traditional data preparation techniques in the field. The reaction of the crowd is monitored under different motivational means (namely, gamification affordances and monetary rewards). The results reveal that the proposed approach is powerful in collecting the targeted materials, and although being an explicit crowdsourcing approach, it is found entertaining and useful by the crowd. The approach has been shown to have the potential to speed up the construction of idiom corpora for different natural languages to be used as second-language learning material, training data for supervised idiom identification systems, or samples for lexicographic studies.
引用
收藏
页码:909 / 941
页数:33
相关论文
共 50 条
  • [31] Pastoral power in leadership work: the relational leadership idiom in the construction industry
    Styhre, Alexander
    Fasth, Jonas
    Lowstedt, Martin
    QUALITATIVE RESEARCH IN ORGANIZATIONS AND MANAGEMENT, 2023, 18 (01): : 84 - 101
  • [33] Construction of Chinese Conversational Corpora for Spontaneous Speech Recognition and Comparative Study on the Trilingual Parallel Corpora
    Hu, Xinhui
    Isotani, Ryosuke
    Nakamura, Satoshi
    ORIENTAL COCOSDA 2009 - INTERNATIONAL CONFERENCE ON SPEECH DATABASE AND ASSESSMENTS, 2009, : 56 - 59
  • [34] Construction Grammar and Phraseology: From Corpora to Theoretical Insights
    Colson, Jean-Pierre
    LANGAGES, 2022, (225) : 19 - +
  • [35] A hybrid approach toward biomedical relation extraction training corpora: combining distant supervision with crowdsourcing
    Sousa, Diana
    Lamurias, Andre
    Couto, Francisco M.
    DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION, 2020,
  • [36] Cotranslate: A Web-Based Tool for Crowdsourcing High-Quality Sentence Pair Corpora
    National Center for Artificial Intelligence, Chile
    不详
    2023,
  • [37] JustWalk: A Crowdsourcing Approach for the Automatic Construction of Indoor Floorplans
    Elhamshary, Moustafa
    Alzantot, Moustafa
    Youssef, Moustafa
    IEEE TRANSACTIONS ON MOBILE COMPUTING, 2019, 18 (10) : 2358 - 2371
  • [38] Termonet: Terminology construction from WordNet and technical corpora
    Solla Portela, Miguel Anxo
    Guinovart, Xavier Gomez
    PROCESAMIENTO DEL LENGUAJE NATURAL, 2015, (55): : 165 - 168
  • [39] CoTranslate: A web-based tool for crowdsourcing high-quality sentence pair corpora
    Carvallo, Andres
    Jorquera, Ignacio
    Aspillaga, Carlos
    SOFTWAREX, 2023, 23
  • [40] Construction of Crowdsourcing Environment for Creation of Voice Interaction Scenario
    Matsushita, Yuichi
    Uchiya, Takahiro
    Nishimura, Ryota
    Yamamoto, Daisuke
    Takumi, Ichi
    2014 IEEE 3RD GLOBAL CONFERENCE ON CONSUMER ELECTRONICS (GCCE), 2014, : 689 - 690