Creating a system for lexical substitutions from scratch using crowdsourcing

被引：22

作者：

Biemann, Chris ^{[1
]}

机构：

[1] Tech Univ Darmstadt, D-64289 Darmstadt, Germany

来源：

LANGUAGE RESOURCES AND EVALUATION | 2013年 / 47卷 / 01期

关键词：

Amazon Turk; Lexical substitution; Word sense disambiguation; Language resource creation; Crowdsourcing;

D O I：

10.1007/s10579-012-9180-5

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

This article describes the creation and application of the Turk Bootstrap Word Sense Inventory for 397 frequent nouns, which is a publicly available resource for lexical substitution. This resource was acquired using Amazon Mechanical Turk. In a bootstrapping process with massive collaborative input, substitutions for target words in context are elicited and clustered by sense; then, more contexts are collected. Contexts that cannot be assigned to a current target word's sense inventory re-enter the bootstrapping loop and get a supply of substitutions. This process yields a sense inventory with its granularity determined by substitutions as opposed to psychologically motivated concepts. It comes with a large number of sense-annotated target word contexts. Evaluation on data quality shows that the process is robust against noise from the crowd, produces a less fine-grained inventory than WordNet and provides a rich body of high precision substitution data at low cost. Using the data to train a system for lexical substitutions, we show that amount and quality of the data is sufficient for producing high quality substitutions automatically. In this system, co-occurrence cluster features are employed as a means to cheaply model topicality.

引用

页码：97 / 122

页数：26

共 50 条

[1] Creating a system for lexical substitutions from scratch using crowdsourcing
Chris Biemann
[J]. Language Resources and Evaluation, 2013, 47 : 97 - 122
[2] Sense Inventory Alignment Using Lexical Substitutions and Crowdsourcing
Ustalov, Dmitry
Igushkin, Sergey
[J]. PROCEEDINGS OF THE INTERNATIONAL FRUCT CONFERENCE ON INTELLIGENCE, SOCIAL MEDIA AND WEB (ISMW FRUCT 2016), 2016, : 56 - 61
[3] Creating a communication system from scratch: gesture beats vocalization hands down
Fay, Nicolas
Lister, Casey J.
Ellison, T. Mark
Goldin-Meadow, Susan
[J]. FRONTIERS IN PSYCHOLOGY, 2014, 5
[4] Sound Substitutions in the lexical Borrowings from German into Polish
Zeller, Jan Patrick
[J]. ZEITSCHRIFT FUR SLAWISTIK, 2016, 61 (03): : 554 - 560
[5] Creating USB device drivers from scratch
Schultz, M
[J]. ELECTRONICS WORLD, 2005, 111 (1830): : 29 - 31
[6] Creating Interactive User Interfaces by Demonstration using Crowdsourcing
Krosnick, Rebecca
[J]. 2018 IEEE SYMPOSIUM ON VISUAL LANGUAGES AND HUMAN-CENTRIC COMPUTING (VL/HCC), 2018, : 277 - 278
[7] Starting from scratch: Creating a university wellness initiative
Feldman, Harriet R.
[J]. JOURNAL OF PROFESSIONAL NURSING, 2024, 54 : 189 - 193
[8] Collection and evaluation of lexical complexity data for Russian language using crowdsourcing
Abramov, Aleksei, V
Ivanov, Vladimir V.
[J]. RUSSIAN JOURNAL OF LINGUISTICS, 2022, 26 (02): : 409 - 425
[9] Freeland: How Residents Are Creating a Dutch City from Scratch
Maas, Winy
[J]. ARCHITECTURAL DESIGN, 2021, 91 (05) : 100 - 105
[10] Awakening to the sacred: Creating a spiritual life from scratch.
Bourquin, D
[J]. LIBRARY JOURNAL, 1999, 124 (09) : 100 - 100

← 1 2 3 4 5 →