Technical Infrastructure at Linguistic Data Consortium: Software and Hardware Resources for Linguistic Data Creation

被引:0
|
作者
Maeda, Kazuaki [1 ]
Lee, Haejoong [1 ]
Grimes, Stephen [1 ]
Wright, Jonathan [1 ]
Parker, Robert [1 ]
Lee, David [1 ]
Mazzucchi, Andrea [1 ]
机构
[1] Univ Penn, Linguist Data Consortium, Philadelphia, PA 19104 USA
关键词
D O I
暂无
中图分类号
H [语言、文字];
学科分类号
05 ;
摘要
Linguistic Data Consortium (LDC) at the University of Pennsylvania has participated as a data provider in a variety of government-sponsored programs that support development of Human Language Technologies. As the number of projects increases, the quantity and variety of the data LDC produces have increased dramatically in recent years. In this paper, we describe the technical infrastructure, both hardware and software, that LDC has built to support these complex, large-scale linguistic data creation efforts at LDC. As it would not be possible to cover all aspects of LDC's technical infrastructure in one paper, this paper focuses on recent development. We also report on our plans for making our custom-built software resources available to the community as open source software, and introduce an initiative to collaborate with software developers outside LDC. We hope that our approaches and software resources will be useful to the community members who take on similar challenges.
引用
收藏
页数:6
相关论文
共 50 条
  • [1] Related Works in the Linguistic Data Consortium Catalog
    Jaquette, Daniel
    Cieri, Christopher
    DiPersio, Denise
    PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020), 2020, : 3438 - 3442
  • [2] ON A SOFTWARE TOOL FOR STATISTICS WITH LINGUISTIC DATA
    KRUSE, R
    FUZZY SETS AND SYSTEMS, 1987, 24 (03) : 377 - 383
  • [3] Annotation Tool Development for Large-Scale Corpus Creation Projects at the Linguistic Data Consortium
    Maeda, Kazuaki
    Lee, Haejoong
    Medero, Shawn
    Medero, Julie
    Parker, Robert
    Strassel, Stephanie
    SIXTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, LREC 2008, 2008, : 3052 - 3056
  • [4] A survey on the exchange of linguistic resources Publishing linguistic linked open data on the Web
    Lezcano, Leonardo
    Sanchez-Alonso, Salvador
    Roa-Valverde, Antonio J.
    PROGRAM-ELECTRONIC LIBRARY AND INFORMATION SYSTEMS, 2013, 47 (03) : 263 - 281
  • [5] Recent Developments for the Linguistic Linked Open Data Infrastructure
    Declerck, Thierry
    McCrae, John
    Hartung, Matthias
    Gracia, Jorge
    Chiarcos, Christian
    Montiel, Elena
    Cimiano, Philipp
    Revenko, Artem
    Sauri, Roser
    Lee, Deirdre
    Racioppa, Stefania
    Nasir, Jamal
    Orlikowski, Matthias
    Lanau-Coronas, Marta
    Faeth, Christian
    Rico, Mariano
    Elahi, Mohammad Fazleh
    Khvalchik, Maria
    Gonzalez, Meritxell
    Cooney, Katharine
    PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020), 2020, : 5660 - 5667
  • [6] The GOLD Community of Practice: an infrastructure for linguistic data on the Web
    Scott Farrar
    William D. Lewis
    Language Resources and Evaluation, 2007, 41 : 45 - 60
  • [7] The GOLD Community of Practice: an infrastructure for linguistic data on the Web
    Farrar, Scott
    Lewis, William D.
    LANGUAGE RESOURCES AND EVALUATION, 2007, 41 (01) : 45 - 60
  • [8] The Linguistic Data Consortium Member Survey: Purpose, Execution and Results
    Reed, Marian
    DiPersio, Denise
    Cieri, Christopher
    SIXTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, LREC 2008, 2008, : 2969 - 2973
  • [9] ROMANIAN RESOURCES IN LINGUISTIC LINKED OPEN DATA FORMAT
    Mititelu, Verginica Barbu
    Irimia, Elena
    Pais, Vasile
    Avram, Andrei-Marius
    Mitrofan, Maria
    Curea, Eric
    PROCEEDINGS OF THE 15TH INTERNATIONAL CONFERENCE LINGUISTIC RESOURCES AND TOOLS FOR NATURAL LANGUAGE PROCESSING, 2020, : 29 - 40
  • [10] The History of the Sergey Radonezhsky's Panegyric Creation by Linguistic Data
    Duhanina, A. V.
    DREVNYAYA RUS-VOPROSY MEDIEVISTIKI, 2009, 35 (01): : 67 - +