MultiWiBi: The multilingual Wikipedia bitaxonomy project

被引:15
|
作者
Flati, Tiziano [1 ]
Vannella, Daniele [1 ]
Pasini, Tommaso [1 ]
Navigli, Roberto [1 ]
机构
[1] Sapienza Univ Roma, Dipartimento Informat, Rome, Italy
基金
欧洲研究理事会;
关键词
Taxonomy extraction; Taxonomy induction; Machine learning; Natural language processing; Collaborative resources; Wikipedia; KNOWLEDGE; WEB;
D O I
10.1016/j.artint.2016.08.004
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present MultiWiBi, an approach to the automatic creation of two integrated taxonomies for Wikipedia pages and categories written in different languages. In order to create both taxonomies in an arbitrary language, we first build them in English and then project the two taxonomies to other languages automatically, without the help of language-specific resources or tools. The process crucially leverages a novel algorithm which exploits the information available in either one of the taxonomies to reinforce the creation of the other taxonomy. Our experiments show that the taxonomical information in MultiWiBi is characterized by a higher quality and coverage than state-of-the-art resources like DBpedia, YAGO, MENTA, WikiNet, LHD and WikiTaxonomy, also across languages. MultiWiBi is available online at http://wibitaxonomy.org/multiwibi. (C) 2016 Elsevier B.V. All rights reserved.
引用
收藏
页码:66 / 102
页数:37
相关论文
共 50 条
  • [1] Two Is Bigger (and Better) Than One: the Wikipedia Bitaxonomy Project
    Flati, Tiziano
    Vannella, Daniele
    Pasini, Tommaso
    Navigli, Roberto
    [J]. PROCEEDINGS OF THE 52ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1, 2014, : 945 - 955
  • [2] Mathematical World Knowledge Contained in the Multilingual Wikipedia Project
    Halbach, Dennis Tobias
    [J]. MATHEMATICAL SOFTWARE - ICMS 2020, 2020, 12097 : 353 - 361
  • [3] Building a Multilingual Wikipedia
    Vrandecic, Denny
    [J]. COMMUNICATIONS OF THE ACM, 2021, 64 (04) : 38 - 41
  • [4] Wikipedia: A multilingual treasure trove
    LeLoup, JW
    Ponterio, R
    [J]. LANGUAGE LEARNING & TECHNOLOGY, 2006, 10 (02): : 4 - 7
  • [5] IRVILAB: Gamified Searching on Multilingual Wikipedia
    Arvola, Paavo
    Alamettala, Tuulikki
    [J]. PROCEEDINGS OF THE 45TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR '22), 2022, : 3329 - 3333
  • [6] Multilingual schema matching for Wikipedia infoboxes
    Nguyen, Thanh
    Moreira, Viviane
    Nguyen, Huong
    Nguyen, Hoa
    Freire, Juliana
    [J]. International Journal of Computer Science Issues, 2012, 9 (03): : 133 - 144
  • [7] Understanding Editing Behaviors in Multilingual Wikipedia
    Kim, Suin
    Park, Sungjoon
    Hale, Scott A.
    Kim, Sooyoung
    Byun, Jeongmin
    Oh, Alice H.
    [J]. PLOS ONE, 2016, 11 (05):
  • [8] INFORMATION OVERLAP IN MULTILINGUAL WIKIPEDIA AND SUMMARIZATION
    Filatova, Elena
    [J]. INTERNATIONAL JOURNAL OF COOPERATIVE INFORMATION SYSTEMS, 2012, 21 (04) : 383 - 403
  • [9] Wikipedia as Multilingual Source of Comparable Corpora
    Gamallo Otero, Pablo
    Gonzalez Lopez, Isaac
    [J]. LREC 2010 - SEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2010, : 21 - 25
  • [10] Multilingual Schema Matching for Wikipedia Infoboxes
    Thanh Nguyen
    Moreira, Viviane
    Huong Nguyen
    Hoa Nguyen
    Freire, Juliana
    [J]. PROCEEDINGS OF THE VLDB ENDOWMENT, 2011, 5 (02): : 133 - 144