WebChild: Harvesting and Organizing Commonsense Knowledge from the Web

被引:56
|
作者
Tandon, Niket [1 ]
de Melo, Gerard [2 ]
Suchanek, Fabian [3 ]
Weikum, Gerhard [1 ]
机构
[1] Max Planck Inst Informat, Saarbrucken, Germany
[2] Tsinghua Univ, IIIS, Beijing, Peoples R China
[3] Telecom ParisTech, Paris, France
关键词
Knowledge Bases; Commonsense Knowledge; Web Mining; Label Propagation; Word Sense Disambiguation;
D O I
10.1145/2556195.2556245
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents a method for automatically constructing a large commonsense knowledge base, called WebChild(1), from Web contents. WebChild contains triples that connect nouns with adjectives via fine-grained relations like hasShape, hasTaste, evokesEmotion, etc. The arguments of these assertions, nouns and adjectives, are disambiguated by mapping them onto their proper WordNet senses. Our method is based on semi-supervised Label Propagation over graphs of noisy candidate assertions. We automatically derive seeds from WordNet and by pattern matching from Web text collections. The Label Propagation algorithm provides us with domain sets and range sets for 19 different relations, and with confidence-ranked assertions betweenWordNet senses. Large-scale experiments demonstrate the high accuracy (more than 80 percent) and coverage (more than four million fine grained disambiguated assertions) of WebChild.
引用
收藏
页码:523 / 532
页数:10
相关论文
共 50 条
  • [1] Harvesting and organizing knowledge from the web
    Weikum, Gerhard
    [J]. ADVANCES IN DATABASES AND INFORMATION SYSTEMS, PROCEEDINGS, 2007, 4690 : 12 - 13
  • [2] WebChild 2.0: Fine-Grained Commonsense Knowledge Distillation
    Tandon, Niket
    de Melo, Gerard
    Weikum, Gerhard
    [J]. PROCEEDINGS OF THE 55TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2017): SYSTEM DEMONSTRATIONS, 2017, : 115 - 120
  • [3] Commonsense Knowledge Mining from the Web
    Yu, Chi-Hsin
    Chen, Hsin-Hsi
    [J]. PROCEEDINGS OF THE TWENTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE (AAAI-10), 2010, : 1480 - 1485
  • [4] Acquiring Comparative Commonsense Knowledge from the Web
    Tandon, Niket
    de Melo, Gerard
    Weikum, Gerhard
    [J]. PROCEEDINGS OF THE TWENTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2014, : 166 - 172
  • [5] A manual experiment on commonsense knowledge acquisition from web corpora
    Zhu, Yao
    Zang, Liang-Jun
    Cao, Ya-Nan
    Wang, Dong-Sheng
    Cao, Cun-Gen
    [J]. PROCEEDINGS OF 2008 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-7, 2008, : 1564 - 1569
  • [6] Refined Commonsense Knowledge From Large-Scale Web Contents
    Nguyen, Tuan-Phong
    Razniewski, Simon
    Romero, Julien
    Weikum, Gerhard
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2023, 35 (08) : 8431 - 8447
  • [7] Knowledge Harvesting from Text and Web Sources
    Suchanek, Fabian
    Weikum, Gerhard
    [J]. 2013 IEEE 29TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE), 2013, : 1250 - 1253
  • [8] Organizing knowledge in a Semantic Web for pathology
    Tolksdorf, R
    Bontas, EP
    [J]. OBJECT-ORIENTED AND INTERNET-BASED TECHNOLOGIES, PROCEEDINGS, 2004, 3263 : 39 - 54
  • [9] Extracting Comparative Commonsense from the Web
    Cao, Yanan
    Cao, Cungen
    Zang, Liangjun
    Wang, Shi
    Wang, Dongsheng
    [J]. INTELLIGENT INFORMATION PROCESSING V, 2010, 340 : 154 - 162
  • [10] From Information to Knowledge: Harvesting Entities and Relationships from Web Sources
    Weikum, Gerhard
    Theobald, Martin
    [J]. PODS 2010: PROCEEDINGS OF THE TWENTY-NINTH ACM SIGMOD-SIGACT-SIGART SYMPOSIUM ON PRINCIPLES OF DATABASE SYSTEMS, 2010, : 65 - 76