Building a large-scale commonsense knowledge base by converting an existing one in a different language

被引:0
|
作者
Jung, Yuchul [1 ]
Lee, Joo-Young [2 ]
Kim, Youngho [1 ]
Park, Jaehyun [2 ]
Myaeng, Sung-Hyon [1 ]
Rim, Hae-Chang [2 ]
机构
[1] Informat & Commun Univ, Sch Engn, Taejon 305732, South Korea
[2] Korea Univ, Dept Comp Sci & Engn, Seoul 136701, South Korea
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper describes our effort to build a large-scale commonsense knowledge base in Korean by converting a pre-existing one in English, called ConceptNet. The English commonsense knowledge base is essentially a huge net consisting of concepts and relations. Triplets in the form of Concept-Relation-Concept in the net were extracted from English sentences collected from volunteers through a Web site, who were interested in entering commonsense knowledge. Our effort is an attempt to obtain its Korean version by utilizing a variety of language resources and tools. We not only employed a morphological analyzer and existing commercial machine translation software but also developed our own special-purpose translation and out-of-vocabulary handling methods. In order to handle ambiguity, we also devised a noisy concept filtering and concept generalization methods. Out of the 2.4 million assertions, i.e. triplets of concept-relation-concept, in the English ConceptNet, we generated about 200,000 Korean assertions so far. Based on our manual judgments of a 5% sample, the accuracy was 84.4%.
引用
收藏
页码:23 / +
页数:3
相关论文
共 50 条
  • [1] Large Language Models as Commonsense Knowledge for Large-Scale Task Planning
    Zhao, Zirui
    Lee, Wee Sun
    Hsu, David
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [2] Large-Scale Commonsense Knowledge for Default Logic Reasoning
    Järv P.
    Tammet T.
    Verrev M.
    Draheim D.
    [J]. SN Computer Science, 4 (5)
  • [3] WebBrain: Joint Neural Learning of Large-Scale Commonsense Knowledge
    Chen, Jiaqiang
    Tandon, Niket
    Hariman, Charles Darwis
    de Melo, Gerard
    [J]. SEMANTIC WEB - ISWC 2016, PT I, 2016, 9981 : 102 - 118
  • [4] Neural Word Representations from Large-Scale Commonsense Knowledge
    Chen, Jiaqiang
    Tandon, Niket
    de Melo, Gerard
    [J]. 2015 IEEE/WIC/ACM INTERNATIONAL CONFERENCE ON WEB INTELLIGENCE AND INTELLIGENT AGENT TECHNOLOGY (WI-IAT), VOL 1, 2015, : 225 - 228
  • [5] Refined Commonsense Knowledge From Large-Scale Web Contents
    Nguyen, Tuan-Phong
    Razniewski, Simon
    Romero, Julien
    Weikum, Gerhard
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2023, 35 (08) : 8431 - 8447
  • [6] Enhancing Natural Language Representation with Large-Scale Out-of-Domain Commonsense
    Cui, Wanyun
    Chen, Xingran
    [J]. FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), 2022, : 1746 - 1756
  • [7] Building a Commonsense Knowledge Base for a Collaborative Storytelling Agent
    Ong, Dionne Tiffany
    De Jesus, Christine Rachel
    Gilig, Luisa Katherine
    Alburo, Junlyn Bryan
    Ong, Ethel
    [J]. KNOWLEDGE MANAGEMENT AND ACQUISITION FOR INTELLIGENT SYSTEMS (PKAW 2018), 2018, 11016 : 1 - 15
  • [8] Converting Data to Knowledge: One District's Experience Using Large-Scale Proficiency Assessment
    Davin, Kristin J.
    Rempert, Tania A.
    Hammerand, Amy A.
    [J]. FOREIGN LANGUAGE ANNALS, 2014, 47 (02) : 241 - 260
  • [9] Building a Large-Scale Cross-Lingual Knowledge Base from Heterogeneous Online Wikis
    Li, Mingyang
    Shi, Yao
    Wang, Zhigang
    Liu, Yongbin
    [J]. NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING, NLPCC 2015, 2015, 9362 : 413 - 420
  • [10] Toward Building a Legal Knowledge-Base of Chinese Judicial Documents for Large-Scale Analytics
    Gupta, Amarnath
    Wang, Alice Z.
    Lin, Kai
    Hong, Haoshen
    Sun, Haoran
    Liebman, Benjamin L.
    Stern, Rachel E.
    Dasgupta, Subhasis
    Roberts, Margaret E.
    [J]. LEGAL KNOWLEDGE AND INFORMATION SYSTEMS, 2017, 302 : 135 - 144