Building a large-scale commonsense knowledge base by converting an existing one in a different language

被引:0
|
作者
Jung, Yuchul [1 ]
Lee, Joo-Young [2 ]
Kim, Youngho [1 ]
Park, Jaehyun [2 ]
Myaeng, Sung-Hyon [1 ]
Rim, Hae-Chang [2 ]
机构
[1] Informat & Commun Univ, Sch Engn, Taejon 305732, South Korea
[2] Korea Univ, Dept Comp Sci & Engn, Seoul 136701, South Korea
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper describes our effort to build a large-scale commonsense knowledge base in Korean by converting a pre-existing one in English, called ConceptNet. The English commonsense knowledge base is essentially a huge net consisting of concepts and relations. Triplets in the form of Concept-Relation-Concept in the net were extracted from English sentences collected from volunteers through a Web site, who were interested in entering commonsense knowledge. Our effort is an attempt to obtain its Korean version by utilizing a variety of language resources and tools. We not only employed a morphological analyzer and existing commercial machine translation software but also developed our own special-purpose translation and out-of-vocabulary handling methods. In order to handle ambiguity, we also devised a noisy concept filtering and concept generalization methods. Out of the 2.4 million assertions, i.e. triplets of concept-relation-concept, in the English ConceptNet, we generated about 200,000 Korean assertions so far. Based on our manual judgments of a 5% sample, the accuracy was 84.4%.
引用
收藏
页码:23 / +
页数:3
相关论文
共 50 条
  • [21] From language models to large-scale food and biomedical knowledge graphs
    Cenikj, Gjorgjina
    Strojnik, Lidija
    Angelski, Risto
    Ogrinc, Nives
    Seljak, Barbara Korousic
    Eftimov, Tome
    [J]. SCIENTIFIC REPORTS, 2023, 13 (01)
  • [22] SWAG: A Large-Scale Adversarial Dataset for Grounded Commonsense Inference
    Zellers, Rowan
    Bisk, Yonatan
    Schwartz, Roy
    Choi, Yejin
    [J]. 2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018), 2018, : 93 - 104
  • [23] Energetic refurbishment of an existing large-scale MBR
    Brepols, C.
    Drensla, K.
    Janot, A.
    Schaefer, H.
    [J]. WATER PRACTICE AND TECHNOLOGY, 2020, 15 (01) : 182 - 187
  • [24] A collective entity linking algorithm with parallel computing on large-scale knowledge base
    Yingchun Xia
    Xingyue Wang
    Lichuan Gu
    Qijuan Gao
    Jun Jiao
    Chao Wang
    [J]. The Journal of Supercomputing, 2020, 76 : 948 - 963
  • [25] A collective entity linking algorithm with parallel computing on large-scale knowledge base
    Xia, Yingchun
    Wang, Xingyue
    Gu, Lichuan
    Gao, Qijuan
    Jiao, Jun
    Wang, Chao
    [J]. JOURNAL OF SUPERCOMPUTING, 2020, 76 (02): : 948 - 963
  • [26] Conceptual design and cost analysis of a large-scale plant for converting red mud into building materials
    Scaratti G.
    Wermuth T.B.
    Arcaro S.
    Montedo O.R.K.
    de Oliveira E.M.
    de Oliveira E.M.
    Peterson M.
    De Noni Junior A.
    [J]. International Journal of Ceramic Engineering and Science, 2021, 3 (06): : 279 - 286
  • [27] Continued commitment to safety: building on the existing rivaroxaban knowledge base
    Beyer-Westendorf, Jan
    Haas, Sylvia
    Turpie, Alexander G. G.
    [J]. EUROPEAN HEART JOURNAL SUPPLEMENTS, 2015, 17 (0D) : D21 - D28
  • [28] Building large-scale Bayesian networks
    Neil, M
    Fenton, N
    Nielsen, L
    [J]. KNOWLEDGE ENGINEERING REVIEW, 2000, 15 (03): : 257 - 284
  • [29] Building large-scale digital libraries
    Schatz, B
    Chen, HC
    [J]. COMPUTER, 1996, 29 (05) : 22 - 26
  • [30] Experiences in building large knowledge base systems
    Joshi, J.P.
    Deshpande, K.
    [J]. Journal of the Institution of Engineers (India), Part CP: Computer Engineering Division, 1992, 73 (01):