Taxonomy-based information content and wordnet-wiktionary-wikipedia glosses for semantic relatedness

被引:0
|
作者
Mohamed Ben Aouicha
Mohamed Ali Hadj Taieb
Abdelmajid Ben Hamadou
机构
[1] Sfax University,Multimedia Information System and Advanced Computing Laboratory
来源
Applied Intelligence | 2016年 / 45卷
关键词
Information content; Gloss; WordNet; Wikipedia; Wiktionary; MeSH; DAG algorithms; Semantic similarity; Semantic relatedness;
D O I
暂无
中图分类号
学科分类号
摘要
Computing the semantic similarity/relatedness between terms is an important research area for several disciplines, including artificial intelligence, cognitive science, linguistics, psychology, biomedicine and information retrieval. These measures exploit knowledge bases to express the semantics of concepts. Some approaches, such as the information theoretical approaches, rely on knowledge structure, while others, such as the gloss-based approaches, use knowledge content. Firstly, based on structure, we propose a new intrinsic Information Content (IC) computing method which is based on the quantification of the subgraph formed by the ancestors of the target concept. Taxonomic measures including the IC-based ones consume the topological parameters that must be extracted from taxonomies considered as Directed Acyclic Graphs (DAGs). Accordingly, we propose a routine of graph algorithms that are able to provide some basic parameters, such as depth, ancestors, descendents, Lowest Common Subsumer (LCS). The IC-computing method is assessed using several knowledge structures which are: the noun and verb WordNet “is a” taxonomies, Wikipedia Category Graph (WCG), and MeSH taxonomy. We also propose an aggregation schema that exploits the WordNet “is a” taxonomy and WCG in a complementary way through the IC-based measures to improve coverage capacity. Secondly, taking content into consideration, we propose a gloss-based semantic similarity measure that operates based on the noun weighting mechanism using our IC-computing method, as well as on the WordNet, Wiktionary and Wikipedia resources. Further evaluation is performed on various items, including nouns, verbs, multiword expressions and biomedical datasets, using well-recognized benchmarks. The results indicate an improvement in terms of similarity and relatedness assessment accuracy.
引用
收藏
页码:475 / 511
页数:36
相关论文
共 50 条
  • [1] Taxonomy-based information content and wordnet-wiktionary-wikipedia glosses for semantic relatedness
    Ben Aouicha, Mohamed
    Taieb, Mohamed Ali Hadj
    Ben Hamadou, Abdelmajid
    [J]. APPLIED INTELLIGENCE, 2016, 45 (02) : 475 - 511
  • [2] MeSH taxonomy-based intrinsic information content method
    Gabsi, Imen
    Kammoun, Hager
    Bougares, Hatem
    Ben Aouicha, Mohamed
    Amous, Ikram
    [J]. PROCEEDINGS OF THE 2016 INTERNATIONAL SYMPOSIUM ON INNOVATIONS IN INTELLIGENT SYSTEMS AND APPLICATIONS (INISTA), 2016,
  • [3] Semantic Relatedness Measurement between Words based on Link Information of Wikipedia
    Wang, Rui-Qin
    [J]. 2011 INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTATION AND INDUSTRIAL APPLICATION (ICIA2011), VOL I, 2011, : 153 - 157
  • [4] Semantic Relatedness Measurement between Words based on Link Information of Wikipedia
    Wang, Rui-Qin
    [J]. 2010 THE 3RD INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND INDUSTRIAL APPLICATION (PACIIA2010), VOL VI, 2010, : 155 - 159
  • [5] Semantic Relatedness Measurement from Wikipedia and WordNet Using Modified Normalized Google Distance
    Karve, Saket
    Shende, Vasisht
    Hople, Swaroop
    [J]. DATA ANALYTICS AND LEARNING, 2019, 43 : 143 - 154
  • [6] EXPANDING APPROACH TO INFORMATION RETRIEVAL USING SEMANTIC SIMILARITY ANALYSIS BASED ON WORDNET AND WIKIPEDIA
    Zhao, Feng
    Fang, Fei
    Yan, Fengwei
    Jin, Hai
    Zhang, Qin
    [J]. INTERNATIONAL JOURNAL OF SOFTWARE ENGINEERING AND KNOWLEDGE ENGINEERING, 2012, 22 (02) : 305 - 322
  • [7] Faceted taxonomy-based information management
    Tzitzikas, Yannis
    Analyti, Anastasia
    [J]. DEXA 2007: 18TH INTERNATIONAL CONFERENCE ON DATABASE AND EXPERT SYSTEMS APPLICATIONS, PROCEEDINGS, 2007, : 207 - +
  • [8] Wikipedia-based information content and semantic similarity computation
    Jiang, Yuncheng
    Bai, Wen
    Zhang, Xiaopei
    Hu, Jiaojiao
    [J]. INFORMATION PROCESSING & MANAGEMENT, 2017, 53 (01) : 248 - 265
  • [9] Semantic Relatedness Estimation using the Layout Information of Wikipedia Articles
    Chan, Patrick
    Hijikata, Yoshinori
    Kuramochi, Toshiya
    Nishida, Shogo
    [J]. INTERNATIONAL JOURNAL OF COGNITIVE INFORMATICS AND NATURAL INTELLIGENCE, 2013, 7 (02) : 30 - 48
  • [10] A personalizable agent for semantic taxonomy-based web search
    Kerschberg, L
    Kim, W
    Scime, A
    [J]. INNOVATIVE CONCEPTS FOR AGENT-BASED SYSTEMS, 2002, 2564 : 3 - 31