An Efficient Approach for Measuring Semantic Similarity Combining WordNet and Wikipedia

被引:14
|
作者
Li, Fei [1 ,2 ]
Liao, Lejian [1 ]
Zhang, Lanfang [3 ]
Zhu, Xinhua [2 ]
Zhang, Bo [4 ]
Wang, Zheng [5 ]
机构
[1] Beijing Inst Technol, Sch Comp Sci & Technol, Beijing 100081, Peoples R China
[2] Guangxi Normal Univ, Guangxi Key Lab Multisource Informat Min & Secur, Guilin 541004, Peoples R China
[3] Guangxi Normal Univ, Fac Educ, Guilin 541004, Peoples R China
[4] Hezhou Univ, Sch Math & Comp Sci, Hezhou 542899, Peoples R China
[5] Nanyang Technol Univ, Sch Comp Sci & Engn, Singapore 639798, Singapore
基金
中国国家自然科学基金;
关键词
Semantics; Encyclopedias; Electronic publishing; Internet; Weight measurement; Ontologies; Semantic similarity; edge weight model; word disambiguation strategy; WordNet; Wikipedia; INFORMATION-CONTENT; RELATEDNESS; RETRIEVAL; MODEL; EDGE;
D O I
10.1109/ACCESS.2020.3025611
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The measurement of semantic similarity between concepts is an important research topic in natural language processing. In the past, several approaches for measuring the semantic similarity between concepts have been proposed based on WordNet or Wikipedia. However, improvements in the measurement accuracy of most methods have led to a dramatic increase in time complexity, and the existing methods do not effectively integrate WordNet and Wikipedia. In this paper, we focus on designing an efficient semantic similarity method based on WordNet and Wikipedia. To improve the accuracy of WordNet edge-based measures, we propose an edge weight model for combining edge and density information, which assigns a weight to each edge adaptively based on the number of direct hyponyms of the subsumer. Second, to improve the computational efficiencies of the existing Wikipedia link vector-based measures, we propose a new Wikipedia link feature-based semantic similarity method that converts Wikipedia links into semantic knowledge and replaces the TF-IDF statistical weight model in the existing measures. In addition, we propose two new word disambiguation strategies to further improve the accuracy of Wikipedia link-based measures. Finally, to fully exploit the advantages of WordNet and Wikipedia, we propose two new aggregation schemas for combining WordNet "is-a" semantics and Wikipedia link semantics to replace the current aggregation schemas that combine WordNet "is-a" semantics with category semantics in Wikipedia. The experimental results show that our aggregation models are outstanding in terms of accuracy, efficiency and word coverage compared to state-of-the-art similarity measures.
引用
收藏
页码:184318 / 184338
页数:21
相关论文
共 50 条
  • [1] Relational Similarity Measure: An Approach Combining Wikipedia and WordNet
    Cao, Yanjiao
    Lu, Zhao
    Cai, Songmei
    [J]. RECENT TRENDS IN MATERIALS AND MECHANICAL ENGINEERING MATERIALS, MECHATRONICS AND AUTOMATION, PTS 1-3, 2011, 55-57 : 955 - 960
  • [2] Measuring semantic similarity in WordNet
    Liu, Xiao-Ying
    Zhou, Yi-Ming
    Zheng, Ruo-Shi
    [J]. PROCEEDINGS OF 2007 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-7, 2007, : 3431 - +
  • [3] A novel wordnet-based approach for measuring semantic similarity
    Zhu, Xinhua
    Li, Fei
    Chen, Hongchao
    Mao, Junqing
    [J]. Journal of Information and Computational Science, 2015, 12 (13): : 4919 - 4927
  • [4] A fuzzy approach for measuring the semantic similarity between words in WordNet
    Song, Ling
    Ma, Jun
    Lei, Jingsheng
    Li, Chao
    [J]. Journal of Information and Computational Science, 2009, 6 (03): : 1673 - 1680
  • [5] Measuring Semantic Similarity Based On WordNet
    Zhao, Zhongcheng
    Yan, Jianzhuo
    Fang, Liying
    Wang, Pu
    [J]. 2009 SIXTH WEB INFORMATION SYSTEMS AND APPLICATIONS CONFERENCE, PROCEEDINGS, 2009, : 89 - 92
  • [6] EXPANDING APPROACH TO INFORMATION RETRIEVAL USING SEMANTIC SIMILARITY ANALYSIS BASED ON WORDNET AND WIKIPEDIA
    Zhao, Feng
    Fang, Fei
    Yan, Fengwei
    Jin, Hai
    Zhang, Qin
    [J]. INTERNATIONAL JOURNAL OF SOFTWARE ENGINEERING AND KNOWLEDGE ENGINEERING, 2012, 22 (02) : 305 - 322
  • [7] Evaluating semantic similarity and relatedness between concepts by combining taxonomic and non-taxonomic semantic features of WordNet and Wikipedia
    Hussain, Muhammad Jawad
    Bai, Heming
    Wasti, Shahbaz Hassan
    Huang, Guangjian
    Jiang, Yuncheng
    [J]. INFORMATION SCIENCES, 2023, 625 : 673 - 699
  • [8] A Hybrid Approach for Measuring Semantic Similarity between Ontologies Based on WordNet
    He, Wei
    Yang, Xiaoping
    Huang, Dupei
    [J]. KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT, 2011, 7091 : 68 - +
  • [9] New Model of Semantic Similarity Measuring in WordNet
    Zhou, Zili
    Wang, Yanna
    Gu, Junzhong
    [J]. 2008 3RD INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEM AND KNOWLEDGE ENGINEERING, VOLS 1 AND 2, 2008, : 256 - +
  • [10] A semantic approach for question classification using WordNet and Wikipedia
    Ray, Santosh Kumar
    Singh, Shailendra
    Joshi, B. P.
    [J]. PATTERN RECOGNITION LETTERS, 2010, 31 (13) : 1935 - 1943