Sentence similarity based on semantic nets and corpus statistics

被引:443
|
作者
Li, Yuhua [1 ]
McLean, David
Bandar, Zuhair A.
O'Shea, James D.
Crockett, Keeley
机构
[1] Univ Ulster, Sch Comp & Intelligent Syst, Coleraine BT48 7JL, Londonderry, North Ireland
[2] Manchester Metropolitan Univ, Dept Comp & Math, Manchester M1 5GD, Lancs, England
关键词
sentence similarity; semantic nets; corpus; natural language processing; word similarity;
D O I
10.1109/TKDE.2006.130
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Sentence similarity measures play an increasingly important role in text-related research and applications in areas such as text mining, Web page retrieval, and dialogue systems. Existing methods for computing sentence similarity have been adopted from approaches used for long text documents. These methods process sentences in a very high-dimensional space and are consequently inefficient, require human input, and are not adaptable to some application domains. This paper focuses directly on computing the similarity between very short texts of sentence length. It presents an algorithm that takes account of semantic information and word order information implied in the sentences. The semantic similarity of two sentences is calculated using information from a structured lexical database and from corpus statistics. The use of a lexical database enables our method to model human common sense knowledge and the incorporation of corpus statistics allows our method to be adaptable to different domains. The proposed method can be used in a variety of applications that involve text knowledge representation and discovery. Experiments on two sets of selected sentence pairs demonstrate that the proposed method provides a similarity measure that shows a significant correlation to human intuition.
引用
收藏
页码:1138 / 1150
页数:13
相关论文
共 50 条
  • [1] RETRACTED: Sentence similarity computation based on WordNet and corpus statistics (Retracted Article)
    Selvi, P.
    Gopalan, N. P.
    ICCIMA 2007: INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND MULTIMEDIA APPLICATIONS, VOL I, PROCEEDINGS, 2007, : 9 - +
  • [2] A sentence similarity metric based on semantic patterns
    Lee, Ming Che
    Chang, Jia Wei
    Hsieh, Tung Cheng
    Chen, Hui Hui
    Chen, Ching Hui
    Advances in Information Sciences and Service Sciences, 2012, 4 (18): : 576 - 585
  • [3] Sentence Similarity Based on Semantic Vector Model
    Zhao Jingling
    Zhang Huiyun
    Cui Baojiang
    2014 NINTH INTERNATIONAL CONFERENCE ON P2P, PARALLEL, GRID, CLOUD AND INTERNET COMPUTING (3PGCIC), 2014, : 499 - 503
  • [4] Eigenvalue Based Features For Semantic Sentence Similarity
    Vardasbi, Ali
    Faili, Heshaam
    Asadpour, Masoud
    2017 19TH CSI INTERNATIONAL SYMPOSIUM ON ARTIFICIAL INTELLIGENCE AND SIGNAL PROCESSING (AISP), 2017, : 184 - 189
  • [5] Sentence Semantic Similarity based on Word FiImbedding and WordNet
    Farouk, Mamdouh
    PROCEEDINGS OF 2018 13TH INTERNATIONAL CONFERENCE ON COMPUTER ENGINEERING AND SYSTEMS (ICCES), 2018, : 33 - 37
  • [6] Calculation of Sentence Semantic Similarity Based on Syntactic Structure
    Li, Xiao
    Li, Qingsheng
    MATHEMATICAL PROBLEMS IN ENGINEERING, 2015, 2015
  • [7] Chinese Sentence Similarity based on Word Context and Semantic
    Gu, Tianjiao
    Ren, Fuji
    IEEE NLP-KE 2009: PROCEEDINGS OF INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING AND KNOWLEDGE ENGINEERING, 2009, : 535 - 539
  • [8] A French Corpus for Semantic Similarity
    Cardon, Remi
    Grabar, Natalia
    PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020), 2020, : 6889 - 6894
  • [9] Sentence similarity calculation method based on lexical, syntactic and semantic
    Zhai S.
    Li Z.
    Duan H.
    Li J.
    Dong D.
    Dongnan Daxue Xuebao (Ziran Kexue Ban)/Journal of Southeast University (Natural Science Edition), 2019, 49 (06): : 1094 - 1100
  • [10] The Semantic Computing Model of Sentence Similarity Based on Chinese FrameNet
    Li, Ru
    Li, Shuanghong
    Zhang, Zezheng
    2009 IEEE/WIC/ACM INTERNATIONAL JOINT CONFERENCES ON WEB INTELLIGENCE (WI) AND INTELLIGENT AGENT TECHNOLOGIES (IAT), VOL 3, 2009, : 255 - +