Sentence similarity calculation method based on lexical, syntactic and semantic

被引:0
|
作者
Zhai S. [1 ,2 ]
Li Z. [1 ]
Duan H. [1 ]
Li J. [1 ]
Dong D. [1 ]
机构
[1] School of Computer Science and Technology, Xi'an University of Posts and Telecommunications, Xi'an
[2] Shaanxi Key Laboratory of Network Data Analysis and Intelligent Processing, Xi'an University of Posts and Telecommunications, Xi'an
关键词
Lexical layer; Ontology; Semantic layer; Sentence similarity; Syntactic layer;
D O I
10.3969/j.issn.1001-0505.2019.06.011
中图分类号
学科分类号
摘要
To solve the problem that the existing sentence similarity algorithms did not consider semantic information, a similarity computation method based on lexical, syntactic and semantic was proposed. The sentence similarities were divided into three levels, including the lexical layer, the syntactic layer and the semantic layer. In the lexical layer, the lexical similarity matrix and the digital sequence similarity matrix were constructed to calculate the similarity of the sentence. In the syntactic layer, the similarity of the sentence was calculated by the similarity of the resource description framework (RDF) triples converted from conceptual vocabularies. In the semantic layer, the semantic distance based on the shortest path representation in the ontology structure was used to calculate the semantic similarity. Then, the semantic similarity calculation model of sentences was proposed. The sentence pairs in the book domain were collected as the test sets, and the book ontology was constructed as the knowledge source. Experimental results show that the proposed method has higher accuracy and recall rate, and its F-measure reaches 0.649 9. Compared with the cosine similarity algorithm, the Levenshtein algorithm and the TF-IDF (term frequency-inverse document frequency) algorithm, the F-measures are increased by about 12%, 17% and 16%, respectively. © 2019, Editorial Department of Journal of Southeast University. All right reserved.
引用
收藏
页码:1094 / 1100
页数:6
相关论文
共 15 条
  • [11] Yao H.P., Liu H.W., Zhang P.Y., A novel sentence similarity model with word embedding based on convolutional neural network, Concurrency and Computation: Practice and Experience, 30, 23, (2018)
  • [12] Lawrence Edwards N., Malouf R., Perez-Ruiz F., Et al., Computational lexical analysis of the language commonly used to describe gout, Arthritis Care & Research, 68, 6, pp. 763-768, (2016)
  • [13] Zheng W.G., Zou L., Peng W., Et al., Semantic SPARQL similarity search over RDF knowledge graphs, Proceedings of the VLDB Endowment, 9, 11, pp. 840-851, (2016)
  • [14] Gan M.X., Dou X., Jiang R., From ontology to semantic similarity: Calculation of ontology-based semantic similarity, The Scientific World Journal, 2013, pp. 1-11, (2013)
  • [15] Lu W., Zhou H.X., Zhang X.J., Review of research on query intent, Journal of Library Science in China, 39, 1, pp. 100-111, (2013)