An improved Chinese word semantic similarity algorithm based on CiLin

被引：0

作者：

Li, Fei ^{[1
]}

Zhu, Xinhua ^{[1
]}

Chen, Hongchao ^{[1
]}

Ma, Runcong ^{[1
]}

Deng, Han ^{[1
]}

机构：

[1] Guangxi Key Lab. of Multi-Source Information Mining & Security and College of Computer Science & Information Technology, Guangxi Normal University, Guilin, China

来源：

Journal of Information and Computational Science | 2015年 / 12卷 / 10期

关键词：

Correlation methods;

D O I：

10.12733/jics20106030

中图分类号：

O212 [数理统计];

学科分类号：

摘要：

The CiLin is a famous semantic dictionary of Chinese synonyms; its structure and function are quite like the WordNet in English. This paper improves the existing algorithm of Chinese word semantic similarity based on CiLin, which integrates the word distance, the density of lowest common parent node and branch layer spacing. Firstly, the initial value of word semantic similarity is calculated through word distance, and then an adjusting parameter that depends on the lowest common parent node density n and the branch interval k is set to revise the initial value downward. Through the fourth root of an expression for the parameters k and n, the revision range of initial similarity can be limited below 16%, thus avoiding the unreasonable phenomenon that the word pairs with near distance have a low similarity because of a far branch interval. This method obtains an as high as 0.8464 value of Pearson correlation coefficient compared with artificial judgment for the word pair set of Miller & Charles. 1548-7741/Copyright © 2015 Binary Information Press

引用

页码：3799 / 3807

共 50 条

[31] Sentence Semantic Similarity based on Word FiImbedding and WordNet
Farouk, Mamdouh
PROCEEDINGS OF 2018 13TH INTERNATIONAL CONFERENCE ON COMPUTER ENGINEERING AND SYSTEMS (ICCES), 2018, : 33 - 37
[32] Improved fast algorithm for Chinese word segmentation
Chen, Guilin
Wang, Yongcheng
Han, Kesong
Wang, Gang
Jisuanji Yanjiu yu Fazhan/Computer Research and Development, 2000, 37 (04): : 418 - 424
[33] Short texts semantic similarity based on word embeddings
Babic, Karlo
Martincic-Ipsic, Sanda
Mestrovic, Ana
Guerra, Francesco
CENTRAL EUROPEAN CONFERENCE ON INFORMATION AND INTELLIGENT SYSTEMS (CECIIS 2019), 2019, : 27 - 33
[34] Improved fast algorithm for Chinese word segmentation
Chen, Guilin
Wang, Yongcheng
Han, Kesong
Wang, Gang
2000, Sci Press (37):
[35] An improved Dijkstra algorithm in Chinese Word Segmentation
Zhang Xueyan
Xue Xiao
Yang Shenggang
Zhao Limei
ITESS: 2008 PROCEEDINGS OF INFORMATION TECHNOLOGY AND ENVIRONMENTAL SYSTEM SCIENCES, PT 2, 2008, : 909 - 914
[36] Word Semantic Similarity based on document's title
Hamani, Mohamed Said
Maamri, Ramdane
2013 24TH INTERNATIONAL WORKSHOP ON DATABASE AND EXPERT SYSTEMS APPLICATIONS (DEXA 2013), 2013, : 43 - 47
[37] Semantic text similarity using corpus-based word similarity and string similarity
University of Ottawa
不详
ACM Transactions on Knowledge Discovery from Data, 2008, 2 (02)
[38] An Algorithm of Semantic Similarity Between Words Based on Word Single-meaning Embedding Model
Li X.-T.
You S.-J.
Chen W.
Zidonghua Xuebao/Acta Automatica Sinica, 2020, 46 (08): : 1654 - 1669
[39] Research and Application of News-text Similarity Algorithm based on Chinese word segmentation
Guan, Wei
Zhang, Pengzhou
2013 3RD INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS, COMMUNICATIONS AND NETWORKS (CECNET), 2013, : 484 - 487
[40] Research on Semantic Similarity Algorithm of Chinese Words in a Specified Domain
Niu, Qinzhou
Zhao, Xiang
FUZZY SYSTEMS AND DATA MINING III (FSDM 2017), 2017, 299 : 285 - 294

← 1 2 3 4 5 →