An improved Chinese word semantic similarity algorithm based on CiLin

被引:0
|
作者
Li, Fei [1 ]
Zhu, Xinhua [1 ]
Chen, Hongchao [1 ]
Ma, Runcong [1 ]
Deng, Han [1 ]
机构
[1] Guangxi Key Lab. of Multi-Source Information Mining & Security and College of Computer Science & Information Technology, Guangxi Normal University, Guilin, China
来源
Journal of Information and Computational Science | 2015年 / 12卷 / 10期
关键词
Correlation methods;
D O I
10.12733/jics20106030
中图分类号
O212 [数理统计];
学科分类号
摘要
The CiLin is a famous semantic dictionary of Chinese synonyms; its structure and function are quite like the WordNet in English. This paper improves the existing algorithm of Chinese word semantic similarity based on CiLin, which integrates the word distance, the density of lowest common parent node and branch layer spacing. Firstly, the initial value of word semantic similarity is calculated through word distance, and then an adjusting parameter that depends on the lowest common parent node density n and the branch interval k is set to revise the initial value downward. Through the fourth root of an expression for the parameters k and n, the revision range of initial similarity can be limited below 16%, thus avoiding the unreasonable phenomenon that the word pairs with near distance have a low similarity because of a far branch interval. This method obtains an as high as 0.8464 value of Pearson correlation coefficient compared with artificial judgment for the word pair set of Miller & Charles. 1548-7741/Copyright © 2015 Binary Information Press
引用
收藏
页码:3799 / 3807
相关论文
共 50 条
  • [41] Word Similarity Algorithm Based on WordNet And HowNet
    Ren, Wuling
    Guo, Jinju
    MECHANICAL ENGINEERING AND GREEN MANUFACTURING II, PTS 1 AND 2, 2012, 155-156 : 375 - 380
  • [42] An Improved Genetic Algorithm for Document Clustering with Semantic Similarity Measure
    Song, Wei
    Park, Soon Cheol
    ICNC 2008: FOURTH INTERNATIONAL CONFERENCE ON NATURAL COMPUTATION, VOL 1, PROCEEDINGS, 2008, : 536 - 540
  • [43] Improvement of Semantic Similarity Algorithm Based on WordNet
    Li, Haisheng
    Tian, Yun
    Cai, Qiang
    2011 6TH IEEE CONFERENCE ON INDUSTRIAL ELECTRONICS AND APPLICATIONS (ICIEA), 2011, : 564 - 567
  • [44] MEASURING SEMANTIC SIMILARITY BY CONTEXTUAL WORD CONNECTIONS IN CHINESE NEWS STORY SEGMENTATION
    Nie, Xuecheng
    Feng, Wei
    Wan, Liang
    Xie, Lei
    2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 8312 - 8316
  • [45] Improved Chinese Sentence Semantic Similarity Calculation Method Based on Multi-Feature Fusion
    Liu, Liqi
    Wang, Qinglin
    Li, Yuan
    JOURNAL OF ADVANCED COMPUTATIONAL INTELLIGENCE AND INTELLIGENT INFORMATICS, 2021, 25 (04) : 442 - 449
  • [46] An Improved Cosine Similarity Algorithm Based on Document Similarity
    Lee, Ming
    Zhao, Heji
    INTERNATIONAL SYMPOSIUM ON FUZZY SYSTEMS, KNOWLEDGE DISCOVERY AND NATURAL COMPUTATION (FSKDNC 2014), 2014, : 196 - 204
  • [47] Semantic Similarity Based on Word Recurrence Ratio Focusing on WordNet
    Ahmed, M. Elius
    Shajalal, Md
    Atabuzzaman, Md
    Aono, Masaki
    2020 23RD INTERNATIONAL CONFERENCE ON COMPUTER AND INFORMATION TECHNOLOGY (ICCIT 2020), 2020,
  • [48] Semantic Similarity of Inverse Morpheme Words Based on Word Embedding
    Zhou, Jiaomei
    Liu, Zhiying
    CHINESE LEXICAL SEMANTICS, CLSW 2021, PT I, 2022, 13249 : 452 - 463
  • [49] Word Embedding based Textual Semantic Similarity Measure in Bengali
    Iqbal, Md Asif
    Sharif, Omar
    Hoque, Mohammed Moshiul
    Sarker, Iqbal H.
    10TH INTERNATIONAL YOUNG SCIENTISTS CONFERENCE IN COMPUTATIONAL SCIENCE (YSC2021), 2021, 193 : 92 - 101
  • [50] Word extraction based on semantic constraints in Chinese word-formation
    Sun, MS
    Luo, SF
    T'sou, BK
    COMPUTATIONAL LINGUISTICS AND INTELLIGENT TEXT PROCESSING, 2005, 3406 : 202 - 213