NGram Approach for Semantic Similarity on Arabic Short Text

被引:0
|
作者
Al-Mahmoud, Rana Husni [1 ]
Sharieh, Ahmad [2 ]
机构
[1] Appl Sci Private Univ, Fac Informat Technol, Amman, Jordan
[2] Univ Jordan, Comp Sci Dept, King Abdullah II Sch Informat Technol, Amman, Jordan
关键词
-Arabic text; Ngram; semantic sentences similarity; short text; ALMaany; natural language; semantic similarity of words; corpus-based measures; TWEETS;
D O I
10.14569/IJACSA.2022.0131199
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Measuring the semantic similarity between words requires a method that can simulate human thought. The use of computers to quantify and compare semantic similarities has become an important research area in various fields, including artificial intelligence, knowledge management, information re-trieval, and natural language processing. Computational seman-tics require efficient measures for computing concept similarity, which still need to be developed. Several computational measures quantify semantic similarity based on knowledge resources such as the WordNet taxonomy. Several measures based on taxonom-ical parameters have been applied to optimize the expression for content semantics. This paper presents a new similarity measure for quantifying the semantic similarity between concepts, words, sentences, short text, and long text based on NGram features and Synonyms of NGram related to the same domain. The proposed algorithm was tested on 700 tweets, and the semantic similarity values were compared with cosine similarity on the same dataset. The results were analyzed manually by a domain expert who concluded that the values provided by the proposed algorithm were better than the cosine similarity values within the selected domain regarding the semantic similarity between the datasets' short texts.
引用
收藏
页码:857 / 866
页数:10
相关论文
共 50 条
  • [31] Research on Semantic Similarity of Short Text Based on Bert and Time Warping Distance
    Qiu, Shijie
    Niu, Yan
    Li, Jun
    Li, Xing
    [J]. JOURNAL OF WEB ENGINEERING, 2021, 20 (08): : 2521 - 2543
  • [32] Chinese Short Text Entity Linking Based On Semantic Similarity and Entity Correlation
    Zhao, Yan
    Wang, Yun
    Yang, Na
    [J]. 2020 IEEE 32ND INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI), 2020, : 426 - 431
  • [33] A Short Text Similarity Calculation Method Combining Semantic and Headword Attention Mechanism
    Ji, Mingyu
    Zhang, Xinhai
    [J]. SCIENTIFIC PROGRAMMING, 2022, 2022
  • [34] Similarity Calculation Method of Chinese Short Text Based on Semantic Feature Space
    Pan, Liqiang
    Zhang, Pu
    Xiong, Anping
    [J]. INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2015, 6 (02) : 306 - 310
  • [35] A fuzzy approach for Persian text segmentation based on semantic similarity of sentences
    Shahabi, Amir Shahab
    Kangavari, Mohammad Reza
    [J]. INTELLIGENT INFORMATION PROCESSING III, 2006, 228 : 411 - +
  • [36] Semantic Similarity for English and Arabic Texts: A Review
    Alian, Marwah
    Awajan, Arafat
    [J]. JOURNAL OF INFORMATION & KNOWLEDGE MANAGEMENT, 2020, 19 (04)
  • [37] Semantic Textual Similarity in Bengali Text
    Shajalal, Md
    Aono, Masaki
    [J]. 2018 INTERNATIONAL CONFERENCE ON BANGLA SPEECH AND LANGUAGE PROCESSING (ICBSLP), 2018,
  • [38] Text Similarity Based on Semantic Analysis
    Wang, Junli
    Zhou, Qing
    Sun, Guobao
    [J]. PROCEEDINGS OF THE 2016 2ND INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND INDUSTRIAL ENGINEERING (AIIE 2016), 2016, 133 : 303 - 307
  • [39] A COMBINED MEASURE FOR TEXT SEMANTIC SIMILARITY
    Li, Hao-Di
    Chen, Qing-Cai
    Wang, Xiao-Long
    [J]. PROCEEDINGS OF 2013 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS (ICMLC), VOLS 1-4, 2013, : 1869 - 1873
  • [40] Semantic Based Text Similarity Computation
    Liu, Yaqi
    Li, Zhijiang
    [J]. ADVANCED GRAPHIC COMMUNICATIONS AND MEDIA TECHNOLOGIES, 2017, 417 : 343 - 348