Semantic Similarity for English and Arabic Texts: A Review

被引:7
|
作者
Alian, Marwah [1 ,2 ]
Awajan, Arafat [1 ]
机构
[1] Princess Sumaya Univ Technol, Amman, Jordan
[2] Hashemite Univ, Zarqa, Jordan
关键词
Semantic similarity; feature-based; word embeddings; statistical corpus-based; sentence similarity; word similarity; document similarity; INFORMATION-CONTENT; CONTEXT;
D O I
10.1142/S0219649220500331
中图分类号
G25 [图书馆学、图书馆事业]; G35 [情报学、情报工作];
学科分类号
1205 ; 120501 ;
摘要
Semantic similarity is the task of measuring relations between sentences or words to determine the degree of similarity or resemblance. Several applications of natural language processing require semantic similarity measurement to achieve good results; these applications include plagiarism detection, text entailment, text summarisation, paraphrasing identification, and information extraction. Many researchers have proposed new methods to measure the semantic similarity of Arabic and English texts. In this research, these methods are reviewed and compared. Results show that the precision of the corpus-based approach exceeds 0.70. The precision of the descriptive feature-based technique is between 0.670 and 0.86, with a Pearson correlation coefficient of over 0.70. Meanwhile, the word embedding technique has a correlation of 0.67, and its accuracy is in the range 0.76-0.80. The best results are achieved by the feature-based approach.
引用
收藏
页数:29
相关论文
共 50 条
  • [31] Supervised Learning to Measure the Semantic Similarity Between Arabic Sentences
    Wali, Wafa
    Gargouri, Bilel
    Ben Hamadou, Abdelmajid
    [J]. COMPUTATIONAL COLLECTIVE INTELLIGENCE (ICCCI 2015), PT I, 2015, 9329 : 158 - 167
  • [32] Deep Contextualized Pairwise Semantic Similarity for Arabic Language Questions
    Al-Bataineh, Hesham
    Farhan, Wael
    Mustafa, Ahmad
    Seelawi, Haitham
    Al-Natsheh, Hussein T.
    [J]. 2019 IEEE 31ST INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2019), 2019, : 1586 - 1591
  • [33] Syntax as a marker of rhetorical organization in written texts: Arabic and English
    Mohamed, AH
    Omer, MR
    [J]. IRAL-INTERNATIONAL REVIEW OF APPLIED LINGUISTICS IN LANGUAGE TEACHING, 1999, 37 (04): : 291 - 305
  • [34] Leveraging Grammatical Roles for Measuring Semantic Similarity Between Texts
    Atabuzzaman, Md
    Shajalal, Md
    Ahmed, M. Elius
    Ibn Afjal, Masud
    Aono, Masaki
    [J]. IEEE ACCESS, 2021, 9 : 62972 - 62983
  • [35] Cross-Level Semantic Similarity for Serbian Newswire Texts
    Batanovic, Vuk
    Petrovic, Maja Milicevic
    [J]. LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 1691 - 1699
  • [36] A Fast Matching Method Based on Semantic Similarity for Short Texts
    Xu, Jiaming
    Liu, Pengcheng
    Wu, Gaowei
    Sun, Zhengya
    Xu, Bo
    Hao, Hongwei
    [J]. NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING, NLPCC 2013, 2013, 400 : 299 - 309
  • [37] Combining methods for detecting and correcting semantic hidden errors in Arabic texts
    Zribi, Chiraz Ben Othmane
    Mejri, Hanene
    Ahmed, Mohamed Ben
    [J]. Computational Linguistics and Intelligent Text Processing, 2007, 4394 : 634 - 645
  • [38] Semantic and stylistic features of English texts of journalistic style
    Hlinka, N.
    Yeskin, O.
    [J]. ADVANCED EDUCATION, 2014, (01) : 6 - 12
  • [39] The lexical-semantic fields of verbs in English texts
    Pavlyshenko, Olha
    [J]. GLOTTOMETRICS, 2013, 25 : 69 - 84
  • [40] Genres of the texts: Tests of semantic interpretive literature in English
    Preher, Gerald
    [J]. REVUE FRANCAISE D ETUDES AMERICAINES, 2010, (123): : 121 - 122