Semantic Similarity for English and Arabic Texts: A Review

被引:7
|
作者
Alian, Marwah [1 ,2 ]
Awajan, Arafat [1 ]
机构
[1] Princess Sumaya Univ Technol, Amman, Jordan
[2] Hashemite Univ, Zarqa, Jordan
关键词
Semantic similarity; feature-based; word embeddings; statistical corpus-based; sentence similarity; word similarity; document similarity; INFORMATION-CONTENT; CONTEXT;
D O I
10.1142/S0219649220500331
中图分类号
G25 [图书馆学、图书馆事业]; G35 [情报学、情报工作];
学科分类号
1205 ; 120501 ;
摘要
Semantic similarity is the task of measuring relations between sentences or words to determine the degree of similarity or resemblance. Several applications of natural language processing require semantic similarity measurement to achieve good results; these applications include plagiarism detection, text entailment, text summarisation, paraphrasing identification, and information extraction. Many researchers have proposed new methods to measure the semantic similarity of Arabic and English texts. In this research, these methods are reviewed and compared. Results show that the precision of the corpus-based approach exceeds 0.70. The precision of the descriptive feature-based technique is between 0.670 and 0.86, with a Pearson correlation coefficient of over 0.70. Meanwhile, the word embedding technique has a correlation of 0.67, and its accuracy is in the range 0.76-0.80. The best results are achieved by the feature-based approach.
引用
收藏
页数:29
相关论文
共 50 条
  • [1] Arabic Semantic Similarity Approaches - Review
    Alian, Marwah
    Awajan, Arafat
    [J]. 2018 19TH INTERNATIONAL ARAB CONFERENCE ON INFORMATION TECHNOLOGY (ACIT), 2018, : 14 - 19
  • [2] Semantic similarity based approach for reducing Arabic texts dimensionality
    Awajan, Arafat
    [J]. INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2016, 19 (02) : 191 - 201
  • [3] A Language Framework for Measuring Semantic and Syntactic Similarity for Arabic Texts
    Ismail S.
    Alsammak A.
    Elshishtawy T.
    [J]. SN Computer Science, 5 (4)
  • [4] Estimation in Semantic Similarity of Texts
    Manh Hung Nguyen
    Dinh Que Tran
    [J]. JOURNAL OF INFORMATION SCIENCE AND ENGINEERING, 2021, 37 (03) : 617 - 633
  • [5] Detection of semantic errors in Arabic texts
    Zribi, Chiraz Ben Othmane
    Ben Ahmed, Mohamed
    [J]. ARTIFICIAL INTELLIGENCE, 2013, 195 : 249 - 264
  • [6] A Semantic Annotation Model for Arabic Legal Texts
    Berrazega, Ines
    Faiz, Rim
    Bouhafs, Asma
    Mourad, Ghassan
    [J]. 9TH HELLENIC CONFERENCE ON ARTIFICIAL INTELLIGENCE (SETN 2016), 2016,
  • [7] Word Embedding-Based Approaches for Measuring Semantic Similarity of Arabic-English Sentences
    Nagoudi, El Moatez Billah
    Ferrero, Jeremy
    Schwab, Didier
    Cherroun, Hadda
    [J]. ARABIC LANGUAGE PROCESSING: FROM THEORY TO PRACTICE, 2018, 782 : 19 - 33
  • [8] Semantic economy of wording in English and Arabic
    Khuwaileh, AA
    [J]. PERSPECTIVES-STUDIES IN TRANSLATOLOGY, 1998, 6 (01): : 61 - 70
  • [9] Arabic Semantic Similarity Approach for Farmers' Complaints
    Farouk, Rehab Ahmed
    Khafagy, Mohammed H.
    Ali, Mostafa
    Munir, Kamran
    Badry, Rasha M.
    [J]. INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2021, 12 (10) : 348 - 358
  • [10] Paraphrasing Identification Techniques in English and Arabic Texts
    Alian, Marwah
    Awajan, Arafat
    [J]. 2020 11TH INTERNATIONAL CONFERENCE ON INFORMATION AND COMMUNICATION SYSTEMS (ICICS), 2020, : 155 - 160