Semantically Enhanced Term Frequency based on Word Embeddings for Arabic Information Retrieval

被引:0
|
作者
El Mahdaouy, Abdelkader [1 ,2 ]
El Alaoui, Said Ouatik [1 ]
Gaussier, Eric [2 ]
机构
[1] Univ USMBA, FSDM, LIM, Fes, Morocco
[2] Univ Grenoble Alpes, CNRS, LIG, AMA, Grenoble, France
关键词
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Traditional Information Retrieval (IR) models are based on bag-of-words paradigm, where relevance scores are computed based on exact matching of keywords. Although these models have already achieved good performance, it has been shown that most of dissatisfaction cases in relevance are due to term mismatch between queries and documents. In this paper, we introduce novel method to compute term frequency based on semantic similarities using distributed representations of words in a vector space (Word Embeddings). Our main goal is to allow distinct but semantically related terms to match each other and contribute to the relevance scores. Hence, Arabic documents are retrieved beyond the bag-of-words paradigm based on semantic similarities between word vectors. The results on Arabic standard TREC data sets show significant improvement over the baseline bag-of-words models.
引用
收藏
页码:385 / 389
页数:5
相关论文
共 50 条
  • [1] Semantically enhanced pseudo relevance feedback for Arabic information retrieval
    Atwan, Jaffar
    Mohd, Masnizah
    Rashaideh, Hasan
    Kanaan, Ghassan
    JOURNAL OF INFORMATION SCIENCE, 2016, 42 (02) : 246 - 260
  • [2] Semantically Enhanced Term Frequency
    Mueller, Christof
    Gurevych, Iryna
    ADVANCES IN INFORMATION RETRIEVAL, PROCEEDINGS, 2010, 5993 : 598 - 601
  • [3] A Context-based Semantically Enhanced Information Retrieval Model
    Cioara, Tudor
    Anghel, Ionut
    Salomie, Ioan
    Dinsoreanu, Mihaela
    2009 IEEE 5TH INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTER COMMUNICATION AND PROCESSING, PROCEEDINGS, 2009, : 245 - 250
  • [4] Semantically enhanced Information Retrieval: An ontology-based approach
    Fernandez, Miriam
    Cantador, Ivan
    Lopez, Vanesa
    Vallet, David
    Castells, Pablo
    Motta, Enrico
    JOURNAL OF WEB SEMANTICS, 2011, 9 (04): : 434 - 452
  • [5] Enhanced Arabic information retrieval system based on Arabic text classification
    Ghwanmeh, Sameh
    Kanaan, Ghassan
    Al-Shalabi, Riyad
    Ababneh, Ahmad
    2007 INNOVATIONS IN INFORMATION TECHNOLOGIES, VOLS 1 AND 2, 2007, : 527 - +
  • [6] Semantically enhanced uyghur information retrieval model
    Ma, Bo
    Yang, Yating
    Zhou, Xi
    Zhou, Junlin
    Journal of Software, 2012, 7 (06) : 1315 - 1320
  • [7] Query Expansion based on Word Embeddings and Ontologies for Efficient Information Retrieval
    Rastogi, Namrata
    Verma, Parul
    Kumar, Pankaj
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2021, 12 (11) : 367 - 373
  • [8] Arabic Word Sense Disambiguation for Information Retrieval
    Abderrahim, Mohammed Alaeddine
    Abderrahim, Mohammed El-Amine
    ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2022, 21 (04)
  • [9] Semantically Enhanced Medical Information Retrieval System: A Tensor Factorization Based Approach
    Wang, Haolin
    Zhang, Qingpeng
    Yuan, Jiahu
    IEEE ACCESS, 2017, 5 : 7584 - 7593
  • [10] Arabic Text Classification Based on Word and Document Embeddings
    El Mahdaouy, Abdelkader
    Gaussier, Eric
    El Alaoui, Said Ouatik
    PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON ADVANCED INTELLIGENT SYSTEMS AND INFORMATICS 2016, 2017, 533 : 32 - 41