Information content measures of semantic similarity between documents based on Hadoop system

被引:0
|
作者
Birjali, Marouane [1 ]
Beni-Hssane, Abderrahim [1 ]
Erritali, Mohammed [2 ]
Madani, Youness [2 ]
机构
[1] Univ Chouaib Doukkali, Fac Sci, Dept Comp Sci, El Jadida, Morocco
[2] Univ Sultan Moulay Slimane, Fac Sci & Technol, Dept Comp Sci, Beni Mellal, Morocco
关键词
distributed processing; Hadoop; Big Data; Semantic similarity; Mapreduce programming; Wordnet;
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Retrieving documents in response to the user's query is the most commonly text retrieval task. For our work, we have mainly focused on detecting the semantic similarity between documents in large documents collection and queries. In this paper, we investigated MapReduce as a specific framework for managing distributed processing in dataset pattern and semantic similarity measures of documents. Then we study the state of the art of different approaches for computing the semantic similarity of documents. We propose an approach based on parallel algorithm of semantic similarity measures using MapReduce and WordNet to detect the relevant documents in the face of the query. Finally, we are leading basic experiments to assess the performance of the proposed approach and noted the leverage of Hadoop and MapReduce to the semantic similarity measures between documents.
引用
收藏
页码:P187 / P192
页数:6
相关论文
共 50 条
  • [1] Applying semantic similarity measures based on information content in the evaluation of a domain ontology
    Hernandez Garcia, Aimee Cecilia
    Tovar Vidal, Mireya
    Lavalle Martinez, Jose de Jesus
    2018 17TH MEXICAN INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE (MICAI 2018), 2018, : 8 - 12
  • [2] Wikipedia-based information content and semantic similarity computation
    Jiang, Yuncheng
    Bai, Wen
    Zhang, Xiaopei
    Hu, Jiaojiao
    INFORMATION PROCESSING & MANAGEMENT, 2017, 53 (01) : 248 - 265
  • [3] A New Model of Information Content for Measuring the Semantic Similarity Between Concepts
    Yuan, Qingbo
    Yu, Zhongqing
    Wang, Kaixi
    2013 INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND BIG DATA (CLOUDCOM-ASIA), 2013, : 141 - 146
  • [4] Semantic similarity similarity measures for enhancing information retrieval in folksonomies
    Uddin, Mohammed Nazim
    Trong Hai Duong
    Ngoc Thanh Nguyen
    Qi, Xin-Min
    Jo, Geun Sik
    EXPERT SYSTEMS WITH APPLICATIONS, 2013, 40 (05) : 1645 - 1653
  • [5] Information theoretic similarity measures for content based image retrieval
    Zachary, J
    Iyengar, SS
    JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY, 2001, 52 (10): : 856 - 867
  • [6] Ontology based semantic similarity comparison of documents
    Oleshchuk, V
    Pedersen, A
    14TH INTERNATIONAL WORKSHOP ON DATABASE AND EXPERT SYSTEMS APPLICATIONS, PROCEEDINGS, 2003, : 735 - 738
  • [7] A Review of Information Content Metric for Semantic Similarity
    Meng, Lingling
    Gu, Junzhong
    Zhou, Zili
    ADVANCES ON DIGITAL TELEVISION AND WIRELESS MULTIMEDIA COMMUNICATIONS, 2012, 331 : 299 - +
  • [8] Information Content Based Semantic Similarity Approaches for Multiple Biomedical Ontologies
    Saruladha, K.
    Aghila, G.
    Bhuvaneswary, A.
    ADVANCES IN COMPUTING AND COMMUNICATIONS, PT 2, 2011, 191 : 327 - 336
  • [9] A Semantic Information Content Based Method for Evaluating FCA Concept Similarity
    Huang, Hongtao
    Liang, Cunliang
    Ye, Haizhi
    INTERNATIONAL JOURNAL OF COGNITIVE INFORMATICS AND NATURAL INTELLIGENCE, 2018, 12 (02) : 77 - 93
  • [10] A semantic similarity method based on information content exploiting multiple ontologies
    Sanchez, David
    Batet, Montserrat
    EXPERT SYSTEMS WITH APPLICATIONS, 2013, 40 (04) : 1393 - 1399