QSST: A Quranic Semantic Search Tool based on word embedding

被引:9
|
作者
Mohamed, Ensaf Hussein [1 ]
Shokry, Eyad Mohamed [1 ]
机构
[1] Helwan Univ, Fac Comp & Artificial Intelligence, Comp Sci Dept, Cairo, Egypt
关键词
Information Retrieval; Word Embedding; Concept-based Search; Ontology; Semantic Search; Arabic Natural Language Processing; Holy Quran;
D O I
10.1016/j.jksuci.2020.01.004
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Retrieving information from the Quran is an important field for Quran scholars and Arabic researchers. There are two types of Quran searching techniques: semantic or concept-based and keyword-based. Concept-based search is a challenging task, especially in a complex corpus such as Quran. This paper presents a concept-based searching tool (QSST) for the Holy Quran. It consists of four phases. In the first phase, the Quran dataset is built by manually annotating Quran verses based on the ontology of Mushaf Al-Tajweed. The second phase is word Embedding, this phase generates features' vectors for words by training a Continuous Bag of Words (CBOW) architecture on large Quranic and Classic Arabic corpus. The third phase includes calculating the features' vectors of both input query and Quranic topics. Finally, retrieving the most relevant verses by computing the cosine similarity between both topic and query vectors. The performance of the proposed QSST is measured by comparing results against Mushaf Al-Tajweed. Then, precision, recall, and F-score are computed and their percentages were 76.91%, 72.23% 69.28% respectively. In addition, the results are evaluated by three Islamic experts and the average precision was 91.95%. Finally, QSST results are compared with the recent existing tools; QSST outperformed them. (C) 2020 The Authors. Production and hosting by Elsevier B.V. on behalf of King Saud University.
引用
收藏
页码:934 / 945
页数:12
相关论文
共 50 条
  • [1] Arabic Quranic Search Tool Based on Ontology
    Alqahtani, Mohammad
    Atwell, Eric
    NATURAL LANGUAGE PROCESSING AND INFORMATION SYSTEMS, NLDB 2016, 2016, 9612 : 478 - 485
  • [2] Embedding Search for Quranic Texts based on Large Language Models
    Alqarni, Mohammed
    INTERNATIONAL ARAB JOURNAL OF INFORMATION TECHNOLOGY, 2024, 21 (02) : 243 - 256
  • [3] Math-word embedding in math search and semantic extraction
    André Greiner-Petter
    Abdou Youssef
    Terry Ruas
    Bruce R. Miller
    Moritz Schubotz
    Akiko Aizawa
    Bela Gipp
    Scientometrics, 2020, 125 : 3017 - 3046
  • [4] Math-word embedding in math search and semantic extraction
    Greiner-Petter, Andre
    Youssef, Abdou
    Ruas, Terry
    Miller, Bruce R.
    Schubotz, Moritz
    Aizawa, Akiko
    Gipp, Bela
    SCIENTOMETRICS, 2020, 125 (03) : 3017 - 3046
  • [5] Text Semantic Steganalysis Based on Word Embedding
    Zuo, Xin
    Hu, Huanhuan
    Zhang, Weiming
    Yu, Nenghai
    CLOUD COMPUTING AND SECURITY, PT IV, 2018, 11066 : 485 - 495
  • [6] Analysing the Semantic Change Based on Word Embedding
    Liao, Xuanyi
    Cheng, Guang
    NATURAL LANGUAGE UNDERSTANDING AND INTELLIGENT APPLICATIONS (NLPCC 2016), 2016, 10102 : 213 - 223
  • [7] A Novel Quranic Search Engine Using an Ontology-Based Semantic Indexing
    Zouaoui, Samia
    Rezeg, Khaled
    ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING, 2021, 46 (04) : 3653 - 3674
  • [8] A Novel Quranic Search Engine Using an Ontology-Based Semantic Indexing
    Samia Zouaoui
    Khaled Rezeg
    Arabian Journal for Science and Engineering, 2021, 46 : 3653 - 3674
  • [9] Short Text Clustering based on Word Semantic Graph with Word Embedding Model
    Jinarat, Supakpong
    Manaskasemsak, Bundit
    Rungsawang, Arnon
    2018 JOINT 10TH INTERNATIONAL CONFERENCE ON SOFT COMPUTING AND INTELLIGENT SYSTEMS (SCIS) AND 19TH INTERNATIONAL SYMPOSIUM ON ADVANCED INTELLIGENT SYSTEMS (ISIS), 2018, : 1427 - 1432
  • [10] Neural embedding-based indices for semantic search
    Lashkari, Fatemeh
    Bagheri, Ebrahim
    Ghorbani, Ali A.
    INFORMATION PROCESSING & MANAGEMENT, 2019, 56 (03) : 733 - 755