A generic framework for ontology-based information retrieval and image retrieval in web data

被引:17
|
作者
Vijayarajan, V. [1 ]
Dinakaran, M. [2 ]
Tejaswin, Priyam [1 ]
Lohani, Mayank [1 ]
机构
[1] VIT Univ, Sch Comp Sci & Engn, Vellore 632014, Tamil Nadu, India
[2] VIT Univ, Sch Informat Technol & Engn, Vellore 632014, Tamil Nadu, India
关键词
Information retrieval; Ontology; Image retrieval; Natural language processing; SPARQL query; SPARQL; QUERIES;
D O I
10.1186/s13673-016-0074-1
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In the internet era, search engines play a vital role in information retrieval from web pages. Search engines arrange the retrieved results using various ranking algorithms. Additionally, retrieval is based on statistical searching techniques or content-based information extraction methods. It is still difficult for the user to understand the abstract details of every web page unless the user opens it separately to view the web content. This key point provided the motivation to propose and display an ontology-based object-attribute-value (O-A-V) information extraction system as a web model that acts as a user dictionary to refine the search keywords in the query for subsequent attempts. This first model is evaluated using various natural language processing (NLP) queries given as English sentences. Additionally, image search engines, such as Google Images, use content-based image information extraction and retrieval of web pages against the user query. To minimize the semantic gap between the image retrieval results and the expected user results, the domain ontology is built using image descriptions. The second proposed model initially examines natural language user queries using an NLP parser algorithm that will identify the subject-predicate-object (S-P-O) for the query. S-P-O extraction is an extended idea from the ontology-based O-A-V web model. Using this S-P-O extraction and considering the complex nature of writing SPARQL protocol and RDF query language (SPARQL) from the user point of view, the SPARQL auto query generation module is proposed, and it will auto generate the SPARQL query. Then, the query is deployed on the ontology, and images are retrieved based on the auto-generated SPARQL query. With the proposed methodology above, this paper seeks answers to following two questions. First, how to combine the use of domain ontology and semantics to improve information retrieval and user experience? Second, does this new unified framework improve the standard information retrieval systems? To answer these questions, a document retrieval system and an image retrieval system were built to test our proposed framework. The web document retrieval was tested against three key-words/bag-of-words models and a semantic ontology model. Image retrieval was tested on IAPR TC-12 benchmark dataset. The precision, recall and accuracy results were then compared against standard information retrieval systems using TREC_EVAL. The results indicated improvements over the standard systems. A controlled experiment was performed by test subjects querying the retrieval system in the absence and presence of our proposed framework. The queries were measured using two metrics, time and click-count. Comparisons were made on the retrieval performed with and without our proposed framework. The results were encouraging.
引用
收藏
页数:30
相关论文
共 50 条
  • [21] The Research and Application of Ontology-Based Information Retrieval
    Wulamu, Aziguli
    Zhou, Yuchao
    Zhang, Dezheng
    Li, Hui
    Rui, Haike
    [J]. PROCEEDINGS OF THE 2014 9TH IEEE CONFERENCE ON INDUSTRIAL ELECTRONICS AND APPLICATIONS (ICIEA), 2014, : 1980 - 1984
  • [22] A new ontology-based information retrieval method
    Shi, Yiyi
    Guo, Qiuhua
    [J]. ISTM/2007: 7TH INTERNATIONAL SYMPOSIUM ON TEST AND MEASUREMENT, VOLS 1-7, CONFERENCE PROCEEDINGS, 2007, : 2504 - 2507
  • [23] Ontology-based similarity for product information retrieval
    Akmal, Suriati
    Shih, Li-Hsing
    Batres, Rafael
    [J]. COMPUTERS IN INDUSTRY, 2014, 65 (01) : 91 - 107
  • [24] Ontology-based Unstructured Information Organization and Retrieval
    Zhang, Peiyun
    Xie, Rongjian
    [J]. 2009 WRI WORLD CONGRESS ON SOFTWARE ENGINEERING, VOL 1, PROCEEDINGS, 2009, : 408 - +
  • [25] Fuzzy Ontology-based Medical Information Retrieval
    Besbes, Ghada
    Baazaoui-Zghal, Hajer
    [J]. 2016 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS (FUZZ-IEEE), 2016, : 178 - 185
  • [26] Ontology-based design information extraction and retrieval
    Li, Zhanjun
    Ramani, Karthik
    [J]. AI EDAM-ARTIFICIAL INTELLIGENCE FOR ENGINEERING DESIGN ANALYSIS AND MANUFACTURING, 2007, 21 (02): : 137 - 154
  • [27] Visual Ontology-based Information Retrieval System
    Zhuhadar, Leyla
    Nasraoui, Olfa
    Wyatt, Robert
    [J]. INFORMATION VISUALIZATION, IV 2009, PROCEEDINGS, 2009, : 419 - 426
  • [28] A Method of Rough Ontology-based Information Retrieval
    Hu Jun
    Li Zhi-lu
    Guan Chun
    [J]. 2008 IEEE INTERNATIONAL CONFERENCE ON GRANULAR COMPUTING, VOLS 1 AND 2, 2008, : 296 - +
  • [29] Ontology-based intelligent information retrieval system
    Pan, Ying
    Wang, Tianjiang
    Jiang, Xueling
    [J]. Journal of Computational Information Systems, 2008, 4 (01): : 91 - 96
  • [30] Ontology-Based Semantic Web Image Retrieval by Utilizing Textual and Visual Annotations
    Su, Ja-Hwung
    Wang, Bo-Wen
    Yeh, Hsin-Ho
    Tseng, Vincent S.
    [J]. 2009 IEEE/WIC/ACM INTERNATIONAL JOINT CONFERENCES ON WEB INTELLIGENCE (WI) AND INTELLIGENT AGENT TECHNOLOGIES (IAT), VOL 3, 2009, : 425 - 428