A Model Driven Approach to Building Domain Specific Search Engines

被引:0
|
作者
Raveendran, Vishnudas [1 ]
Shah, Sapan [1 ]
Reddy, Sreedhar [1 ]
机构
[1] Tata Consultancy Serv Ltd, TCS Res, Pune, Maharashtra, India
关键词
D O I
10.1109/MODELS-C53483.2021.00029
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
In many application domains, general purpose search engines are not very effective as they are not designed for deep processing of semantic relations present in these domains. They need a search engine that understands domain concepts and relations. However, building a domain specific search engine is a non-trivial task. To build an effective search engine, it is not only the text processing technologies that one has to master, one also has to understand the nuances of domain entities and their relationships. Domain knowledge such as generalization hierarchies, relation types, cardinalities, property value types, units, ranges, etc., play a key role in understanding the text. To achieve right accuracy levels, these domain specific nuances have to be encoded into text processing algorithms. Developing a search engine that takes all this into account requires substantial time and effort. To address this, we present a model driven approach to realize domain specific search engines. We propose a metamodel to specify a domain in terms of concepts, relations and their extraction models. An information extraction component interprets this metamodel to access the domain model and the information contained in it. It uses this information to determine what entities, relations and properties need to be extracted, what extraction models to use for the same, and how to resolve the ambiguities that arise from the text processing stage. We also show how a domain specific query language interface can be generated from the domain model. We discuss the results of applying the proposed approach on two domains: materials science and urban mining. The model driven approach results in substantial savings in development efforts.
引用
收藏
页码:167 / 174
页数:8
相关论文
共 50 条