Clustering-based fragmentation and data replication for flexible query answering in distributed databases

被引:16
|
作者
Wiese L. [1 ]
机构
[1] Institute of Computer Science, Georg-August-Universität Göttingen, Goldschmidtstraße 7, Göttingen
关键词
Bin packing with conflicts; Clustering; Data replication; Distributed database; Flexible query answering; Fragmentation; Load balancing;
D O I
10.1186/s13677-014-0018-0
中图分类号
学科分类号
摘要
One feature of cloud storage systems is data fragmentation (or sharding) so that data can be distributed over multiple servers and subqueries can be run in parallel on the fragments. On the other hand, flexible query answering can enable a database system to find related information for a user whose original query cannot be answered exactly. Query generalization is a way to implement flexible query answering on the syntax level. In this paper we study a clustering-based fragmentation for the generalization operator Anti-Instantiation with which related information can be found in distributed data. We use a standard clustering algorithm to derive a semantic fragmentation of data in the database. The database system uses the derived fragments to support an intelligent flexible query answering mechanism that avoids overgeneralization but supports data replication in a distributed database system. We show that the data replication problem can be expressed as a special Bin Packing Problem and can hence be solved by an off-the shelf solver for integer linear programs. We present a prototype system that makes use of a medical taxonomy to determine similarities between medical expressions. © 2014, Wiese; licensee Springer.
引用
收藏
页数:16
相关论文
共 50 条
  • [21] Clustering-based data replication algorithm in Mobile Ad Hoc Networks for improving data availability
    Zheng, Jing
    Lu, Xi-Cheng
    Wang, Yi-Jie
    Ruan Jian Xue Bao/Journal of Software, 2005, 16 (08): : 1474 - 1483
  • [22] A clustering-based data replication algorithm in Mobile Ad Hoc Networks for improving data availability
    Zheng, J
    Su, JS
    Lu, XC
    PARALLEL AND DISTRIBUTED PROCESSING AND APPLICATIONS, PROCEEDINGS, 2004, 3358 : 399 - 409
  • [23] A Clustering Based Approach for Query Relaxation in Evidential Databases
    Amami, Abir
    Elouedi, Zied
    Hadjali, Allel
    2016 IEEE/ACS 13TH INTERNATIONAL CONFERENCE OF COMPUTER SYSTEMS AND APPLICATIONS (AICCSA), 2016,
  • [24] Using ontology databases for scalable query answering, inconsistency detection, and data integration
    LePendu, Paea
    Dou, Dejing
    JOURNAL OF INTELLIGENT INFORMATION SYSTEMS, 2011, 37 (02) : 217 - 244
  • [25] Using ontology databases for scalable query answering, inconsistency detection, and data integration
    Paea LePendu
    Dejing Dou
    Journal of Intelligent Information Systems, 2011, 37 : 217 - 244
  • [26] Cluster-Former: Clustering-based Sparse Transformer for Question Answering
    Wang, Shuohang
    Zhou, Luowei
    Gan, Zhe
    Chen, Yen-Chun
    Fang, Yuwei
    Sun, Siqi
    Cheng, Yu
    Liu, Jingjing
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-IJCNLP 2021, 2021, : 3958 - 3968
  • [27] A Unified Framework for Flexible Query Answering over Heterogeneous Data Sources
    De Virgilio, Roberto
    Maccioni, Antonio
    Torlone, Riccardo
    FLEXIBLE QUERY ANSWERING SYSTEMS 2015, 2016, 400 : 283 - 294
  • [28] Distributed Island-Based Query Answering for Expressive Ontologies
    Wandelt, Sebastian
    Moeller, Ralf
    ADVANCES IN GRID AND PERVASIVE COMPUTING, PROCEEDINGS, 2010, 6104 : 461 - 470
  • [29] Modified minimum spanning tree based vertical fragmentation, allocation and replication approach in distributed multimedia databases
    Deepak Kumar Sharma
    Utsha Sinha
    Aditi Gupta
    Manju Khari
    Multimedia Tools and Applications, 2022, 81 : 37101 - 37118
  • [30] Dynamic Clustering-based Sharding in Distributed Deduplication Systems
    Zhou, Peng
    Zou, Xiangyu
    Xia, Wen
    Proceedings of DRBSD-8 2022: 8th International Workshop on Data Analysis and Reduction for Big Scientific Data, Held in conjunction with SC 2022: The International Conference for High Performance Computing, Networking, Storage and Analysis, 2022, : 54 - 55