Querying documents in object databases

被引:315
|
作者
Abiteboul S. [1 ]
Cluet S. [1 ]
Christophides V. [1 ]
Milo T. [2 ]
Moerkotte G. [3 ]
Siméon J. [1 ]
机构
[1] INRIA-Rocquencourt, F-78153 Le Chesnay Cedex
[2] Tel Aviv University, Ramat Aviv
[3] Lehrstuhl für Praktische Informatik III, Seminargebäude A5, Universität Mannheim
关键词
Generalized path expressions; ODMG; OQL; Pattern matching;
D O I
10.1007/s007990050001
中图分类号
学科分类号
摘要
We consider the problem of storing and accessing documents (SGML and HTML, in particular) using database technology. To specify the database image of documents, we use structuring schemas that consist in grammars annotated with database programs. To query documents, we introduce an extension of OQL, the ODMG standard query language for object databases. Our extension (named OQL-doc) allows us to query documents without a precise knowledge of their structure using in particular generalized path expressions and pattern matching. This allows us to introduce in a declarative language (in the style of SQL or OQL), navigational and information retrieval styles of accessing data. Query processing in the context of documents and path expressions leads to challenging implementation issues. We extend an object algebra with new operators to deal with generalized path expressions. We then consider two essential complementary optimization techniques. We show that almost standard database optimization techniques can be used to answer queries without having to load the entire document into the database. We also consider the interaction of full-text indexes (e.g., inverted files) with standard database collection indexes (e.g., B-trees) that provide important speed-up. © Springer-Verlag 1997.
引用
收藏
页码:5 / 19
页数:14
相关论文
共 50 条
  • [31] F2/XML: Storing XML documents in object databases
    Al-Jadir, L
    El-Moukaddem, F
    OBJECT-ORIENTED INFORMATION SYSTEMS, PROCEEDINGS, 2002, 2425 : 108 - 116
  • [32] QUERYING DATA IN NOSQL DATABASES
    Babic, Andrea
    Jaksic, Danijela
    Poscic, Patrizia
    ZBORNIK VELEUCILISTA U RIJECI-JOURNAL OF THE POLYTECHNICS OF RIJEKA, 2019, 7 (01): : 257 - 270
  • [33] Integration and querying of distributed databases
    Hu, GZ
    Fernandes, H
    PROCEEDINGS OF THE 2003 IEEE INTERNATIONAL CONFERENCE ON INFORMATION REUSE AND INTEGRATION, 2003, : 167 - 174
  • [34] Querying sequence databases with transducers
    Bonner, AJ
    Mecca, G
    ACTA INFORMATICA, 2000, 36 (07) : 511 - 544
  • [35] Querying Graph Databases at Scale
    Hogan, Aidan
    Vrgoc, Domagoj
    COMPANION OF THE 2024 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, SIGMOD-COMPANION 2024, 2024, : 585 - 589
  • [36] Graphical querying of multidimensional Databases
    Ravat, Franck
    Teste, Olivier
    Tournier, Ronan
    Zurfluh, Gilles
    ADVANCES IN DATABASES AND INFORMATION SYSTEMS, PROCEEDINGS, 2007, 4690 : 298 - +
  • [37] Querying Communities in Relational Databases
    Qin, Lu
    Yu, Jeffrey Xu
    Chang, Lijun
    Tao, Yufei
    ICDE: 2009 IEEE 25TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING, VOLS 1-3, 2009, : 724 - 735
  • [38] Querying sequence databases with transducers
    Bonner, AJ
    Mecca, G
    DATABASE PROGRAMMING LANGUAGES, 1998, 1369 : 118 - 135
  • [39] Querying and Learning in Probabilistic Databases
    Dylla, Maximilian
    Theobald, Martin
    Miliaraki, Iris
    REASONING WEB: REASONING ON THE WEB IN THE BIG DATA ERA, 2014, 8714 : 313 - +
  • [40] Semantically Annotating and Querying Databases
    Karagiannis, Georgios Th.
    MATHEMATICAL METHODS, COMPUTATIONAL TECHNIQUES, NON-LINEAR SYSTEMS, INTELLIGENT SYSTEMS, 2008, : 461 - +