Viewing Heterogeneous Text Databases: Object Technology for the View AND its Implementation

Paepcke, Andreas



Abstract: We have implemented unified access to heterogeneous, semi-structured text sources by creating the illusion that the items in those sources are objects in a virtual object-oriented database. Queries are formulated using the language OSQL, which is an object-oriented extension to SQL. Data sources are modeled as types. Indexes on these sources are modeled as functions and the text records within the sources are viewed as objects. Inheritance is used to reflect semantic similarities among the sources and to control the search range in Queries. This paper describes the implementation of the system. In particular, we focus on showing the advantages we gain from making our query translation and object materialization mechanisms object-oriented. We use type-allocated attributes and a hierarchy of "phrasebooks", which mirror the data source hierarchy and hold the vocabulary information needed to drive the translation of queries from OSQL to the underlying search engines. Function overloading is used in the translation engine to manage different target languages. We use WS-Iris, a persistent object system which serves simultaneously as our implementation platform and as the database that holds materialized information.

