Heterogeneous database system


A heterogeneous database system is an automated system for the integration of heterogeneous, disparate database management systems to present a user with a single, unified query interface.
Heterogeneous database systems are computational models and software implementations that provide heterogeneous database integration.

Problems of heterogeneous database integration

This article does not contain details of distributed database management systems.

Technical heterogeneity

Different file formats, access protocols, query languages etc. Often called syntactic heterogeneity from the point of view of data.

Data model heterogeneity

Different ways of representing and storing the same data. Table decompositions may vary, column names may be different, data encoding schemes may vary. Also referred as schematic heterogeneity.

Semantic heterogeneity

Data across constituent databases may be related but different. Perhaps a database system must be able to integrate genomic and proteomic data. They are related—a gene may have several protein products—but the data are different. There may be many ways of looking at semantically similar, but distinct, datasets.
The system may also be required to present "new" knowledge to the user. Relationships may be inferred between data according to rules specified in domain ontologies.