Database Systems
URL: http://db.uwaterloo.ca
Contact Person: M. Tamer Özsu, tozsu@cs.uwaterloo.ca
| Group Members:
| Ashraf Aboulnaga, Edward P. F. Chan, Khuzaima Daudjee, Ihab F. Ilyas, M. Tamer Özsu, Kenneth Salem, David Toman, Frank Wm. Tompa, Grant Weddell
|
Overview
Database research at Waterloo is broad-based, and mirrors the expanding role of
database systems in managing very large amounts of diverse data. The group's
research can be broadly classified in two categories: (1) extending & enhancing
database systems and (2) data management for new domains. Within these two broad
categories, there are research projects on adaptive
& self-managing database systems; cloud data management; database schema languages;
distributed data management; multiprocessor-hosted database systems;
security in databases; spatial, temporal, & multimedia databases;
storage management for database systems; text data managment; and top-k querying.
- Adaptive and self-managing database systems. Existing database systems are
complex and operate in dynamic environments that make it difficult to
configure them properly. Part of our work involves improving query optimizer statistics
and cost estimates by observing the estimation errors and learning from them. A
related project studies ways of enhancing the capabilities for current query
optimization and execution techniques to cope with the continuously changing
computing environments.
- Cloud data management. Our goal is to develop database management systems
that are suitable for virtualized computing environments,
such as public or private clouds, which make computing resources
available on demand. Database managers for clouds should be fault-tolerant, scalable, and elastic
— able to grow and shrink their capacity on the fly —
while providing useful query and update capabilities to applications.
- Database schema languages. The metadata for a database is embodied in a schema language,
such as relational schemas, ER models, UML class and object diagrams, object-relational schemas,
and the first two levels of OWL.
We examine how the facilities of such languages can be modeled by description logic
together with identification, containment, and temporal constraints.
A goal is to develop practical and theoretically sound methods
to answer questions about query containment and to reason about temporal properties.
We explore the application of such reasoning to query rewriting and optimization, to keyword search,
and to query compilation into code that is executed over low-level physical layouts of data,
such as records and pointer structures.
- Distributed data management. Projects in this area investigate
data management problems in Internet-scale
(i.e., very large and widespread) distributed environments.
One project focuses on the
management of stream data, where we study query languages and query processing
over data streams. Other projects investigate data management
problems in large-scale peer-to-peer networks.
- Multiprocessor-hosted database systems. Today's microprocessor vendors offer systems with increased processing power by placing multiple processing units on each chip. Many emerging applications benefit from executing the sequential components on multi-core CPUs and computationally intensive components on massively parallel, many-core Graphical Processing Units (GPUs). We study some fundamental problems that affect the design and implementation of scalable, effective, and massively parallel subsystems (such as those using GPUs or joint CPU/GPU execution) and the implications for database processing.
- Security in databases. Our goal is to develop a well-defined, efficiently
implementable, fine-grained (i.e., at the element level) access control model for
XML and object-oriented data against which a variety of operations or methods are
to be applied. We are also investigating how relational data can be managed
so as to meet compliance constraints that
are specified through record retention and destruction policies.
- Spatial, temporal, and multimedia databases. Our research addresses the
development of database technology to fulfill the requirements of large classes of
applications (e.g., geographic information systems, location-based services,
graphic and simulation systems, historical data warehouses, multimedia systems)
that deal with spatial objects that evolve over time, move, and may
consist of multiple media types (video, audio, text, images). Using weighted graphs
as a model for such applications, one project investigates route queries, which are
posed against the set of all paths. The goal is to develop a fast unified disk-based
algorithm and associated optimization techniques that can be used to answer a large
variety of route queries. Along the temporal dimension, we study the issues of automatic
expiration of data that is no longer necessary. We also investigate problems related
to the modeling, representation and efficient retrieval of video objects.
- Storage management for database systems. Database management
systems rely on an underlying storage system to store the
database safely. Our goal is to develop mechanisms through which database management systems
and storage systems can cooperate to optimize
overall performance of the database/storage stack. For example,
database-provided hints can be used to guide data placement
and caching in the storage tier. Furthermore, database systems
might benefit from the dynamic provisioning of data replicas in
multi-tier systems and from maintaining materialized views to
answer database queries.
- Text data management. Much of business intelligence is embodied in the form of text,
and our goal is to provide access to this information as conveniently as if it were
stored in a conventional database system.
One project examines how to extract relational data from text, either at query time or in anticipation of potential queries.
Other projects examine how to manage structured text, and in particular XML,
either by mapping the data and queries to relational equivalents
or by developing native XML database systems. In the case of relational
mapping, we investigate powerful labeling schemes and XML query translation
algorithms. For native XML databases, we study storage systems, indexing, query
optimization, and distributed XML processing.
- Top-k querying. Our work focuses on enriching query
processors/optimizers to handle Information Retrieval (IR) style queries that are
common in some emerging applications such as multimedia systems and Web databases.
These queries do not have exact answers, requiring the ranking of the results
according to their suitability as an answer. The goal of this line of research is
to develop a formal and generalized ranked-retrieval framework that adapts to
non-traditional processing environments and is applicable to a variety of data
models.


Last modified: Monday, 20-Dec-2010 14:43:43 EST