Undisplayed Graphic

Oracle8 for Data Warehousing

Features Overview

June 1997

INTRODUCTION

Database management systems are at the core of every application that supports business decisions¸ from enterprise data warehouses to subject-specific data marts¸ providing robust data management and scalable, high performance query processing. Oracle has been one of the earliest in the industry to recognize the significance of data warehousing and has, over the last several years, systematically evolved the Oracle server to address the specific needs of decision support applications. Key technologies such as parallel query execution, parallel data management, sophisticated cost-based query optimization, efficient bitmap indexing, and specialized query execution algorithms like hash joins, just to mention a few examples, have been incorporated in the Oracle7 server to deliver industry-leading performance and scalability for data warehousing. Oracle8, the next generation Oracle server product, continues along the same direction and advances the state-of-the-art in several key areas. In this paper we provide an overview of the significant new capabilities in Oracle8 that specifically address the needs of data warehousing.

ORACLE8 FOR DATA WAREHOUSING

Oracle8 delivers new functionality in the following major areas related to data warehousing:

  • Data Management
  • Scalable Query Processing
  • Business-Critical Availability

In the rest of this document, we provide an outline of the new capabilities.

DATA MANAGEMENT

Data warehouses and data marts seem to be constantly growing in size. ItÝs not uncommon to find a single table in a data warehouse that is 100 GB or larger in size. When tables and the associated indexes get to be this large, the database server has to offer a robust, scalable data management infrastructure that can meet the availability, manageability and performance requirements at this large scale. Oracle8 incorporates Data Partitioning as the key enabling technology to meet the data management needs of data warehouses.

Data Partitioning

Data Partitioning is a åDivide and ConquerÝ technology that enables better management of very large tables and indexes by dividing them up into smaller pieces called partitions. This seemingly simple idea has the power to deliver very significant improvements in data management:

  • Each partition can be stored on a separate set of storage devices. This physical separation enables continued data availability in the face of partial failures, since the loss of a disk drive will only affect the availability of a single partition or a subset of partitions, and not the entire table.
  • With partitioning, data management operations such as loading data and building indexes can be performed at the granularity of a partition as opposed to at the table level. As a result, these tasks can now be efficiently performed within finite operational maintenance windows, independent of the actual size of the underlying table. Further, availability of the rest of the table is maintained while these operations are being performed on a subset of partitions.
  • Further, since the database server is aware of the contents of each partition, this knowledge can be effectively applied in query processing to provide scalable query execution performance, even with substantial growth in data volumes.

Scalable Operations

Data Partitioning enables data management operations to be performed in parallel at the partition level and delivers the full benefits of parallel processing to data warehouses and data marts. Virtually every operation¸data loading, building indexes, enforcing constraints, gathering optimizer statistics, and backup and recovery¸can be done at the granularity of a partition. Further, data reorganization tasks such as moving partitions and splitting partitions can be performed using scalable parallel data flow operations.

Oracle8 introduces support for the execution of bulk insert, update and delete operations in parallel. These parallel data manipulation operators deliver high performance, scalability and efficient utilization of all hardware resources and enable the completion of data management tasks within ever-shrinking batch windows. Parallel insert, update and delete capabilities are very useful for bulk operations such as the creation of summary tables, purging historical data, and batch updates in support of re-statement operations.

Breaking Size Barriers

Oracle8 provides an infrastructure that will scale with your needs, into the terabytes and beyond. A number of internal system limits have been substantially raised, to enable the creation and management of larger tables, tables with more columns, more tablespaces and datafiles, and much larger databases. The table below summarizes some of the revised limits:

Internal Limit Oracle8
Database Size

Theoretical

Practical

512 PB

Tens of TB

Tablespaces per database 64 K
Files per database 64 - 256 M
Columns per table 1000

Revised Size Limit in Oracle8

SCALABLE QUERY PROCESSING

Oracle8 includes several innovations in query processing and delivers significant improvements in performance and scalability to meet the demanding ad-hoc query needs of data warehousing.

Next Generation Star Joins

Oracle8 introduces ground-breaking new technology for star query processing to deliver dramatic performance gains and superior flexibility. This patent-pending new approach to star joins incorporates intelligent query transformations and an efficient algorithm for joining multiple tables in a single step and eliminates the need to generate a cartesian product of the dimension tables. Further, the new method makes innovative use of bitmap indexes and avoids the need for one or more multi-column b-tree indexes on the fact table. Here are the major benefits of this new technology:

  • Performance Gains: The elimination of cartesian products, innovative use of bitmap indexes and parallel execution combine to deliver dramatic gains in query execution performance.
  • Improved scalability: The new method provides superior scalability to handle large or unconstrained dimensions and large number of dimension tables, and efficiently deals with sparse fact tables.
  • Flexibility: The Oracle8 approach dynamically combines single-column bitmap indexes based on query criteria and eliminates the need to create and maintain multiple concatenated indexes, resulting in improved flexibility and reduced administrative overheads. Further, the method is flexible enough to deal with complex schemas involving multiple fact tables and advanced models such as snowflake schemas, providing additional choices in modeling without any trade-off in access performance.
  • Efficient Storage, Easy Maintenance: Oracle8 enables efficient storage of bitmap indexes through the use of advanced data compression technology. Parallel index creation and automatic index maintenance with update operations facilitate easy maintenance.

Innovative Parallel Query

Oracle8 advances OracleÝs innovative åbest-of-both-worldsÝ parallel query architecture. Unlike competing implementations that base parallel execution completely on data partitioning, OracleÝs industry-leading technology combines intelligent use of data partitioning in the form of partition elimination and parallelism across partitions, with OracleÝs unique dynamic, intra-partition parallel query execution to deliver superior scalability and uncompromised manageability.

Query Optimization Enhancements

The Oracle8 cost-based query optimizer incorporates powerful query transformation technology that automatically årewritesÝ queries generated by end-user tools for efficient query execution. New query transforms include the star query transformation and numerous other syntactic modifications. Further, the Oracle8 optimizer is fully parallel-aware and partition-aware, supporting efficient, scalable parallel query execution.

Powerful New Indexing

Oracle8 offers a rich array of indexing options to suit every access need. Both b-tree and bitmap indexes can be partitioned, to achieve improved manageability and access performance. A number of choices are available in the configuration of indexes on partitioned tables¸partitioned indexes: local prefixed or non-prefixed, and global, as well as non-partitioned indexes¸providing maximum flexibility to achieve the right levels of availability and parallel access performance. Oracle8 index-organized tables enable fast primary key-based access and efficient storage.

BUSINESS-CRITICAL AVAILABILITY

As data warehousing moves mainstream, from being just useful to truly business-critical, highly available access to decision support data has become a key requirement. Oracle8 data partitioning provides a robust infrastructure that enables continued data availability in the face of partial failures and during maintenance operations. In addition, Oracle8 incorporates several other enhancements that significantly improve the availability of data warehouses and data marts.

Server-Managed Backup and Recovery

Oracle8 introduces an intelligent, server-managed infrastructure for backup, restore and recovery tasks that enables simpler, safer operations. Here are the highlights of this technology:

  • With Oracle8, details related to backup, restore and recovery operations are maintained by the server and automatically utilized as part of these operations, reducing administrative burden and minimizing the possibility of human errors.
  • Oracle8 includes support for incremental backup and recovery, enabling operations to be completed efficiently within times proportional to the amount of changes, rather than the overall size of the database.
  • The backup/recovery technology is highly scalable and provides tight interfaces to industry-leading storage subsystems for efficient operations that can scale up to handle terabytes of data.

Highly Available Parallel Server

Oracle's parallel server technology is the key architectural foundation that delivers high availability and proven scalability on clustered and massively parallel hardware. Oracle8 incorporates advanced fail-over features that further improve availability and simplify manageability. Oracle8 provides transparent and automatic reconnection of users to a designated backup node in the event of node failure, with little or no downtime. Advanced fail-over options further improve availability by enabling users to pre-establish backup connections and even continue queries from the point of failure without having to re-start them.

Undisplayed Graphic

Oracle Corporation
World Headquarters
500 Oracle Parkway
Redwood Shores, CA 94065
U.S.A.

Worldwide Inquiries:
+1.415.506.7000
Fax +1.415.506.7200
http://www.oracle.com/

Copyright © Oracle Corporation 1997
All Rights Reserved

This document is provided for informational purposes only, and the information herein is subject to change without notice. Please report any errors herein to Oracle Corporation. Oracle Corporation does not provide any warranties covering and specifically disclaims any liability in connection with this document.

Oracle is a registered trademark, and Enabling the Information Age, Oracle8, and Oracle7 are trademarks of Oracle Corporation.

All other company and product names mentioned are used for identification purposes only and may be trademarks of their
respective owners.



This is a copy of an article published @ http://www.oracle.com/