By Guy Harrison
This article explores Oracle8's "big set" optimizations, including table partitioning together with improved parallel and DML capabilities, which support the creation and maintenance of monster-sized databases.
Although the press has almost exclusively concentrated on the object-relational features of Oracle8, such as as nested tables, array data types, large objects, and user-defined objects, there is another side to this new release. Oracle8 also has some significant "big set" optimizations that support the creation and maintenance of much larger databases than are practical with Oracle7. These features include table partitioning together with improved parallel and DML capabilities. This month, I look at some of the "big set" features, particularly those that impact performance.
Partitioned Tables and Indexes
As databases increase in size, maintaining and manipulating the data
in very large tables can become problematic. For instance, it may be impractical
to rebuild or export a very large table. Bulk updates and purges may consume
excessive resources or over-run maintenance windows. Partitioning provides
a "divide-and-conquer" solution to some of these problems. Oracle8
provides a partitioning scheme based on column ranges. For example, you
might partition a SALES table by year, so that the table stores sales for
each year in distinct partitions. The following SQL shows how to create
such a partitioned table:
CREATE TABLE sales_partition
(column definitions for SALES)
PARTITION BY RANGE (sale_date)
(PARTITION s1 VALUES LESS THAN
(`01-JAN-93'),
PARTITION s2 VALUES LESS THAN
(`01-JAN-94'),
- Other years
PARTITION s5 VALUES GREATER THAN
(`01-JAN-98'));
Each partition is a separate segment, and you can store partitions in
separate tablespaces and assign the partitions distinct storage definitions.
Partitions are implemented transparently and an application doesn't need
to know the names of inividual paritions. However, your SQL statements can
specify a partition name. For example, the following statement updates an
individual partition (S1) only:
UPDATE sales_partition PARTITION(s1) SET purge_ind='Y' WHERE sale_value < 10000
Partition Indexes
You can also partition indexes. You will typically create a partitioned index on a partitioned table, although it is possible to create a partitioned index on an unpartitioned table. If an index on a partitioned table is itself unpartitioned or is partitioned on different column range conditions than those of the source table, then the index is known as a global index. If an index is paritioned in exactly the same fashion as the parent table, it is known as a local index. If the leading columns of the index are also the columns upon which the index is partitioned, then the index is known as a local prefixed index. Local indexes have some significant advantages over global indexes. If a partitioned table with a global index has a partition split, merge, or move, then the corresponding index partitions will automatically be subjected to the same maintenance operations. Global indexes also preclude some new Oracle8 operations - such as direct load inserts (which I discuss shortly).
Partition Maintenance
Oracle8 allows you to manage partitions in a number of ways:
ALTER TABLE sales_partition
SPLIT PARTITION s5
AT (`30-JUN-97')
INTO ( partition s5a,
partition s5b);
The Advantages of Partitions
Oracle8's partitions enable you to perform administrative work on subsets
of very large tables. In Oracle7, these very large tables pose special challenges.
For example, in Oracle7 you may not be able export very large tables within
an acceptable time frame. With Oracle8's partitioned tables, you can reduce
maintenance time by restricting maintenance to an individual partition or
by performing operations on multiple partitions in parallel. Partitioned
tables can also improve the performance of table scans. If you issue a query
based on the same columns by which a table is partitioned, the Oracle8 optimizer
can restrict the scans to the relevant partitions. You can also manually
restrict operations to a particular partition by specifying the partition
name after the table name. You can subject partitioned tables to parallel
update and delete operations and to more efficient parallel inserts. (I
discuss these operations later in this article.) Partitioned tables also
offer an excellent means of purging historical data. For example, in the
SALES partition, you could quickly purge all data for a particular financial
year simply by dropping the relevant partition.
Restrictions on Partitioning
Although Oracle8 partitioning is powerful and effective, there are some limitations:
Direct Load and Unlogged Inserts
Oracle8 has two enhancements that improve the performance of inserts based on queries (for example, INSERT INTO table1 SELECT columns FROM table2). These are:
NOLOGGING Mode.
In Oracle7 you can specify that a CREATE TABLE … AS SELECT statement or a CREATE INDEX statement is UNRECOVERABLE. This instructs Oracle not to generate redo log information for the operation and can result in very significant reductions in build time - especially when you use it in conjunction with the PARALLEL clause. However, tables and indexes created this way cannot be recovered from information contained in redo logs. The NOLOGGING keyword in Oracle8 replaces the UNRECOVERABLE keyword in Oracle7, but UNRECOVERABLE is retained for compatibility reasons. You can use the NOLOGGING mode for direct load inserts, as well as in CREATE TABLE and CREATE INDEX statements. Because redo log information is not generated, the insert will be faster. But if there should be a disk failure before the next tablespace backup, you will lose all the information inserted. You can specify the NOLOGGING keyword in CREATE TABLE, CREATE INDEX, ALTER TABLE, or ALTER INDEX statements, or you can specify it as the default for a tablespace. Once you associate NOLOGGING with a table, any direct mode inserts will automatically be performed without logging. Watch out for this if you currently use the Oracle7 UNRECOVERABLE option to build tables: Under Oracle8 you'll need to issue an ALTER TABLE statement to turn logging mode back on if you need to log subsequent inserts. Because the data inserted with the NOLOGGING mode can't be restored from redo logs, you're not likely to use this facility for critical application data. However, you might use the NOLOGGING feature in the following circumstances:
Oracle 8 offers the following additional parallel operations:
Parallel Insert.
Unlike other forms of parallel DML, it's possible to perform an INSERT that uses a subquery in parallel, even if the table is not partitioned. A parallel insert into nonpartitioned tables works in APPEND mode by default. That is, blocks are created and formatted in session memory and inserted directly into the datafiles, bypassing the buffer cache. Even when you tell Oracle not to use the APPEND mode (by using the NOAPPEND hint), each parallel query server process must insert into a separate extent or partition. For parallel insert into nonpartitioned tablesthis means that for each degree of parallelism at least one new extent must be created. As a result, a parallel insert might result in a greater increase in storage allocation than a serial insert.
For a parallel insert into a partitioned table, each parallel query server
process is assigned a separate partition. They may insert into an existing
extent if blocks in the extent are on the free list. If there is no space
available in existing extents, then new extents must be allocated. Parallel
Update and Delete. Updates and deletes on partitioned tables may be processed
in parallel. Multiple parallel query processes are allocated to the operation,
although each partition is allocated no more than a single process. For
this reason, the degree of parallelism (number of parallel threads of execution)
may be no more than the number of partitions in the table. Using Parallel
DML. To activate parallel DML, you should first issue the ALTER SESSION
ENABLE PARALLEL DML command. As with parallel queries, parallel DML is initiated
if there is a PARALLEL clause on the table or a PARALLEL hint within the
SQL statement. For example, the following statement inserts all transactions
older than one week into a TRANSACTION_ARCHIVE table. Note that the PARALLEL
hint is used both in the INSERT statement and in the underlying SELECT.
I also used the APPEND hint to invoke the direct insert mode.
insert /*+PARALLEL (t,5) APPEND */ INTO transaction_archive ta SELECT /*+PARALLEL (t,5) */ * FROM transactions t WHERE transaction_date < SYSTDATE -7;
When using parallel DML, you should try to avoid the following pitfalls:
Any Oracle user who has to deal with large tables can probably imagine a number of effective uses for parallel DML. Following are some of the more obvious:
Restrictions on Parallel DML
There are a number of restrictions on parallel DML:
Parallel Queries Using Partitioned Indexes
Prior to Oracle8, only queries that were based on a full table scan could be processed in parallel. In Oracle8, queries based on index range scans can be parallelized if the index is partitioned. The algorithm for this is similar to that used for parallel DML for a partitioned table: Each slave process is allocated one of the index partitions and performs a scan on that partition only. Therefore, the degree of parallelism can be no greater than the number of partitions in the index. A new hint, PARALLEL_INDEX, lets you request a parallel partitioned index scan.
Index-Organized Tables
In Oracle7 you may occasionally create an index that includes all or most of the columns in a table. The table itself may then become superfluous because all queries are satisfied from the index. However, the table still consumes storage and incurs insert/update/delete overhead. To address this situation, Oracle8 enables you to create an index-organized table in which the table itself is stored in an index format.
You can create an index-organized table using the ORGANIZATION keyword
of the CREATE TABLE statement. For example:
CREATE TABLE iot_table (pk_col1 number NOT NULL, pk_col2 number NOT NULL, data_col1 VARCHAR2(30) NOT NULL, data_col2 LONG , constraint iot_table_pk PRIMARY KEY (pk_col1,pk_col2)) ORGANIZATION index PCTTHRESHOLD 50 OVERFLOW TABLESPACE long_tbs;
Index-organized tables are organized as a B-tree index constructed against their primary key. The primary key and additional columns are stored in the leaf blocks of the B-tree. However, only the columns that contribute to less than PCTTHRESHOLD of the total row length are stored in the B-tree leaf blocks. The remainder of the row is stored within chained blocks in the OVERFLOW TABLESPACE.
The ability to relocate longer columns away from the B-tree structure lets Oracle keep the B-tree relatively small and efficient, while keeping small, frequently accessed columns in the B-tree. When creating an index-organized table, define frequently accessed, small columns early in the column list and large, infrequently accessed columns later in the list. Define PCTTHRESHOLD so that only the frequently accessed columns are retained in the B-tree. The primary key should include all columns that might appear in the WHERE clause. It is not possible to create a secondary (that is, nonprimary key) index on an index-organized table, because the index-organized table rows are not associated with a ROWID and hence there is nothing to which the secondary index can point. Index-organized tables are suitable for the following circumstances:
Other Enhancements
I've looked at just some of the new performance-enhancing features in Oracle8. I've detailed table and index partitioning, direct load and unlogged insert, parallel DML, and partitioned index scans. These facilities work very well together: Many of the new parallel operations work only on partitioned tables and the direct mode and nologging options are often combined with parallel insert to rapidly populate temporary or derived tables.
In subsequent months I'll continue to examine Oracle8 features and facilities.
In particular, look forward to an article on Oracle8 object-oriented facilities
and how they can improve - or degrade - the performance of your Oracle system.
Guy Harrison is an independent Oracle consultant specializing in Oracle development and performance issues. He is the author of Oracle SQL High-Performance Tuning (Prentice Hall, 1997). You can contact Guy through the Internet at [email protected] or at his home page http://werple.net.au/~gharriso.