
Welcome to OReview's new performance-tuning column. In this column I hope to provide regular and reliable performance-related strategies, news, and information that will supplement - although not replace - the more in-depth performance-related feature articles that appear periodically in OReview. In particular, I'll provide:
Most performance experts agree that tuning application SQL is usually the most effective way of improving performance, and that SQL tuning is an essential prerequisite to effective server or operating-system tuning. Unfortunately, SQL tuning often takes second seat to server tuning because SQL tuning is less of an exact science and often out of a DBA's control.
I have 10 hints that can help you tune your SQL and improve your application performance. In this inaugural column, I give you five hints; look for five more hints in my August column.
It's not uncommon for SQL that works well in a development environment to exhibit poor performance once it is in a production system. One of the primary causes of such unpleasant surprises is an inadequate development or volume-testing environment. In particular, environments without realistic or representative data volumes are bound to lead to unrealistic SQL performance.
The ideal tuning or development environment is one in which:
EXPLAIN
PLAN, SQL_TRACE, and tkprof facilities. Ensure that relevant database
options are effectively set.(See my article"Getting
the Most from the SQL_TRACE Facility,"OTJ,Winter1996)
Oracle provides powerful facilities for measuring the execution of SQL
statements, such as the EXPLAIN PLAN command, which can generate
the execution plan for a SQL statement. To use EXPLAIN PLAN:
PLAN_TABLE using the script in $Oracle_HOME/rdbms/admin/utlxplan.sql
(the location might be slightly different under non-Unix hosts).
EXPLAIN PLAN FOR sql_statement statement.
SELECT rtrim(lpad(' ',2*level)||
rtrim(operation)||' '||
rtrim(options)||' '||
object_name) query_plan
FROM plan_table
CONNECT BY prior id=parent_id
START WITH id=0;
PLAN_TABLE.
A more powerful relative of the EXPLAIN PLAN command is
the SQL_TRACE facility. To use SQL_TRACE you
do the following:
ALTER SESSION SET SQL_TRACE=TRUE
from within an Oracle session.
USER_DUMP_DEST.
tkprof input_trace_file output_report_file sort='(prsela,exeela,fchela)' explain=username/password
Interpreting tkprof output and execution plans is a complex topic that I can't give adequate coverage here. For more information, see my article "Getting the Most from the SQL_TRACE Facility," OTJ, Winter 1996.
Indexes exist to improve the performance of SQL statements. In order to establish the "best" indexes to achieve optimal performance, you should do the following:
SURNAME
and FIRSTNAME, don't create separate indexes for each column.
Instead, create a concatenated index on both SURNAME and FIRSTNAME.
You can then use the "leading" portion of the concatenated index
on its own; thus, if you sometimes query on the SURNAME column
without supplying the FIRSTNAME, SURNAME should
come first in the index. Note that you can't use the trailing portions
of a concatenated index. For example, the concatenated index on SURNAME
and FIRSTNAME can't help you find all people with a first
name of "Fred."
SQL execution by "overindexing." Overindexing involves
concatenating columns that appear in the SELECT clause to
the index, but not in the WHERE clause. Imagine that you are
searching on SURNAME and FIRSTNAME in order to
find EMPLOYEE_ID. Your concatenated index on SURNAME
and FIRSTNAME will let you quickly locate the row containing
the appropriate EMPLOYEE_ID, but you will need to access both
the index and the table. If there is an index on SURNAME,
FIRSTNAME, and EMPLOYEE_ID, then the query can
be satisfied using the index alone. This technique is particularly useful
when optimizing joins, because intermediate tables in a join are sometimes
queried merely to obtain the join key for the next table.
- Index clusters let you locate rows from one or more tables in cluster key order. Clustering tables can result in a substantial improvement in join performance. However, the performance of table scans of individual tables in the cluster may be severely degraded. Index clusters are usually recommended only for tables that are always accessed together. Even then, you should consider alternatives such as denormalization.
- In hash clusters, the key values are translated mathematically to a hash value. Rows are stored in the hash cluster based on this hash value. Locating a row when the hash key is known may require only a single I/O rather than the two or three I/Os required by an index lookup. However, you cannot perform range scans of the hash key. Furthermore, if the cluster is poorly configured or the size of the cluster changes, overflows on the hash keys may occur or the cluster may become sparsely populated. In the first case, the performance of hash-key retrieval can degrade; in the second case, table scans will be less efficient.
- Bitmapped indexes were introduced in Oracle 7.3. These indexes suit queries on multiple columns that each have only a few distinct values. They are more compact than a corresponding concatenated index and, unlike the concatenated index, you may query columns in any combination. However, bitmapped indexes are not suitable for tables with high modification rates, because locking of bitmapped indexes occurs at the block level rather than the row level. Bitmapped indexes are also not suitable for columns with large numbers of distinct values.
Parsing a SQL statement includes the process of validating the SQL and determining the optimal execution plan. For SQL that has low I/O requirements but is frequently executed (for example, SQL generated by OLTP-type applications), reducing the overhead of SQL parsing is extremely important.
When an Oracle session needs to parse an SQL statement, it first looks for an identical shared SQL statement in the Oracle shared pool. If it cannot find a matching statement, Oracle will determine the optimal execution plan for the statement and store the parsed representation in the shared pool.
The process of parsing SQL is CPU-intensive. When I/O is well tuned, the overhead of parsing a SQL statement can be a significant portion of the total overhead of executing that statement. You can reduce the parsing overhead by:
SESSION_CACHED_CURSORS
server parameter. If SESSION_CACHED_CURSORS is greater than
zero, Oracle will store recently executed cursors in a cache and, if you
reexecute such a SQL statement, you can avoid reparsing the statement.
The Oracle software component that determines the execution plan for a SQL statement is called the optimizer. Oracle supports two approaches to query optimization: rule-based and cost-based. The rule-based optimizer determines the execution plan based on a set of rules that rank various access paths. For example, an index-based retrieval has a lower rank than a full table scan; therefore, the rule-based optimizer will use indexes wherever possible.
The cost-based optimizer determines the execution plan based on an estimate of the computer resources (the cost) required to satisfy various access methods. The cost-based optimizer uses statistics, including the number of rows in a table and the number of distinct values in indexes, to determine this ideal plan.
Many people's early experiences with the cost-based optimizer in Oracle 7.0 and 7.1 were disappointing, which gave the cost-based optimizer a bad reputation in some quarters. However, the cost-based optimizer has been improving with each release of Oracle7, and many advanced SQL access methods, such as star and hash joins, are only available when you use the cost-based optimizer.
The cost-based optimizer is your best choice for almost all new projects, and it may be cost-effective to convert your existing projects from rule-based to cost-based optimization. Consider the following guidelines for getting the most from the cost-based optimizer:
OPTIMIZER_MODE=CHOOSE)
attempts to optimize the throughput - the time taken to retrieve all rows
- of SQL statements and often favors full table scans over index lookups.
When converting to cost-based optimization, many users are disappointed
to find that previously well-tuned index lookups change to long-running
table scans. To avoid this situation, set OPTIMIZER_MODE=FIRST_ROWS
in the server's parameter file or ALTER SESSION SET OPTIMIZE=FIRST_ROWS
in your code. These settings instruct the cost-based optimizer to minimize
the time taken to retrieve the first row in your result set and encourages
the use of indexes.
SELECT /*+ INDEX(customers cust_i2) */ * FROM customers WHERE name=:cust_name
ANALYZE command. Make sure that
you analyze all of your tables regularly and analyze them at peak volumes
(for example, don't analyze a table just before it is about to be loaded
by a batch job). For small or medium-sized tables, use the ANALYZE
TABLE table_name COMPUTE STATISTICS statement. For larger tables,
analyzing every row of the table may be too time-consuming. In this case,
analyze a random sample of rows using a statement such as ANALYZE
TABLE table_name ESTIMATE STATISTICS SAMPLE 20 PERCENT.
FOR
COLUMNS clause of the ANALYZE command (for example,
ANALYZE TABLE table_name COMPUTE STATISTICS FOR ALL INDEXED COLUMNS).
Note that you can't take advantage of histograms if you are using bind
variables (which I discussed earlier). (For a further discussion of histograms,
please see Steve Bobrowski's article "Tuning
Application Logic," OTJ, Summer 1996)
To briefly review this month's SQL tuning hints:
Hint #1. Establish a tuning environment that will enable you to effectively write and tune efficient SQL. This environment should contain realistic or representative data volumes, include tuning and diagnostic tools, and have application design documentation available.
Hint #2. Use the tuning tools that are included in Oracle. In particular, learn to use EXPLAIN PLAN, SQL_TRACE, and tkprof.
Hint #3. Make effective use of Oracle indexes. Take advantage of concatenated indexes, overindexing, and "advanced" indexing options, such as bitmapped indexes and hash and index clusters.
Hint #4. Reduce the overhead of SQL statement parsing by using bind variables, retaining SQL "cursors," and implementing an Oracle cursor cache.
Hint #5. Take advantage of the Oracle cost-based optimizer. If
you're still using the older rule-based optimizer, consider switching to
the cost-based optimizer. When using the cost-based optimizer, ensure that
you analyze all of your tables regularly. Also, make sure that OPTIMIZER_MODE
is set appropriately. Using hints and column histograms can improve the
performance of SQL statements that are particularly difficult to optimize.
Next month, I'll look at some of the performance features of Oracle version 8. In August, I'll present a final list of five hints for optimizing SQL, which will include tips on avoiding accidental table scans, optimizing unavoidable scans, using array processing, effective locking strategies, and leveraging features introduced in recent releases of Oracle.
Guy Harrison is an independent Oracle consultant specializing
in Oracle development and performance issues. He is the author of Oracle
SQL High-Performance Tuning (Prentice Hall, 1997). You can contact Guy via
email at [email protected] or at his home page, werple.net.au/~gharriso.
| Development Environment | How to Reduce Parsing by Retaining Cursors |
|---|---|
| PowerBuilder | The SQLCache setting in the DBParm determines the number of cursors to keep open after execution. The default value is zero, which means that no cursors are retained. Setting this parameter to 20 or 30 usually reduces parse overhead significantly without using excessive memory. |
| SQL*Windows | It's hard to avoid destroying cursors in SQL*Windows, but it is possible if you:
|
| Precompilers (Pro*C,Pro*Cobol, and so on) compiler option | The Precompilers will cache SQL cursors. The number of SQL statements in
the cache is determined by the MAXOPENCURSORS.You can use the
HOLD_CURSOR and RELEASE_CURSOR options to determine
which statements get preference in the cache. |
| Oracle Forms(Developer/2000) | Oracle Forms automatically retains and reuses cursors effectively. If you
specify the OptimizeTP=NO option on the command line,only SELECT
statements are allocated separate cursors. |
| Oracle Objects for OLE | Cursors are represented as dynasets in Oracle objects for OLE. As long as your dynasets are not deallocated, your cursors will be reusable. |
| Oracle Call Interface (OCI) | In OCI, you have complete control over cursor management. To keep a cursor
in memory, do not close the cursor using the OCLOSE call after
statement execution - this completely deallocates the memory and the SQL
statement. You can reuse the cursor simply by rebinding and reexecuting
the cursor. |
| Oracle Power Objects PL/SQL | Cursor reuse is transparent and automatic. Cursors are kept in memory as
required. If you specify CLOSE_CACHED_OPEN_CURSORS=TRUE
in an ALTER SESSION statement or in the server's
parameter file, cursors are discarded on COMMIT or ROLLBACK. |
| Hint | Usage |
|---|---|
ALL_ROWS |
Uses the cost-based optimizer to optimize for the retrieval of all rows. |
CACHE |
Encourages rows retrieved by a full table scan to remain in the buffer cache of the SGA. |
FIRST_ROWS |
Specifies that the cost-based optimizer should optimize the statement to reduce the cost of retrieving the first row only. |
FULL(table_name) |
Uses a full table scan to retrieve rows from the specified table. |
INDEX(table_name [index_name]) |
Uses the specified index to retrieve rows from the table or, if no index is specified, to use any index. |
ORDERED |
Instructs the optimizer to join the tables in exactly the left-to-right
order specified in the FROM clause. |
PARALLEL(table_name, degree_of_parallelism) |
Instructs the optimizer to perform parallel scans on the nominated table. If no degree of parallelism is specified, the default will be used. |
RULE |
Uses rule-based optimization. |