Cookbook for Sizing Objects

By James F. Koopmann

Had your fill of reading manuals and following cumbersome methods for sizing tables and indexes? Here's a recipe you'll find to your taste

I usually dread having to pull out pencil and paper and drill through software manuals to revisit the proper methods for sizing tables and indexes. Friends and colleagues have offered help in the way of spreadsheets and other methods, but I've always found these solutions a bit cumbersome and hard to maintain when trying to update new information from analysts, developers, and users. And it was hard to change the data to reflect current trends for existing objects in the database.

So I decided to automate this process to my own satisfaction. I created a couple of tables that I could maintain in an Oracle database that would give me the power to source in the volumetrics on the objects in question. Essentially, it would give me information about the number of rows inserted into an object, plus how the data in the object will grow or mutate over time. Keeping the information in tables allows me to fully plan for current and future needs. Not only do I have at my fingertips the current information about num_rows, pct_free, pct_used, and the like through the DBA tables but I can also link in future growth patterns on the database. The ability to tie into the database where existing objects reside is critical to fully understanding the current space needs of your database and predicting the future ones.

I have developed a recipe for setting up the tables in question, adding rows to those tables, and running a couple of scripts that will get you started on reporting space requirements for your database.

Select the Main course

First of all, you need to capture all those little variables that Oracle maintains for its inner workings. The variables come from v$parameter and v$type_size tables and are all important in determining the overhead for each data block. The query in Listing 1 assumes a default of 1 for INITRANS and 20 for PCTFREE to show the working of the script. (I will use real values later in the article.) The query also makes extensive use of Oracle's ability to create dynamic views in the FROM clause of a SQL statement. If you run the script in Listing 1, you should get output like that shown in Table 1.

TABLE 1: Running the script in Listing 1 should produce output similar to that shown here.

HSIZE AVAIL
SPACE DR_BLOCK_
SIZE KCBH UB4 KTBBH KTBIT KDBH KDBT UB1 SB21 N1_TRANS PCT_
FREE

8106 6481 8192 20 4 48 24 14 4 1 2 1 20

The variables generated by the script are important, because they are used in virtually any script that looks at Oracle data blocks and measures how well they are being utilized. To check the utilization, you must know the current database block size (db_block_size); how much space is left after Oracle has allocated header space (HSIZE); and, most important, how much of the block you can use to store data (AVAILSPACE). The best part of this script is that it is independent of any machine and any db_block_size. In past scripts, I hard-coded block_size. I have since substituted the hard-coded values with this script. One less thing to worry about!

Prepare the Potatoes

Use the statements in Listing 2 to create the sz_tables and sz_indexes tables to store sizing information about current or future tables and indexes. I have created the sz_tables and sz_indexes tables as a subset of the dba_tables and dba_indexes tables for two reasons: First, it allows me to run a set of space-estimation scripts against both the current statistics and the data in the DBA tables, as well as against future space requirements. I can then load that data into the sz_tables. Second, I wanted a way to store information about objects that hadn't been created within the database and then still run the same space-estimation scripts. To accomplish this, I just needed to load the sz_tables with data representative of these future tables and indexes.

To accomplish both of these things, the SQL makes extensive use of decodes, outer joins, and view materialization to impose a form of logic for determining whether it should pull statistics from sz_tables, dba_tables, or a combination of the two to calculate sizing requirements for the objects.

Add Some Salt, Maybe Some Pepper

Getting data into the tables can be the most cumbersome task in this process. Typically, the tables and indexes are defined somewhere in a database or spreadsheet, or on paper. If the tables and indexes are already in a database and the objects have been analyzed, you only have to do a couple of insert statements from the catalog and then modify the sizing tables for the future expectations. Inserting index information can be tricky, because you have to sum the column lengths while accounting for the lengths of individual columns. Listing 3 does it for you. Inserting the table data in sz_tables is straightforward:

insert into sz_tables (select owner,table_name,",avg_row_len, 
      ini_trans,pct_free,pct_used,num_rows
	from dba_tables where owner = '<some_owner>')

If you are using information from users, data analysts, or your own research--and the objects are not defined in a database--loading the data can be more difficult. I suggest that you create a flat file with all the basic information, such as owner, table_name, and num_rows. Then create a small script to generate SQL statements from the flat file. The flat file might look something like this:

Example : file.lst
	<some_owner>	Table01	50
	<some_owner>	Table02	100000
	.	.	.
	.	.	.
	<some_owner>	Table0n	13000

A sample script for generating insert or update statements:

# crszsql.sh more $1 | while read LINE do OWNER='echo $LINE | awk '{print $1}' - | tr a-z A-Z' TABLE_NAME='echo $LINE | awk '{print $2}' - | tr a-z A-Z' NUM_ROWS='echo $LINE | awk "{print $3}" NEW_LINE="insert into sz_tables(owner,table_name,num_rows) \ values ("$OWNER","$TABLE_NAME","$NUM_ROWS");'" echo $NEW_LINE done

You can pipe the output to another flat file, using:

<unix prompt> crszsql.sh  file.lst > crszsql.sql

Run this flat file in SQL*Plus to populate the sz_tables table. (This example is applicable only to UNIX installations. All other scripts, when in SQL*Plus, will work on any platform.) There are other, similar, ways to populate the sz_indexes table. It may seem cumbersome to create all the SQL statements through a script, but if you manage more than one database or a couple of clients, you can lose sight of how or where you generated statements. This single point of creation from one flat file per database or client helps eliminate this problem. You can populate the tables any way you want: The important thing is to get the initial object information into the tables.

Bring to Boil

I've also written a tblrowsz.sql script, which reports on estimated table sizes and generates two basic types of numbers--Catalog numbers and Analyzed numbers--in the report (see Listing 4, in the online version of this article, at www.oramag.com). The Catalog numbers are generated from the dba_tables, whereas the Analyzed numbers are generated from a combination of dba_tables and the sz_tables table.

This script prompts you to "Enter value for dba_tables:" which is where you specify dba_tables or sz_tables. If you enter dba_tables, the report will compare sizing estimates based entirely on data kept in the current catalog. If you enter sz_tables, the report will calculate sizing estimates based on both the dba_tables and sz_tables. You will also be prompted to enter a value for table_name that specifies a table name or pattern on which to report. Press Enter for the default (which shows all tables), or type in a pattern on which to search.

Use Table 2 to help determine the output of the report and from where the data is derived. Data must exist in the sz_tables table; otherwise, it won't generate a report. The output from tblrowsz.sql looks like that shown in Table 3.

TABLE 3: Sample output from the tblrowsz.sql script.

Owner Table
Name Catalog
Rowsize Catalog
Rows
per
Block Analyzed
Rowsize Analyzed
Rows
per
Block Analyzed
Number
of Rows Analyzed
Blocks
Needed Analyzed
Space
Needed
(Meg) Catalog
Space
Needed
(Meg) Pct
Free Pct_
Used

<OWNER> <TABLE_01> 87 83 61 119 15963 134 1 1 10 60
<OWNER> <TABLE_02> 841 7 267 24 1958100 81587 637 2185 20 60

<OWNER> <TABLE_03> 1060 6 425 15 4220403 281360 2198 5495 20 40

<OWNER> <TABLE_04> 264 27 139 52 46322 890 6 13 10 60

<OWNER> <TABLE_05> 76 95 71 102 44200 433 3 3 10 60

There you have it: a script that can do sizing for tables with data from the analyzed data in dba_tables or your own sizing table in sz_tables. The Catalog Space Needed amount typically exceeds that for Analyzed Space Needed, because full column sizes are used from the dba_tab_columns in the calculation of the catalog space. This calculation is useful because if all else stays the same, this is the largest amount of space you will need. You can run this script from dba_tables to report current space requirements and then run it from sz_tables to see future requirements, take the difference, and determine whether you have enough space to extend the tablespace.

Let Simmer

The next script, ixrowsz.sql, calculates index sizes. Included as Listing 5, in the online version of this article, this script also generates two types of numbers in its report: Catalog numbers and Sizing numbers. The Catalog numbers are generated from dba_tables, and the Sizing numbers are generated from a combination of sz_tables and dba_tables. The script will prompt you to "Enter value for dba_indexes:" which is where you enter dba_indexes or sz_indexes. If you specify dba_indexes, the report will be driven from the dba_indexes table. Any tables that also exist on the sz_indexes table will also be reported. If you specify sz_indexes, the report will be driven off the sz_indexes table. If any of the same indexes are defined in the catalog, they will also be reported.

Be aware, if you want to use the sz_indexes table, that the script uses the sz_tables table to determine the number of rows for the table on which the index resides. To obtain sizing estimates on indexes, you must have an entry in both the sz_tables table and the sz_indexes table.

Use the information in Table 4 to view your calculated index sizes and to see the source from which the data is collected (assuming you've entered data in sz_tables). You will also be prompted to enter a value for index_name, for which you can press the Enter key (to get the default of all tables) or enter a index name or pattern on which to search.

The output from ixrowsz.sql will be similar to that shown in Table 5. There's nothing fancy about the report: It shows current and future size estimates on the same line of a report so you can plan for growth.

TABLE 5: Sample output from ixrowsz.sql.

Owner Index
Name Catalog
Index
Entry
Size Catalog
Blocks
for
Index Catalog
Number
of Rows Catalog
Space
Needed
(Meg) Sizing
Index
Entry
Size Sizing
Blocks
for
Index Sizing
Number
of Rows Sizing
Space
Needed
(Meg)

<OWNER> <INDEX_01> 19 11649 4220403 91 19 17473 6330605 137

<OWNER> <INDEX_02> 37 10529 1958850 82 37 15787 2937150 123

<OWNER> <INDEX_03> 28 197 48442 2 28 283 69483 2

<OWNER> <INDEX_04> 19 11649 4220403 91 29 26670 6330605 208

<OWNER> <INDEX_05> 37 10529 1958850 82 66 28161 2937150 220
<OWNER> <INDEX_06> 28 197 48442 2 48 485 69483 4

Serve and Enjoy

Oracle has removed many of the barriers and headaches associated with sizing and managing objects in a database, but you still must consider size when creating new objects, in order to avoid performance and maintenance problems.

This recipe covers just the basics, but it should be enough to get you started in automating the process of sizing database objects. Remember, there is much more to it than just selecting an initial_extent and a next_extent: You truly have to know your data and watch it grow over a given period of time to determine if the pctfree, pctused, ini_trans, and next_extent are set properly to facilitate growth without having a negative impact on performance.

Start capturing growth patterns in another set of tables. Key tables to mirror and monitor are dba_tables, dba_tablespaces, and dba_segments. If you do this, you will gain a wealth of information on growth patterns--or lack of growth--within your database within two or three weeks.

You might want to start running a couple of scripts, such as blkixchk.sql, that I introduced in the March/April 1997 issue of Oracle Magazine to monitor block usage for indexes and tables. (See www.oramag.com for an archive of articles from past issues.)

James F. Koopmann ([email protected]) is a database consultant for Integrated Server Solutions, Inc., in Denver, Colorado. His Web address is www.pcisys.net/~jkoopmann.

If you're interested in further reading on the performance aspects of sizing objects, check out the following books and papers:

Space Estimations for Schema Objects, Appendix A,
Oracle Server Administrator's Guide Release 8.0; Oracle Corporation, 1997.

The following are available through OraPub of Earth at www.europa.com/~orapub:

Millsap, Cary V.--Oracle7 Server Space Management, Revision 1.4b (10/31/95)
Shallahamer, Craig A.--Avoiding a Database Reorganization (11/2/94) V2.2

TABLE 2: Use the information in this table to select a response for the "Enter value for dba_tables:" prompt.

Column dba_indexes sz_indexes

Catalog
Rowsize The sum of column lengths from dba_tab_columns Same as for dba_tables
if table exists in catalog

Catalog Rows
per Block The number of rows, given
the catalog rowsize, that
will fit into a block Same as for dba_tables
if table exists in catalog

Analyzed
Rowsize The avg_row_len from
dba_tables after an analyze
has been done on the table The avg_row_len from
sz_tables

Analyzed Rows
per block The number of rows, given
the analyzed avg_row_len,
that will fit into a block The number of rows, given
the avg_row_len from
sz_tables, that will fit
into a block

Analyzed
Number of Rows The value of num_rows from
dba_tables after an analyze
has been done on the table The num_rows from sz_tables

Analyzed
Blocks
Needed The number of data blocks
needed, given the analyzed
rows per block and
analyzed number of rows The number of data
blocks needed, given the
rows per block and
the number of rows
from the sz_tables data

Analyzed
Space Needed
(megabytes) Total space needed,
determined by dba_block_
size, dba_tables, and
analyzed blocks needed Total space needed,
determined by dba_block_
size, sz_tables, and
analyzed blocks needed

Catalog
Space Needed
(megabytes) Total space needed
determined by catalog
blocks needed (not in report)
and analyzed num_rows
from dba_tables Total space needed
determined by catalog
blocks needed (not in
report) and num_rows
from sz_tables

Pct Free The percentage free from
dba_tables The percentage free from
sz_tables

Pct Used The percentage used from
dba_tables The percentage used from
sz_tables

TABLE 4: Use the following to select the information to answer the "Enter value for dba_indexes:" prompt.

Column dba_indexes sz_indexes

Catalog Index
Entry Size The sum of column lengths
from dba_ind_columns Same as for dba_indexes, if
index exists in the catalog

Catalog Blocks
for Index The computed number of
blocks to hold index, given
the catalog index-entry size,
num_rows from dba_tables,
and available space
per block Same as for dba_indexes, if
index exists in the catalog

Catalog Number
of Rows The value of num_rows
from dba_tables Same as for dba_indexes, if
index exists in the catalog

Catalog Space
Needed
(megabytes) The calaculated space given
the cataloged blocks for
index and available space
per block Same as for dba_indexes, if
index exists in the catalog

Sizing Index
Entry Size Same as sz_indexes, if entry
in sz_indexes exists Index_entry_size from
sz_indexes

Sizing Blocks
for Index Same as sz_indexes, if entry
in sz_indexes exists Computed number of
blocks to hold index, given
the sizing index entry size,
num_rows from sz_tables,
and available space
per block

Sizing Number
of Rows Same as sz_indexes, if
entry in sz_indexes exists The value of num_rows
from sz_tables

Sizing Space
Needed
(megabytes) Same as sz_indexes, if
entry in sz_indexes exists The calculated space, given
the sizing blocks for index
and available space
per block

This is a copy of an article published @ http://www.oramag.com/

HSIZE	AVAIL SPACE	DR_BLOCK_ SIZE	KCBH	UB4	KTBBH	KTBIT	KDBH	KDBT	UB1	SB21	N1_TRANS	PCT_ FREE
8106	6481	8192	20	4	48	24	14	4	1	2	1	20

Owner	Table Name	Catalog Rowsize	Catalog Rows per Block	Analyzed Rowsize	Analyzed Rows per Block	Analyzed Number of Rows	Analyzed Blocks Needed	Analyzed Space Needed (Meg)	Catalog Space Needed (Meg)	Pct Free	Pct_ Used
<OWNER>	<TABLE_01>	87	83	61	119	15963	134	1	1	10	60
<OWNER>	<TABLE_02>	841	7	267	24	1958100	81587	637	2185	20	60
<OWNER>	<TABLE_03>	1060	6	425	15	4220403	281360	2198	5495	20	40
<OWNER>	<TABLE_04>	264	27	139	52	46322	890	6	13	10	60
<OWNER>	<TABLE_05>	76	95	71	102	44200	433	3	3	10	60

Owner	Index Name	Catalog Index Entry Size	Catalog Blocks for Index	Catalog Number of Rows	Catalog Space Needed (Meg)	Sizing Index Entry Size	Sizing Blocks for Index	Sizing Number of Rows	Sizing Space Needed (Meg)
<OWNER>	<INDEX_01>	19	11649	4220403	91	19	17473	6330605	137
<OWNER>	<INDEX_02>	37	10529	1958850	82	37	15787	2937150	123
<OWNER>	<INDEX_03>	28	197	48442	2	28	283	69483	2
<OWNER>	<INDEX_04>	19	11649	4220403	91	29	26670	6330605	208
<OWNER>	<INDEX_05>	37	10529	1958850	82	66	28161	2937150	220
<OWNER>	<INDEX_06>	28	197	48442	2	48	485	69483	4

Column	dba_indexes	sz_indexes
Catalog Rowsize	The sum of column lengths from dba_tab_columns	Same as for dba_tables if table exists in catalog
Catalog Rows per Block	The number of rows, given the catalog rowsize, that will fit into a block	Same as for dba_tables if table exists in catalog
Analyzed Rowsize	The avg_row_len from dba_tables after an analyze has been done on the table	The avg_row_len from sz_tables
Analyzed Rows per block	The number of rows, given the analyzed avg_row_len, that will fit into a block	The number of rows, given the avg_row_len from sz_tables, that will fit into a block
Analyzed Number of Rows	The value of num_rows from dba_tables after an analyze has been done on the table	The num_rows from sz_tables
Analyzed Blocks Needed	The number of data blocks needed, given the analyzed rows per block and analyzed number of rows	The number of data blocks needed, given the rows per block and the number of rows from the sz_tables data
Analyzed Space Needed (megabytes)	Total space needed, determined by dba_block_ size, dba_tables, and analyzed blocks needed	Total space needed, determined by dba_block_ size, sz_tables, and analyzed blocks needed
Catalog Space Needed (megabytes)	Total space needed determined by catalog blocks needed (not in report) and analyzed num_rows from dba_tables	Total space needed determined by catalog blocks needed (not in report) and num_rows from sz_tables
Pct Free	The percentage free from dba_tables	The percentage free from sz_tables
Pct Used	The percentage used from dba_tables	The percentage used from sz_tables