|
|
Cooperative Extension Service |
|
|
|
||
|
|
||
|
Agricultural
Experiment Station |
||
|
|
|
|
Here are some Oracle database tuning topics that I have picked out from some of my classes and other sources. They are in no particular order - just whatever came across a listserv or prompted me from a book or manual. Tuning is an ongoing process, but, don't let it dominate your life! You might want to check out your configuration every month or so, or when there has been some large structural change made to it, or if your users are noticing a slowdown, but, it's usually not something that demands your constant attention.
I've tried to focus on statistics that are immediately available in tables, rather than having to run statistics gathering routines such as utlbstat/utlestat, since the reports from those can contain a majority of information that you will probably never use and will have to sift through to find what you are really looking for. Note that if you have just started up your Oracle database instance, this information will probably be irrelevant - you should probably wait several hours after startup to get a representative sample of your users' interactions with the database. Also, be aware that some statistics may be expressed as a "hit ratio", while others may be expressed as a "miss ratio" - they are different, and you can convert one to the other by subtracting it from 1. All of this information is generic to Oracle. Check back here again for additions and updates!
Redo Log Buffer Latches
Database Buffer Cache Size
Shared Pool Size
Tuning Scripts
When a transaction is ready to write its changes to the redo log, it
first has to grab the Redo Allocation Latch, of which there is only one, to keep others
from writing to the log at the same time. If someone else has that latch, it has to
wait for the latch, resulting in a "miss".
Once it grabs that latch, if the change is larger than log_small_entry_max_size bytes and
if your server has multiple CPU's, it then tries to grab a Redo Copy Latch, of which there
can be up to 2 times the number of CPU's, which would allow it to release the Redo
Allocation Latch for someone else to use. If none of them are available, resulting
in an "immediate miss", it will not wait for a Redo Copy Latch (thus, the
"immediate"), but, instead, hangs on to the Redo Allocation Latch until the
change is written.
Oracle keeps statistics for these latches in v$latch, including the number of gets and
misses for the Redo Allocation Latch and the number of immediate gets and immediate misses
for the Redo Copy Latches, which are cumulative values since instance startup. If
you've got a 100% hit ratio for either of those latch types, that's a good thing. It
just means that all of your transactions were able to grab and use the latch without
retrying. It's when you get below a 99% hit ratio that you need to start looking out. The
following sql figures the current hit ratios for those latches:
column latch_name format a20
select name latch_name, gets, misses,
round(decode(gets-misses,0,1,gets-misses)/
decode(gets,0,1,gets),3) hit_ratio
from v$latch where name = 'redo allocation';
column latch_name format a20
select name latch_name, immediate_gets, immediate_misses,
round(decode(immediate_gets-immediate_misses,0,1,
immediate_gets-immediate_misses)/
decode(immediate_gets,0,1,immediate_gets),3) hit_ratio
from v$latch where name = 'redo copy';
If your Redo Allocation Latch hit ratio consistently falls below
99%, and if you have a multi-CPU machine, you can lower the value for
log_small_entry_max_size (see below) in your init.ora file (ours is currently 800 bytes,
but, maybe 100 or so bytes may be better - you'll have to try out different values over
time), which says that any change smaller than that will hang onto the Redo Allocation
Latch until Oracle is finished writing that change. Anything larger than that grabs
a Redo Copy Latch, if currently available, and releases the Redo Allocation Latch for
another transaction to use.
If your Redo Copy Latch hit ratio consistently falls below 99%, and if you have a
multi-CPU machine, you can raise the value of log_simultaneous_copies in your init.ora
file up to twice the number of CPU's to provide more Redo Copy Latches (there is only one
Redo Allocation Latch, so it is at a premium). Remember that you have to shut down
your database instance and restart it to reread the new parameter values in the init.ora
file ($ORACLE_HOME/dbs/initSID.ora). The following sql shows the current
values for those associated parameters:
column name format a30
column value format a10
select name,value from v$parameter where name in
('log_small_entry_max_size','log_simultaneous_copies',
'cpu_count');
The Database Buffer Cache is part of the Shared Global Area (SGA) in memory for a single database instance (SID) and holds the blocks of data and indexes that you and everyone else is currently using. It may even contain multiple copies of the same data block if, for example, more than one transaction is making changes to it but not yet committed, or, if you are looking at the original copy (select) and someone else is looking at their modified but uncommitted copy (insert, update, or delete). The parameters db_block_buffers and db_block_size in your init.ora file determine the size of the buffer cache. db_block_size, in bytes, is set at database creation, and cannot be changed (unless you recreate the database from scratch), so, the only thing that you can adjust is the number of blocks in db_block_buffers (one buffer holds one block).
The Cache Hit Ratio shows how many blocks were already in memory (logical reads, which include "db block gets" for blocks you are using and "consistent gets" of original blocks from rollback segments that others are updating) versus how many blocks had to be read from disk ("physical reads"). Oracle recommends that this ratio be at least 80%, but, I like at least 90% myself. The ratio can be obtained from values in v$sysstat, which are constantly being updated and show statistics since database startup (it is only accessable from a DBA user account). You will get a more representative sample if the database has been running several hours with normal user transactions taking place. The Cache Hit Ratio is determined as follows:
select (1-(pr.value/(dbg.value+cg.value)))*100
from v$sysstat pr, v$sysstat dbg, v$sysstat cg
where pr.name = 'physical reads'
and dbg.name = 'db block gets'
and cg.name = 'consistent gets';
If you have a low Cache Hit Ratio, you can test to see what the effect of adding buffers would be by putting "db_block_lru_extended_statistics = 1000" in the init.ora file, doing a shutdown and startup of the database, and waiting a few hours to get a representative sample. Oracle determines how many Additional Cache Hits (ACH) would occur for each query and transaction for each of the 1000 buffer increments (or whatever other maximum value you might want to try out), and places them into the x$kcbrbh table, which is only accessable from user "sys". To measure the new Cache Hit Ratio with, for example, 100 extra buffers, determine ACH as follows:
select sum(count) "ACH" from x$kcbrbh where indx < 100;
and plug that value into the Cache Hit Ratio formula as follows:
select (1-((pr.value-&ACH)/(dbg.value+cg.value)))*100
from v$sysstat pr, v$sysstat dbg, v$sysstat cg
where pr.name = 'physical reads'
and dbg.name = 'db block gets'
and cg.name = 'consistent gets';
If the ratio originally was lower than 80% and is now higher with ACH, you may want to increase db_block_buffers by that number of extra buffers, restarting your database to put the increase into effect. Be sure to try several values for the number of extra buffers to find an optimum for your work load. Also, remove db_block_lru_extended_statistics from your init.ora file before restarting your database to stop gathering statistics, which tends to slow down the transaction time. (Removing that clears the x$kcbrbh table.) Also, make sure that your server has enough memory to accomodate the increase!
If you are running really tight on memory, and the Cache Hit Ratio is running well above 80%, you might want to check the effect of lowering the number of buffers, which would release Oracle memory that could then be used by other processes, but would also potentially slow down database transactions. To test this, put "db_block_lru_statistics = true" in your init.ora file and restart your database. This gathers statistics for Additional Cache Misses (ACM) that would occur for each query and transaction for each of the buffer decrements up to the current db_block_buffers value, placing them into the x$kcbcbh table, also only accessable from user "sys". To measure the new Cache Hit Ratio with, for example, 100 fewer buffers, determine ACM as follows:
select sum(count) "ACM" from x$kcbcbh
where indx >= (select max(indx)+1-100 from x$kcbcbh);
and plug that value into the Cache Hit Ratio formula as follows:
select (1-((pr.value+&ACM)/(dbg.value+cg.value)))*100
from v$sysstat pr, v$sysstat dbg, v$sysstat cg
where pr.name = 'physical reads'
and dbg.name = 'db block gets'
and cg.name = 'consistent gets';
If the ratio is still above 80%, you may want to decrease db_block_buffers by that number of fewer buffers, restarting your database to put the decrease into effect. Be sure to try several values for the number of fewer buffers to find an optimum for your work load. Also, remove db_block_lru_statistics from your init.ora file before restarting your database to stop gathering statistics, which tends to slow down the transaction time. (Removing that clears the x$kcbcbh table.)
I have three scripts which you can use to figure your instance's optimum number of db_block_buffers. The cache_hit_ratio.sql script computes the current ratio for the database buffer cache, and can be run from any DBA account. The adding_buffers.sql script computes the resulting ratio for an increase in the buffer cache size of the given number of buffer blocks (figuring ACH itself). It must be run from user "sys", after a representative sampling time with db_block_lru_extended_statistics in place. The removing_buffers.sql script computes the resulting ratio for a decrease in the buffer cache size of the given number of buffer blocks (figuring ACM itself). It must be run from user "sys", after a representative sampling time with db_block_lru_statistics in place.
The Shared Pool is also part of the Shared Global Area (SGA) in memory for a single database instance (SID) and holds the Library Cache with the most recently used SQL statements and parse trees along with PL/SQL blocks, and the Data Dictionary Cache with definitions of tables, views, and other dictionary objects. Both of those sets of cached objects can be used by one or more users, and are aged out (Least Recently Used) as other objects need the space. (You can pin large frequently-used objects in the Shared Pool for performance and other reasons, but, I won't go into that here.)
There are several ratios that you can check after a representative sample time that may indicate that you need to enlarge the shared pool, which is set by the shared_pool_size parameter in your init.ora file and defaults to 3500000 (3.5 Meg). One indicator is the Library Cache Get Hit Ratio, which shows how many cursors are being shared (SQL statements (gets) which were already found and parsed (gethits) in the shared pool, with no parsing or re-parsing needed), and is determined by:
select gethits,gets,gethitratio from v$librarycache
where namespace = 'SQL AREA';
If the gethitratio is less than 90%, you should consider increasing the shared pool size. Another indicator is the reloads per pin ratio, which shows how many parsed statements (pins) have been aged out (reloaded) of the shared pool for lack of space (ideally 0), and is determined by:
select reloads,pins,reloads/pins from v$librarycache
where namespace = 'SQL AREA';
If the reloads/pins ratio is more than 1%, you should consider increasing the shared pool size. A third indicator, which is not as important as the first two, is the dictionary object getmisses per get ratio, which shows how many cached dictionary object definitions in the dictionary cache are encountering too many misses (aged out?), and is determined by:
select sum(getmisses),sum(gets),sum(getmisses)/sum(gets)
from v$rowcache;
If the getmisses/gets ratio is more than 15%, you should consider increasing the shared pool size.
If these ratios indicate that your shared pool is too small, you can estimate the size of the shared pool by doing the following. Set the shared_pool_size to a very large number, maybe a fourth or more of your system's available memory, depending on how many other instances and processes that you have running that are also using memory, then shutdown and startup your database and let it run for a representative time (like all day or when a large batch job is running that you want to accomodate), then, figure the memory required for packages and views, memory required for frequently used SQL statements, and memory required for users SQL statements executed, as shown below:
select sum(sharable_mem) "Packages/Views" from v$db_object_cache;
select sum(sharable_mem) "SQL Statements" from v$sqlarea
where executions > 5;
select sum(250 * users_opening) "SQL Users" from v$sqlarea;
Then, add the above three numbers and multiply the results by 2.5. Use this estimated size as a guideline for the value for shared_pool_size, changing that parameter to the estimated size or back to the original size and doing another shutdown/startup to put the value into effect. The shared_pool_size.sql script can be used to figure these values for you, which uses an example of the Select From Selects tip:
select sum(a.spspv) "Packages/Views", sum(a.spssql) "SQL Statements",
sum(a.spsusr) "SQL Users", round((sum(a.spspv) + sum(a.spssql) +
sum(a.spsusr)) * 2.5,-6) "Estimated shared_pool_size"
from (select sum(sharable_mem) spspv, 0 spssql, 0 spsusr
from v$db_object_cache
union all
select 0, sum(sharable_mem), 0 from v$sqlarea
where executions > 5
union all
select 0, 0, sum(250 * users_opening) from v$sqlarea) a;
Here are some tuning scripts that you could try, that also give
suggestions on what to do to correct these performance problems:
tuning.sql - By David Midgett at Eastern
Kentucky University, with some modifications by me.
tuning2.sql - From
http://www.oramag.com/code/cod46dba.html (no longer available).
>>> Stay Tuned (sorry, couldn't pass that one up!) - More To Come! <<<
This Page was Last Updated on 06/15/06
You Are Visitor Number |
|
|
© 2006 |
|
|
University of Arkansas • Division of Agriculture |
Mission
•
Disclaimer
•
EEO
•
|