Search Oracle Related Sites

Thursday, March 31, 2011

AWR TOP 5 Timed Event Analysis - Cache Buffers Chains

TOP 5 Timed Event -Cache Buffers Chains

Problem: High CPU and delay in the transaction processing.

Definition: The cache buffers chains latches are used to protect a buffer list in the buffer cache. These latches are used when searching for, adding, or removing a buffer from the buffer cache. Contention on this latch usually means that there is a block that is greatly contended for (known as a hot block).

The reasons for this latch can be
1)Sequence number generation code that updates a row in a table to generate the number, rather than using a sequence number generator
2)Index leaf chasing from very many processes scanning the same unselective index with very similar predicate

How to identify the Identify the segment the hot block belongs to  -- Steps given below.. ( Metalink - How To Identify a Hot Block Within The Database Buffer Cache. [ID 163424.1]

To solve a hot block, the application maybe need to be reviewed.

By examining the waits on this latch, information about the segment and the
specific block can be obtained using the following queries.

First determine which latch id(ADDR) are interesting by examining the number of
sleeps for this latch. The higher the sleep count, the more interesting the
latch id(ADDR) is:
SQL> select CHILD#  "cCHILD"
     ,      ADDR    "sADDR"
     ,      GETS    "sGETS"
     ,      MISSES  "sMISSES"
     ,      SLEEPS  "sSLEEPS" 
     from v$latch_children 
     where name = 'cache buffers chains'
     order by 5, 1, 2, 3;

Run the above query a few times to to establish the id(ADDR) that has the most
consistent amount of sleeps. Once the id(ADDR) with the highest sleep count is found
then this latch address can be used to get more details about the blocks
currently in the buffer cache protected by this latch.
The query below should be run just after determining the ADDR with
the highest sleep count.
SQL> column segment_name format a35
     select /*+ RULE */
       e.owner ||'.'|| e.segment_name  segment_name,
       e.extent_id  extent#,
       x.dbablk - e.block_id + 1  block#,
       x.tch,
       l.child#
     from
       sys.v$latch_children  l,
       sys.x$bh  x,
       sys.dba_extents  e
     where
       x.hladdr  = '&ADDR' and
       e.file_id = x.file# and
       x.hladdr = l.addr and
       x.dbablk between e.block_id and e.block_id + e.blocks -1
     order by x.tch desc ;

Example of the output :SEGMENT_NAME                     EXTENT#      BLOCK#       TCH    CHILD#
-------------------------------- ------------ ------------ ------ ----------
SCOTT.EMP_PK                       5            474          17     7,668
SCOTT.EMP                          1            449           2     7,668

Depending on the TCH column (The number of times the block is hit by a SQL
statement), you can identify a hotblock. The higher the value of the TCH column,
the more frequent the block is accessed by SQL statements.

In order to reduce contention for this object the following mechanisms can be put in place:
1) Examine the application to see if the execution of certain DML and SELECT statements
    can be reorganized to eliminate contention on the object.

2) Decrease the buffer cache -although this may only help in a small amount of cases.

3) DBWR throughput may have a factor in this as well.
    If using multiple DBWR's then increase the number of DBWR's.

4) Increase the PCTFREE for the table storage parameters via ALTER TABLE
    or rebuild. This will result in less rows per block.

5) Consider implementing reverse key indexes
   (if range scans aren't commonly used against the segment)

6)  In the AWR you can look for segments with high buffer waits and sql statements whose elapsed times are high on a particular objects. If you are having high transaction OLTP databases it is always better if we can have the lesser block size to avoid the contention like 4k block size.

Monday, March 14, 2011

AWR TOP 5 Timed Event Analysis - Latch Free


Top 5 Times Events

latch free , latch: library cache , latch: library cache lock

latch free – System is waiting for a latch to become free


Basically it means some session needed a latch (on the library cache for example to parse some SQL). it tried to get the latch, but failed (because someone else had it). So, it goes to sleep (waits on a latch free) and wakes up and tries again. The time it was asleep - that is the wait time for "latch free"

A latch is a low-level internal lock used by Oracle to protect memory structures. The latch free event is updated when a server process attempts to get a latch, and the latch is unavailable on the first attempt.

There is a dedicated latch-related wait event for the more popular latches that often generate significant contention. For those events, the name of the latch appears in the name of the wait event, such as latch: library cache or latch: cache buffers chains. This enables you to quickly figure out if a particular type of latch is responsible for most of the latch-related contention. Waits for all other latches are grouped in the generic latch free wait event.

Actions

This event should only be a concern if latch waits are a significant portion of the wait time on the system as a whole, or for individual users experiencing problems.

• Examine the resource usage for related resources. For example, if the library cache latch is heavily contended for, then examine the hard and soft parse rates.

• Examine the SQL statements for the sessions experiencing latch contention to see if there is any commonality.

Once you see this event as TOP, you need to further drill down the AWR report to Latch Statistics section to see Latch Activity and Latch Sleep Breakdown to determine which latch is doing more no of misses, sleeps and spin gets. Based on which one has to look for data what is causing these.

If you see library cache latch, then possible causes might be lack of statement reuse, statements not using bind variables, cursors closed explicitly after each execution.

If one see cache buffers chains as top event with more misses and sleeps then the possible cause might be repeated access to a block known as hot block which is caused due to Sequence number generation code that updates a row in a table to generate the number, rather than using a sequence number generator.

We have found that in our case a poorly written package was hitting us hard where huge hard parses were found. There are lot many queries to find out the queries which are performing the hard parses like

SELECT SQL_TEXT FROM V$SQLSTATS WHERE EXECUTIONS < 4 ORDER BY SQL_TEXT;

SELECT SUBSTR(SQL_TEXT, 1, 60), COUNT(*) FROM V$SQLSTATS WHERE EXECUTIONS < 4 GROUP BY SUBSTR(SQL_TEXT, 1, 60)HAVING COUNT(*) > 1;

Check the V$SQLSTATS view. Enter the following query:

SELECT SQL_TEXT, PARSE_CALLS, EXECUTIONS FROM V$SQLSTATS ORDER BY PARSE_CALLS;

When the PARSE_CALLS value is close to the EXECUTIONS value for a given statement, you might be continually reparsing that statement. Tune the statements with the higher numbers of parse calls.

Tuesday, March 1, 2011

AWR TOP 5 Timed Event Analysis


gc buffer busy


The gc current block busy and gc cr block busy wait events indicate that the local instance that is making the request did not immediately receive a current or consistent read block. The term "busy" in these events' names indicates that the sending of the block was delayed on a remote instance. For example, a block cannot be shipped immediately if Oracle Database has not yet written the redo for the block's changes to a log file.

In comparison to "block busy" wait events, a gc buffer busy event indicates that Oracle Database cannot immediately grant access to data that is stored in the local buffer cache. This is because a global operation on the buffer is pending and the operation has not yet completed. In other words, the buffer is busy and all other processes that are attempting to access the local buffer must wait to complete.
The existence of gc buffer busy events also means that there is block contention that is resulting in multiple requests for access to the local block. Oracle Database must queue these requests.

Usually, either interconnect or load issues or SQL execution against a large shared working set can be found to be the root cause.

gc buffer busy‟ waits can happen for many reasons. Few of them are: CPU starvation issues, Swapping issues, interconnect issues etc. For example, if the process that opened the request for a block did not get enough CPU, then it might not drain the network buffers to copy the buffer to buffer cache. Other sessions accessing that buffer will wait on „gc buffer busy‟ waits. Or If there is a network issue and the Global cache messages are slower, then it might induce higher gc buffer busy waits too. Statistics „gc lost packets‟ is a good indicator for network issues, but not necessarily a complete indicator.

Solution

If this wait event was found on the top 5 wait events then one has to immediately look for "Segments by Global Cache Buffer Busy" section of the AWR to check buffers of which objects are contributing to the wait event. Then look for Sql Statements which are being fired from multiple instances and see if one can reduce the contention either by ensuring that applications which perform transactions on these objects very frequently should be pointed to a single node or look for other issues like cache values of sequences on these objects or CPU , network, interconnect , large loads etc.