Search Oracle Related Sites

Friday, April 29, 2011

AWR TOP 5 Timed Event Analysis - Global Cache Wait Events

The Global Cache Service is the controlling process that implements Cache Fusion. It maintains the block mode for blocks in the global role. It is responsible for block transfers between instances. The Global Cache Service employs various background processes such as the Global Cache Service Processes (LMSn) and Global Enqueue Service Daemon (LMD)


Global cache (“gc” ) events and statistics Indicate that Oracle searches the cache hierarchy to find data fast as “normal” as an IO ( e.g. db file sequential read )

GC events tagged as “busy” or “congested” consuming a significant amount of database time should be investigated. At first, assume a load or IO problem on one or several of the cluster nodes.


  

 
 
 
 
 
All Global Cache Events will follow the following format:


RAC wait events are grouped in a category called “Cluster Wait Class” characterized as Current or CR.

• Current - blocks read into memory for the first time
• Consistent Read (CR) - denotes block for read access

The following wait events shows that the remotely cached blocks were shipped to the local Instance without having been busy, pinned or requiring a log flush. In a simple way buffer requests and received for read or write

gc current block 2-way
gc current block 3-way
gc cr block 2-way
gc cr block 3-way

GC cr/current block congested
  • Repeated requests by foreground processes, not serviced by LMS
  • Indicates LMS not able to keep up
  • Queue lengths & scheduling delays in OS, can cause LMS delays

GC cr/current block busy

  •  Delay for some reason, before block sent to requestor

 GC current grant busy

  •  Permission to access the block granted, but blocked by other requests ahead of it

 GC cr/current block request

  • Wait time, cr or current block is being retrieved 
How do we intrepet the AWR further for the GC Wait Event problems.

     
    Check where most of the time in the database is spend (“Top 5 Timed Events” )
    Check whether gc events are “busy”, “congested”  --  Check the avg wait time  --  Drill down the AWR further for finding the top wait events --  SQL with highest cluster wait time  --- Segment Statistics with highest block transfers
    .
Interconnect issues must be fixed first


If IO wait time is dominant , fix IO issues

At this point, performance may already be good

Fix “bad” plans

Fix serialization

Fix schema 

Thursday, April 21, 2011

AWR Analysis - Parse CPU to Parse Elapsd %

Under AWR’s Instance Efficiency Percentages - Parse CPU to Parse Elapsd % is one area which is more confusing and clear information will not be available. Information below talks clearly on how to we interpret the ratio.

• If you spend 1 CPU second on CPU to parse but total elapsed is 5 second wall clock time then it means you are waiting on some resources to complete the parsing.

• Ideally Parse Elapsed must be equal to Parse CPU, i.e., only CPU time is used for parsing. In that case the ratio is 100%. If wait time is more than the ratio will be less.


 
(8879/110582)*100=8.03%
What does it mean and how it is being interpreted
• Parse CPU to Parse Elapsd %: 8.03
• It is percentage. 8.03% means .0803
• If you divide it by 1 then 1/.0803 = 12.45
• Which means 12.45 second (wall clock time) must be elapsed for every cpu second for parsing. Its not   
   good.
• It represents resource contention while parsing.Low Value for this ratio is an indicator of latching problem. Investigate the latch sections in AWR report for contention on library cache and shared pool latches.





Thursday, April 14, 2011

AWR TOP 5 Timed Event Analysis - CPU Time

AWR Top 5 Timed Events - CPU Time

Most of the time we see CPU time in the TOP 5 list of AWR reports. So couple of questions we get to our mind is 1) Why the Wait Information is empty 2) How do we interpret the Time mentioned in the report.


1) Wait Information  is empty because “CPU time” is not wait event. It is the time spent on CPU to do the actual work.

2) Interpretation of CPU Time(s) 1033

We have 60*60=3600 CPU Seconds to use in a particular interval for single CPU in 1 hour snap


In the example we have 8 CPU (Num_CPUs under the Operating System Statistics of AWR) which relates to    60*60*8= 28800 CPU seconds to use in 1 hr interval. ( Single Database Machine is running on machine)

(1033/28800)*100 = 3.58% of Total CPU

So we are not CPU bound and things looks good from CPU point of View.

Other way to look at the CPU information is to drill down to the operating system statistics and look for Busy Time and Idle Time. If the Idle Time is high then there is not much contention for the CPU.