Search Oracle Related Sites

Tuesday, February 28, 2012

Top 5 Timed Events - gc cr failure

gc cr failure – This wait event is triggered when a CR ( Consistent Read) block is requested from the holder of the block and a failure status message is received. This happens where there are unforeseen events such as lost block or checksum or an invalid block request or when the holder cannot process the request. One will see multiple timeouts for the place holder wait like gc cr request before receiving gc cr failure event. One can query system statistics view v$sysstat for gc blocks lost or gc claim blocks lost.

Failure is not an option in cluster communications because lot messages or block may potentially trigger node evictions.

 
In the above case this wait event is because of gc buffer busy as the node holding the block requested is busy and cannot process the request.

Let us understand how Consistent Read (CR) requests are handled in RAC to get more clarity and why the nodes get busy fulfilling the requests. When an instance needs to generate a CR version of the current block, the block can be either in the local or remote cache. If the latter, then LMS ( Lock Manager Server) on the other instance will try to create the CR block, when the former, the foreground process executing the query will perform the CR block generation. When a CR version is created, the instance or instances needs to read the transaction table and undo blocks from the rollback /undo segment that are referenced in the active transaction table of the block. Sometimes this cleanout/rollback process may cause several lookups of remote undo headers and undo blocks. The remote undo header and undo block lookups will result in a gc cr request . Also as undo headers are frequently accessed, a buffer wait may also occur.

We got rid of these kind of wait events after reducing the traffic between the nodes by pointing the applications which are depended on each specific tables  to specific nodes.

No comments: