Search Oracle Related Sites

Thursday, August 25, 2011

Top 5 Timed Foreground Events - cursor: pin S wait on X

Top 5 Timed Foreground Events - cursor: pin S wait on X

Cursor: pin S wait on X.
A session waits on this event when requesting a mutex for shareable operations related to pins (such as executing a cursor), but the mutex cannot be granted because it is being held exclusively by another session (which is most likely parsing the cursor). 



This wait event will occur in mulitple scenarios, In 10g we have observes this is due to the frequent Automatic Memory Management Resize operations. Setting the SGA_TARGET to zero solved the problem. There are many bugs in versions of 10g versions. (The bug is tracked as Bug 6528336 – LARGE NUMBER OF SESSIONS WAITING ON CURSOR: PIN S WAIT ON X)

This has been fixed in further releases of 10g and 11g.

We received the above error while performing the load test in our environment in 11g R2 and the below query has helped us in identifing sql responsible for this.

In 11g R2, the blocking session can be found directly using the following sql:
SQL> select sid,serial#,SQL_ID,BLOCKING_SESSION,BLOCKING_SESSION_STATUS,EVENT
from v$session where event ='cursor: pin S wait on X'

SID SERIAL# SQL_ID BLOCKING_SESSION BLOCKING_SESSION_STATUS EVENT
---- ------- ------------- ---------------- ----------------------- ----------
125 8190 3d3pd7g7dwuf6 135 VALID cursor: pin S wait on X

One of the most likely causes of cursor: pin S wait on X is high parsing time. Therefore the reason for the high parse time should be investigated.

11g R1 Below Diagonosis


As a result of Bug 7568642 BLOCKING_SESSION EMPTY FOR "CURSOR: PIN S WAIT ON X" the blocking_session is not populated in 10.2.The bug is fixed in 11g R1. Below approach can be used to idetify the necessary sql causing this wait.
The column P2RAW in v$session or v$session_wait gives the blocking session for wait event cursor: pin S wait on X.
The top bytes of p2raw is the blocker. It is in hex so needs to be converted in decimal.
SQL> select p2raw from v$session where event = 'cursor: pin S wait on X';

P2RAW
----------------
0000001F00000000
The top bytes of p2raw is the blocker.
Taking 0000001F (the first 8 bytes) and converting to decimal gives session id 31.
Or simply:
SQL> select p2raw,to_number(substr(to_char(rawtohex(p2raw)),1,8),'XXXXXXXX') sid
from v$session
where event = 'cursor: pin S wait on X';

P2RAW SID
---------------- ---
0000001F00000000 31
SQL> select p1, p2raw, count(*) from v$session
where event ='cursor: pin S wait on X'
and wait_time = 0
group by p1, p2raw;
p1 = the mutex Id
This has the same definition as v$mutex_sleep_history.mutex_identifier
    p2raw = holding Session Id | Ref Count The most significant bytes always store the Holding Session Id (Holding SId). The least significant bytes always store the Ref Count.
The blocking session can be queried to see what it is doing and if anyone is blocking it.
SQL> select sid,serial#,SQL_ID,BLOCKING_SESSION,BLOCKING_SESSION_STATUS,EVENT
from v$session where SID=31

Tuesday, August 9, 2011

AWR Top 5 Timed Event - SQL*Net more data to client

SQL*Net more data to client -The server process is sending more data/messages to the client



The SQL*Net more data to client Oracle metric indicates the server process is sending more data/messages to the client. The previous operation to the client was also a send.

The first session data unit (SDU) buffer of return data is written to TCP socket buffer under “SQL*Net message to client” wait event. If Oracle needs to write more data for a call than fits into the “first” SDU buffer, then further writes for that call are done under SQL*Net more data to client event.

One can check with the Network Team on the overall network performance between Application Server and Database Server and also try to increase the SDU parameter at Listener and Tnsnames Levels. In our case after increasing the SDU parameter we have seen significant reduction in the wait event.


 
SID_LIST_LISTENER =

  (SID_LIST =

    (SID_DESC =

      (SDU=32640)

      (SID_NAME = ORAKHOJ)

      (ORACLE_HOME = /export/home/oracle10)

    ))

Tnsnames.ora

ORAKHOJ = (DESCRIPTION =

    (SDU=32640)

    (ADDRESS = (PROTOCOL = TCP)(HOST = ORAKHOJServer)(PORT = 1521))

    (CONNECT_DATA =

      (SERVER = DEDICATED) (SERVICE_NAME = ORAKHOJ)

    ) )


Network bottlenecks are very common in distributed systems and those with high network traffic. They are manifested as SQL*Net wait events, below are more wait events discussed below.

SQL*Net more data from client

Def: The server is waiting on the client to send more data to its client shadow process, in an already initiated operation

Solution: Usually OK, reduce data transferred, possible Network problems.

SQL*Net more data to client

Def: The server process is sending more data/messages to the client. The previous operation to the client was also a send.

Wait Time: The actual time it took for the send to complete

Solution: Usually OK, reduce amount of data transferred, possible Network tuning needed ( SDU)

SQL*Net break/reset to client

Def: The server sends a break or reset message to the client. The session running on the server waits for a reply from the client.

Wait Time: The actual time it takes for the break or reset message to return from the client


Solution: Check for errors in sql statement


SQL*Net more data from dblink

Def :The foreground process is expecting more data from a data base link.

Wait Time: The total time it takes to read the data from the database link (including the waiting time for the data to arrive)

Solution: Reduce data transfer, check net response

SQL*Net break/reset to dblink

Def: The server sends a break or reset message to the client. The session running on the server waits for a reply from the client.

Wait Time: The actual time it takes for the break or reset message to return from the client

Solution: Check for errors in sql statement sent


Monday, August 1, 2011

DB Time & AWR TOP 5 Timed Events

Proactive Monitoring of DB Time for better Analysis

Database time is total time spent by user processes either actively working or actively waiting in a database call. To make it clearer, DB Time is the time spent in the database by foreground sessions which includes CPU time, IO time and wait time for a particular snap interval. It excludes idle wait time.
DB Time and Wait Events are directly proportional. The first thing one will be looking while reviewing the AWR is the DB time, if it is more and multiples of elapsed time then one has to jump to the TOP 5 Timed Events sessions to investigate  further.

Ideally DBAs will be reactively looking at AWR  for performance problems, but one has to Proactively look at the AWR reports periodically to see if things are fine or not. But it is very difficult to generate AWR or review AWRs for all the snap intervals of Production database as we will be having 7 X 24 AWR reports for a week (1 hour snap interval).
The following single query will help the DBAs to generate the DB time of all Snap Intervals of a week. Based on the data generated, one can identify the high DB time snaps and generate AWR reports to investigate proactively before it gets into worst.

We have automated the query which will generate Graph and mail it to the DBAs who owns the databases using the Ploticaus tool to plot graph in linux. (http://ploticus.sourceforge.net/doc/welcome.html)













In the above graph we see there is a spike , one can generate AWR and look into what went wrong at a particular snap time and correct it before it becomes major problem. This helps to monitor DB 24 X 7 .

Fine the Query  below which gets the DB time for the whole week.
col snap_id format A8
 col begin_interval_time_N format A30 trunc
 col end_interval_time_N format A30 trunc
 col HOST_NAME format A15 trunc
 col INSTANCE_NUMBER format 9 heading “I”
 col begin_interval_time_N noprint
 col end_interval_time_N noprint

 select  A.SNAP_ID , A.INSTANCE_NUMBER , C.HOST_NAME  , MIN(A.begin_interval_time) OVER (partition by A.dbid, A.snap_id) AS begin_interval_time_N  , MIN(A.end_interval_time) OVER (partition by A.dbid, A.snap_id) AS end_interval_time_N  , (B. VALUE – LAG(B.VALUE,1,0) OVER (ORDER BY A.SNAP_ID))/1000000/60 as DB_TIME  from dba_hist_snapshot A , DBA_HIST_SYS_TIME_MODEL B,dba_hist_database_instance C  where A.SNAP_ID=B.SNAP_ID  and A.DBID=B.DBID  And A.INSTANCE_NUMBER=B.INSTANCE_NUMBER  and B.STAT_NAME Like ‘DB time’  and A.DBID=C.DBID  and A.INSTANCE_NUMBER=C.INSTANCE_NUMBER  and A.startup_time = C.startup_time  and A.begin_interval_time >= trunc(sysdate) – 7 and A.begin_interval_time <= trunc(sysdate) + 20/(26*60)