by Satyendra Kumar ------------------------------ Use the below custom search feature to search the blog
Search Oracle Related Sites
Thursday, June 11, 2009
AWR / Statspack Analysis
Problem : Performance and High CPU
Solution / Interpretation : The TOP Event log file sync This wait event is seen in most of the high transactional databases where more frequent commits happens. Let us look deep into the concept and then see how we can reduce that, definitely we cannot avoid it.
At commit time, process creates a redo record [ containing commit op codes] and copies that redo record in to log buffer. Then that process signals LGWR to write contents of log buffer. LGWR writes from log buffer to log file and signals user process back completing a commit. Commit is considered successful after LGWR write is successful.
Commit is not complete until LGWR writes log buffers including commit redo recodes to log files. In a nutshell, after posting LGWR to write, user or background processes waits for LGWR to signal back with 1 sec timeout. User process charges this wait time as ‘log file sync’ event.
LGWR is unable to complete writes fast enough for one of the below reasons..
1)Disk I/O performance to log files is not good enough.
2)LGWR is starving for CPU resource. If the server is very busy, then LGWR can starve for CPU too. This will lead to slower response from LGWR, increasing ‘log file sync’ waits.
3)LGWR is unable to complete writes fast enough due to file system or unix buffer cache limitations.
4)LGWR is unable to post the processes fast enough, due to excessive commits. It is quite possible that there is no starvation for cpu or memory and I/O performance is decent enough. Still, if there are excessive commits, then LGWR has to perform many writes/semctl calls and this can increase ‘log file sync’ waits. This can also result in sharp increase in redo wastage’ statistics’.
This event is also the major contributor for overall CPU usage.
In the above scenario, we cannot avoid the COMMITS as the application requires it. What i have done is reduced the CPU contention so that LGWR does not wait for the lack of CPU.
The other aspect is to see where these online redo log files are created. We have made raw devices for these files so that the writing happened fast.
Also there is a misconception that increasing the redo log buffer will solve the problem , but this is not the case as if you the see fundamentals of log buffer flush is for every commit or when 1/3 is full or 1 MB of redo is generated. In this case for every commit there is redo buffer flush happenes so no need to have the redo buffer more than 3MB.
Also try to place the major hot tables and indexes separately on to multiple disks to increase the disk i/o.
Team ORAKHOJ
Get free AWR / Statspack Analysis by sending mail to tuning@orakhoj.com
Subscribe to:
Posts (Atom)