Wednesday, October 10, 2012

JRockit Flight Recorder: Analysing OIM11G environment


Today I am looking at JRockitMissionControl analysis into a JFR(JRockit Flight Recorder) file. Using this we can do several analysis and find what could be a cause of high CPU usage or memory leaks , for example.
Contextualizing:
To get these information(.jfr file) below. I started JRMC Flight Recording using JROCKIT_HOME/bin/jrcmd. Command examples:
1-Starting a Recording: jrcmd start_flightrecording duration=60s filename=myrecording.jfr
2-Checking Recording status: jrcmd check_flightrecording
3-Stopping a Recording: jrcmd stop_flightrecording recording=1
So, opening the generated jfr file using JRMC into my OIM11G(WLS 10.3.2) environment:

Fig 1: This tool has so many possibilities (monitor, manage, profile, and eliminate memory leaks in your Java application) and you also don’t need introduce the performance overhead normally associated with tools of this type. It’s functionality is always available on-demand, and the small performance overhead is only in effect while the tools are running.

Fig 2: For this selected time we can see into CPU/threads the ‘famous’ Socket Muxers in thread dumps, 5 times. Costing 2m 00s 769.455ms total.

And Looking into weblogic application logs:
=======================================
“ExecuteThread: ’2′ for queue: ‘weblogic.socket.Muxer’” daemon prio=5 tid=0x070b2148 nid=0×42 waiting for monitor entry [5887f000..5887fc28]
at weblogic.socket.PosixSocketMuxer.processSockets(PosixSocketMuxer.java:92)
- waiting to lock (a java.lang.String)
at weblogic.socket.SocketReaderRequest.execute(SocketReaderRequest.java:33)
at weblogic.kernel.ExecuteThread.execute(ExecuteThread.java:220)
at weblogic.kernel.ExecuteThread.run(ExecuteThread.java:179)
======================================

Fig 3: Muxer threads are shown as being Stuck since they’re all locked on the same object. This is probably treated as a deadlock condition. But usually they are not deadlocks just regular locks. this thread gains an exclusive lock on an object(my example Quartz as you can see into Fig5.) to perform some action, then frees it allowing the next thread to gain access.

Fig 4: And from memory analysis in this period we could see ‘weblogic.socket.Muxer’ allocation 32.61% of the memory.

Fig 5.
According to the analysis during this selection time, the thread events executed the most was QuartzSchedulerThread (more details about this class can be found on the scheduled tasks article from a few months ago). This thread is part of OIM into Scheduler jobs(Scheduled tasks).
And also from Thread perspective. You will see something like this below:

Fig 6:Thread waiting for another thread to finish.
So in this specific case and time, the object that is ‘creating some locks, thread wait and consuming more memory’ is one OIM scheduled job. Now, you need to analyse your schedule task. So, it’s very important tool analysis because with this resource we can go to the right logs and see exactly which object or event is requesting more resources at this time, from the server perspective.
Key benefits of fligh Recorder:
–> The overhead capturing these events are minimize ar maximun. THis is very important into systems that are almost 100% of usage.
–> This will just capture weblogic info per requests. And will keep high performance in cases that system activity under load.
–> The time interval in JFR flight recording buffer is maximized. This will give you a accurate history of your data.
Donwnload JRMC: http://www.oracle.com/technetwork/middleware/jrockit/downloads/index.html
Docs and Reference: 1.0- http://docs.oracle.com/cd/E15289_01/index.htm
2.0- http://docs.oracle.com/cd/E15289_01/doc.40/e15070/usingjfr.htm
I hope this helps,
Thiago Leoncio.