Saturday, January 31, 2015

Weblogic STUCK & Hogging Threads, How to deal with STUCK & HOGGING Threads.


Weblogic Server STUCK & Hogging Threads, How to deal with STUCK & HOGGING Threads.

What is STUCK thread?

How to deal with STUCK thread?

What is HOGGING thread?

How Weblogic determine a thread to declare as Hogging?


What is a STUCK thread?

We know that a STUCK thread is a thread which is processing a request for more than maximum time defined for a thread to complete the request which is default 600 and can be configured from admin console.  Based on different technical circumstances like due to some intermittent issues with network, database, application server etc a STUCK can be release after some time, but most of the time is certain thread or threads declared as STUCK then there would be some problem either temporary or permanent which need some fix.

How to deal with STUCK thread?

As of now there is no way to deal with a STUCK thread like, sometime end users ask if there any way to kill STUCK thread. No, there is no way to deal with STUCK thread.

To deal with STUCK thread

  • Take multiple thread dumps immediately.
  • Review thread dumps or from console (managed server > monitoring > threads) Where it exactly got stuck?
  • See how many threads got stuck?
  • If the stuck thread count is increasing or constant?
  • If constant then if got stuck on same area (application code etc ) or at different places ?
  • If getting increase then there would be some serious problem and you have to do a quick health check of you application server, database and other integrated technologies wherever your application reaching like ldap server for authentication, some other API’s or web services etc, and in parallel review thread dumps for STUCK threads and share same with your developers to analyze quickly.
  • If you have one, two or few constant STUCK threads and it’s not increasing then you can monitor it for some more time to check if they get clear or not, if not then to clear them you have only option to restart your managed server(s), and its better to restart and clear them before they make further any impact.

  
What is Hogging thread?  


I am sure if you are going to read this post then you must aware about what is hogging thread. Ok, let me define it again in a single line, “I hogging thread is a thread which is taking more than usual time to complete the request and can be declared as STUCK”.


How Weblogic determine a thread to declare as Hogging?

As we know a thread declared as STUCK if it runs over 600 secs (default configuration which you can increase or decrease from admin console).

Now, How Weblogic determines a thread to declare as Hogging? ok, here is the logic which 
I had learn from some of the Oracle internal portal note.

  1. There is an internal WebLogic polar which runs every 2 secs  (by default 2 secs and can be alter)
  2. It checks for the number of requests completed in last two minutes
  3. Then it check how much times each took to complete
  4. Then it takes the average time of all completed request (completed in last 2 sec)
  5. Then multiply average time with 7, and the value came consider as “usual time to complete the request”
  6. Now weblogic check each current executed thread in last 2 secs and compare with above average time, if for any of the thread it’s above this value then that thread will declare as Hogged thread.


For example

  1. At a particular moment,  total number of completed requests in last two seconds – 4
  2. Total time took by all 4 requests – 16 secs
  3. Req1 took – 5 secs, Req2 took – 3 secs, Req3 took – 7 secs, Req4 took – 1 sec
  4. Average time = 16/4 = 4 secs
  5. 7*4 = 28 secs
  6. Now weblogic check all executed threads to see which taking more than 28 secs, if any then that thread(s) declared as Hogged Thread.



Only the thing you can change with respect to hogging threads configuration is Polar time (Stuck Thread Timer Interval parameter) which is 2 secs by default. You can change this polar value to some different value like 4 secs if you want polar to run in every 4 secs instead of 2 secs.