WebLogic-Coherence slow performance – ClearStone to the rescue

By Bill Nigh

I heard recently of a use case of Oracle WebLogic and Coherence that benefitted from Evident’s  performance monitoring. It seems an enterprise was running an application that used WebLogic for its application server tier and Coherence for its distributed data caching tier. The application was running on 8 Web servers, and WebLogic had been configured with about 130 service threads allocated for incoming requests.

The company sent out a promotion email blast. This got a huge response, and the Web site underperformed; it could not serve the content fast enough. The WebLogic service threads were “taking their own sweet time” to finish a request, in the endearing phraseology of one of our guys, and this was creating a big backlog. Performance bottlenecks began to form, and eventually the system collapsed a number of times.

The WebLogic system failures inspired the employment of a number of safeguards. The ops staff would monitor the number of threads. When the number of available threads began to get low, the working assumption was that the system would soon collapse, so the ops staff took action based on that threshold, taking thread dumps to try to analyze what was hanging and restarting the Web server. Only when the number of users dropped off would the system stabilize. (And some of these users dropping off were probably customers who were frustrated that the link they clicked on was leading to a site with poor performance.)

There was a relationship between WebLogic Server and Coherence in the application’s behavior; when the thread count dropped, monitoring of Coherence revealed an increase of activity in the cache, such as GETs.

As part of a troubleshooting consult, we used Evident ClearStone to monitor WebLogic and Coherence. By looking at performance metrics on application and caching tiers, we were able to identify the root cause as the application source code. Part of the application code was unintentionally retrieving all the information from the caching tier, as seen by a big spike in requests going to the caching tier. The load on the caching tier was causing it  to give up. The inter-tier relationship was key to discovering the problem, which is where ClearStone came to the rescue. In the future, ClearStone’s advanced heuristics will make it even easier to discover interplays and causal relationships between tiers in complex NoSQL systems.

Learn more about our performance monitoring solution for Java, NoSQL and web servers

Categories: General
Date: March 1st, 2011
blog comments powered by Disqus