Articles tagged ClearStone

The views expressed in this blog are strictly personal, and do not necessarily represent the views of Evident Software.

By Bill Nigh

What special challenges are involved in pursuing useful NoSQL DB performance comparisons? Some of the challenges are: single point solutions, lack of a common query language/API to integrate metrics from different tiers into a common view, the heterogeneous nature of the technology, hard to predict lifecycles for objects such as nodes and caches, and other challenges, outlined here.

One approach to NoSQL DB performance comparison is to do time-on-time comparisons of single or multiple metrics with charts representing different monitored resources.  Don Jeffery puts it this way: “Depending on what your objective is when you are benchmarking, one of the things that you might do is to take a quiescent system, one that’s spun up but not incurring any load, and introduce some load to it over a period of time. For example, in Coherence, you establish the cluster, introduce, say, eight nodes, maybe eight JVMs over four loads, with caches introduced, but no clients.”

“You would monitor this quiescent system and inspect certain key metrics and expect them to be relatively flat. You’d then introduce loads and keep track of the times and loads introduced.” (Since we store history as time series data in Cassandra, getting this information for potentially long stretches of time is not an issue.)

Don continues: “If you keep track of the time and the load, then you can introduce time-on-time comparisons where you look at a number of key metrics. So now I have a benchmark; what I might want to do is vary that load in a certain way over a different time period, and perhaps compare against the initial benchmark and the previous set of measurements, to see if I can establish any kind of a trend. For example, I’ve attempted to increase the GET burden against the cache by 50%; does that translate into an equivalent translation of certain key metrics, or is there no linear relationship?”

We are about to introduce time on time measurements and thereby support this type of analysis, with some of the visualizatons we are planning in future releases of ClearStone. Multiple perspectives in the ClearStone real time dashboard, shared and customized, already support the presentation of charts from various resources in a free-form and easy to create manner.

Performance comparisons of different NoSQL DB technologies may be possible, Don said, if they are similar, say, two data grid technologies, with a deliberate and documented use of load and benchmarks. “You might look at response times, for example, or GETs, or look at cache hits.” This could enable a useful technology choice in situations where you know what kind, what size and what elasticity you anticipate in your data.

“Take the example of ehcache performance. The first consideration is to define what the key metrics are. One way to do that even with our system today is to establish reasonable thresholds and to use our thresholding policy tools to set those up and capture them and create events. That way we see if we violate any of those thresholds. We start with a quiescent system and introduce a load over an hour; during that time, ECS is running, so our collectors are doing their job. Line charts we assemble are being annotated with the events we’ve defined at the points of transition to the threshold.”

“We can do this with our product today, but not as conveniently as with 5.0 where we could use a single chart, we can do another test run and vary the load. For a data caching product, for example, we introduce additional caches or perhaps retune the network or cluster and start it up again with larger or smaller Java heap sizes or any number of other parameters we want to investigate; then we look at the behavior; then we run another load test; or perhaps same load, but vary the heap sizes. What we can do is to create a perspective in the real time dashboard; create two charts and compare them by using a metric that correlates the two time periods. What we hope to do soon is to have this comparison appear in the same chart. While we can do a better job of integrating the visualization, we can support those use cases today with two separate charts that are visually aligned in such as way as to allow comparison.”

Learn more about our performance monitoring solution for Java, NoSQL and web servers.

By Bill Nigh

Organizations using the cloud model of computing have several ways of handling cloud performance management. In most cases, with so called ‘public’ cloud providers, management is encapsulated in the contract, and is not an issue.

For those considering a ‘private’ implementation of cloud technology within the firewall, cloud performance management can be crucial. And public private or hybrid cloud, this is not something legacy management can necessarily glide into. As Bernd Harzog writes here: “It is clear that ‘cloud enabled’ APM is becoming a category into its own and that legacy APM products like CA/Wily, IBM ITCAM, and HP Diagnostics will have to be reinvented to meet this use case”.. or passed in the fast lane by tools vendors that are cloud-savvy from day one?

First, what exactly is Cloud Performance Management, or cloud performance monitoring?
Harzog talks about:

  • Configuration Management, deemed critical due to the rate of change and degree of resource sharing, which “makes a continuous self-discovered understanding of configuration essential”.
  • Monitoring utilization of the “key physical and virtual resources that .. support workloads in virtual and cloud based environments. These resources include virtual and physical instances of CPU, memory, network I/O and storage I/O”. Harzog says that these resources “need to be monitored in near real time for performance management purposes, and trended over time for the purpose of capacity planning and management.”
  • Focusing on “the response time of the infrastructure to requests for work placed upon it by workloads”. This is an important because “in virtualized and cloud based environments one can no longer infer the performance of an infrastructure from its resource utilization. Therefore a new category of tools are needed which focus upon measuring the response time of all or portions of the infrastructure on behalf of its supported applications.”
  • Applications Performance Management, which focuses upon “discovering the topology of an applications system (and keeping that topology map up to date in a dynamic private/public cloud environment), and measuring the round trip and per hop response times for each application running on the virtual or cloud based infrastructure”.
  • Service Assurance, ”an emerging category that will combine elements of all of the above disciplines with configuration management and the ability to guarantee the performance of the most important applications by measuring their response times, and then automatically allocating the correct levels of resources to these applications (and denying them to less critical workloads).”

(I’m pleased and proud to mention that Mr Harzog gave us a favorable mention in this post subsequently).

 How good is cloud performance management without useful and complete cloud performance monitoring? From the above, it appears that cloud performance management without the best possible cloud performance monitoring will not be very good at all. We have specialized in application performance monitoring for many years at Evident. We are working on making ClearStone even more cloud-friendly, and are excited about the possibilities; stay tuned.

Learn more about our performance monitoring solution for Java, NoSQL and web servers

By Bill Nigh

SQL Performance Monitoring has a well understood model underlying some very mature monitoring and management tools. The RDBMS standard, of which SQL is a part, has a number of strengths that have earned it its central place in so many enterprises, features such as triggers, views, stored procedures, and declarative referential integrity. The salient RDBMS feature for SQL performance monitoring, however, is the presence, in deference to the RDBMS standard, of a large set of system tables, variously called the CATalog in Oracle, or the SYS-prefixed family of system tables in Sybase and MS SQL Server. The Relational Model, promulgated and championed by E.F.Codd, specified that a compliant RDBMS expose its internals to reporting using the same language and data model that was used for actual data. For this reason, even a bare bones command line interface to a relational database can, through SQL queries, harvest useful system data, thank Codd.

The performance monitoring area for NoSQL DBs differs in two major respects:

  • There are many products that come under that ‘elastic’ label; the Wikipedia article on NoSQL lists twelve types in its taxonomy (!)
  • As the name suggests, NoSQL has no SQL, no powerful and more importantly, standard language that can be used to query the system for performance metrics or other useful information about the resources found within the RDBMS, such as table size, number of indexes, percentage full of logical devices, and so forth.

NoSQL performance monitoring has to rely on tools that use an API, when present. Some NoSQL products have associated query languages, but, again, there is no common well known and well documented query language for metrics, putting a burden on organizations that have to monitor multiple NoSQL implementations.

This is where we come in.

Learn more about our performance monitoring solution for NoSQL, web apps and web servers: http://www.evidentsoftware.com

By Scott Barnett

I just happened to be in California last week when Membase and CouchOne announced their merger.  First, this is excellent news for the NoSQL movement, and it seems happy times at the new Couchbase.  I happened to be in Palo Alto when I saw Bob’s blog, so I wandered over to Tied House and shared a few pints with the folks.  I had a chance to meet Bob and several members of the Membase team (Melinda, Perry and of course James who could not wipe the smile off his face!).  I also had a chance to meet Damien (and his lovely wife) from the CouchOne team, and we got serenaded with a new merger song which was penned at the party.  There were also several folks from companies that were using NoSQL in their environments (including one guy whose name I have forgotten (sorry!), but I do remember he worked in the same building as Membase but was using MongoDB for their application!  Shame :-) .  I met folks from Facebook, Canonical, Battery Ventures among others – it’s always great to feel the Valley vibe.

Beyond drinks and laughing, there was cause for real celebration. Both the Membase and Couch folks are seeing significant traction, and they had some great positioning in mind for the combined company. The combined ability to do caching, clustering, with a document database is the consolidation we predicted would happen in the NoSQL market.  CouchOne’s positioning with mobile gives them yet another growing channel for usage of Web and cloud applications where performance (and not transactions) are paramount.

Of course, we feel that this makes ClearStone’s positioning even more important as the leading APM tool for NoSQL – developers and operations will need tools that provide deep visibility into the NoSQL “stack”, which more often than not includes Couchbase, Membase or Memcached, as well as stats, correlation and relationship-mapping with the other tiers such as RDBMS, Web, Application, System, Network, etc., etc.

So, while we just released 5.0 Beta with Management Packs for Memcached stats and performance optimization and Membase, we will be adding support for CouchOne/Couchbase as they expose more monitoring capabilities.  In fact, 5.0 comes with a RESTful API that allows developers to build their own Management Packs, so if somebody wants to take a crack at the first Couchbase adapter for ClearStone, we’re ready for you!!

Congratulations again to both the Membase and CouchOne teams, and best wishes for continued success.

Related articles

Enhanced by Zemanta

Learn more about our solution for Memcached stats, monitoring and optimization

By Scott Barnett

DevOps might just be a buzzword to some people, but we’ve been seeing the need for integrating development and operations firsthand. Customers tell us they start off needing ClearStone (our monitoring and management platform for NoSQL DBs, data caching, Java and web applications) in development and testing, sometimes many months before an application makes its way into production.

In the past, the idea of developers using ClearStone presented a bit of a challenge for us, because our business model was developed with production applications in mind.  But as we’ve continued talking to prospects and customers, it’s been clear that we need not only to make ClearStone affordable and accessible to developers, we need to continue to innovate and add features specifically for developers (features like cache browsing, load testing, management automation, integration with build/deploy tools). Whether or not the operations team uses ClearStone, developers need ClearStone’s real-time monitoring, detailed analytics, and rich visualizations.

We are proud to announce our first official product for developers: the ClearStone Development Pack. 

We have been quietly offering Development Packs for about a month and are now ready to formalize the product.  A Development Pack is sold on a per-developer (per-seat) basis.  This is a fully functional version of ClearStone – and it can be run on an unlimited number of servers as long as they are not production servers.  This allows a developer to use ClearStone on their own workstation for unit testing, but also on their Q/A and staging servers for load/stress testing, capacity planning, and active/passive monitoring in development.  Once the application is ready for production and onto production servers, customers will be required to purchase production licenses of the product.  The production pricing model will be changing too in early 2011, but I’ll save the details about that for another post next month.

The Dev Pack will be priced at $995/user/year – and comes with email support, free upgrades/updates to the software, and access to the ClearStone wiki.  To celebrate our launch, we’re offering a 50% discount on Dev Packs through the end of December – so (or more!), get the value of ClearStone in development, and contact us to learn how to best leverage ClearStone in both development and then production. 

We’re confident you will find the tool more than valuable for the price, and look forward to working with you and your team.  Please be sure to tell your friends as well!

By Ivan Ho

I was invited by Oracle to talk about how Evident Software leverages Oracle Coherence as an in-memory distributed cache within Evident ClearStone, Evident’s tool for Oracle Coherence monitoring.

There are many advantages that Coherence brings to the table for any real-time system. Within Evident ClearStone, the two primary requirements for Oracle Coherence monitoring were:

  • To serve as an elastic in-memory data store that was capable of storing and accessing large volumes of performance data at every high rates without compromising performance of the monitored resources and ClearStone.
  • Provide us with real-time “callback” interfaces for triggering our pipelines for data processing.

Thru our experience with Coherence monitoring, we’ve become one of the leading APM experts on Oracle Coherence, which is another reason why the Evident ClearStone product is also a management and monitoring tool for Oracle Coherence data grids.

For additional commentary about customer benefits and product introduction, please listen to the 3 minute podcast with Oracle here:

http://streaming.oracle.com/ebn/podcasts/media/943261_Ivan_Ho_111610.mp3

Learn more about our solution for Oracle Coherence monitoring

By jclark

The folks over at RightScale wrote up a great post on Cluster Monitoring with some new visualizations they’ve been using. The visualizations were certainly interesting (especially to me) but there is a key message in the post. The big takeaway is the importance of having a consolidated time series view of multiple entities so the effect of an anomaly can easily be seen across an environment. In a clustered environment, individual entities are inevitably related and an entity in trouble can, over time, affect other entities. The RightScale folks clearly showed this in their use case. The difficulty becomes when the density of information becomes too great to discern a problem. Imagine 1000 servers represented in a heatmap with 600,000 data points. The amount of information needed to be shown could exceed the number of pixels you have available to show it! But an environment of that size would not likely be in the hands of a developer where detail is critical. Production environments can conceivably reach 100′s or 1000′s of entities and the monitoring needs change along with the change in roles between Developer and Operations (DevOps). Information critical to developers can become “noise” to Operations folks. The detailed information needs to be distilled down to exceptions that are leading to, or clearly indicate, a problem in the environment.

In Evident ClearStone, we bridge to gap between developers and operators by providing exception focused views backed by the detail used to evaluate the exception condition. Furthermore, there is value in showing dissimilar but related entities like the constituent components of a NoSQL or Data Caching Platform (DCP) fabric to see cause and effect of related cascading failures. Below is a summarized view of various Oracle Coherence components showing health indicators for every 30 seconds of the last 15 mins. The color of the health indicator represent severity healthy condition.

Using the exception focused view above, specific details of a particular entity or entities to attempt to determine a root cause.

By Scott Barnett

Hadoop Several of the Evident team members had an opportunity to attend Hadoop World 2010 yesterday in NYC. The event was very well attended – reports have itthat attendance went from 400 last year to over 900 this year. It’s hard for me to compare since I wasn’t at last year’s event, but I can report that this year’s event had excellent speakers and material, and the day flew by.

Anyone questioning the use of Hadoop in production environments would have been well served to hear some of this year’s talks. The Evident team covered practically every talk between us – I spent most of my time in the Grand Ballroom listening to customer case studies around Hadoop. There were the expected players (Twitter, AOL, eBay, Yahoo) but also some perhaps unexpected guests (Bank of America, Chicago Mercantile Exchange, GE, HP, Orbitz). From our biased perspective, it was great to hear that most of the users were looking for better ways to manage/monitor their Hadoop grids (as well as the overall infrastructure). Exactly what we wanted to hear!

While a handful of the talks were disappointing, most of them had very useful information and relevant statistics about the benefits and implementation of Hadoop. It isn’t all about massive scale – rather, the ability to create a simple elastic grid is a great reason to get started by itself – the fact that Hadoop can scale linearly is gravy. Tim O’Reilly had an interesting (if a bit meandering) keynote regarding the consequences of living in a world of data. Mike Olson from Cloudera kicked off the event and did a good job praising the Hadoop community and bashing Oracle. And the Cloudera team overall was quite good keeping things flowing and making their presence known. The vendor kiosks were packed during the breaks, and I met several interesting folks throughout the day, one of whom was downloading ClearStone while we were talking – thanks!

I hope we can come back next year, perhaps as one of the sponsors of the event. It was exciting and full of energy.

By Scott Barnett

I attended the Boston Big Data Summit last week, which was extremely well attended and an excellent session. Moderated by Fred Holahan (who announced his new position as VP, Marketing at VoltDB, congrats!), the panel discussion included folks from 10gen (the “MongoDB guys”), Cloudera (the “Hadoop guys”), Infobright and VoltDB. (As an aside, it seems that Cloudera has done a great job branding their company around Hadoop, but 10gen seems to want to sit behind the Mongo brand?)

Anyway, the topic was real world problems that each of the vendors solutions can address. Each vendor got 10 minutes to talk about a use case that was relevant to their solution. Without going into the details of each use case, what struck me was how similar the excitement of this space is to what happened in the mid-90′s with Java Application Servers.

Questions from the audience focused on scalability and “hardening” of each solution – as well as limitations of each solution as it pertained to certain use cases. Each of the vendors tried to defer to the others with regard to specific use cases for specific products – I’m not sure how long that will last, as there is significant overlap of the solutions – while each one does focus on a specific area, it will be impossible for them not to explore expansion into other capabilities.

There was also a discussion regarding management/monitoring of applications written with these tools. Both Cloudera and VoltDB mentioned that they are releasing management functionality in the next releases of their respective products – this is a good sign, as no serious development shop is going to put applications in production without basic management, so it indicates that these vendors are getting these types of questions from their community. There was also an excellent comment from the audience regarding DevOps – that their operations team has a detailed checklist that development is required to fill out before their applications can be put in production – and part of the checklist is to identify what tools are needed to manage and monitor the application, and how those tools can interoperate with the “standards” already in place. This had personal resonance to what we are seeing at Evident – where it is the developers and architects who are the initial users of ClearStone, and they then recommend production use of our tool to Operations once the applications are ready to go live. It also confirmed our belief that we need to continue to make ClearStone easier and useful for developers during their build/deploy process.

Overall the session was excellent – given that it was my first Big Data summit so I didn’t know the “rules”, I will recommend they publish the following guidelines (assuming they are all standard):

  • Dinner is served and it’s really good, so don’t snack before hand :-)
  • Networking is primarily done before the event, and it ends later than stated, so if you have someplace else to be, get there a little early

By Scott Barnett

Our friends at Shopzilla talked about their application environment recently, and particularly around how they measure and visualize metrics from a wide variety of sources. This goes under the general category of DevOps – bringing development and operations closer together, and giving operations engineers the highly configurable, agile tools that developers have been enjoying for some time. To realize the vision of DevOps, an organization must be able to collect metrics and events from a variety of sources, bring that data together in an intelligent way (correlate it), and then present it in the best possible way for each of the various audiences in the organization. And all these steps must occur in near real time–in other words, the application environment must be able to analyze a vast amount of data fast enough for developers, operations staff, and business stake-holders to take whatever action might be necessary to optimize IT operations. No small feat – Juan Paul talked about half a dozen tools they use (including ClearStone) to manage this task.

So, we’ve been thinking here at Evident about an ambitious task – what if ClearStone could manage more of the DevOps problem? It’s clear that we are already in use within the development and operations community when it comes to managing and monitoring Oracle Coherence – but that’s just the first step. What about managing the application tier, other data caching/NoSQL environments, perhaps even database and system level metrics? That is the direction that Evident is moving in – witness our expansion in DCP to support memcached, and others are coming soon. Also, AppServer containers such as WebLogic, jBoss, Tomcat – out of the box capabilites to manage, monitor and instrument these containers in conjunction with the DCP layer. Also, grid technologies such as Hadoop, GridGain, DataSynapse.

Do you have custom code or your own containers? No problem, how about we give you a toolkit to bring those in yourself? And management plug-ins so that operations can leverage their existing tools for managing their environment and providing more automated alerting and control to the production environment.

Ambitious it is, but the feedback we’re getting and progress we’re making has been very encouraging. And we hear more people doing what Juan Paul and Shopzilla are doing, so we’re pleased with that path.