What special challenges are involved in pursuing useful NoSQL DB performance comparisons? Some of the challenges are: single point solutions, lack of a common query language/API to integrate metrics from different tiers into a common view, the heterogeneous nature of the technology, hard to predict lifecycles for objects such as nodes and caches, and other challenges, outlined here.
One approach to NoSQL DB performance comparison is to do time-on-time comparisons of single or multiple metrics with charts representing different monitored resources. Don Jeffery puts it this way: “Depending on what your objective is when you are benchmarking, one of the things that you might do is to take a quiescent system, one that’s spun up but not incurring any load, and introduce some load to it over a period of time. For example, in Coherence, you establish the cluster, introduce, say, eight nodes, maybe eight JVMs over four loads, with caches introduced, but no clients.”
“You would monitor this quiescent system and inspect certain key metrics and expect them to be relatively flat. You’d then introduce loads and keep track of the times and loads introduced.” (Since we store history as time series data in Cassandra, getting this information for potentially long stretches of time is not an issue.)
Don continues: “If you keep track of the time and the load, then you can introduce time-on-time comparisons where you look at a number of key metrics. So now I have a benchmark; what I might want to do is vary that load in a certain way over a different time period, and perhaps compare against the initial benchmark and the previous set of measurements, to see if I can establish any kind of a trend. For example, I’ve attempted to increase the GET burden against the cache by 50%; does that translate into an equivalent translation of certain key metrics, or is there no linear relationship?”
We are about to introduce time on time measurements and thereby support this type of analysis, with some of the visualizatons we are planning in future releases of ClearStone. Multiple perspectives in the ClearStone real time dashboard, shared and customized, already support the presentation of charts from various resources in a free-form and easy to create manner.
Performance comparisons of different NoSQL DB technologies may be possible, Don said, if they are similar, say, two data grid technologies, with a deliberate and documented use of load and benchmarks. “You might look at response times, for example, or GETs, or look at cache hits.” This could enable a useful technology choice in situations where you know what kind, what size and what elasticity you anticipate in your data.
“Take the example of ehcache performance. The first consideration is to define what the key metrics are. One way to do that even with our system today is to establish reasonable thresholds and to use our thresholding policy tools to set those up and capture them and create events. That way we see if we violate any of those thresholds. We start with a quiescent system and introduce a load over an hour; during that time, ECS is running, so our collectors are doing their job. Line charts we assemble are being annotated with the events we’ve defined at the points of transition to the threshold.”
“We can do this with our product today, but not as conveniently as with 5.0 where we could use a single chart, we can do another test run and vary the load. For a data caching product, for example, we introduce additional caches or perhaps retune the network or cluster and start it up again with larger or smaller Java heap sizes or any number of other parameters we want to investigate; then we look at the behavior; then we run another load test; or perhaps same load, but vary the heap sizes. What we can do is to create a perspective in the real time dashboard; create two charts and compare them by using a metric that correlates the two time periods. What we hope to do soon is to have this comparison appear in the same chart. While we can do a better job of integrating the visualization, we can support those use cases today with two separate charts that are visually aligned in such as way as to allow comparison.”
Learn more about our performance monitoring solution for Java, NoSQL and web servers.

View Comments