Articles tagged nosql reporting

The views expressed in this blog are strictly personal, and do not necessarily represent the views of Evident Software.

By Bill Nigh

NoSQL DB logging and reporting are challenges, for reasons discussed in this post. I recently spoke with Evident Software veteran Don Jeffery (@drjmun on Twitter) about those challenges and how Evident ClearStone (ECS) addresses NoSQL DB logging and reporting.

Metrics Collection

ECS collection (using JMX and ODI, our RESTful API over HTTP) creates Neo4j graph nodes from harvested and derived data in the form of resource metrics, resources, relationships and events. A unique identifier is generated for each Neo4j graph node, regardless of type. This identifier can be used to retrieve information from Apache Cassandra 0.7, which we employ as a time-series database, thus supporting current and historical performance monitoring visualizations within customizable perspectives. This product design allows ECS to chart performance metrics and display events such as threshold violations. (A nice writeup of our use of Neo4j and Cassandra was done in a blog post by our CTO, Ivan Ho, and got good play on DZone).

In ClearStone, a Neo4j node represents an instance of any entity we choose to track. The Neo4j node may have attributes that relate to, say, a host, such as an IP address, or data that somehow through heuristics lets us calculate how many processors may be on a piece of hardware. Other resources, possibly from other technologies, may also have information that helps us confirm that new host entity, create an instance of it, and populate it opportunistically as more information comes in. The incredibly free form nature of nodes stored in Neo4j makes this an easy capability to support.

Any time there are events associated with a resource, we keep a timeline of such events married to a snapshot of the associated resource(s) in the inventory at the time of the event occurrence.

Challenges

In the realm of NoSQL logging and reporting, consider the problems involved in monitoring a dynamic distributed environment with tools that are usually specific to a single technology. As Don put it, “what we need to understand is that the virtual and physical resources these products run on often overlap. At the very least there are server farms and networks that are shared”. NoSQL logging and reporting tools need to be able to identify patterns and relationships. They need an “elastic cross-technology solution that gets information on how [the technologies] impinge on one another in a common fabric”.

Another issue: in an environment where nodes are coming and going, a monitoring tool has to keep track of which nodes are current and which are not. As Don said “If we don’t get a report from a node, does that mean it’s just offline, or it has a problem? We also have to know at any given time in the history of our collection what nodes are available. Sampling a number of times helps you get a picture”. Sufficient samples over time can help ascertain whether a node’s state fluctuates a lot, with the caveat that maybe one can never be completely certain of even that. Maybe one rule could be that if a node is always ‘on’, and we get no reports on it for a [fill in the time frame], then we can conclude it has a problem; I think you see the challenge here.

Opportunities

Don understands the challenge and opportunity of NoSQL logging and reporting well; he says that “keeping the best snapshot” of a monitored resource is what we are striving to do with ECS, trying to “identify principal players” that a customer installation consists of, be they caches, nodes in a cluster, whatever they may be, in what is called our inventory. We can’t simply rely on current state, nor rely on history, but rather a combination of the two; “so that’s some of the stuff we’ve been looking at. If we can usefully compare what is in inventory now to what was there in the past, we’ll discover things we hadn’t even thought about, such as usage patterns and virtual and physical resources in that environment. I’m not sure we’ll be initially able to assess causality, but I think we can establish a footprint and allow the user to be able to explore and draw conclusions; I think we can give them the basis for that information.”

“It’s challenging to pinpoint cause and effect. For example, it’s difficult to determine that your publisher success rate is low because your CPU is maxed out; maybe your CPU is maxed out because you are attempting to do so much publishing that you’ve saturated that machine, leading to a low success rate; we can at least give them hints. We can also begin, with ECS 5.0, to give them projections, maybe presented graphically and in a number of visual perspectives; maybe an incident matrix.”

Regardless of what we deliver, Don says that “we want to give them something navigable, so they can begin to see where things are ‘lighting up’, then move distances away. So if a Cassandra cache was causing problems I could inspect the host. Oh, and now that I’m at that host, I can see there’s some other technology on there that’s beginning to have a lot of events.” Maybe this second technology is the actual root cause of the problem that first surfaced in the original monitored resource, a technology that is possible in a different tier, of a different NoSQL or caching technology, or maybe a servlet… you get the picture.

Don says “Being able to provide those hints and that navigation becomes even more important as the size and scale of these systems becomes such an issue that it becomes really difficult to monitor and manage the new environment without some event information or other heuristics that we’ve applied, a view that limits the scope of what they’re seeing to a space that we believe is related to problems that can explore causality. So, that’s one of the things we want to introduce in 5.0, some interesting visualizations, to help them navigate around.”

Weaving powerful semantics among tiers and domains based on an ever growing a better understood inventory of resources will provide a platform for discovering interrelationships whose understanding can well serve both root cause analysis and enforcement of SLA’s. This is the future of Evident ClearStone, as well as its present.

Learn more about our performance monitoring solution for Java, NoSQL and web servers

By Bill Nigh

NoSQL DB logging and reporting are challenging for a number of reasons:

  1. NoSQL is relatively new, so there are not a lot of experienced practitioners of the new technology, which most now understand to comprehend a wide variation of architectures; one definition of ‘NoSQL expert’ might be simply ’one who has product knowledge of one or more products’, as they are so different.
  2. Unlike RDBMS legacy technology, NoSQL DB logging and reporting cannot benefit from an ANSI standard common application and system query capability; while the RDBMS model exposes excellent system metrics via virtual ‘system’ tables that can be queried with the rich SQL language, this is not the case with NoSQL
  3. Single-point solutions are most likely the case, enabling a proliferation of monitoring and reporting consoles
  4. Entities participating in a cluster typically have separate logs on separate hosts in separate contexts, causing log correlation problems; for example cluster rebalancing, as found with Coherence, introduces load onto other nodes in that cluster, creating a ‘pathological’ situation and flurry of log entries that are not easy to interpret
  5. Collection methods can impact performance; an agent coresident in server memory with a NoSQL (or any) app, will impinge on the performance of the very process you are trying to monitor
  6. NoSQL is often part of a much larger system; e.g. Hadoop was a key part, but only part, of IBM Watson;  however, NoSQL logging and reporting tools are typically silos.
  7. The technologies mesh together in synergies hard to foresee and interactions that may not be anticipated; the performance implications of  all combinations of NoSQL running in the enterprise have not yet been documented, but a typical use case spans multiple NoSQL technologies
  8. A holistic view of NoSQL performance monitoring data is lacking, as each technology is associated with different metrics and (typically) different point solutions.
  9. There is so much data that one has to apply heuristics to filter out the valuable information, to avoid ‘drinking from a fire hose’
  10. Reporting such as found in JConsoleLINK does not persist report data for useful enough timeframes, so seeing trends and doing capacity planning are difficult
  11. The elastic nature of the environment means that as resources are deployed in response to increased demand, nodes will be coming and going, with life cycles that are not easy to predict; logging and reporting tools need to somehow account for the fact that a node may not report within a given time frame; does that mean it is just offline or it has a problem?

Bernd Harzog, writing recently in The Virtualization Practice, has furnished a very useful view of the state of the art and state of the industry here. It is well worth the time to read it (and I am proud to point out that he mentions Evident Software as an example of a company that is well positioned to make contributions toward addressing these challenges).

If you are interested in this subject, you may also want to read this post on challenges and opportunities in NoSQL DB logging and reporting.

Learn more about our performance monitoring solution for Java, NoSQL and web servers

By Scott Barnett

Last month, we launched Evident ClearStone 4.5, which includes NoSQL logging and NoSQL reporting features. This software release marks an important milestone in the evolution of Evident Software. Over the summer, we made a decision to aggressively go after the NoSQL DB market, expanding our previous support for compute-grid technologies such as DataSynapse and application-grid technologies such as Oracle Coherence by offering NoSQL reporting, NoSQL logging, management and performance monitoring.

Why make this change of course? There were several reasons:

  1. NoSQL might not still be called “NoSQL” in a few years, but it absolutely will be an important technology for enterprise applications. Think back to how the Java Application Server came of age in the mid/late 1990′s. That technology required several iterations to become the Java Application Server. The market took a few years to coalesce and turn into something that the broad industry could understand, market, and build around. Today NoSQL is going through a similar evolution. We’re just starting to see forecasts of the size of the NoSQL market. We suspect it won’t be called NoSQL a year from now (some other people seem to agree) – as technologies such as Hadoop, Data Caching Platforms such as Coherence, GemFire, Terracotta and hybrid in-memory databases such as VoltDB all vie for developer mind-share. Whatever it’s called, this is the “new” tier in the application stack, and it’s going to need focused and dedicated capabilities from a management/monitoring perspective, including NoSQL logging and NoSQL reporting. Here’s a great database (no pun intended) of systems that fall into the NoSQL realm.
  2. Correlating metrics and events between the NoSQL tier and the other existing tiers in the application (and system) stack will be key capabilities for monitoring and managing NoSQL applications. Each tier cannot continue to have its own NoSQL logging and monitoring capabilities – monitoring needs to be integrated, so enterprises can get a holistic view of their applications. This is a hard problem to solve.It’s also a valuable problem to solve. We are solving this problem already now for the caching technologies I listed above. Now we want continue extending this capability across the different tiers of the application stack.
  3. Visualization is the key to success in APM. When you are gathering so many metrics/events in real time, it’s a challenge to determine what is really important to DevOps.. We’ve been told we’ve done a great job of figuring this one out – our user interface is intuitive, attractive, and meaningful. Making sense of all that data is hard to do. Without it, you have lots of great data with no insight. You need insight to make good decisions.
  4. Our goal is to support every NoSQL system out there. To meet this goal requires a change of strategy – so you will see us open up our platform so that people can build and deploy their own “Management Packs.” We currently have Management Packs for DataSynapse GridServer, Oracle Coherence, Apache Cassandra, Memcached (with Membase coming very soon), WebLogic Server, and jBoss. We are working on many more, but we want to move even faster. So you will see a Management Pack framework that allows developers to build their own Management Packs (we can help you too!). It is not hard to do this, and we will roll out a developer site shortly for people to share/collaborate/contribute. We will start by contributing our own management packs to the site.

So, 4.5 is the next step in our evolution and a hearty step forward in our embrace of all things NoSQL as the latest, greatest participant in the application stack. From our conversations with customers and prospects over the past few months, we know many of you agree our vision of NoSQL reporting, monitoring and management. We look forward to working with you on this initiative in the months and years ahead. We are very interested in your thoughts, ideas, and suggestions on how to continue this process, so please share your ideas with us!