Articles tagged ClearStone 5.0

The views expressed in this blog are strictly personal, and do not necessarily represent the views of Evident Software.

By Bill Nigh

ClearStone 5.0 is now in GA, with compelling new functionality and a long, clear growth path due to good architecture choices. The main architectural change, with significant implications: ClearStone 5.0 was re-architected with NoSQL DBs Cassandra and Neo4J.

There were several challenges we faced in re-architecting ClearStone, and we’re proud to have handled them in stride.

  1. We were accustomed to using an RDBMS for storage of application and server performance metrics. In an RDBMS, triggers are a bulletproof way of sensing change to data and having the option to take some action, such as sending notifications. While other NoSQL technologies offer an equivalent to triggers, Cassandra does not, so we had to come up with a way to emulate this functionality. (We and others have requested this feature from the Apache Cassandra project, so it may show up in the future). We changed ClearStone’s data model to accommodate this new reality.
  2. Neo4j presented a fascinating challenge. We store and model data as a series of connected nodes; Neo4j, with its very low impedance with the way we designed, made it look like it would be fairly easy. What is challenging is the wide open nature of Neo4j qua implementation of a graph db; you can define nodes and edges (relationships) anyway you want; they can have or not have properties, and can have different types of directionality. So the challenge was to figure out how we wanted to represent all the data without fully knowing in advance how we wanted to traverse all that and get correlations; that presented a challenge from the engineering standpoint of the model.
  3. Another design challenge: we are consuming arbitrary time series data structures. ODI (Open Data Interface), our RESTful API for instrumenting most any IT resource, opens up the product to accept any data from any IT resource; because of this open API, we can’t know in advance what data model will characterize data being instrumented, so we needed maximum possible flexibility. An RDBMS would not only require a formal schema but an attendant requirement to normalize everything, or at least start with that.  Cassandra combines that needed flexibility with high accessibility; Cassandra’s column family implementation is “very forgiving”, as Ivan Ho, Evident’s CTO has said; columns can be added on the fly. (Another feature of Cassandra was especially persuasive: in prior versions of ClearStone, due to its use of RDBMS, there was a discontinuity of the data; a different application was used for history. Cassandra solved that problem, allowing display of current and historical data.)
  4. Cassandra has no query language; all access is through its API; (Lucene is used to query Neo4j, in contrast, one of several methods of retrieval). Using a globally unique key assigned during its inception in Neo4j, we can retrieve needed Cassandra metric, event and entity information.

Future versions will benefit from the flexibility we architected into the product with these NoSQL DBs, as will our customers; expect to see additional Management Packs over the next few weeks; when we ‘sprint’, we take it literally :)

By Bill Nigh

NoSQL DB logging is characterized mainly by point solutions, reflecting the varied points of origin of relatively new technologies. The way is clear for a monitoring platform that can integrate logging and monitoring of  Distributed Cache and NoSQL databases, and wouldn’t it be nice to monitor practically any resource anywhere in the enterprise?

Evident ClearStone 5.0, with DevOps friendly pricing, gives you the ability to monitor resources through channels such as Linux SAR, PHP, Perl, Ruby and other scripted feeds using ODI (our new REST API), joining our existing JMX-based approach for monitoring JVMs.  The performance metrics are captured and formed in Neo4J as graph objects, then stored in Cassandra as time series data. This rich storage model opens up the possibility of more and more interesting visualizations of events relationships and entities.

The new release of Evident ClearStone continues our efforts toward a more integrated approach to monitoring application performance throughout the entire enterprise application stack and operating environment.

Learn more about our performance monitoring solution for Java, NoSQL and web servers.

By Bill Nigh

NoSQL DB logging and reporting are challenges, for reasons discussed in this post. I recently spoke with Evident Software veteran Don Jeffery (@drjmun on Twitter) about those challenges and how Evident ClearStone (ECS) addresses NoSQL DB logging and reporting.

Metrics Collection

ECS collection (using JMX and ODI, our RESTful API over HTTP) creates Neo4j graph nodes from harvested and derived data in the form of resource metrics, resources, relationships and events. A unique identifier is generated for each Neo4j graph node, regardless of type. This identifier can be used to retrieve information from Apache Cassandra 0.7, which we employ as a time-series database, thus supporting current and historical performance monitoring visualizations within customizable perspectives. This product design allows ECS to chart performance metrics and display events such as threshold violations. (A nice writeup of our use of Neo4j and Cassandra was done in a blog post by our CTO, Ivan Ho, and got good play on DZone).

In ClearStone, a Neo4j node represents an instance of any entity we choose to track. The Neo4j node may have attributes that relate to, say, a host, such as an IP address, or data that somehow through heuristics lets us calculate how many processors may be on a piece of hardware. Other resources, possibly from other technologies, may also have information that helps us confirm that new host entity, create an instance of it, and populate it opportunistically as more information comes in. The incredibly free form nature of nodes stored in Neo4j makes this an easy capability to support.

Any time there are events associated with a resource, we keep a timeline of such events married to a snapshot of the associated resource(s) in the inventory at the time of the event occurrence.

Challenges

In the realm of NoSQL logging and reporting, consider the problems involved in monitoring a dynamic distributed environment with tools that are usually specific to a single technology. As Don put it, “what we need to understand is that the virtual and physical resources these products run on often overlap. At the very least there are server farms and networks that are shared”. NoSQL logging and reporting tools need to be able to identify patterns and relationships. They need an “elastic cross-technology solution that gets information on how [the technologies] impinge on one another in a common fabric”.

Another issue: in an environment where nodes are coming and going, a monitoring tool has to keep track of which nodes are current and which are not. As Don said “If we don’t get a report from a node, does that mean it’s just offline, or it has a problem? We also have to know at any given time in the history of our collection what nodes are available. Sampling a number of times helps you get a picture”. Sufficient samples over time can help ascertain whether a node’s state fluctuates a lot, with the caveat that maybe one can never be completely certain of even that. Maybe one rule could be that if a node is always ‘on’, and we get no reports on it for a [fill in the time frame], then we can conclude it has a problem; I think you see the challenge here.

Opportunities

Don understands the challenge and opportunity of NoSQL logging and reporting well; he says that “keeping the best snapshot” of a monitored resource is what we are striving to do with ECS, trying to “identify principal players” that a customer installation consists of, be they caches, nodes in a cluster, whatever they may be, in what is called our inventory. We can’t simply rely on current state, nor rely on history, but rather a combination of the two; “so that’s some of the stuff we’ve been looking at. If we can usefully compare what is in inventory now to what was there in the past, we’ll discover things we hadn’t even thought about, such as usage patterns and virtual and physical resources in that environment. I’m not sure we’ll be initially able to assess causality, but I think we can establish a footprint and allow the user to be able to explore and draw conclusions; I think we can give them the basis for that information.”

“It’s challenging to pinpoint cause and effect. For example, it’s difficult to determine that your publisher success rate is low because your CPU is maxed out; maybe your CPU is maxed out because you are attempting to do so much publishing that you’ve saturated that machine, leading to a low success rate; we can at least give them hints. We can also begin, with ECS 5.0, to give them projections, maybe presented graphically and in a number of visual perspectives; maybe an incident matrix.”

Regardless of what we deliver, Don says that “we want to give them something navigable, so they can begin to see where things are ‘lighting up’, then move distances away. So if a Cassandra cache was causing problems I could inspect the host. Oh, and now that I’m at that host, I can see there’s some other technology on there that’s beginning to have a lot of events.” Maybe this second technology is the actual root cause of the problem that first surfaced in the original monitored resource, a technology that is possible in a different tier, of a different NoSQL or caching technology, or maybe a servlet… you get the picture.

Don says “Being able to provide those hints and that navigation becomes even more important as the size and scale of these systems becomes such an issue that it becomes really difficult to monitor and manage the new environment without some event information or other heuristics that we’ve applied, a view that limits the scope of what they’re seeing to a space that we believe is related to problems that can explore causality. So, that’s one of the things we want to introduce in 5.0, some interesting visualizations, to help them navigate around.”

Weaving powerful semantics among tiers and domains based on an ever growing a better understood inventory of resources will provide a platform for discovering interrelationships whose understanding can well serve both root cause analysis and enforcement of SLA’s. This is the future of Evident ClearStone, as well as its present.

Learn more about our performance monitoring solution for Java, NoSQL and web servers

By Bill Nigh

We’re still in our beta period, but I asked if I could blog about the notions of ClearStone 5.0 collectors and other methods we support to instrument and monitor pretty much any resource in any tier of your ever-changing application stack. More detail about collectors forthcoming, but, for now, let’s jump directly into a representative user experience with our ClearStone admin console. There, you will learn how to set up collections for NoSQL DB champ Cassandra and for JVM. (See related discussions of JConsole)

Here’s the Tutorial

By Bill Nigh

While venerable and valuable, JConsole does not offer historical metrics and trending; the field is open to a JConsole alternative. Evident Software is proud to offer such an alternative: its ClearStone application and server performance monitoring suite. Developers are no doubt familiar with JConsole, a tool first introduced with Java 5.0. Part of the JDK, JConsole is built with the Java Management Extensions (JMX) APIs of the java.lang.management API. Tapping directly into the internals of the JVM, this utility provides a wrapper around the JMX MBeans in any local or remote platform MBeanServer.

I recently had the opportunity to shoehorn an interview into the very busy schedule of Evident’s sprint project manager Tim Sneed about the ClearStone Management Pack for Java, the Evident JConsole alternative. Some tidbits:

The collection configuration utility in the administration console allows one to browse and connect to an MBean server, much like one can do with the standard JConsole. One salient difference, however—a difference that may be crucial for developers—is that charting custom MBeans is possible in ClearStone, whereas in JConsole it is not. Developers rightly are concerned with understanding the dynamics of an elastic and distributed ecosystem as much as possible; the advent and rapid rollout of Big Data technology puts a premium on being able to see and integrate metrics on every IT asset of significance (from back-end data stores to web request response times) within one monitoring platform.

With ClearStone, developers are able to see important metrics by exposing them through their own custom MBeans. Doing so provides developers with macro-to-micro visibility of their entire deployment stack to see how their application performs over time; who would ever turn down a greater and richer volume of wide-ranging metrics like this?

In addition to just passive capture, ClearStone is a worthy JConsole alternative in the fact that threshold detection and notification are available. For example, assume that you have been performing Jconsole heap dumps and Jconsole thread dumps. You constrain your app to never exceed 300 MB of heap. If that threshold is ever breached, an automatic email can be sent or an event logged (visible to ClearStone’s Event Viewer) to make it clear that some code has to be adjusted.

Another shortcoming of JConsole is the lack of historical perspective. Once the user exits JConsole, or if the target server is restarted for some unknown reason, all the previous stats are gone! imagine running JConsole for several days, then making that innocent mistake. Evident’s JConsole alternative offers history in addition to realtime information.

And how many instances of JConsole are sometimes needed to make proper sense of performance /stability/consistency in a testing environment? As native functionality ClearStone’s JConsole alternative offers the ability to create perspectives, with each perspective containing multiple charts.

ClearStone provides the perfect transition from ‘Dev’ to ‘Ops’ with its functionality and pricing models, making this the best ‘DevOps’ tool for monitoring your deployments.

In addition to its JMX-based Flex Management Pack for Java, Evident offers Management Packs for Oracle Coherence, Memcached, Cassandra, JBoss, and WebLogic. The new version, 5.0, soon to undergo Beta testing (see below for a sign-up link) will also support easy instrumenting via the RESTful Evident ODI (Open Data Interface); sample file formats are CSV and XML. The ODI option allows not only streaming of data but, in the future, support for synthetic event injection, which will empower the developer with incredible levels of details to encourage more informed decision-making.

Learn more about our performance monitoring solution for Custom JMX monitoring

By Bill Nigh

One of the major design decisions going into ClearStone 5.0 was the selection of Neo4j, an open source graph DB, as the data store for the most recently collected data.

Neo4j allows designers to define relationships between entities stored as nodes, and edges between nodes. Neo4j has a reference node, which serves as the starting point for navigation through the structures. Structures are not limited to hierarchical topology. Arbitrary relationships can be defined and discovered.

Neo4j allows ClearStone to store events in association with entities that we consider resources, and more. A resource is any entity that we collect information on that can be qualified by properties. Neo4J allows ClearStone to capture more sophisticated correlations. ClearStone users can then ‘walk around’ and find second- and third-order relationships. Neo4j lets us express the relationships in a very convenient way. The DB supports a number of query backends. The one we’re using is Apache Lucene (a Java library for building search applications).

Neo4j is good for us not only because of relationships that we already expect in the data, such as when a threshold is violated, but also because we can apply other relationships, which could be generated through heuristics, to the data.

Relationships are not only free-form, but a relationship can itself have properties, thus allowing more sophisticated forms of correlation. “We’re just beginning” to work in this area, one of our developers pointed out. Look for even more advanced, multi-dimensional monitoring capabilities in future releases of ClearStone.

The bottom line: By using Neo4j, ClearStone 5.0 is able to correlate application events with a very rich set of metrics.

One implication is the ability to traverse from one resource to another within the UI. At present, one can already see charts from various parts of the multi-tier enterprise ecology, but being able to ‘follow your nose’ for this type of analysis will echo the natural curiosity of an interested troubleshooter.

Ivan Ho, our CTO, gave an example of one of the use cases that has been part of the design of ClearStone: Suppose you’re interested in a particular process and you want to know where it’s running. When you inspect the host, you might want to see what other resources the host depends on—a database for example—or what other processes are also running on the host, or you might want to assess the host’s health overall.

Using ClearStone’s ability to traverse data (stored in Neo4j), you could traverse the processes running on the host, and make an assessment as the cause of the host’s poor health. This type of rich inspection requires the ability to explore different visualizations of relationships—and Neo4J makes this type of exploration eminently feasible.

Interested in ClearStone 5.0? Sign up for the Beta program here.