Data Processing via the Pipeline Server
The Evident ClearStone Pipeline Server* performs the work required to aggregate, correlate, and map business context (e.g. apply a rate/cost) to raw data based on the configured system information and the type of usage being supplied by the data collection adapters. The internal workflow is structured as a unique data processing “pipeline.” The pipeline assembly is constructed from smaller, independent, and reusable-processing units. The processing units transform the raw UDR data into processed information that is aggregated and correlated based on a defined mapping to the domain/container, service, application and/or business process associated with the collected data as well as the end reporting requirements.
The Evident ClearStone Pipeline Server architecture allows the processing units to be configured in many different ways. The architecture supports the ability to quickly and efficiently address the ever-changing user requirements or to support new data collection sources. The Evident ClearStone Pipeline is implemented using best of breed web open source tools/frameworks including a web-services environment, a J2EE application framework (Spring), an Enterprise Services Bus (MULE), an event processing engine (ESPER), object/relational persistence and query service, and general data interface components.
* Patent Pending
Pipeline Functionality
The Evident ClearStone product supports the use of modularized generic components that can be assembled in different ways to support reporting and analytics for a variety of IT RTI technologies including virtualized applications running on virtualized application fabrics, compute grids, distributed data cache and the enterprise network. The pipeline performs the following functions:
- Data aggregation from multiple instances of a compute and/or data grid, an EMS tool for physical and virtual servers, and the network infrastructure
- Data normalization
- Correlation or mapping of data to virtualized application/service
- Transformation of data to meet reporting requirements including invocation of regular expressions and/or algorithms to process the data based on reporting needs
- Enrichment of data with additional attributes associated with a specific metric or Key Performance Indicators (KPI’s)
- Time correlation of the data
- Data processing for both near real-time (performance) and long term trending analysis
- Application of monetary rates to RTI usage for service accounting reporting (chargeback)
Pipeline Server Components and Data Flow
The main components of the Evident ClearStone Pipeline Server are implemented as Java classes and objects (Plain Old Java Objects or POJO’s). These objects form the building block required to deliver the pipeline functionality. The Pipeline Server is structured to process the UDR files that are produced by the various data collection adapters. The diagram on the right illustrates a typical pipeline server configuration that is supporting two grid applications (Application “A” and Application “B”).
The availability of a UDR file produced by the various Evident ClearStone Adapters triggers the start of pipeline processing and it does not complete until there are no more UDR files available for processing. The pipeline server components perform the aggregation, correlation, mapping, and a host of other data enrichment functions that are applied to the raw UDR data and driven by the end reporting and analytics requirements.
The Pipeline Server architecture includes a “routing” function that contains logic to make decisions on which component pipeline to use for processing a specific UDR record. This decision is based on:
- how the pipelines are configured,
- the information contained in the UDR files,
- the end reporting and analytics requirements.
Once the routing function is completed, the UDR is then sequentially processed by each component in the pipeline. Each component of the pipeline is designed to perform a specific function on the data depending on its logic. The order of each component and thus the structure of a specific pipeline is configurable via XML configuration files within the Pipeline Server. This allows for virtually endless combinations of pipeline components, which can be assembled in support of vastly different collected data, application structure and reporting/analytics requirements.
Benefits
The Evident ClearStone Pipeline Server has many benefits including:
- Web-based implementation using standard Java application frameworks and technologies – fits into today’s application utility environments
- Support for near real-time alerting to address utility service performance and longer-term data aggregation for business, operational and trending reports and analysis – designed to support vastly different user reporting requirements
- Independent, general-purpose, and re-configurable pipeline components – quickly address new application and reporting requirements
- Event correlation to trigger on Key Performance Indicators (KPI’s) – quickly investigate and correct SLA violations
- Bulk load processing capabilities – high-performance database loading
|