Grid Utility Data Collection via Adapters
Data collection is key to obtaining the raw information required to determine operational parameters and performance characteristics of a specific environments and its hosted services and/or applications. The data collection process gathers the virtualized application, grid service, server, network consumption and other metrics via a component called an “Adapter”. The Evident ClearStone Adapters collect the raw data and other metrics from specific virtualized application and grid management systems, Enterprise Management Systems (EMS), and the network.
The Evident ClearStone solution supports the following adapters (others are being developed):
|
| |
Data Source |
 |
Evident ClearStone Adapter |
|
| |
Virtualized Application Fabrics |
|
DataSynapse FabricServer Adapter |
| |
Compute Grids |
|
DataSynapse GridServer Adapter |
| |
EMS |
|
BMC Performance Assurance
(Visualizer Database) Adapter |
| |
Data Grids |
|
Oracle Coherence Adapter |
| |
Network |
|
Cisco Netflow and RMON Adapter |
Evident ClearStone Adapters interface with a specific collection device using a variety of different technologies that include extracting data from management databases via a JDBC interface, or accessing data cache information via a JMX interface, and capturing network conversation data from a network router and/or switch via a Cisco Netflow interface.
Once the data is captured, the Evident ClearStone Adapters transforms and normalizes the raw collected data into an Evident Software internal format called a Universal Data Record (UDR). The UDR data is further processed and enriched by the Evident ClearStone Pipeline Server.
Typical Virtualized Application Features
Deploying and scaling IT applications across the Real-Time Infrastructure (RTI) can be accomplished using an application virtualization platform (application fabric) that dynamically configure, activate and scale applications based on business policies and user demand. These platforms support the packaging of web, ISV/Packaged, or custom IT applications into “virtualized containers” that can be dynamically provisioned and deployed across a shared server environment. This results in increased application performance (on-demand scaling), increased utilization (sharing computing resources), and lower costs (no dedicated, idle resources).
In order to manage these applications and the RTI that they run on, performance and usage reporting and analytics become critical in order to understand and optimize the application fabric. The Evident ClearStone Adapter for virtualized application fabrics collects performance and usage information pertaining to:
- Application domains – results in reports that show how application containers are utilizing the shared RTI
- Engine statistics – results in reports that provide details information (CPU and memory utilization, and free disk storage space) regarding RTI resource usage for optimizing or maintaining the RTI and for reporting supply and demand of engines across all applications running within the RTI
- Broker allocation – results in reports pertaining to any alerts that occurred during the allocation of an engine to an application domain by the fabric controller (i.e. the Fabric Broker)
Typical Compute Grid Metrics
A compute grid can be implemented as a software component or a framework that distributes and executes the workload from in a scalable manner across heterogeneous compute resources. The grid computing resources are distributed compute “engines” with a grid controller managing execution. The compute grid adds and removes computing nodes “on-demand” based on the computing workload requirements. This leads to near-linear performance gains when application compute demands increase.
The type of raw metrics that are collected from a compute grid can depend on the application type and how it is designed to operate on the grid utility. Although there are many different types of grid-enabled applications and their behavior varies, the majority typically falls into one of the following categories or profiles:
- Interactive
- Scheduled
- Long Running
- Scheduled Batch
The following table illustrates the typical metrics that are collected by the Evident ClearStone Adapter from the grid controller (broker) management server for different types of applications/service requests. This data is enriched with configuration information regarding the service and jobs/tasks association and other analytics in order to deliver grid service reporting including SLA exceptions and performance metrics.
| Application Profile |
|
Typical Collected Metrics/SLA Thresholds |
| |
|
|
| Interactive |
|
- Service request duration (how long the service request took to complete?)
- Service request task count (how many tasks were executed?)
- Service request status (did the request complete successfully?)
|
| Scheduled |
|
- Service request start time (did the request start on time?)
- Service request end time (did the request complete on time?)
- Service request duration (how long did the service request take to complete?)
- Service request task count (how many tasks were executed?)
- Service request status (did the request complete successfully?)
|
| Long Running |
|
- Task duration (how long did the task take to complete?)
- Task status (did the task complete successfully?)
- Tasks computation times for a particular transaction (how much compute time?) – custom to each app.
|
| Scheduled Batch |
|
- Batch start time (did the batch start on time)
- Batch end time (did the batch complete on time)
- Batch duration (how much time did it take to complete the batch)
- Batch service request count (how many scheduled jobs were executed)
- Batch engine count (how many engines processed the batch)
|
Typical EMS Metrics
Enterprise Management Systems (EMS) monitor the server infrastructure’s availability and performance in order to insure application availability. These systems are used throughout the enterprise and in some cases even within the grid utility. In order to provide a holistic view of all virtualized application resource use, including off-grid server resources, Evident ClearStone supports the integration and correlation of information provided by system management tools in conjunction with what is collected directly from the grid.
The Evident ClearStone solution uses server performance data available from the EMS and integrates that data with data collected from the compute grid, the data cache and the network to provide a holistic view of the virtual application environment. Integration with an existing EMS can be accomplished via the Evident EMS Adapter.
The following are the typical metrics collected from the EMS system for off-grid servers:
| |
- CPU Utilization
- Memory Utilization
- Free memory
- Total memory
- Network packets in
- Network packets out
- Network bytes in
|
- Network bytes out
- Network collisions
- Network errors
- Storage utilization
- Storage free storage
- Storage total storage
|
Typical Data Grid Features
Access to large amounts of application data can be a bottleneck in the overall performance of the grid-enabled application. In order to overcome this bottleneck, solution providers have developed software that solves the data latency issues inherent in data intensive applications. This solution involves the enablement of in-memory data cache (data grid) across a clustered environment that results in very low-latency access to the data. The following is a list of some of the features and capabilities that are supported by Evident ClearStone based on the raw data collected from the data grid:
| Data Grid Metrics |
|
ClearStone Processed Data Results |
| |
|
|
| Profile Caching Performance |
|
- Determine cache hit/miss performance by named cache and member
- Determine put/get response times by named cache and member
|
| Inspect Data Grid Profiles |
|
- Determine count and size of named caches across the data grid
- Determine memory consumption (by JVM/cache) for partitioned caches
- Determine member counts by named cache
- Determine object counts across a partitioned cache
|
| Profile Data Grid Events/Activities |
|
- Determine # of member join events and join times
- Determine repartition counts by named cache
|
| Provide Various Perspectives of Affinities Across the Data Grid |
|
- Graph network relationships among data grid components and clients
- Graph data grid nodes grouped by named caches (organize by “size”)
- Graph data grid nodes grouped by roles within the data grid
|
|