Distributed caching platforms (DCP) are becoming essential to reliably scaling out data intensive and transactional processing systems. In essence, DCPs have become in-memory databases for Internet-scale applications.
Now, instead of adding servers or virtual machines, or re-architecting applications, enterprises are deploying DCPs in applications ranging from reservations to online betting; from Web-based education to loyalty programs; from inventory management to real-time portfolio-risk calculations. DCPs are especially important in cloud computing, where developers pay for computing resources used, and data is often the bottleneck.
Poorly optimized resources or infrastructure failures can lead to lost trades, reservations or orders, missed regulatory requirements or customer churn. To avoid these risks, enterprises need a comprehensive, real-time view of cluster, cache and node performance, the ability to playback and analyze events, and the management to maintain peak operating levels – even under extreme loads and dozens of clusters. This requires more than simplistic monitoring. Enterprises need an intelligent platform that can easily aggregate, correlate, analyze and respond to thousands of performance and event data according to KPI’s of their choosing.
Evident’s automated, 24×7 risk mitigation identifies DCP issues in real-time before their escalate to outages or lost revenues.
- Simple, configurable collection, aggregation and roll-up for any metric
- Out of the box dashboards which can be customized for use during development, test and production support
- Alert notification
- Threshold-based event triggering of automated management functions or other external customer scripts
- A highly scalable, real-time architecture for monitoring, analysis and visualization
A sampling of KPI’s derived using ClearStone’s analytics engine include the following:
- Cluster performance metrics such as Cache effectiveness (hits, misses, gets & puts), node ‘hot spots,’ inadequate memory overhead, publish success rates, variance from historical averages, etc.
- Health metrics such as long garbage collection recovery by server over time, endangered caches or data, JVM heap or CPU utilization
- Network-induced performance issues
- Capacity limitations driven by the number of client connections, thread pool exhaust, increase in object sizes without corresponding increases in nodes, etc.
- Resiliency measures such as cache partitioning
To learn more about Evident Software’s solution for DCPs, contact us.
