Cassandra

Apache Cassandra server monitoring in DESK provides information about database exceptions, failed requests, performance, and more. If Cassandra is underperforming or a problem occurs, DESK lets you know immediately and shows you which nodes are affected.

Prerequisites

  • Cassandra 2.xx
  • Linux or Windows

Enabling Cassandra monitoring globally

With Cassandra monitoring enabled globally, DESK automatically collects Cassandra metrics whenever a new host running Cassandra is detected in your environment.

  1. In the navigation menu, select Settings.
  2. Select Monitoring > Monitored technologies.
  3. In the Supported technologies list, find the Cassandra JMX row.
  4. Set the Cassandra JMX switch to the On position.

Monitoring Cassandra in DESK

  1. In the navigation menu, select Technologies.
  2. Click the Apache Cassandra tile on the Technology overview page.
  3. To view Cassandra cluster metrics, select the cluster in the Process group table under the tiles.
    The chart displays the selected process group (cluster) metric over time. You can select a different metric from the list.
  4. In the expanded row, click the Process group details button to see details on the selected Cassandra cluster.
  5. On the Process group details page, select the Technology-specific metrics tab to identify any problematic nodes.
  6. To display node-specific metrics, select a node from the Process list under the chart.
  7. Select the Cassandra metrics tab to see valuable node-specific Cassandra metrics.
    • The Exceptions and Failed requests charts show you if there’s a problem with the node. Pay particular attention to the Unavailable - Read, Unavailable - Write, and Unavailable - RangeSlice counts in Failed requests.
    • The Operation count and Latency 95th percentile charts can help you monitor performance. Increased latency while the number of operations remains stable typically indicates a performance issue.
  8. Select the Further details tab to see charts on a variety of additional Cassandra metrics.

Cassandra cluster metrics

Select the Technology-specific metrics tab on the Process group details page to display aggregated Cassandra cluster metrics. Use the Show chart for list to change a different chart to display. All metrics are plotted against the number of process group instances. Hover your pointer over the chart to see an instance count and the minimum, maximum, and average for the selected metric at that time.

  • Suspension
  • JVM threads
  • Java memory pool commits
  • Java memory pool used
  • GC time (garbage collection time)
  • Exception count
  • Files open
  • RangeSlice latency
  • RangeSlices
  • Read latency
  • Reads
  • Storage load
  • Write latency
  • Writes

Cassandra node metrics

Cassandra metrics tab

The Cassandra metrics tab shows key metrics for Cassandra on the node level.

Chart Metric Description
Exceptions Exception count Number of internal Cassandra exceptions detected. Under normal conditions, this metric should be zero.
Failed requests Unavailable – Read Number of Unavailable – Read exceptions encountered.
Unavailable – Write Number of Unavailable – Write exceptions encountered.
Unavailable – RangeSlice Number of Unavailable – RangeSlice exceptions encountered.
Timeout – Read Number of Timeout – Read exceptions encountered.
Timeout – Write Number of Timeout – Write exceptions encountered.
Timeout – RangeSlice Number of Timeout – RangeSlice exceptions encountered.
Failure – Read Number of Failure – Read exceptions encountered.
Failure – Write Number of Failure – Write exceptions encountered.
Failure – RangeSlice Number of Failure – RangeSlice exceptions encountered.
Operation count Read Average number of reads per second.
Write Average number of writes per second.
RangeSlice Average number of RangeSlices per second.
Latency 95th percentile Read Average 95th percentile of transaction read latency.
Write Average 95th percentile of transaction write latency.
RangeSlice Average 95th percentile of transaction RangeSlice latency.

Further details tab

The Further details tab shows additional metrics for Cassandra on the node level: Cache, Disk usage, Hints, Java managed memory, Load, and Pending tasks.

Chart Metric Description
Cache: Hit rate Row cache hit rate 2m row cache hit rate.
Key cache hit rate 2m key cache row hit rate.
Disk usage: Storage load Load Size, in bytes, of the on-disk data the node manages.
Disk usage: Bytes compacted Bytes compacted Total number of bytes compacted since server start.
Disk usage: Compaction tasks pending Pending tasks Estimated number of compactions remaining to perform.
Disk usage: Compaction tasks completed Completed tasks Number of completed compactions since server start.
Disk usage: SSTable count SSTable count Number of SSTables on disk for this table.
Hints Hints Number of hint messages written to this node since start. Includes one entry for each host to be hinted per hint.
Java managed memory: poolname Used memory Java used memory.
Committed memory Java committed memory.
Maximum memory Java maximum memory.
Garbage collection count Java garbage collection count.
Garbage collection time Java garbage collection time.
Load: Read latency Average Average 95th percentile of transaction read latency.
Maximum Maximum 95th percentile of transaction read latency.
Load: Write latency Average Average 95th percentile of transaction write latency.
Maximum Maximum 95th percentile of transaction write latency.
Load: RangeSlice latency Average Average 95th percentile of transaction RangeSlice latency.
Maximum Maximum 95th percentile of transaction RangeSlice latency.
Load: Read throughput Average Average number of reads per second.
Maximum Maximum number of reads per second.
Load: Write throughput Average Average number of writes per second.
Maximum Maximum number of writes per second.
Load: RangeSlice throughput Average Average number of RangeSlices per second.
Maximum Maximum number of RangeSlices per second.
Pending tasks: Read pending tasks Read pending tasks Number of read mutation tasks.
Pending tasks: ReadRepair pending tasks ReadRepair pending tasks Number of ReadRepair mutation tasks.
Pending tasks: Mutation pending tasks Mutation pending tasks Number of queued mutation tasks.
Pending tasks: Compaction pending tasks Compaction tasks pending Estimated number of compactions remaining to perform.