Troubleshoot monitoring interruptions

A monitoring interruption is a situation where the majority of your installed OneAgents lose their connection with the DESK server and usually manifests itself as a lack of visibility in terms of both availability and performance monitoring.

This doesn't necessarily mean an outage of your servers though. In case of a monitoring interruption, DESK automatically suppresses all Host unavailable problems and alerts you to the monitoring interruption. All hosts are set to the availability state Unmonitored for the duration of the monitoring outage. Monitoring interruption alerts do have a special severity filter within your alerting profiles. The Monitoring unavailable alert severity level allows you to create a filter and then deliver these highly critical alerts to your monitoring operations teams.

Monitoring interruptions can have different root causes depending on the type of DESK deployment you're running. DESK SaaS environments are administered by the DESK DevOps team, who post all operational issues to desk.status.io. For environments running in DESK Managed deployments, it's most likely that the monitoring interruption is caused by an issue within your own datacenter or network configuration.

Read below for use-case specific details

Monitoring interruption in a single DESK environment

This situation is detected whenever a single DESK SaaS environment loses the connection to its OneAgents. As no other environments are affected on the same DESK SaaS cluster, it's highly recommended that you check the following issues within your own network configuration:

  • Check whether a recent change in your network or firewall configuration blocks the outgoing monitoring traffic of your OneAgents.
  • In case you are routing OneAgent traffic through an ActiveGate, check the operational status of your ActiveGates.
  • Finally, in case you don't find any network issues within your own datacenter, check desk.status.io for a general issue in your region.

The example alert below shows a monitoring interruption within a DESK SaaS environment.

monitoring interruption

Monitoring interruption within a DESK cluster

An alert is sent out to all affected monitoring environments within a DESK SaaS cluster in case of a general interruption of OneAgent communication. The alert message states that the issue affects the complete DESK cluster and isn't limited to your own environment. As the SaaS clusters within different regions are operated by the DESK DevOps team, you can check the status of your own SaaS region on desk.status.io.

See below an example for an alert for a monitoring interruption on a DESK SaaS region.

monitoring interruption