diff --git a/docs/sources/alerting/_index.md b/docs/sources/alerting/_index.md index e420c23118d..bd4e93282cf 100644 --- a/docs/sources/alerting/_index.md +++ b/docs/sources/alerting/_index.md @@ -6,39 +6,34 @@ weight = 110 # Alerts overview -Alerts allow you to identify problems in your system moments after they occur. By quickly identifying unintended changes in your system, you can minimize disruptions to your services. +Alerts allow you to know about problems in your systems moments after they occur. Robust and actionable alerts help you identify and resolve issues quickly, minimizing disruption to your services. -Alerts consists of two parts: +Alerts have four main components: -- Alert rules - When the alert is triggered. Alert rules are defined by one or more conditions that are regularly evaluated by Grafana. -- Notification channel - How the alert is delivered. When the conditions of an alert rule are met, the Grafana notifies the channels configured for that alert. - -Currently only the graph panel visualization supports alerts. +- Alert rule - One or more conditions, the frequency of evaluation, and the (optional) duration that a condition must be met before notifying. +- Contact point - A channel for sending notifications when the conditions of an alert rule are met. +- Notification policy - A set of matching and grouping criteria used to determine where, and how frequently, to send notifications. +- Silences - Date and matching criteria used to silence notifications. ## Alert tasks You can perform the following tasks for alerts: -- [Add or edit an alert notification channel]({{< relref "notifications.md" >}}) - [Create an alert rule]({{< relref "create-alerts.md" >}}) - [View existing alert rules and their current state]({{< relref "view-alerts.md" >}}) - [Test alert rules and troubleshoot]({{< relref "troubleshoot-alerts.md" >}}) +- [Add or edit an alert contact point]({{< relref "notifications.md" >}}) ## Clustering Currently alerting supports a limited form of high availability. Since v4.2.0 of Grafana, alert notifications are deduped when running multiple servers. This means all alerts are executed on every server but no duplicate alert notifications are sent due to the deduping logic. Proper load balancing of alerts will be introduced in the future. -## Notifications - -You can also set alert rule notifications along with a detailed message about the alert rule. The message can contain anything: information about how you might solve the issue, link to runbook, and so on. - -The actual notifications are configured and shared between multiple alerts. +## Alert evaluation -## Alert execution +Grafana managed alerts are evaluated by the Grafana backend. Rule evaluations are scheduled, according to the alert rule configuration, and queries are evaluated by an engine that is part of core Grafana. -Alert rules are evaluated in the Grafana backend in a scheduler and query execution engine that is part -of core Grafana. Alert rules can query only backend data sources with alerting enabled. Such data sources are: -- builtin or developed and maintained by grafana, such as: `Graphite`, `Prometheus`, `Loki`, `InfluxDB`, `Elasticsearch`, +Alert rules can only query backend data sources with alerting enabled: +- builtin or developed and maintained by grafana: `Graphite`, `Prometheus`, `Loki`, `InfluxDB`, `Elasticsearch`, `Google Cloud Monitoring`, `Cloudwatch`, `Azure Monitor`, `MySQL`, `PostgreSQL`, `MSSQL`, `OpenTSDB`, `Oracle`, and `Azure Data Explorer` - any community backend data sources with alerting enabled (`backend` and `alerting` properties are set in the [plugin.json]({{< relref "../developers/plugins/metadata.md" >}})) @@ -46,9 +41,12 @@ of core Grafana. Alert rules can query only backend data sources with alerting e The alert engine publishes some internal metrics about itself. You can read more about how Grafana publishes [internal metrics]({{< relref "../administration/view-server/internal-metrics.md" >}}). -Description | Type | Metric name +Metric Name | Type | Description ---------- | ----------- | ---------- -Total number of alerts | counter | `alerting.active_alerts` -Alert execution result | counter | `alerting.result` -Notifications sent counter | counter | `alerting.notifications_sent` -Alert execution timer | timer | `alerting.execution_time` +`alerting.alerts` | gauge | How many alerts by state +`alerting.request_duration_seconds` | histogram | Histogram of requests to the Alerting API +`alerting.active_configurations` | gauge | The number of active, non default alertmanager configurations for grafana managed alerts +`alerting.rule_evaluations_total` | counter | The total number of rule evaluations +`alerting.rule_evaluation_failures_total` | counter | The total number of rule evaluation failures +`alerting.rule_evaluation_duration_seconds` | summary | The duration for a rule to execute +`alerting.rule_group_rules` | gauge | The number of rules