docs(alerting): Add two common examples in `Learn` section (#105325)

* docs(alerting): Add two common examples in `Learn` section

* Update docs/sources/alerting/learn/examples/multi-dimensional-alerts.md

Co-authored-by: Johnny Kartheiser <140559259+JohnnyK-Grafana@users.noreply.github.com>

* Update docs/sources/alerting/learn/examples/multi-dimensional-alerts.md

Co-authored-by: Johnny Kartheiser <140559259+JohnnyK-Grafana@users.noreply.github.com>

* mention `summary` annotation in multi-dimensional alerts example

* Remove note about alert grouping

* minor edits to section: `Differences with time series`

* minor grammar change

---------

Co-authored-by: Johnny Kartheiser <140559259+JohnnyK-Grafana@users.noreply.github.com>
pull/105414/head
Pepe Cano 6 days ago committed by GitHub
parent 0c699d4a72
commit 4ae91715df
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
  1. 70
      docs/sources/alerting/fundamentals/alert-rules/queries-conditions.md
  2. 20
      docs/sources/alerting/learn/examples/_index.md
  3. 156
      docs/sources/alerting/learn/examples/multi-dimensional-alerts.md
  4. 128
      docs/sources/alerting/learn/examples/table-data.md

@ -47,6 +47,11 @@ refs:
destination: /docs/grafana/<GRAFANA_VERSION>/panels-visualizations/query-transform-data/expression-queries/#reduce
- pattern: /docs/grafana-cloud/
destination: /docs/grafana-cloud/visualizations/panels-visualizations/query-transform-data/expression-queries/#reduce
table-data-example:
- pattern: /docs/grafana/
destination: /docs/grafana/<GRAFANA_VERSION>/alerting/learn/examples/table-data/
- pattern: /docs/grafana-cloud/
destination: /docs/grafana-cloud/alerting-and-irm/alerting/learn/examples/table-data/
---
# Queries and conditions
@ -70,7 +75,7 @@ Queries in Grafana can be applied in various ways, depending on the data source
Alerting can work with two types of data:
1. **Time series data** — The query returns a collection of time series, where each series must be [reduced](#reduce) to a single numeric value for evaluating the alert condition.
1. **Tabular data** — The query must return data in a table format with only one numeric column. Each row must have a value in that column, used to evaluate the alert condition. See a [tabular data example](#alert-example-on-tabular-data).
1. **Tabular data** — The query must return data in a table format with only one numeric column. Each row must have a value in that column, used to evaluate the alert condition. See a [tabular data example](ref:table-data-example).
Each time series or table row is evaluated as a separate [alert instance](ref:alert-instance).
@ -209,66 +214,3 @@ The following aggregation functions are also available to further refine your qu
| `count_non_null` | Displays a count of values in the result set that aren't `null` |
{{< /collapse >}}
## Alert example on tabular data
Grafana Alerting supports backend data sources that return data in table format, including:
- SQL-based data sources, such as MySQL, PostgreSQL, MSSQL, and Oracle.
- Data sources that expose structured data via query languages or APIs.
- Formats like CSV or JSON, accessed through plugins such as the TestData or Infinity data source.
For Alerting to process tabular data, the data must be structured like:
1. Rows must contain only one column with a single numeric values (e.g. int, double, float).
1. Rows can also include additional columns with string values, which become labels.
The name of the column becomes the label name, and the value in each row becomes the value of the corresponding label. If multiple rows are returned, each row should be uniquely identified by its labels.
**Example**
For a MySQL table called "DiskSpace":
| Time | Host | Disk | PercentFree |
| ----------- | ---- | ---- | ----------- |
| 2021-June-7 | web1 | /etc | 3 |
| 2021-June-7 | web2 | /var | 4 |
| 2021-June-7 | web3 | /var | 8 |
You can query the data by filtering on time, but without returning the time series to Grafana. For example, a query that calculate the free space per Host and Disk:
```sql
SELECT
Host,
Disk,
AVG(PercentFree) AS PercentFree
FROM DiskSpace
WHERE __timeFilter(Time)
GROUP BY
Host,
Disk
```
This query returns the following Table response to Grafana:
| Host | Disk | PercentFree |
| ---- | ---- | ----------- |
| web1 | /etc | 3 |
| web2 | /var | 4 |
| web3 | /var | 8 |
When Alerting evaluates the query response, the data is transformed into time series data, producing three alert instances as follows:
| Alert instance | Value |
| --------------------- | ----- |
| {Host=web1,disk=/etc} | 3 |
| {Host=web2,disk=/var} | 4 |
| {Host=web3,disk=/var} | 8 |
Finally, an alert condition that checks for less than 5% of free space (`$A < 5`) results in two alert instances firing:
| Alert instance | State |
| --------------------- | ---------- |
| {Host=web1,disk=/etc} | 1 `Firing` |
| {Host=web2,disk=/var} | 1 `Firing` |
| {Host=web3,disk=/var} | 0 `Normal` |

@ -0,0 +1,20 @@
---
canonical: https://grafana.com/docs/grafana/latest/alerting/learn/examples/
description: This section provides practical examples of alert rules for common monitoring scenarios.
keywords:
- grafana
labels:
products:
- cloud
- enterprise
- oss
menuTitle: Examples
title: Grafana Alerting Examples
weight: 1100
---
# Grafana Alerting Examples
This section provides practical examples of alert rules for common monitoring scenarios. Each example focuses on a specific use case, showing how to structure queries, evaluate conditions, and understand how Grafana generates alert instances.
{{< section >}}

@ -0,0 +1,156 @@
---
canonical: https://grafana.com/docs/grafana/latest/alerting/learn/examples/multi-dimensional-alerts/
description: This example shows how a single alert rule can generate multiple alert instances using time series data.
keywords:
- grafana
labels:
products:
- cloud
- enterprise
- oss
menuTitle: Multi-dimensional alerts
title: Example of multi-dimensional alerts on time series data
weight: 1101
refs:
testdata-data-source:
- pattern: /docs/grafana/
destination: /docs/grafana/<GRAFANA_VERSION>/datasources/testdata/
- pattern: /docs/grafana-cloud/
destination: /docs/grafana-cloud/connect-externally-hosted/data-sources/testdata/
table-data-example:
- pattern: /docs/grafana/
destination: /docs/grafana/<GRAFANA_VERSION>/alerting/learn/examples/multi-dimensional-alerts/table-data/
- pattern: /docs/grafana-cloud/
destination: /docs/grafana-cloud/alerting-and-irm/alerting/learn/examples/multi-dimensional-alerts/table-data/
annotations:
- pattern: /docs/grafana/
destination: /docs/grafana/<GRAFANA_VERSION>/alerting/fundamentals/alert-rules/annotation-label/#annotations
- pattern: /docs/grafana-cloud/
destination: /docs/grafana-cloud/alerting-and-irm/alerting/fundamentals/alert-rules/annotation-label/#annotations
reduce-expression:
- pattern: /docs/grafana/
destination: /docs/grafana/<GRAFANA_VERSION>/alerting/fundamentals/alert-rules/queries-conditions/#reduce
- pattern: /docs/grafana-cloud/
destination: /docs/grafana-cloud/alerting-and-irm/alerting/fundamentals/alert-rules/queries-conditions/#reduce
alert-grouping:
- pattern: /docs/grafana/
destination: /docs/grafana/<GRAFANA_VERSION>/alerting/fundamentals/notifications/group-alert-notifications/
- pattern: /docs/grafana-cloud/
destination: /docs/grafana-cloud/alerting-and-irm/alerting/fundamentals/notifications/group-alert-notifications/
---
# Example of multi-dimensional alerts on time series data
This example shows how a single alert rule can generate multiple alert instances — one for each label set (or time series). This is called **multi-dimensional alerting**: one alert rule, many alert instances.
In Prometheus, each unique combination of labels defines a distinct time series. Grafana Alerting uses the same model: each label set is evaluated independently, and a separate alert instance is created for each series.
This pattern is common in dynamic environments when monitoring a group of components like multiple CPUs, containers, or per-host availability. Instead of defining individual alert rules or aggregated alerts, you alert on _each dimension_ — so you can detect particular issues and include that level of detail in notifications.
For example, a query returns one series per CPU:
| `cpu` label value | CPU percent usage |
| :---------------- | :---------------- |
| cpu-0 | 95 |
| cpu-1 | 30 |
| cpu-2 | 85 |
With a threshold of `> 80`, this would trigger two alert instances for `cpu-0` and one for `cpu-2`.
## Examples overview
Imagine you want to trigger alerts when CPU usage goes above 80%, and you want to track each CPU core independently.
You can use a Prometheus query like this:
```
sum by(cpu) (
rate(node_cpu_seconds_total{mode!="idle"}[1m])
)
```
This query returns the active CPU usage rate per CPU core, averaged over the past minute.
| CPU core | Active usage rate |
| :------- | :---------------- |
| cpu-0 | 95 |
| cpu-1 | 30 |
| cpu-2 | 85 |
This produces one series for each existing CPU.
When Grafana Alerting evaluates the query, it creates an individual alert instance for each returned series.
| Alert instance | Value |
| :------------- | :---- |
| {cpu="cpu-0"} | 95 |
| {cpu="cpu-1"} | 30 |
| {cpu="cpu-2"} | 85 |
With a threshold condition like `$A > 80`, Grafana evaluates each instance separately and fires alerts only where the condition is met:
| Alert instance | Value | State |
| :------------- | :---- | :----- |
| {cpu="cpu-0"} | 95 | Firing |
| {cpu="cpu-1"} | 30 | Normal |
| {cpu="cpu-2"} | 85 | Firing |
Multi-dimensional alerts help you surface issues on individual components—problems that might be missed when alerting on aggregated data (like total CPU usage).
Each alert instance targets a specific component, identified by its unique label set. This makes alerts more specific and actionable. For example, you can set a [`summary` annotation](ref:annotations) in your alert rule that identifies the affected CPU:
```
High CPU usage on {{$labels.cpu}}
```
In the previous example, the two firing alert instances would display summaries indicating the affected CPUs:
- High CPU usage on `cpu-0`
- High CPU usage on `cpu-2`
## Try it with TestData
You can quickly experiment with multi-dimensional alerts using the [**TestData** data source](ref:testdata-data-source), which can generate multiple random time series.
1. Add the **TestData** data source through the **Connections** menu.
1. Go to **Alerting** and create an alert rule
1. Select **TestData** as the data source.
1. Configure the TestData scenario
1. Scenario: **Random Walk**
1. Series count: 3
1. Start value: 70, Max: 100
1. Labels: `cpu=cpu-$seriesIndex`
{{< figure src="/media/docs/alerting/testdata-random-series.png" max-width="750px" alt="Generating random time series data using the TestData data source" >}}
## Reduce time series data for comparison
The example returns three time series like shown above with values across the selected time range.
To alert on each series, you need to reduce the time series to a single value that the alert condition can evaluate and determine the alert instance state.
Grafana Alerting provides several ways to reduce time series data:
- **Data source query functions**. The earlier example used the Prometheus `sum` function to sum the rate results by `cpu,`producing a single value per CPU core.
- **Reduce expression**. In the query and condition section, Grafana provides the `Reduce` expression to aggregate time series data.
- In **Default mode**, the **When** input selects a reducer (like `last`, `mean`, or `min`), and the threshold compares that reduced value.
- In **Advanced mode**, you can add the [**Reduce** expression](ref:reduce-expression) (e.g., `last()`, `mean()`) before defining the threshold (alert condition).
For demo purposes, this example uses the **Advanced mode** with a **Reduce** expression:
1. Toggle **Advanced mode** in the top right section of the query panel to enable adding additional expressions.
1. Add the **Reduce** expression using a function like `mean()` to reduce each time series to a single value.
1. Define the alert condition using a **Threshold** like `$reducer > 80`
1. Click **Preview** to evaluate the alert rule.
{{< figure src="/media/docs/alerting/using-expressions-with-multiple-series.png" max-width="750px" caption="The alert condition evaluates the reduced value for each alert instance and shows whether each instance is Firing or Normal." alt="Alert preview using a Reduce expression and a threshold condition" >}}
## Learn more
This example shows how Grafana Alerting implements a multi-dimensional alerting model: one rule, many alert instances and why reducing time series data to a single value is required for evaluation.
For additional learning resources, check out:
- [Get started with Grafana Alerting – Part 2](https://grafana.com/tutorials/alerting-get-started-pt2/)
- [Example of alerting on tabular data](ref:table-data-example)

@ -0,0 +1,128 @@
---
canonical: https://grafana.com/docs/grafana/latest/alerting/learn/examples/table-data
description: This example shows how to create an alert rule using table data.
keywords:
- grafana
labels:
products:
- cloud
- enterprise
- oss
menuTitle: Table data
title: Example of alerting on tabular data
weight: 1102
refs:
testdata-data-source:
- pattern: /docs/grafana/
destination: /docs/grafana/<GRAFANA_VERSION>/datasources/testdata/
- pattern: /docs/grafana-cloud/
destination: /docs/grafana-cloud/connect-externally-hosted/data-sources/testdata/
multi-dimensional-example:
- pattern: /docs/grafana/
destination: /docs/grafana/<GRAFANA_VERSION>/alerting/learn/examples/multi-dimensional-alerts/
- pattern: /docs/grafana-cloud/
destination: /docs/grafana-cloud/alerting-and-irm/alerting/learn/examples/multi-dimensional-alerts/
---
# Example of alerting on tabular data
Not all data sources return time series data. SQL databases, CSV files, and some APIs often return results as rows or arrays of columns or fields — commonly referred to as tabular data.
This example shows how to create an alert rule using data in table format. Grafana treats each row as a separate alert instance, as long as the data meets the expected format.
## How Grafana Alerting evaluates tabular data
When a query returns data in table format, Grafana transforms each row into a separate alert instance.
To evaluate each row (alert instance), it expects:
1. **Only one numeric column.** This is the value used for evaluating the alert condition.
1. **Non-numeric columns.** These columns defines the label set. The column name becomes a label name; and the cell value becomes the label value.
1. **Unique label sets per row.** Each row must be uniquely identifiable by its labels. This ensures each row represents a distinct alert instance.
{{< admonition type="caution" >}}
These three conditions must be met—otherwise, Grafana can’t evaluate the table data and the rule will fail.
{{< /admonition >}}
## Example overview
Imagine you store disk usage in a `DiskSpace` table and you want to trigger alerts when the available space drops below 5%.
| Time | Host | Disk | PercentFree |
| ---------- | ---- | ---- | ----------- |
| 2021-06-07 | web1 | /etc | 3 |
| 2021-06-07 | web2 | /var | 4 |
| 2021-06-07 | web3 | /var | 8 |
To calculate the free space per Host and Disk in this case, you can use `$__timeFilter` to filter by time but without returning the date to Grafana:
```sql
SELECT
Host,
Disk,
AVG(PercentFree) AS PercentFree
FROM DiskSpace
WHERE $__timeFilter(Time)
GROUP BY Host, Disk
```
This query returns the following table response:
| Host | Disk | PercentFree |
| ---- | ---- | ----------- |
| web1 | /etc | 3 |
| web2 | /var | 4 |
| web3 | /var | 8 |
When Alerting evaluates the query response, the data is transformed into three alert instances as previously detailed:
- The numeric column becomes the value for the alert condition.
- Additional columns define the label set for each alert instance.
| Alert instance | Value |
| ---------------------------- | ----- |
| `{Host="web1", Disk="/etc"}` | 3 |
| `{Host="web2", Disk="/var"}` | 4 |
| `{Host="web3", Disk="/var"}` | 8 |
Finally, an alert condition that checks for less than 5% of free space (`$A < 5`) would result in two alert instances firing:
| Alert instance | Value | State |
| ---------------------------- | ----- | ------ |
| `{Host="web1", Disk="/etc"}` | 3 | Firing |
| `{Host="web2", Disk="/var"}` | 4 | Firing |
| `{Host="web3", Disk="/var"}` | 8 | Normal |
## Try it with TestData
To test this quickly, you can simulate the table using the [**TestData** data source](ref:testdata-data-source):
1. Add the **TestData** data source through the **Connections** menu.
1. Go to **Alerting** and create an alert rule
1. Select **TestData** as the data source.
1. From **Scenario**, select **CSV Content** and paste this CSV:
```bash
host, disk, percentFree
web1, /etc, 3
web2, /var, 4
web3, /var, 8
```
1. Set a condition like `$A < 5` and **Preview** the alert.
Grafana evaluates the table data and fires the two first alert instances.
{{< figure src="/media/docs/alerting/example-table-data-preview.png" max-width="750px" alt="Alert preview with tabular data using the TestData data source" >}}
## **Differences with time series data**
Working with time series is similar—each series is treated as a separate alert instance, based on its label set.
The key difference is the data format:
- **Time series data** contains multiple values over time, each with its own timestamp.
To evaluate the alert condition, alert rules **must reduce each series to a single number** using a function like `last()`, `avg()`, or `max()`.
- **Tabular data** doesn’t require reduction, as each row contains only a single numeric value used to evaluate the alert condition.
For comparison, see the [multi-dimensional time series data example](ref:multi-dimensional-example).
Loading…
Cancel
Save