[release-12.0.2] docs(alerting): Examples of dynamic labels and dynamic thresholds (#106087)

docs(alerting): Examples of dynamic labels and dynamic thresholds (#105776)

* docs(alerting): Add dynamic thresholds example

* update intro

* docs(alerting): Example of dynamic labels

* fix template example

* Update docs/sources/alerting/best-practices/dynamic-labels.md



* Update docs/sources/alerting/best-practices/dynamic-labels.md



* Update docs/sources/alerting/best-practices/dynamic-labels.md



* Update docs/sources/alerting/best-practices/dynamic-labels.md



* Update docs/sources/alerting/best-practices/dynamic-labels.md



* Update docs/sources/alerting/best-practices/dynamic-thresholds.md



* Update docs/sources/alerting/best-practices/dynamic-thresholds.md



* Update docs/sources/alerting/best-practices/dynamic-thresholds.md



* Update docs/sources/alerting/best-practices/dynamic-labels.md



* Update docs/sources/alerting/best-practices/dynamic-labels.md



* Update docs/sources/alerting/best-practices/dynamic-labels.md



* Update docs/sources/alerting/best-practices/dynamic-labels.md



* Update docs/sources/alerting/best-practices/dynamic-labels.md



* Update docs/sources/alerting/best-practices/dynamic-labels.md



* Update docs/sources/alerting/best-practices/dynamic-labels.md



* fix typo

* fix typo

* Add section `CSV data with Infinity` in Table data example

* Link dynamic threshold example to tabular data requirements

* minor copy changes

* minor heading fix

* Add links (admonition) to Grafana Play examples

* Use `Caveat` instead of `Gotcha`

* Dynamic thresholds: caution message when Math operates on missing series

* Exampleof latency threshold based on traffic

---------


(cherry picked from commit c84388f550)

Co-authored-by: Pepe Cano <825430+ppcano@users.noreply.github.com>
Co-authored-by: Johnny Kartheiser <140559259+JohnnyK-Grafana@users.noreply.github.com>
pull/106101/head
grafana-delivery-bot[bot] 1 month ago committed by GitHub
parent 54200ca1cc
commit 22b6da78a8
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
  1. 8
      docs/sources/alerting/alerting-rules/templates/_index.md
  2. 8
      docs/sources/alerting/alerting-rules/templates/examples.md
  3. 328
      docs/sources/alerting/best-practices/dynamic-labels.md
  4. 229
      docs/sources/alerting/best-practices/dynamic-thresholds.md
  5. 13
      docs/sources/alerting/best-practices/multi-dimensional-alerts.md
  6. 17
      docs/sources/alerting/best-practices/table-data.md
  7. 10
      docs/sources/alerting/fundamentals/alert-rules/queries-conditions.md
  8. 10
      docs/sources/shared/alerts/math-example.md
  9. 8
      docs/sources/shared/alerts/note-dynamic-labels.md

@ -18,11 +18,11 @@ labels:
title: Template annotations and labels
weight: 500
refs:
shared-stale-alert-instances:
shared-dynamic-label-example:
- pattern: /docs/grafana/
destination: /docs/grafana/<GRAFANA_VERSION>/alerting/fundamentals/alert-rule-evaluation/stale-alert-instances/
destination: /docs/grafana/<GRAFANA_VERSION>/alerting/best-practices/dynamic-labels/
- pattern: /docs/grafana-cloud/
destination: /docs/grafana-cloud/alerting-and-irm/alerting/fundamentals/alert-rule-evaluation/stale-alert-instances/
destination: /docs/grafana-cloud/alerting-and-irm/alerting/best-practices/dynamic-labels/
reference-labels:
- pattern: /docs/grafana/
destination: /docs/grafana/<GRAFANA_VERSION>/alerting/alerting-rules/templates/reference/#labels
@ -198,7 +198,7 @@ In this example, the value of the `severity` label is determined by the query va
> **Note:** An alert instance is uniquely identified by its set of labels.
>
> - Avoid displaying query values in labels, as this can create numerous alert instances—one for each distinct label set. Instead, use annotations for query values.
> - If a templated label's value changes, it maps to a different alert instance, and the previous instance is considered [stale (MissingSeries)](ref:shared-stale-alert-instances) when its label value is no longer present.
> - If a templated label's value changes, it maps to a different alert instance, and the previous instance is considered **stale**. Learn all the details in this [example using dynamic labels](ref:shared-dynamic-label-example).
[//]: <> ({{< docs/shared lookup="alerts/note-dynamic-labels.md" source="grafana" version="<GRAFANA_VERSION>" >}})

@ -16,11 +16,11 @@ title: Labels and annotations template examples
menuTitle: Examples
weight: 102
refs:
shared-stale-alert-instances:
shared-dynamic-label-example:
- pattern: /docs/grafana/
destination: /docs/grafana/<GRAFANA_VERSION>/alerting/fundamentals/alert-rule-evaluation/stale-alert-instances/
destination: /docs/grafana/<GRAFANA_VERSION>/alerting/best-practices/dynamic-labels/
- pattern: /docs/grafana-cloud/
destination: /docs/grafana-cloud/alerting-and-irm/alerting/fundamentals/alert-rule-evaluation/stale-alert-instances/
destination: /docs/grafana-cloud/alerting-and-irm/alerting/best-practices/dynamic-labels/
labels:
- pattern: /docs/grafana/
destination: /docs/grafana/<GRAFANA_VERSION>/alerting/fundamentals/alert-rules/annotation-label/#labels
@ -217,7 +217,7 @@ You can then use the `severity` label to control how alerts are handled. For ins
> **Note:** An alert instance is uniquely identified by its set of labels.
>
> - Avoid displaying query values in labels, as this can create numerous alert instances—one for each distinct label set. Instead, use annotations for query values.
> - If a templated label's value changes, it maps to a different alert instance, and the previous instance is considered [stale (MissingSeries)](ref:shared-stale-alert-instances) when its label value is no longer present.
> - If a templated label's value changes, it maps to a different alert instance, and the previous instance is considered **stale**. Learn all the details in this [example using dynamic labels](ref:shared-dynamic-label-example).
[//]: <> ({{< docs/shared lookup="alerts/note-dynamic-labels.md" source="grafana" version="<GRAFANA_VERSION>" >}})

@ -0,0 +1,328 @@
---
canonical: https://grafana.com/docs/grafana/latest/alerting/best-practices/dynamic-labels
description: This example shows how to define dynamic labels based on query values, along with important behavior to keep in mind when using them.
keywords:
- grafana
- alerting
- examples
labels:
products:
- cloud
- enterprise
- oss
menuTitle: Examples of dynamic labels
title: Example of dynamic labels in alert instances
weight: 1104
refs:
missing-data-guide:
- pattern: /docs/grafana/
destination: /docs/grafana/<GRAFANA_VERSION>/alerting/best-practices/missing-data/
- pattern: /docs/grafana-cloud/
destination: /docs/grafana-cloud/alerting-and-irm/alerting/best-practices/missing-data/
alert-rule-evaluation:
- pattern: /docs/grafana/
destination: /docs/grafana/<GRAFANA_VERSION>/alerting/fundamentals/alert-rule-evaluation/
- pattern: /docs/grafana-cloud/
destination: /docs/grafana-cloud/alerting-and-irm/alerting/fundamentals/alert-rule-evaluation/
pending-period:
- pattern: /docs/grafana/
destination: /docs/grafana/<GRAFANA_VERSION>/alerting/fundamentals/notifications/notification-policies/
- pattern: /docs/grafana-cloud/
destination: /docs/grafana-cloud/alerting-and-irm/alerting/fundamentals/notifications/notification-policies/
view-alert-state-history:
- pattern: /docs/grafana/
destination: /docs/grafana/<GRAFANA_VERSION>/alerting/monitor-status/view-alert-state-history/
- pattern: /docs/grafana-cloud/
destination: /docs/grafana-cloud/alerting-and-irm/alerting/monitor-status/view-alert-state-history/
stale-alert-instances:
- pattern: /docs/grafana/
destination: /docs/grafana/<GRAFANA_VERSION>/alerting/fundamentals/alert-rule-evaluation/stale-alert-instances/
- pattern: /docs/grafana-cloud/
destination: /docs/grafana-cloud/alerting-and-irm/alerting/fundamentals/alert-rule-evaluation/stale-alert-instances/
notification-policies:
- pattern: /docs/grafana/
destination: /docs/grafana/<GRAFANA_VERSION>/alerting/fundamentals/notifications/notification-policies/
- pattern: /docs/grafana-cloud/
destination: /docs/grafana-cloud/alerting-and-irm/alerting/fundamentals/notifications/notification-policies/
templating-labels-annotations:
- pattern: /docs/grafana/
destination: /docs/grafana/<GRAFANA_VERSION>/alerting/alerting-rules/templates/
- pattern: /docs/grafana-cloud/
destination: /docs/grafana-cloud/alerting-and-irm/alerting/alerting-rules/templates/
labels:
- pattern: /docs/grafana/
destination: /docs/grafana/<GRAFANA_VERSION>/alerting/fundamentals/alert-rules/annotation-label/#labels
- pattern: /docs/grafana-cloud/
destination: /docs/grafana-cloud/alerting-and-irm/alerting/fundamentals/alert-rules/alert-rules/annotation-label/#labels
testdata-data-source:
- pattern: /docs/grafana/
destination: /docs/grafana/<GRAFANA_VERSION>/datasources/testdata/
- pattern: /docs/grafana-cloud/
destination: /docs/grafana-cloud/connect-externally-hosted/data-sources/testdata/
multi-dimensional-example:
- pattern: /docs/grafana/
destination: /docs/grafana/<GRAFANA_VERSION>/alerting/best-practices/multi-dimensional-alerts/
- pattern: /docs/grafana-cloud/
destination: /docs/grafana-cloud/alerting-and-irm/alerting/best-practices/multi-dimensional-alerts/
---
# Example of dynamic labels in alert instances
Labels are essential for scaling your alerting setup. They define metadata like `severity`, `team`, `category`, or `environment`, which you can use for alert routing.
A label like `severity="critical"` can be set statically in the alert rule configuration, or dynamically based on a query value such as the current free disk space. Dynamic labels **adjust label values at runtime**, allowing you to reuse the same alert rule across different scenarios.
This example shows how to define dynamic labels based on query values, along with key behavior to keep in mind when using them.
First, it's important to understand how Grafana Alerting treats [labels](ref:labels).
## Alert instances are defined by labels
Each alert rule creates a separate alert instance for every unique combination of labels.
This is called [multi-dimensional alerts](ref:multi-dimensional-example): one rule, many instances—**one per unique label set**.
For example, a rule that queries CPU usage per host might return multiple series (or dimensions):
- `{alertname="ServerHighCPU", instance="prod-server-1" }`
- `{alertname="ServerHighCPU", instance="prod-server-2" }`
- `{alertname="ServerHighCPU", instance="prod-server-3" }`
Each unique label combination defines a distinct alert instance, with its own evaluation state and potential notifications.
The full label set of an alert instance can include:
- Labels from the query result (e.g., `instance`)
- Auto-generated labels (e.g., `alertname`)
- User-defined labels from the rule configuration
## User-defined labels
As shown earlier, alert instances automatically include labels from the query result, such as `instance` or `job`. To add more context or control alert routing, you can define _user-defined labels_ in the alert rule configuration:
{{< figure src="/media/docs/alerting/example-dynamic-labels-edit-labels-v3.png" max-width="750px" alt="Edit labels UI in the alert rule configuration." >}}
User-defined labels can be either:
- **Fixed labels**: These have the same value for every alert instance. They are often used to include common metadata, such as team ownership.
- **Templated labels**: These calculate their values based on the query result at evaluation time.
## Templated labels
Templated labels evaluate their values dynamically, based on the query result. This allows the label value to vary per alert instance.
Use templated labels to inject additional context into alerts. To learn about syntax and use cases, refer to [Template annotations and labels](ref:templating-labels-annotations).
You can define templated labels that produce either:
- A fixed value per alert instance.
- A dynamic value per alert instance that changes based on the last query result.
### Fixed values per alert instance
You can use a known label value to enrich the alert with additional metadata not present in existing labels. For example, you can map the `instance` label to an `env` label that represents the deployment environment:
```go
{{- if eq $labels.instance "prod-server-1" -}}production
{{- else if eq $labels.instance "stag-server-1" -}}staging
{{- else -}}development
{{- end -}}
```
This produces alert instances like:
- `{alertname="ServerHighCPU", instance="prod-server-1", env="production"}`
- `{alertname="ServerHighCPU", instance="stag-server-1", env="staging"}`
In this example, the `env` label is fixed for each alert instance and does not change during its lifecycle.
### Dynamic values per alert instance
You can define a label whose value depends on the numeric result of a query—mapping it to a predefined set of options. This is useful for representing `severity` levels within a single alert rule.
Instead of defining three separate rules like:
- _CPU ≥ 90_`severity=critical`
- _CPU ≥ 80_`severity=warning`
- _CPU ≥ 70_`severity=minor`
You can define a single rule and assign `severity` dynamically using a template:
```go
{{/* $values.B.Value refers to the numeric result from query B */}}
{{- if gt $values.B.Value 90.0 -}}critical
{{- else if gt $values.B.Value 80.0 -}}warning
{{- else if gt $values.B.Value 70.0 -}}minor
{{- else -}}none
{{- end -}}
```
This pattern lets you express multiple alerting scenarios in a single rule, while still routing based on the `severity` label value.
## Example overview
In the previous severity template, you can set the alert condition to `$B > 70` to prevent firing when `severity=none`, and then use the `severity` label to route distinct alert instances to different contact points.
For example, configure a [notification policy](ref:notification-policies) that matches `alertname="ServerHighCPU"` with the following children policies:
- `severity=critical` → escalate to an incident response and management solution (IRM).
- `severity=warning` → send to the team's Slack channel.
- `severity=minor` → send to a non-urgent queue or log-only dashboard.
The resulting alerting flow might look like this:
| Time | $B query | Alert instance | Routed to |
| :--- | :------- | :------------------------------------------------- | :------------------- |
| t1 | 65 | `{alertname="ServerHighCPU", severity="none"}` | `Not firing` |
| t2 | 75 | `{alertname="ServerHighCPU", severity="minor"}` | Non-urgent queue |
| t3 | 85 | `{alertname="ServerHighCPU", severity="warning"}` | Team Slack channel |
| t4 | 95 | `{alertname="ServerHighCPU", severity="critical"}` | IRM escalation chain |
This alerting setup allows you to:
- Use a single rule for multiple severity levels.
- Route alerts dynamically using the label value.
- Simplify alert rule maintenance and avoid duplication.
However, dynamic labels can introduce unexpected behavior when label values change. The next section explains this.
## Caveat: a label change affects a distinct alert instance
Remember: **alert instances are defined by their labels**.
If a dynamic label changes between evaluations, this new value affects a separate alert instance.
Here's what happens if `severity` changes from `minor` to `warning`:
1. The instance with `severity="minor"` disappears → it becomes a missing series.
1. A new instance with `severity="warning"` appears → it starts from scratch.
1. After two evaluations without data, the `minor` instance is **resolved and evicted**.
Here’s a sequence example:
| Time | Query value | Instance `severity="none"` | Instance `severity="minor"` | Instance `severity="warning"` |
| :--- | :---------- | :------------------------- | :-------------------------- | :---------------------------- |
| t0 | | | | |
| t1 | 75 | | 🔴 📩 | |
| t2 | 85 | | ⚠ MissingSeries | 🔴 📩 |
| t3 | 85 | | ⚠ MissingSeries | 🔴 |
| t4 | 50 | 🟢 | 📩 Resolved and evicted | ⚠ MissingSeries |
| t5 | 50 | 🟢 | | ⚠ MissingSeries |
| t6 | 50 | 🟢 | | 📩 Resolved and evicted |
Learn more about this behavior in [Stale alert instances](ref:stale-alert-instances).
In this example, the `minor` and `warning` alerts likely represent the same underlying issue, but Grafana treats them as distinct alert instances. As a result, this scenario generates two firing notifications and two resolved notifications, one for each instance.
This behavior is important to keep in mind when dynamic label values change frequently.
It can lead to multiple notifications firing and resolving in short intervals, resulting in **noisy and confusing notifications**.
## Try it with TestData
You can replicate this scenario using the [TestData data source](ref:testdata-data-source) to simulate an unstable signal—like monitoring a noisy sensor.
This setup reproduces label flapping and shows how dynamic label values affect alert instance behavior.
1. Add the **TestData** data source through the **Connections** menu.
1. Create an alert rule.
Navigate to **Alerting****Alert rules** and click **New alert rule**.
1. Simulate a query (`$A`) that returns a noisy signal.
Select **TestData** as the data source and configure the scenario.
- Scenario: Random Walk
- Series count: 1
- Start value: 51
- Min: 50, Max: 100
- Spread: 100 (ensures large changes between consecutive data points)
1. Add an expression.
- Type: Reduce
- Input: A
- Function: Last (to get the most recent value)
- Name: B
1. Define the alert condition.
Use a threshold like `$B >= 50` (it always fires).
1. Click **Edit Labels** to add a dynamic label.
Create a new label `severity` and set its value to the following:
```go
{{/* $values.B.Value refers to the numeric result from query B */}}
{{- if gt $values.B.Value 90.0 -}}P1
{{- else if gt $values.B.Value 80.0 -}}P2
{{- else if gt $values.B.Value 70.0 -}}P3
{{- else if gt $values.B.Value 60.0 -}}P4
{{- else if gt $values.B.Value 50.0 -}}P5
{{- else -}}none
{{- end -}}
```
1. Set evaluation behavior.
Set a short evaluation interval (e.g., `10s`) to observe quickly label flapping and alert instance transitions in the history.
1. Preview alert routing to verify the label template.
In **Configure notifications**, toggle **Advanced options**.
Click **Preview routing** and check the value of the `severity` label:
{{< figure src="/media/docs/alerting/example-dynamic-labels-preview-label.png" max-width="750px" caption="Preview routing multiple times to verify how label values change over time." >}}
1. Observe alert state changes.
Click **Save rule and exit**, and open the [alert history view](ref:view-alert-state-history) to see how changes in `severity` affect the state of distinct alert instances.
{{< figure src="/media/docs/alerting/example-dynamic-labels-alert-history-page.png" max-width="750px" caption="You can find multiple transitions over time as the label value fluctuates." >}}
{{< docs/play title="this alert example" url="https://play.grafana.org/alerting/grafana/femr0gkp9vsowe/view" >}}
## Considerations
Dynamic labels lets you reuse a single alert rule across multiple escalation scenarios—but it also introduces complexity. When the label value depends on a noisy metric and changes frequently, it can lead to flapping alert instances and excessive notifications.
These alerts often require tuning to stay reliable and benefit from continuous review. To get the most out of this pattern, consider the following:
- **Tune evaluation settings and queries for stability**
Increase the [evaluation interval and pending period](ref:alert-rule-evaluation) to reduce the frequency of state changes. Additionally, consider smoothing metrics with functions like `avg_over_time` to reduce flapping.
- **Use wider threshold bands**
Define broader ranges in your label template logic to prevent label switching caused by small value changes.
- **Disable resolved notifications**
When labels change frequently and alerts resolve quickly, you can reduce the number of notifications by disabling resolved notifications at the contact point.
- **Disable the Missing series evaluations setting**
The [Missing series evaluations setting](ref:stale-alert-instances) (default: 2) defines how many intervals without data are allowed before resolving an instance. Consider disabling it if it's unnecessary for your use case, as it can complicate alert troubleshooting.
- **Preserve context across related alerts**
Ensure alert metadata includes enough information to help correlate related alerts during investigation.
- **Use separate alert rules and static labels when simpler**
In some cases, defining separate rules with static labels may be easier to manage than one complex dynamic rule. This also allows you to customize alert queries for each specific case.
## Learn more
Here's a list of additional resources related to this example:
- [Multi-dimensional alerting example](ref:multi-dimensional-example) – Explore how Grafana creates separate alert instances for each unique set of labels.
- [Labels](ref:labels) – Learn about the different types of labels and how they define alert instances.
- [Template labels in alert rules](ref:templating-labels-annotations) – Use templating to set label values dynamically based on query results.
- [Stale alert instances](ref:stale-alert-instances) – Understand how Grafana resolves and removes stale alert instances.
- [Handle missing data](ref:missing-data-guide) – Learn how Grafana distinguishes between missing series and `NoData`.
- [Notification policies and routing](ref:notification-policies) – Create multiple notification policies to route alerts based on label values like `severity` or `team`.
- [Dynamic label example in Grafana Play](https://play.grafana.org/alerting/grafana/femr0gkp9vsowe/view) - View this example in Grafana Play to explore alert instances and state transitions with dynamic labels.

@ -0,0 +1,229 @@
---
canonical: https://grafana.com/docs/grafana/latest/alerting/best-practices/dynamic-thresholds
description: This example shows how to use a distinct threshold value per dimension using multi-dimensional alerts and a Math expression.
keywords:
- grafana
- alerting
- examples
labels:
products:
- cloud
- enterprise
- oss
menuTitle: Examples of dynamic thresholds
title: Example of dynamic thresholds per dimension
weight: 1103
refs:
testdata-data-source:
- pattern: /docs/grafana/
destination: /docs/grafana/<GRAFANA_VERSION>/datasources/testdata/
- pattern: /docs/grafana-cloud/
destination: /docs/grafana-cloud/connect-externally-hosted/data-sources/testdata/
math-expression:
- pattern: /docs/grafana/
destination: /docs/grafana/<GRAFANA_VERSION>/alerting/fundamentals/alert-rules/queries-conditions/#math
- pattern: /docs/grafana-cloud/
destination: /docs/grafana-cloud/alerting-and-irm/alerting/fundamentals/alert-rules/queries-conditions/#math
table-data-example:
- pattern: /docs/grafana/
destination: /docs/grafana/<GRAFANA_VERSION>/alerting/best-practices/table-data/
- pattern: /docs/grafana-cloud/
destination: /docs/grafana-cloud/alerting-and-irm/alerting/best-practices/table-data/
multi-dimensional-example:
- pattern: /docs/grafana/
destination: /docs/grafana/<GRAFANA_VERSION>/alerting/best-practices/multi-dimensional-alerts/
- pattern: /docs/grafana-cloud/
destination: /docs/grafana-cloud/alerting-and-irm/alerting/best-practices/multi-dimensional-alerts/
recording-rules:
- pattern: /docs/grafana/
destination: /docs/grafana/<GRAFANA_VERSION>/alerting/alerting-rules/create-recording-rules/
- pattern: /docs/grafana-cloud/
destination: /docs/grafana-cloud/alerting-and-irm/alerting/alerting-rules/create-recording-rules/
---
# Example of dynamic thresholds per dimension
In Grafana Alerting, each alert rule supports only one condition expression.
That's enough in many cases—most alerts use a fixed numeric threshold like `latency > 3s` or `error_rate > 5%` to determine their state.
As your alerting setup grows, you may find that different targets require different threshold values.
Instead of duplicating alert rules, you can assign a **different threshold value to each target**—while keeping the same condition. This simplifies alert maintenance.
This example shows how to do that using [multi-dimensional alerts](ref:multi-dimensional-example) and a [Math expression](ref:math-expression).
## Example overview
You're monitoring latency across multiple API services. Initially, you want to get alerted if the 95th percentile latency (`p95_api_latency`) exceeds 3 seconds, so your alert rule uses a single static threshold:
```
p95_api_latency > 3
```
But the team quickly finds that some services require stricter thresholds. For example, latency for payment APIs should stay under 1.5s, while background jobs can tolerate up to 5s. The team establishes different thresholds per service:
- `p95_api_latency{service="checkout-api"}`: must stay under `1.5s`.
- `p95_api_latency{service="auth-api"}`: also strict, `1.5s`.
- `p95_api_latency{service="catalog-api"}`: less critical, `3s`.
- `p95_api_latency{service="async-tasks"}`: background jobs can tolerate up to `5s`.
You want to avoid creating one alert rule per service—this is harder to maintain.
In Grafana Alerting, you can define one alert rule that monitors multiple similar components like this scenario. This is called [multi-dimensional alerts](ref:multi-dimensional-example): one alert rule, many alert instances—**one per unique label set**.
But there's an issue: Grafana supports only **one alert condition per rule**.
```
One alert rule
├─ One condition ( e.g., $A > 3)
│ └─ Applies to all returned series in $A
│ ├─ {service="checkout-api"}
│ ├─ {service="auth-api"}
│ ├─ {service="catalog-api"}
│ └─ {service="async-tasks"}
```
To evaluate per-service thresholds, you need a distinct threshold value for each returned series.
## Dynamic thresholds using a Math expression
You can create a dynamic alert condition by operating on two queries with a [Math expression](ref:math-expression).
- `$A` for query results (e.g., `p95_api_latency`).
- `$B` for per-service thresholds (from CSV data or another query).
- `$A > $B` is the _Math_ expression that defines the alert condition.
Grafana evaluates the _Math_ expression **per series**, by joining series from `$A` and `$B` based on their shared labels before applying the expression.
Here’s an example of an arithmetic operation:
{{< docs/shared lookup="alerts/math-example.md" source="grafana" version="<GRAFANA_VERSION>" >}}
In practice, you must align your threshold input with the label sets returned by your alert query.
The following table illustrates how a per-service threshold is evaluated in the previous example:
| $A: p95 latency query | $B: threshold value | $C: $A\>$B | State |
| :--------------------------- | :----------------------------- | :--------------------------- | :--------- |
| `{service="checkout-api"} 3` | `{service="checkout-api"} 1.5` | `{service="checkout-api"} 1` | **Firing** |
| `{service="auth-api"} 1` | `{service="auth-api"} 1.5` | `{service="auth-api"} 0` | **Normal** |
| `{service="catalog-api"} 2` | `{service="catalog-api"} 3` | `{service="catalog-api"} 0` | **Normal** |
| `{service="sync-work"} 3` | `{service="sync-work"} 5` | `{service="sync-work"} 0` | **Normal** |
In this example:
- `$A` comes from the `p95_api_latency` query.
- `$B` is manually defined with a threshold value for each series in `$A`.
- The alert condition compares `$A>$B` using a _Math_ relational operator (e.g., `>`, `<`, `>=`, `<=`, `==`, `!=`) that joins series by matching labels.
- Grafana evaluates the alert condition and sets the firing state where the condition is true.
The _Math_ expression works as long as each series in `$A` can be matched with exactly one series in `$B`. They must align in a way that produces a one-to-one match between series in `$A` and `$B`.
{{% admonition type="caution" %}}
If a series in one query doesn’t match any series in the other, it’s excluded from the result and a warning message is displayed:
_1 items **dropped from union(s)**: ["$A > $B": ($B: {service=payment-api})]_
{{% /admonition %}}
**Labels in both series don’t need to be identical**. If labels are a subset of the other, they can join. For example:
- `$A` returns series `{host="web01", job="event"}` 30 and `{host="web02", job="event"}` 20.
- `$B` returns series `{host="web01"}` 10 and `{host="web02"}` 0.
- `$A` + `$B` returns `{host="web01", job="event"}` 40 and `{host="web02", job="event"}` 20.
## Try it with TestData
You can use the [TestData data source](ref:testdata-data-source) to replicate this example:
1. Add the **TestData** data source through the **Connections** menu.
1. Create an alert rule.
Navigate to **Alerting****Alert rules** and click **New alert rule**.
1. Simulate a query (`$A`) that returns latencies for each service.
Select **TestData** as the data source and configure the scenario.
- Scenario: Random Walk
- Alias: latency
- Labels: service=api-$seriesIndex
- Series count: 4
- Start value: 1
- Min: 1, Max: 4
This uses `$seriesIndex` to assign unique service labels: `api-0`, `api-1`, etc.
{{< figure src="/media/docs/alerting/example-dynamic-thresholds-latency-series-v2.png" max-width="750px" alt="TestData data source returns 4 series to simulate latencies for distinct API services." >}}
1. Define per-service thresholds with static data.
Add a new query (`$B`) and select **TestData** as the data source.
From **Scenario**, select **CSV Content** and paste this CSV:
```
service,value
api-0,1.5
api-1,1.5
api-2,3
api-3,5
```
The `service` column must match the labels from `$A`.
The `value` column is a numeric value used for the alert comparison.
For details on CSV format requirements, see [table data examples](ref:table-data-example).
1. Add a new **Reduce** expression (`$C`).
- Type: Reduce
- Input: A
- Function: Mean
- Name: C
This calculates the average latency for each service: `api-0`, `api-1`, etc.
1. Add a new **Math** expression.
- Type: Math
- Expression: `$C > $B`
- Set this expression as the **alert condition**.
This fires if the average latency (`$C`) exceeds the threshold from `$B` for any service.
1. **Preview** the alert.
{{< figure src="/media/docs/alerting/example-dynamic-thresholds-preview-v3.png" max-width="750px" caption="Alert preview evaluating multiple series with distinct threshold values" >}}
{{< docs/play title="this alert example" url="https://play.grafana.org/alerting/grafana/demqzyodxrd34e/view" >}}
## Other use cases
This example showed how to build a single alert rule with different thresholds per series using [multi-dimensional alerts](ref:multi-dimensional-example) and [Math expressions](ref:math-expression).
This approach scales well when monitoring similar components with distinct reliability goals.
By aligning series from two queries, you can apply a dynamic threshold—one value per label set—without duplicating rules.
While this example uses static CSV content to define thresholds, the same technique works in other scenarios:
- **Dynamic thresholds from queries or recording rules**: Fetch threshold values from a real-time query, or from [custom recording rules](ref:recording-rules).
- **Combine multiple conditions**: Build more advanced threshold logic by combining multiple conditions—such as latency, error rate, or traffic volume.
For example, you can define a PromQL expression that sets a latency threshold which adjusts based on traffic—allowing higher response times during periods of high-load.
```
(
// Fires when p95 latency > 2s during usual traffic (≤ 1000 req/s)
service:latency:p95 > 2 and service:request_rate:rate1m <= 1000
)
or
(
// Fires when p95 latency > 4s during high traffic (> 1000 req/s)
service:latency:p95 > 4 and service:request_rate:rate1m > 1000
)
```

@ -117,12 +117,13 @@ You can quickly experiment with multi-dimensional alerts using the [**TestData**
1. Select **TestData** as the data source.
1. Configure the TestData scenario
1. Scenario: **Random Walk**
1. Series count: 3
1. Start value: 70, Max: 100
1. Labels: `cpu=cpu-$seriesIndex`
- Scenario: **Random Walk**
- Labels: `cpu=cpu-$seriesIndex`
- Series count: 3
- Min: 70, Max: 100
- Spread: 2
{{< figure src="/media/docs/alerting/testdata-random-series.png" max-width="750px" alt="Generating random time series data using the TestData data source" >}}
{{< figure src="/media/docs/alerting/testdata-random-series-v2.png" max-width="750px" alt="Generating random time series data using the TestData data source" >}}
## Reduce time series data for comparison
@ -146,6 +147,8 @@ For demo purposes, this example uses the **Advanced mode** with a **Reduce** exp
{{< figure src="/media/docs/alerting/using-expressions-with-multiple-series.png" max-width="750px" caption="The alert condition evaluates the reduced value for each alert instance and shows whether each instance is Firing or Normal." alt="Alert preview using a Reduce expression and a threshold condition" >}}
{{< docs/play title="this alert example" url="https://play.grafana.org/alerting/grafana/cemqwfn334npce/view" >}}
## Learn more
This example shows how Grafana Alerting implements a multi-dimensional alerting model: one rule, many alert instances and why reducing time series data to a single value is required for evaluation.

@ -22,6 +22,9 @@ refs:
destination: /docs/grafana/<GRAFANA_VERSION>/alerting/best-practices/multi-dimensional-alerts/
- pattern: /docs/grafana-cloud/
destination: /docs/grafana-cloud/alerting-and-irm/alerting/best-practices/multi-dimensional-alerts/
infinity-csv:
- pattern: /docs/grafana/
destination: /docs/plugins/yesoreyeram-infinity-datasource/latest/csv/
---
# Example of alerting on tabular data
@ -115,7 +118,19 @@ To test this quickly, you can simulate the table using the [**TestData** data so
{{< figure src="/media/docs/alerting/example-table-data-preview.png" max-width="750px" alt="Alert preview with tabular data using the TestData data source" >}}
## **Differences with time series data**
{{< docs/play title="this alert example" url="https://play.grafana.org/alerting/grafana/eemqylh1l8tfkf/view" >}}
## CSV data with Infinity
Note that when the [Infinity plugin fetches CSV data](ref:infinity-csv), all the columns are parsed and returned as strings. By default, this causes the query expression to fail in Alerting.
To make it work, you need to format the CSV data as [expected by Grafana Alerting](#how-grafana-alerting-evaluates-tabular-data).
In the query editor, specify the column names and their types to ensure that only one column is treated as a number.
{{< figure src="/media/docs/alerting/example-table-data-infinity-csv-data.png" max-width="750px" alt="Using the Infinity data source plugin to fetch CSV data in Alerting" >}}
## Differences with time series data
Working with time series is similar—each series is treated as a separate alert instance, based on its label set.

@ -17,6 +17,11 @@ labels:
title: Queries and conditions
weight: 104
refs:
dynamic-threshold-example:
- pattern: /docs/grafana/
destination: /docs/grafana/<GRAFANA_VERSION>/alerting/best-practices/dynamic-thresholds/
- pattern: /docs/grafana-cloud/
destination: /docs/grafana-cloud/alerting-and-irm/alerting/best-practices/dynamic-thresholds/
alert-instance:
- pattern: /docs/grafana/
destination: /docs/grafana/<GRAFANA_VERSION>/alerting/fundamentals/#alert-instances
@ -117,9 +122,7 @@ Performs free-form math functions/operations on time series data and numbers. Fo
If queries being compared have **multiple series in their results**, series from different queries are matched(joined) if they have the same labels. For example:
- `$A` returns series `{host=web01} 30` and `{host=web02} 20`
- `$B` returns series `{host=web01} 10` and `{host=web02} 0`
- `$A + $B` returns `{host=web01} 40` and `{host=web02} 20`.
{{< docs/shared lookup="alerts/math-example.md" source="grafana" version="<GRAFANA_VERSION>" >}}
In this case, only series with matching labels are joined, and the operation is calculated between them.
@ -129,6 +132,7 @@ You can also use a Math expression to define the **alert condition**. For exampl
- `$B > 70` should fire if the value of B (query or expression) is more than 70.
- `$B < $C * 100` should fire if the value of B is less than the value of C multiplied by 100.
- Compare matching series from two queries, as shown in the [dynamic threshold example](ref:dynamic-threshold-example).
### Resample

@ -0,0 +1,10 @@
---
labels:
products:
- oss
title: 'Math example'
---
- `$A` returns series `{host="web01"} 30` and `{host="web02"} 20`.
- `$B` returns series `{host="web01"} 10` and `{host="web02"} 0`.
- `$A + $B` returns `{host="web01"} 40` and `{host="web02"} 20`.

@ -5,10 +5,4 @@ labels:
title: 'Note Dynamic labels'
---
{{% admonition type="note" %}}
An alert instance is uniquely identified by its set of labels.
- Avoid displaying query values in labels, as this can create numerous alert instances—one for each distinct label set. Instead, use annotations for query values.
- If a templated label's value changes, it maps to a different alert instance, and the previous instance transitions to the `No data` state when its label value is no longer present.
{{% /admonition %}}
Pending

Loading…
Cancel
Save