@ -22,17 +38,21 @@ View a history of all alert events generated by your Grafana-managed alert rules
An alert event is displayed each time an alert instance changes its state over a period of time. All alert events are displayed regardless of whether silences or mute timings are set, so you’ll see a complete set of your data history even if you’re not necessarily being notified.
{{<admonitiontype="note">}}
Grafana OSS and Grafana Enterprise users must [configure alert state history in Loki](/docs/grafana/<GRAFANA_VERSION>/alerting/set-up/configure-alert-state-history/) to view the **History page** and **State history view**.
{{</admonition>}}
## View from the History page
{{<admonitiontype="note">}}
For Grafana Enterprise and OSS users:
The feature is available starting with Grafana 11.2.
To try out the new alert history page, enable the `alertingCentralAlertHistory` feature toggle and configure [Loki annotations](https://grafana.com/docs/grafana/<GRAFANA_VERSION>/alerting/set-up/configure-alert-state-history/).
The History page shows the history and state changes of all Grafana-managed alert rules. You can filter by labels and alert states.
Users can only view the history of alert rules they have permission to access (RBAC).
Users can only see the history and transitions of alert rules they have access to (RBAC).
{{<admonitiontype="note">}}
Grafana OSS and Grafana Enterprise users must also enable the [`alertingCentralAlertHistory`](/docs/grafana/<GRAFANA_VERSION>/setup-grafana/configure-grafana/feature-toggles/) feature toggle to access this page.
{{</admonition>}}
To access the History view, complete the following steps.
To access the History page, complete the following steps.
1. Navigate to **Alerts & IRM** -> **Alerting** -> **History**.
@ -59,10 +79,6 @@ Use the State history view to get insight into how your individual alert instanc
View information on when a state change occurred, what the previous state was, the current state, any other alert instances that changed their state at the same time as well as what the query value was that triggered the change.
{{<admonitiontype="note">}}
Open source users must [configure alert state history](/docs/grafana/latest/alerting/set-up/configure-alert-state-history/) in order to be able to access the view.
{{</admonition>}}
To access the State history view, complete the following steps.
Starting with Grafana 10, Alerting can record all alert rule state changes for your Grafana managed alert rules in a Loki instance.
Alerting can record all alert rule state changes for your Grafana managed alert rules in a Loki or Prometheus instance, or in both.
This allows you to explore the behavior of your alert rules in the Grafana explore view and levels up the existing state history dialog box with a powerful new visualisation.
- With Prometheus, you can query the `GRAFANA_ALERTS` metric for alert state changes in **Grafana Explore**.
- With Loki, you can query and view alert state changes in **Grafana Explore** and the [Grafana Alerting History views](/docs/grafana/<GRAFANA_VERSION>/alerting/monitor-status/view-alert-state-history/).
<!-- image here, maybe the one from the blog? -->
## Configure Loki for alert state
## Configuring Loki
The following steps describe a basic configuration:
To set up alert state history, make sure to have a Loki instance Grafana can write data to. The default settings might need some tweaking as the state history dialog box might query up to 30 days of data.
1. **Configure Loki**
The following change to the default configuration should work for most instances, but look at the full Loki configuration settings and adjust according to your needs.
The default Loki settings might need some tweaking as the state history view might query up to 30 days of data.
As this might impact the performances of an existing Loki instance, use a separate Loki instance for the alert state history.
The following change to the default configuration should work for most instances, but look at the full Loki configuration settings and adjust according to your needs.
```yaml
limits_config:
split_queries_by_interval: '24h'
max_query_parallelism: 32
```yaml
limits_config:
split_queries_by_interval: '24h'
max_query_parallelism: 32
```
As this might impact the performances of an existing Loki instance, use a separate Loki instance for the alert state history.
1. **Configure Grafana**
The following Grafana configuration instructs Alerting to write alert state history to a Loki instance:
```toml
[unified_alerting.state_history]
enabled = true
backend = loki
# The URL of the Loki server
loki_remote_url = http://localhost:3100
```
1. **Configure the Loki data source in Grafana**
Add the [Loki data source](/docs/grafana/<GRAFANA_VERSION>/datasources/loki/) to Grafana.
If everything is set up correctly, you can access the [History view and History page](/docs/grafana/<GRAFANA_VERSION>/alerting/monitor-status/view-alert-state-history/) to view and filter alert state history. You can also use **Grafana Explore** to query the Loki instance, see [Alerting Meta monitoring](/docs/grafana/<GRAFANA_VERSION>/alerting/monitor/) for details.
## Configure Prometheus for alert state (GRAFANA_ALERTS metric)
You can also configure a Prometheus instance to store alert state changes for your Grafana-managed alert rules. However, this setup does not enable the **Grafana Alerting History views**, as Loki does.
Instead, Grafana Alerting writes alert state data to the `GRAFANA_ALERTS` metric-similar to how Prometheus Alerting writes to the `ALERTS` metric.
The following steps describe a basic configuration:
Additional configuration is required in the Grafana configuration file to have it working with the alert state history.
1. **Configure Prometheus**
The example below instructs Grafana to write alert state history to a local Loki instance:
Enable the remote write receiver in your Prometheus instance by setting the `--web.enable-remote-write-receiver` command-line flag. This enables the endpoint to receive alert state data from Grafana Alerting.
```toml
[unified_alerting.state_history]
enabled = true
backend = "loki"
loki_remote_url = "http://localhost:3100"
1. **Configure the Prometheus data source in Grafana**
Add the [Prometheus data source](/docs/grafana/<GRAFANA_VERSION>/datasources/prometheus/) to Grafana.
In the [Prometheus data source configuration options](/docs/grafana/<GRAFANA_VERSION>/datasources/prometheus/configure/), set the **Prometheus type** to match your Prometheus instance type. Grafana Alerting uses this option to identify the remote write endpoint.
1. **Configure Grafana**
The following Grafana configuration instructs Alerting to write alert state history to a Prometheus instance:
```toml
[unified_alerting.state_history]
enabled = true
backend = prometheus
# Target data source UID for writing alert state changes.
# (Optional) Metric name for the alert state metric. Default is "GRAFANA_ALERTS".
# prometheus_metric_name = GRAFANA_ALERTS
# (Optional) Timeout for writing alert state data to the target data source. Default is 10s.
# prometheus_write_timeout = 10s
```
You can then use **Grafana Explore** to query the alert state metric. For details, refer to [Alerting Meta monitoring](/docs/grafana/<GRAFANA_VERSION>/alerting/monitor/).
```promQL
GRAFANA_ALERTS{alertstate='firing'}
```
<!-- TODO can we add some more info here about the feature flags and the various different supported setups with Loki as Primary / Secondary, etc? -->
## Configure Loki and Prometheus for alert state
## Adding the Loki data source
You can also configure both Loki and Prometheus to record alert state changes for your Grafana-managed alert rules.
Refer to the instructions on [adding a data source](/docs/grafana/latest/administration/data-source-management/).
Start with the same setup steps as shown in the previous [Loki](#configure-loki-for-alert-state) and [Prometheus](#configure-prometheus-for-alert-state-alerts-metric) sections. Then, adjust your Grafana configuration as follows:
## Querying the history
```toml
[unified_alerting.state_history]
enabled = true
backend = multiple
If everything is set up correctly you can use the Grafana Explore view to start querying the Loki data source.
primary = loki
# URL of the Loki server.
loki_remote_url = http://localhost:3100
A simple litmus test to see if data is being written correctly into the Loki instance is the following query:
secondaries = prometheus
# Target data source UID for writing alert state changes.
@ -35,11 +35,57 @@ You can use meta-monitoring metrics to understand the health of your alerting sy
## Metrics for Grafana-managed alerts
To meta monitor Grafana-managed alerts, you need a Prometheus server, or other metrics database to collect and store metrics exported by Grafana.
To meta monitor Grafana-managed alerts, you can collect two types of metrics in a Prometheus instance:
For example, if you are using Prometheus, add a `scrape_config` to Prometheus to scrape metrics from Grafana, Alertmanager, or your data sources.
- **State history metric (`GRAFANA_ALERTS`)** — Exported by Grafana Alerting as part of alert state history.
### Example
- **Scraped metrics** — Exported by Grafana’s `/metrics` endpoint to monitor alerting activity and performance.
You need a Prometheus-compatible server to collect and store these metrics.
### `GRAFANA_ALERTS` metric
If you have configured [Prometheus for alert state history](/docs/grafana/<GRAFANA_VERSION>/alerting/set-up/configure-alert-state-history/), Grafana writes alert state changes to the `ALERTS` metric:
This `GRAFANA_ALERTS` metric is compatible with the `ALERTS` metric used by Prometheus Alerting and includes two additional labels:
1. A new `grafana_rule_uid` label for the UID of the Grafana rule.
2. A new `grafana_alertstate` label for the Grafana alert state, which differs slightly from the equivalent Prometheus state included in the `alertstate` label.
| Grafana state | `alertstate` | `grafana_alertstate` |
To collect scraped Alerting metrics, configure Prometheus to scrape metrics from Grafana.
```yaml
- job_name: grafana
@ -54,8 +100,6 @@ For example, if you are using Prometheus, add a `scrape_config` to Prometheus to
- grafana:3000
```
### List of available metrics
The Grafana ruler, which is responsible for evaluating alert rules, and the Grafana Alertmanager, which is responsible for sending notifications of firing and resolved alerts, provide a number of metrics that let you observe them.
#### grafana_alerting_alerts
@ -82,6 +126,39 @@ This metric is a gauge that shows you the number of seconds that the scheduler i
This metric is a histogram that shows you the number of seconds taken to send notifications for firing and resolved alerts. This metric lets you observe slow or over-utilized integrations, such as an SMTP server that is being given emails faster than it can send them.
## Logs for Grafana-managed alerts
If you have configured [Loki for alert state history](/docs/grafana/<GRAFANA_VERSION>/alerting/set-up/configure-alert-state-history/), logs related to state changes in Grafana-managed alerts are stored in the Loki data source.
You can use **Grafana Explore** and the Loki query editor to search for alert state changes.
In the **Logs** view, you can review details for individual alerts by selecting fields such as:
- `previous`: previous alert instance state.
- `current`: current alert instance state.
- `ruleTitle`: alert rule title.
- `ruleID` and `ruleUID`.
- `labels_alertname`, `labels_new_label`, and `labels_grafana_folder`.
- Additional available fields.
Alternatively, you can access the [History page](/docs/grafana/<GRAFANA_VERSION>/alerting/monitor-status/view-alert-state-history/) in Grafana to visualize and filter state changes for individual alerts or all alerts.
## Metrics for Mimir-managed alerts
To meta monitor Grafana Mimir-managed alerts, open source and on-premise users need a Prometheus/Mimir server, or another metrics database to collect and store metrics exported by the Mimir ruler.
@ -96,8 +173,6 @@ To meta monitor the Alertmanager, you need a Prometheus/Mimir server, or another
For example, if you are using Prometheus you should add a `scrape_config` to Prometheus to scrape metrics from your Alertmanager.
### Example
```yaml
- job_name: alertmanager
honor_timestamps: true
@ -111,8 +186,6 @@ For example, if you are using Prometheus you should add a `scrape_config` to Pro
- alertmanager:9093
```
### List of available metrics
The following is a list of available metrics for Alertmanager.