Loki cloud integration instructions (and necessary mixin changes) (#8492)

**What this PR does / why we need it**:

This PR is a first stab at documentation for setting up the Grafana
Cloud Loki Integration to monitor a local Loki cluster, installed using
the helm chart, using Grafana Cloud.

In addition to instructions on how to collect the necessary kubernetes
metrics, there were also some things with the Loki mixin (which provides
dashboards, alerts, and rules to the integration) that needed to be
fixed. For example, our dashboards rely on there being a cluster label
on our recording rules, yet our mixin does not include this? Also, we
are still including `cortex-gw` panels even when the mixin is compile
with `internal_components: false`.

**Special notes for your reviewer**:

The changes to the mixin will not be reflected in the current version of
the integration. We need to merge this PR first then update the
integration.

**Checklist**
- [ ] Reviewed the
[`CONTRIBUTING.md`](https://github.com/grafana/loki/blob/main/CONTRIBUTING.md)
guide (**required**)
- [ ] Documentation added
- [ ] Tests updated
- [ ] `CHANGELOG.md` updated
- [ ] Changes that require user attention or interaction to upgrade are
documented in `docs/sources/upgrading/_index.md`

---------

Co-authored-by: Michel Hollands <42814411+MichelHollands@users.noreply.github.com>
Co-authored-by: J Stickler <julie.stickler@grafana.com>
pull/8636/head
Trevor Whitney 3 years ago committed by GitHub
parent 9f8aa4b98a
commit dac3b84d08
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
  1. 19
      docs/sources/installation/helm/monitor-and-alert/_index.md
  2. 100
      docs/sources/installation/helm/monitor-and-alert/with-grafana-cloud.md
  3. 29
      docs/sources/installation/helm/monitor-and-alert/with-local-monitoring.md
  4. 4
      docs/sources/installation/helm/reference.md
  5. 4
      production/helm/loki/values.yaml
  6. 106
      production/loki-mixin-compiled-ssd/dashboards/loki-operational.json
  7. 10
      production/loki-mixin-compiled/dashboards/loki-operational.json
  8. 5
      production/loki-mixin/config.libsonnet
  9. 37
      production/loki-mixin/dashboards/loki-operational.libsonnet
  10. 7
      production/loki-mixin/mixin-ssd.libsonnet
  11. 12
      tools/dev/k3d/Makefile
  12. 34
      tools/dev/k3d/environments/helm-cluster/empty.jsonnet
  13. 2
      tools/dev/k3d/environments/helm-cluster/spec.json
  14. 43
      tools/dev/k3d/environments/helm-cluster/values/enterprise-logs-cloud-monitoring.yaml
  15. 1
      tools/dev/k3d/environments/helm-cluster/values/kube-state-metrics.yaml

@ -0,0 +1,19 @@
---
title: Monitoring
description: monitoring
weight: 200
aliases:
- /docs/installation/helm/monitor-and-alert
keywords:
- helm
- scalable
- simple-scalable
- monitor
---
# Monitoring
There are two common ways to monitor Loki:
- [Monitor using Grafana Cloud (recommended)]({{<relref "with-grafana-cloud">}})
- [Monitor using using Local Monitoring]({{<relref "with-local-monitoring">}})

@ -0,0 +1,100 @@
---
title: Configure Monitoring and Alerting of Loki Using Grafana Cloud
menuTitle: Monitor Loki with Grafana Cloud
description: setup monitoring and alerts for Loki using Grafana Cloud
aliases:
- /docs/installation/helm/monitoring/with-grafana-cloud
weight: 100
keywords:
- monitoring
- alert
- alerting
- grafana cloud
---
# Configure Monitoring and Alerting of Loki Using Grafana Cloud
This topic will walk you through using Grafana Cloud to monitor a Loki installation that is installed with the Helm chart. This approach leverages many of the chart's _self monitoring_ features, but instead of sending logs back to Loki itself, it sends them to a Grafana Cloud Logs instance. This approach also does not require the installation of the Prometheus Operator and instead sends metrics to a Grafana Cloud Metrics instance. Using Grafana Cloud to monitor Loki has the added benefit of being able to troubleshoot problems with Loki when the Helm installed Loki is down, as the logs will still be available in the Grafana Cloud Logs instance.
**Before you begin:**
- Helm 3 or above. See [Installing Helm](https://helm.sh/docs/intro/install/).
- A Grafana Cloud account and stack (including Cloud Grafana, Cloud Metrics, and Cloud Logs)
- [Grafana Kubernetes Monitoring using Agent](/docs/grafana-cloud/kubernetes-monitoring/configuration/config-k8s-agent-guide/) configured for the Kubernetes cluster
- A running Loki deployment installed in that Kubernetes cluster via the Helm chart
**Prequisites for Monitoring Loki:**
You must setup the Grafana Kubernetes Integration following the instructions in [Grafana Kubernetes Monitoring using Agent](/docs/grafana-cloud/kubernetes-monitoring/configuration/config-k8s-agent-guide/) as this will install necessary components for collecting metrics about your Kubernetes cluster and sending them to Grafana Cloud. Many of the dashboards installed as a part of the Loki integration rely on these metrics.
Walking through this installation will create two Grafana Agent configurations, one for metrics and one for logs, that will add the external label `cluster: cloud`. In order for the Dashboards in the self-hosted Grafana Loki integration to work, the cluster name needs to match your Helm installation name. If you installed Loki using the command `helm install best-loki-cluster grafana/loki`, you would need to change the `cluster` value in both Grafana Agent configurations from `cloud` to `best-loki-cluster` when setting up the Grafana Kubernetes integration.
**To set up the Loki integration in Grafana Cloud:**
1. Get valid Push credentials for your Cloud Metrics and Cloud Logs instances.
1. Create a secret in the same namespace as Loki to store your Cloud Logs credentials.
```bash
cat <<'EOF' | NAMESPACE=loki /bin/sh -c 'kubectl apply -n $NAMESPACE -f -'
apiVersion: v1
data:
password: <BASE64_ENCODED_CLOUD_LOGS_PASSWORD>
username: <BASE64_ENCODED_CLOUD_LOGS_USERNAME>
kind: Secret
metadata:
name: grafana-cloud-logs-credentials
type: Opaque
EOF
```
1. Create a secret to store your Cloud Metrics credentials.
```bash
cat <<'EOF' | NAMESPACE=loki /bin/sh -c 'kubectl apply -n $NAMESPACE -f -'
apiVersion: v1
data:
password: <BASE64_ENCODED_CLOUD_METRICS_PASSWORD>
username: <BASE64_ENCODED_CLOUD_METRICS_USERNAME>
kind: Secret
metadata:
name: grafana-cloud-metrics-credentials
type: Opaque
EOF
```
1. Enable monitoring metrics and logs for the Loki installation to be sent your cloud database instances by adding the following to your Helm `values.yaml` file:
```yaml
---
monitoring:
dashboards:
enabled: false
rules:
enabled: false
selfMonitoring:
logsInstance:
clients:
- url: <CLOUD_METRICS_URL>
basicAuth:
username:
name: grafana-cloud-logs-credentials
key: username
password:
name: grafana-cloud-logs-credentials
key: password
serviceMonitor:
metricsInstance:
remoteWrite:
- url: <CLOUD_LOGS_URL>
basicAuth:
username:
name: grafana-cloud-metrics-credentials
key: username
password:
name: grafana-cloud-metrics-credentials
key: password
```
1. Install the self-hosted Grafana Loki integration by going to your hosted Grafana instance, clicking the lightning bolt icon labeled **Integrations and Connections**, then search for and install the **Self-hosted Grafana Loki** integration.
1. Once the self-hosted Grafana Loki integration is installed, click the **View Dashboards** button to see the installed dashboards.

@ -3,7 +3,7 @@ title: Configure monitoring and alerting
menuTitle: Configure monitoring and alerting
description: setup monitoring and alerts for the Helm Chart
aliases:
- /docs/installation/helm/monitoring
- /docs/installation/helm/monitoring/with-local-monitoring
weight: 100
keywords:
- monitoring
@ -13,9 +13,9 @@ keywords:
# Configure monitoring and alerting
By default this Helm Chart configures meta-monitoring of metrics (service monitoring) and logs (self monitoring).
By default this Helm Chart configures meta-monitoring of metrics (service monitoring) and logs (self monitoring). This topic will walk you through configuring monitoring using a monitoring solution local to the same cluster where Loki is installed.
The `ServiceMonitor` resource works with either the Prometheus Operator or the Grafana Agent Operator, and defines how Loki's metrics should be scraped. Scraping this Loki cluster using the scrape config defined in the `ServiceMonitor` resource is required for the included dashboards to work. A `MetricsInstance` can be configured to write the metrics to a remote Prometheus instance such as Grafana Cloud Metrics.
The `ServiceMonitor` resource works with either the Prometheus Operator or the Grafana Agent Operator, and defines how Loki's metrics should be scraped. Scraping this Loki cluster using the scrape config defined in the `SerivceMonitor` resource is required for the included dashboards to work. A `MetricsInstance` can be configured to write the metrics to a remote Prometheus instance such as Grafana Cloud Metrics.
_Self monitoring_ is enabled by default. This will deploy a `GrafanaAgent`, `LogsInstance`, and `PodLogs` resource which will instruct the Grafana Agent Operator (installed seperately) on how to scrape this Loki cluster's logs and send them back to itself. Scraping this Loki cluster using the scrape config defined in the `PodLogs` resource is required for the included dashboards to work.
@ -80,25 +80,6 @@ prometheus:
targetLabel: cluster
```
In order to make sure the Prometheus Operator discovers the `ServiceMonitor` resources deployed by the `loki` chart, you will need to make sure those resources have the correct labels the Prometheus Operator is configured to look for. By default this is the key value label pair `release: prometheus`. Make sure the Loki `ServiceMonitor`s have this label by adding the following to the `values.yaml` for your Loki helm chart:
```yaml
monitoring:
serviceMonitor:
labels:
release: prometheus
```
This is also true for the `PrometheusRule` resource deployed by the Helm chart, which in addition to a label, need to be in the same namespace as the Prometheus Operator. For example, if you installed the Prometheus Operator in the `monitoring` namespace, you would need to also add the following to the `values.yaml` for you Loki helm chart to ensure recroding rules are properly discoverd:
```yaml
monitoring:
rules:
namespace: monitoring
labels:
release: prometheus
```
The `kube-prometheus-stack` installs `ServicMonitor` and `PrometheusRule` resources for monitoring Kubernetes, and it depends on the `kube-state-metrics` and `prometheus-node-exporter` helm charts which also install `ServiceMonitor` resources for collecting `kubelet` and `node-exporter` metrics. The above values file adds the necessary additional labels required for these metrics to work with the included dashboards.
If you are using this helm chart in an environment which does not allow for the installation of `kube-prometheus-stack` or custom CRDs, you should run `helm template` on the `kube-prometheus-stack` helm chart with the above values file, and review all generated `ServiceMonitor` and `PrometheusRule` resources. These resources may have to be modified with the correct ports and selectors to find the various services such as `kubelet` and `node-exporter` in your environment.
@ -131,7 +112,7 @@ If you are using this helm chart in an environment which does not allow for the
type: file
```
**To add additional Prometheus rules:**
**To add add additional Prometheus rules:**
1. Modify the configuration file `values.yaml`:
@ -229,4 +210,4 @@ If you are using this helm chart in an environment which does not allow for the
enabled: false
```
5. Install the `Loki meta-monitoring` connection on Grafana Cloud.
5. Install the `Loki meta-motoring` connection on Grafana Cloud.

@ -2404,9 +2404,9 @@ true
<tr>
<td>monitoring.serviceMonitor.interval</td>
<td>string</td>
<td>ServiceMonitor scrape interval</td>
<td>ServiceMonitor scrape interval Default is 15s because included recording rules use a 1m rate, and scrape interval needs to be at least 1/4 rate interval.</td>
<td><pre lang="json">
null
"15s"
</pre>
</td>
</tr>

@ -550,7 +550,9 @@ monitoring:
# -- Additional ServiceMonitor labels
labels: {}
# -- ServiceMonitor scrape interval
interval: null
# Default is 15s because included recording rules use a 1m rate, and scrape interval needs to be at
# least 1/4 rate interval.
interval: 15s
# -- ServiceMonitor scrape timeout in Go duration format (e.g. 15s)
scrapeTimeout: null
# -- ServiceMonitor relabel configs to apply to samples before scraping

@ -87,7 +87,7 @@
"steppedLine": false,
"targets": [
{
"expr": "sum by (status) (\nlabel_replace(\n label_replace(\n rate(loki_request_duration_seconds_count{cluster=\"$cluster\", job=~\"$namespace/cortex-gw(-internal)?\", route=~\"api_prom_query|api_prom_label|api_prom_label_name_values|loki_api_v1_query|loki_api_v1_query_range|loki_api_v1_label|loki_api_v1_label_name_values\"}[5m]),\n \"status\", \"${1}xx\", \"status_code\", \"([0-9])..\"),\n\"status\", \"${1}\", \"status_code\", \"([a-z]+)\")\n)",
"expr": "sum by (status) (\nlabel_replace(\n label_replace(\n rate(loki_request_duration_seconds_count{cluster=\"$cluster\", job=~\"($namespace)/(loki|enterprise-logs)-read\", route=~\"api_prom_query|api_prom_label|api_prom_label_name_values|loki_api_v1_query|loki_api_v1_query_range|loki_api_v1_label|loki_api_v1_label_name_values\"}[5m]),\n \"status\", \"${1}xx\", \"status_code\", \"([0-9])..\"),\n\"status\", \"${1}\", \"status_code\", \"([a-z]+)\")\n)",
"legendFormat": "{{status}}",
"refId": "A"
}
@ -183,7 +183,7 @@
"steppedLine": false,
"targets": [
{
"expr": "sum by (status) (\nlabel_replace(\n label_replace(\n rate(loki_request_duration_seconds_count{cluster=\"$cluster\", job=~\"$namespace/cortex-gw(-internal)?\", route=~\"api_prom_push|loki_api_v1_push\"}[5m]),\n \"status\", \"${1}xx\", \"status_code\", \"([0-9])..\"),\n\"status\", \"${1}\", \"status_code\", \"([a-z]+)\"))",
"expr": "sum by (status) (\nlabel_replace(\n label_replace(\n rate(loki_request_duration_seconds_count{cluster=\"$cluster\", job=~\"($namespace)/(loki|enterprise-logs)-write\", route=~\"api_prom_push|loki_api_v1_push\"}[5m]),\n \"status\", \"${1}xx\", \"status_code\", \"([0-9])..\"),\n\"status\", \"${1}\", \"status_code\", \"([a-z]+)\"))",
"legendFormat": "{{status}}",
"refId": "A"
}
@ -229,102 +229,6 @@
"alignLevel": null
}
},
{
"aliasColors": { },
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"fieldConfig": {
"defaults": {
"custom": { }
},
"overrides": [ ]
},
"fill": 1,
"fillGradient": 0,
"gridPos": {
"h": 5,
"w": 4,
"x": 8,
"y": 1
},
"hiddenSeries": false,
"id": 11,
"legend": {
"avg": false,
"current": false,
"hideEmpty": false,
"hideZero": false,
"max": false,
"min": false,
"show": false,
"total": false,
"values": false
},
"lines": true,
"linewidth": 1,
"nullPointMode": "null",
"options": {
"dataLinks": [ ]
},
"panels": [ ],
"percentage": false,
"pointradius": 2,
"points": false,
"renderer": "flot",
"seriesOverrides": [ ],
"spaceLength": 10,
"stack": false,
"steppedLine": false,
"targets": [
{
"expr": "topk(5, sum by (name,level) (rate(promtail_custom_bad_words_total{cluster=\"$cluster\", exported_namespace=\"$namespace\"}[$__interval])) - \nsum by (name,level) (rate(promtail_custom_bad_words_total{cluster=\"$cluster\", exported_namespace=\"$namespace\"}[$__interval] offset 1h)))",
"legendFormat": "{{name}}-{{level}}",
"refId": "A"
}
],
"thresholds": [ ],
"timeFrom": null,
"timeRegions": [ ],
"timeShift": null,
"title": "Bad Words",
"tooltip": {
"shared": true,
"sort": 2,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": [ ]
},
"yaxes": [
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": true
}
],
"yaxis": {
"align": false,
"alignLevel": null
}
},
{
"aliasColors": { },
"bars": false,
@ -662,17 +566,17 @@
"steppedLine": false,
"targets": [
{
"expr": "histogram_quantile(0.99, sum by (le) (job_route:loki_request_duration_seconds_bucket:sum_rate{job=~\"$namespace/cortex-gw(-internal)?\", route=~\"api_prom_push|loki_api_v1_push\", cluster=~\"$cluster\"})) * 1e3",
"expr": "histogram_quantile(0.99, sum by (le) (job_route:loki_request_duration_seconds_bucket:sum_rate{job=~\"($namespace)/(loki|enterprise-logs)-write\", route=~\"api_prom_push|loki_api_v1_push\", cluster=~\"$cluster\"})) * 1e3",
"legendFormat": ".99",
"refId": "A"
},
{
"expr": "histogram_quantile(0.75, sum by (le) (job_route:loki_request_duration_seconds_bucket:sum_rate{job=~\"$namespace/cortex-gw(-internal)?\", route=~\"api_prom_push|loki_api_v1_push\", cluster=~\"$cluster\"})) * 1e3",
"expr": "histogram_quantile(0.75, sum by (le) (job_route:loki_request_duration_seconds_bucket:sum_rate{job=~\"($namespace)/(loki|enterprise-logs)-write\", route=~\"api_prom_push|loki_api_v1_push\", cluster=~\"$cluster\"})) * 1e3",
"legendFormat": ".9",
"refId": "B"
},
{
"expr": "histogram_quantile(0.5, sum by (le) (job_route:loki_request_duration_seconds_bucket:sum_rate{job=~\"$namespace/cortex-gw(-internal)?\", route=~\"api_prom_push|loki_api_v1_push\", cluster=~\"$cluster\"})) * 1e3",
"expr": "histogram_quantile(0.5, sum by (le) (job_route:loki_request_duration_seconds_bucket:sum_rate{job=~\"($namespace)/(loki|enterprise-logs)-write\", route=~\"api_prom_push|loki_api_v1_push\", cluster=~\"$cluster\"})) * 1e3",
"legendFormat": ".5",
"refId": "C"
}

@ -87,7 +87,7 @@
"steppedLine": false,
"targets": [
{
"expr": "sum by (status) (\nlabel_replace(\n label_replace(\n rate(loki_request_duration_seconds_count{cluster=\"$cluster\", job=~\"$namespace/cortex-gw(-internal)?\", route=~\"api_prom_query|api_prom_label|api_prom_label_name_values|loki_api_v1_query|loki_api_v1_query_range|loki_api_v1_label|loki_api_v1_label_name_values\"}[5m]),\n \"status\", \"${1}xx\", \"status_code\", \"([0-9])..\"),\n\"status\", \"${1}\", \"status_code\", \"([a-z]+)\")\n)",
"expr": "sum by (status) (\nlabel_replace(\n label_replace(\n rate(loki_request_duration_seconds_count{cluster=\"$cluster\", job=~\"($namespace)/query-frontend\", route=~\"api_prom_query|api_prom_label|api_prom_label_name_values|loki_api_v1_query|loki_api_v1_query_range|loki_api_v1_label|loki_api_v1_label_name_values\"}[5m]),\n \"status\", \"${1}xx\", \"status_code\", \"([0-9])..\"),\n\"status\", \"${1}\", \"status_code\", \"([a-z]+)\")\n)",
"legendFormat": "{{status}}",
"refId": "A"
}
@ -183,7 +183,7 @@
"steppedLine": false,
"targets": [
{
"expr": "sum by (status) (\nlabel_replace(\n label_replace(\n rate(loki_request_duration_seconds_count{cluster=\"$cluster\", job=~\"$namespace/cortex-gw(-internal)?\", route=~\"api_prom_push|loki_api_v1_push\"}[5m]),\n \"status\", \"${1}xx\", \"status_code\", \"([0-9])..\"),\n\"status\", \"${1}\", \"status_code\", \"([a-z]+)\"))",
"expr": "sum by (status) (\nlabel_replace(\n label_replace(\n rate(loki_request_duration_seconds_count{cluster=\"$cluster\", job=~\"($namespace)/distributor\", route=~\"api_prom_push|loki_api_v1_push\"}[5m]),\n \"status\", \"${1}xx\", \"status_code\", \"([0-9])..\"),\n\"status\", \"${1}\", \"status_code\", \"([a-z]+)\"))",
"legendFormat": "{{status}}",
"refId": "A"
}
@ -662,17 +662,17 @@
"steppedLine": false,
"targets": [
{
"expr": "histogram_quantile(0.99, sum by (le) (job_route:loki_request_duration_seconds_bucket:sum_rate{job=~\"$namespace/cortex-gw(-internal)?\", route=~\"api_prom_push|loki_api_v1_push\", cluster=~\"$cluster\"})) * 1e3",
"expr": "histogram_quantile(0.99, sum by (le) (job_route:loki_request_duration_seconds_bucket:sum_rate{job=~\"($namespace)/distributor\", route=~\"api_prom_push|loki_api_v1_push\", cluster=~\"$cluster\"})) * 1e3",
"legendFormat": ".99",
"refId": "A"
},
{
"expr": "histogram_quantile(0.75, sum by (le) (job_route:loki_request_duration_seconds_bucket:sum_rate{job=~\"$namespace/cortex-gw(-internal)?\", route=~\"api_prom_push|loki_api_v1_push\", cluster=~\"$cluster\"})) * 1e3",
"expr": "histogram_quantile(0.75, sum by (le) (job_route:loki_request_duration_seconds_bucket:sum_rate{job=~\"($namespace)/distributor\", route=~\"api_prom_push|loki_api_v1_push\", cluster=~\"$cluster\"})) * 1e3",
"legendFormat": ".9",
"refId": "B"
},
{
"expr": "histogram_quantile(0.5, sum by (le) (job_route:loki_request_duration_seconds_bucket:sum_rate{job=~\"$namespace/cortex-gw(-internal)?\", route=~\"api_prom_push|loki_api_v1_push\", cluster=~\"$cluster\"})) * 1e3",
"expr": "histogram_quantile(0.5, sum by (le) (job_route:loki_request_duration_seconds_bucket:sum_rate{job=~\"($namespace)/distributor\", route=~\"api_prom_push|loki_api_v1_push\", cluster=~\"$cluster\"})) * 1e3",
"legendFormat": ".5",
"refId": "C"
}

@ -15,6 +15,11 @@
// Enable dashboard and panels for Grafana Labs internal components.
internal_components: false,
promtail: {
// Whether or not to include promtail specific dashboards
enabled: true,
},
// SSD related configuration for dashboards.
ssd: {
// Support Loki SSD mode on dashboards.

@ -19,11 +19,16 @@ local utils = import 'mixin-utils/utils.libsonnet';
'Ingester',
],
hiddenPanels:: if $._config.promtail.enabled then [] else [
'Bad Words',
],
jobMatchers:: {
cortexgateway: [utils.selector.re('job', '($namespace)/cortex-gw(-internal)?')],
distributor: [utils.selector.re('job', '($namespace)/%s' % (if $._config.ssd.enabled then '%s-write' % $._config.ssd.pod_prefix_matcher else 'distributor'))],
ingester: [utils.selector.re('job', '($namespace)/%s' % (if $._config.ssd.enabled then '%s-write' % $._config.ssd.pod_prefix_matcher else 'ingester.*'))],
querier: [utils.selector.re('job', '($namespace)/%s' % (if $._config.ssd.enabled then '%s-read' % $._config.ssd.pod_prefix_matcher else 'querier'))],
queryFrontend: [utils.selector.re('job', '($namespace)/%s' % (if $._config.ssd.enabled then '%s-read' % $._config.ssd.pod_prefix_matcher else 'query-frontend'))],
},
podMatchers:: {
@ -136,6 +141,7 @@ local utils = import 'mixin-utils/utils.libsonnet';
std.rstripChars(matcherStr('querier'), ',')
),
local replaceAllMatchers(expr) =
replaceMatchers(replaceClusterMatchers(expr)),
@ -147,12 +153,33 @@ local utils = import 'mixin-utils/utils.libsonnet';
local isRowHidden(row) =
std.member(dashboards['loki-operational.json'].hiddenRows, row),
local isPanelHidden(panelTitle) =
std.member(dashboards['loki-operational.json'].hiddenPanels, panelTitle),
local replaceCortexGateway(expr, replacement) = if $._config.internal_components then
expr
else
std.strReplace(
expr,
'job=~"$namespace/cortex-gw(-internal)?"',
matcherStr(replacement, matcher='job', sep='')
),
local removeInternalComponents(title, expr) = if (title == 'Queries/Second') then
replaceCortexGateway(expr, 'queryFrontend')
else if (title == 'Pushes/Second') then
replaceCortexGateway(expr, 'distributor')
else if (title == 'Push Latency') then
replaceCortexGateway(expr, 'distributor')
else
replaceAllMatchers(expr),
panels: [
p {
datasource: selectDatasource(super.datasource),
targets: if std.objectHas(p, 'targets') then [
e {
expr: replaceAllMatchers(e.expr),
expr: removeInternalComponents(p.title, e.expr),
}
for e in p.targets
] else [],
@ -161,7 +188,7 @@ local utils = import 'mixin-utils/utils.libsonnet';
datasource: selectDatasource(super.datasource),
targets: if std.objectHas(sp, 'targets') then [
e {
expr: replaceAllMatchers(e.expr),
expr: removeInternalComponents(p.title, e.expr),
}
for e in sp.targets
] else [],
@ -170,15 +197,17 @@ local utils = import 'mixin-utils/utils.libsonnet';
datasource: selectDatasource(super.datasource),
targets: if std.objectHas(ssp, 'targets') then [
e {
expr: replaceAllMatchers(e.expr),
expr: removeInternalComponents(p.title, e.expr),
}
for e in ssp.targets
] else [],
}
for ssp in sp.panels
if !(isPanelHidden(ssp.title))
] else [],
}
for sp in p.panels
if !(isPanelHidden(sp.title))
] else [],
title: if !($._config.ssd.enabled && p.type == 'row') then p.title else
if p.title == 'Distributor' then 'Write Path'
@ -186,7 +215,7 @@ local utils = import 'mixin-utils/utils.libsonnet';
else p.title,
}
for p in super.panels
if !(p.type == 'row' && isRowHidden(p.title))
if !(p.type == 'row' && isRowHidden(p.title)) && !(isPanelHidden(p.title))
],
} +
$.dashboard('Loki / Operational', uid='operational')

@ -4,6 +4,13 @@
grafanaDashboardFolder: 'Loki SSD',
_config+:: {
internal_components: false,
// By default the helm chart uses the Grafana Agent instead of promtail
promtail+: {
enabled: false,
},
ssd+: {
enabled: true,
},

@ -57,6 +57,9 @@ apply-enterprise-helm-cluster:
apply-loki-helm-cluster:
tk apply --ext-code enterprise=false environments/helm-cluster
apply-empty-helm-cluster:
tk apply --ext-code enterprise=false environments/helm-cluster/empty.jsonnet
down:
k3d cluster delete helm-cluster
@ -151,3 +154,12 @@ helm-upgrade-loki-ha-single-binary:
helm-uninstall-loki-binary:
$(HELM) uninstall loki-single-binary -n loki
helm-install-kube-state-metrics:
helm install kube-state-metrics --create-namespace --values "$(CURDIR)/environments/helm-cluster/values/kube-state-metrics.yaml
helm-install-enterprise-logs-cloud-monitoring:
helm install enterprise-logs-test-fixture "$(HELM_DIR)" -n loki --create-namespace --values "$(CURDIR)/environments/helm-cluster/values/enterprise-logs-cloud-monitoring.yaml"
helm-upgrade-enterprise-logs-cloud-monitoring:
helm upgrade enterprise-logs-test-fixture "$(HELM_DIR)" -n loki --values "$(CURDIR)/environments/helm-cluster/values/enterprise-logs-cloud-monitoring.yaml"

@ -0,0 +1,34 @@
local k = import 'github.com/grafana/jsonnet-libs/ksonnet-util/kausal.libsonnet';
local tanka = import 'github.com/grafana/jsonnet-libs/tanka-util/main.libsonnet';
local configMap = k.core.v1.configMap;
local spec = (import './spec.json').spec;
{
_config+:: {
namespace: spec.namespace,
},
lokiNamespace: k.core.v1.namespace.new('loki'),
gelLicenseSecret: k.core.v1.secret.new('gel-license', {}, type='Opaque')
+ k.core.v1.secret.withStringData({
'license.jwt': importstr '../../secrets/gel.jwt',
})
+ k.core.v1.secret.metadata.withNamespace('loki'),
local grafanaCloudCredentials = import '../../secrets/grafana-cloud-credentials.json',
grafanaCloudMetricsCredentials: k.core.v1.secret.new('grafana-cloud-metrics-credentials', {}, type='Opaque')
+ k.core.v1.secret.withStringData({
username: '%d' % grafanaCloudCredentials.metrics.username,
password: grafanaCloudCredentials.metrics.password,
})
+ k.core.v1.secret.metadata.withNamespace('loki'),
grafanaCloudLogsCredentials: k.core.v1.secret.new('grafana-cloud-logs-credentials', {}, type='Opaque')
+ k.core.v1.secret.withStringData({
username: '%d' % grafanaCloudCredentials.logs.username,
password: grafanaCloudCredentials.logs.password,
})
+ k.core.v1.secret.metadata.withNamespace('loki'),
}

@ -6,7 +6,7 @@
"namespace": "environments/helm-cluster/main.jsonnet"
},
"spec": {
"apiServer": "https://0.0.0.0:38539",
"apiServer": "https://0.0.0.0:33931",
"namespace": "k3d-helm-cluster",
"resourceDefaults": {},
"expectVersions": {}

@ -0,0 +1,43 @@
---
loki:
querier:
multi_tenant_queries_enabled: true
enterprise:
enabled: true
adminToken:
secret: "gel-admin-token"
useExternalLicense: true
externalLicenseName: gel-license
provisioner:
provisionedSecretPrefix: "provisioned-secret"
monitoring:
dashboards:
enabled: false
rules:
enabled: false
selfMonitoring:
tenant:
name: loki
logsInstance:
clients:
- url: https://logs-prod-us-central1.grafana.net/loki/api/v1/push
basicAuth:
username:
name: grafana-cloud-logs-credentials
key: username
password:
name: grafana-cloud-logs-credentials
key: password
serviceMonitor:
metricsInstance:
remoteWrite:
- url: https://prometheus-blocks-prod-us-central1.grafana.net/api/prom/push
basicAuth:
username:
name: grafana-cloud-metrics-credentials
key: username
password:
name: grafana-cloud-metrics-credentials
key: password
minio:
enabled: true
Loading…
Cancel
Save