feat: Update Loki monitoring docs to new meta monitoring helm (#13176)

Co-authored-by: J Stickler <julie.stickler@grafana.com>
pull/13182/head
Jay Clifford 2 years ago committed by GitHub
parent a08ee68dea
commit b4d44f89f9
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
  1. 298
      docs/sources/setup/install/helm/monitor-and-alert/with-grafana-cloud.md
  2. 341
      docs/sources/setup/install/helm/monitor-and-alert/with-local-monitoring.md

@ -1,7 +1,7 @@
---
title: Configure monitoring and alerting of Loki using Grafana Cloud
title: Monitor Loki with Grafana Cloud
menuTitle: Monitor Loki with Grafana Cloud
description: Configuring monitoring and alerts for Loki using Grafana Cloud.
description: Configuring monitoring for Loki using Grafana Cloud.
aliases:
- ../../../../installation/helm/monitor-and-alert/with-grafana-cloud
weight: 200
@ -12,89 +12,255 @@ keywords:
- grafana cloud
---
# Configure monitoring and alerting of Loki using Grafana Cloud
# Monitor Loki with Grafana Cloud
This topic will walk you through using Grafana Cloud to monitor a Loki installation that is installed with the Helm chart. This approach leverages many of the chart's _self monitoring_ features, but instead of sending logs back to Loki itself, it sends them to a Grafana Cloud Logs instance. This approach also does not require the installation of the Prometheus Operator and instead sends metrics to a Grafana Cloud Metrics instance. Using Grafana Cloud to monitor Loki has the added benefit of being able to troubleshoot problems with Loki when the Helm installed Loki is down, as the logs will still be available in the Grafana Cloud Logs instance.
This guide will walk you through using Grafana Cloud to monitor a Loki installation set up with the `meta-monitoring` Helm chart. This method takes advantage of many of the chart's self-monitoring features, sending metrics, logs, and traces from the Loki deployment to Grafana Cloud. Monitoring Loki with Grafana Cloud offers the added benefit of troubleshooting Loki issues even when the Helm-installed Loki is down, as the telemetry data will remain available in the Grafana Cloud instance.
**Before you begin:**
These instructions are based off the [meta-monitoring-chart repository](https://github.com/grafana/meta-monitoring-chart/tree/main).
## Before you begin
- Helm 3 or above. See [Installing Helm](https://helm.sh/docs/intro/install/).
- A Grafana Cloud account and stack (including Cloud Grafana, Cloud Metrics, and Cloud Logs).
- [Grafana Kubernetes Monitoring using Agent Flow](/docs/grafana-cloud/monitor-infrastructure/kubernetes-monitoring/configuration/config-k8s-agent-flow/) configured for the Kubernetes cluster.
- A running Loki deployment installed in that Kubernetes cluster via the Helm chart.
**Prequisites for Monitoring Loki:**
## Configure the meta namespace
The meta-monitoring stack will be installed in a separate namespace called `meta`. To create this namespace, run the following command:
```bash
kubectl create namespace meta
```
## Grafana Cloud Connection Credentials
You must setup the Grafana Kubernetes Integration following the instructions in [Grafana Kubernetes Monitoring using Agent Flow](/docs/grafana-cloud/monitor-infrastructure/kubernetes-monitoring/configuration/config-k8s-agent-flow/) as this will install necessary components for collecting metrics about your Kubernetes cluster and sending them to Grafana Cloud. Many of the dashboards installed as a part of the Loki integration rely on these metrics.
The meta-monitoring stack sends metrics, logs, and traces to Grafana Cloud. This requires that you know your connection credentials to Grafana Cloud. To obtain connection credentials, follow the steps below:
Walking through this installation will create two Grafana Agent configurations, one for metrics and one for logs, that will add the external label `cluster: cloud`. In order for the Dashboards in the self-hosted Grafana Loki integration to work, the cluster name needs to match your Helm installation name. If you installed Loki using the command `helm install best-loki-cluster grafana/loki`, you would need to change the `cluster` value in both Grafana Agent configurations from `cloud` to `best-loki-cluster` when setting up the Grafana Kubernetes integration.
1. Create a new Cloud Access Policy in Grafana Cloud.
1. Sign into [Grafana Cloud](https://grafana.com/auth/sign-in/).
1. In the main menu, select **Security > Access Policies**.
1. Click **Create access policy**.
1. Give the policy a **Name** and select the following permissions:
- Metrics: Write
- Logs: Write
- Traces: Write
1. Click **Create**.
**To set up the Loki integration in Grafana Cloud:**
1. Get valid Push credentials for your Cloud Metrics and Cloud Logs instances.
1. Create a secret in the same namespace as Loki to store your Cloud Logs credentials.
1. Once the policy is created, select the policy and click **Add token**.
1. Name the token, select an expiration date, then click **Create**.
1. Copy the token to a secure location as it will not be displayed again.
1. Navigate to the Grafana Cloud Portal **Overview** page.
1. Click the **Details** button for your Prometheus or Mimir instance.
1. From the **Using a self-hosted Grafana instance with Grafana Cloud Metrics** section, collect the instance **Name** and **URL**.
1. Navigate back to the **Overview** page.
1. Click the **Details** button for your Loki instance.
1. From the **Using Grafana with Logs** section, collect the instance **Name** and **URL**.
1. Navigate back to the **Overview** page.
1. Click the **Details** button for your Tempo instance.
1. From the **Using Grafana with Tempo** section, collect the instance **Name** and **URL**.
3. Finally, generate the secrets to store your credentials for each metric type within your Kubernetes cluster:
```bash
cat <<'EOF' | NAMESPACE=loki /bin/sh -c 'kubectl apply -n $NAMESPACE -f -'
apiVersion: v1
data:
password: <BASE64_ENCODED_CLOUD_LOGS_PASSWORD>
username: <BASE64_ENCODED_CLOUD_LOGS_USERNAME>
kind: Secret
metadata:
name: grafana-cloud-logs-credentials
type: Opaque
EOF
kubectl create secret generic logs -n meta \
--from-literal=username=<USERNAME LOGS> \
--from-literal= <ACCESS POLICY TOKEN> \
--from-literal=endpoint='https://<LOG URL>/loki/api/v1/push'
kubectl create secret generic metrics -n meta \
--from-literal=username=<USERNAME METRICS> \
--from-literal=password=<ACCESS POLICY TOKEN> \
--from-literal=endpoint='https://<METRICS URL>/api/prom/push'
kubectl create secret generic traces -n meta \
--from-literal=username=<OTLP INSTANCE ID> \
--from-literal=password=<ACCESS POLICY TOKEN> \
--from-literal=endpoint='https://<OTLP URL>/otlp'
```
1. Create a secret to store your Cloud Metrics credentials.
## Configuration and Installation
To install the `meta-monitoring` Helm chart, you must create a `values.yaml` file. At a minimum this file should contain the following:
* The namespace to monitor
* Enablement of cloud monitoring
This example `values.yaml` file provides the minimum configuration to monitor the `loki` namespace:
```yaml
namespacesToMonitor:
- default
cloud:
logs:
enabled: true
secret: "logs"
metrics:
enabled: true
secret: "metrics"
traces:
enabled: true
secret: "traces"
```
For further configuration options, refer to the [sample values.yaml file](https://github.com/grafana/meta-monitoring-chart/blob/main/charts/meta-monitoring/values.yaml).
To install the `meta-monitoring` Helm chart, run the following commands:
```bash
helm repo add grafana https://grafana.github.io/helm-charts
helm repo update
helm install meta-monitoring grafana/meta-monitoring -n meta -f values.yaml
```
or when upgrading the configuration:
```bash
helm upgrade meta-monitoring grafana/meta-monitoring -n meta -f values.yaml
```
To verify the installation, run the following command:
```bash
kubectl get pods -n meta
```
It should return the following pods:
```bash
NAME READY STATUS RESTARTS AGE
meta-alloy-0 2/2 Running 0 23h
meta-alloy-1 2/2 Running 0 23h
meta-alloy-2 2/2 Running 0 23h
```
## Enable Loki Tracing
By default, Loki does not have tracing enabled. To enable tracing, modify the Loki configuration by editing the `values.yaml` file and adding the following configuration:
Set the `tracing.enabled` configuration to `true`:
```yaml
loki:
tracing:
enabled: true
```
Next, instrument each of the Loki components to send traces to the meta-monitoring stack. Add the `extraEnv` configuration to each of the Loki components:
```yaml
ingester:
replicas: 3
extraEnv:
- name: JAEGER_ENDPOINT
value: "http://mmc-alloy-external.default.svc.cluster.local:14268/api/traces"
# This sets the Jaeger endpoint where traces will be sent.
# The endpoint points to the mmc-alloy service in the default namespace at port 14268.
- name: JAEGER_AGENT_TAGS
value: 'cluster="prod",namespace="default"'
# This specifies additional tags to attach to each span.
# Here, the cluster is labeled as "prod" and the namespace as "default".
- name: JAEGER_SAMPLER_TYPE
value: "ratelimiting"
# This sets the sampling strategy for traces.
# "ratelimiting" means that traces will be sampled at a fixed rate.
- name: JAEGER_SAMPLER_PARAM
value: "1.0"
# This sets the parameter for the sampler.
# For ratelimiting, "1.0" typically means one trace per second.
```
Since the meta-monitoring stack is installed in the `meta` namespace, the Loki components will need to be able to communicate with the meta-monitoring stack. To do this, create a new `externalname` service in the `default` namespace that points to the `meta` namespace by running the following command:
```bash
kubectl create service externalname mmc-alloy-external --external-name meta-alloy.meta.svc.cluster.local -n default
```
Finally, upgrade the Loki installation with the new configuration:
```bash
helm upgrade --values values.yaml loki grafana/loki
```
## Import the Loki Dashboards to Grafana Cloud
The meta-monitoring stack includes a set of dashboards that can be imported into Grafana Cloud. These can be found in the [meta-monitoring repository](https://github.com/grafana/meta-monitoring-chart/tree/main/charts/meta-monitoring/src/dashboards).
## Installing Rules
The meta-monitoring stack includes a set of rules that can be installed to monitor the Loki installation. These rules can be found in the [meta-monitoring repository](https://github.com/grafana/meta-monitoring-chart/). To install the rules:
1. Clone the repository:
```bash
git clone https://github.com/grafana/meta-monitoring-chart/
```
1. Install `mimirtool` based on the instructions located [here](https://grafana.com/docs/mimir/latest/manage/tools/mimirtool/)
1. Create a new access policy token in Grafana Cloud with the following permissions:
- Rules: Write
- Rules: Read
1. Create a token for the access policy and copy it to a secure location.
1. Install the rules:
```bash
cat <<'EOF' | NAMESPACE=loki /bin/sh -c 'kubectl apply -n $NAMESPACE -f -'
apiVersion: v1
data:
password: <BASE64_ENCODED_CLOUD_METRICS_PASSWORD>
username: <BASE64_ENCODED_CLOUD_METRICS_USERNAME>
kind: Secret
metadata:
name: grafana-cloud-metrics-credentials
type: Opaque
EOF
mimirtool rules load --address=<your_cloud_prometheus_endpoint> --id=<your_instance_id> --key=<your_cloud_access_policy_token> *.yaml
```
1. Verify that the rules have been installed:
```bash
mimirtool rules list --address=<your_cloud_prometheus_endpoint> --id=<your_instance_id> --key=<your_cloud_access_policy_token>
```
It should return a list of rules that have been installed.
```bash
1. Enable monitoring metrics and logs for the Loki installation to be sent your cloud database instances by adding the following to your Helm `values.yaml` file:
```yaml
---
monitoring:
dashboards:
enabled: false
rules:
enabled: false
selfMonitoring:
logsInstance:
clients:
- url: <CLOUD_LOGS_URL>
basicAuth:
username:
name: grafana-cloud-logs-credentials
key: username
password:
name: grafana-cloud-logs-credentials
key: password
serviceMonitor:
metricsInstance:
remoteWrite:
- url: <CLOUD_METRICS_URL>
basicAuth:
username:
name: grafana-cloud-metrics-credentials
key: username
password:
name: grafana-cloud-metrics-credentials
key: password
loki-rules:
- name: loki_rules
rules:
- record: cluster_job:loki_request_duration_seconds:99quantile
expr: histogram_quantile(0.99, sum(rate(loki_request_duration_seconds_bucket[5m])) by (le, cluster, job))
- record: cluster_job:loki_request_duration_seconds:50quantile
expr: histogram_quantile(0.50, sum(rate(loki_request_duration_seconds_bucket[5m])) by (le, cluster, job))
- record: cluster_job:loki_request_duration_seconds:avg
expr: sum(rate(loki_request_duration_seconds_sum[5m])) by (cluster, job) / sum(rate(loki_request_duration_seconds_count[5m])) by (cluster, job)
- record: cluster_job:loki_request_duration_seconds_bucket:sum_rate
expr: sum(rate(loki_request_duration_seconds_bucket[5m])) by (le, cluster, job)
- record: cluster_job:loki_request_duration_seconds_sum:sum_rate
expr: sum(rate(loki_request_duration_seconds_sum[5m])) by (cluster, job)
- record: cluster_job:loki_request_duration_seconds_count:sum_rate
expr: sum(rate(loki_request_duration_seconds_count[5m])) by (cluster, job)
- record: cluster_job_route:loki_request_duration_seconds:99quantile
expr: histogram_quantile(0.99, sum(rate(loki_request_duration_seconds_bucket[5m])) by (le, cluster, job, route))
- record: cluster_job_route:loki_request_duration_seconds:50quantile
expr: histogram_quantile(0.50, sum(rate(loki_request_duration_seconds_bucket[5m])) by (le, cluster, job, route))
- record: cluster_job_route:loki_request_duration_seconds:avg
expr: sum(rate(loki_request_duration_seconds_sum[5m])) by (cluster, job, route) / sum(rate(loki_request_duration_seconds_count[5m])) by (cluster, job, route)
- record: cluster_job_route:loki_request_duration_seconds_bucket:sum_rate
expr: sum(rate(loki_request_duration_seconds_bucket[5m])) by (le, cluster, job, route)
- record: cluster_job_route:loki_request_duration_seconds_sum:sum_rate
expr: sum(rate(loki_request_duration_seconds_sum[5m])) by (cluster, job, route)
- record: cluster_job_route:loki_request_duration_seconds_count:sum_rate
expr: sum(rate(loki_request_duration_seconds_count[5m])) by (cluster, job, route)
- record: cluster_namespace_job_route:loki_request_duration_seconds:99quantile
expr: histogram_quantile(0.99, sum(rate(loki_request_duration_seconds_bucket[5m])) by (le, cluster, namespace, job, route))
- record: cluster_namespace_job_route:loki_request_duration_seconds:50quantile
expr: histogram_quantile(0.50, sum(rate(loki_request_duration_seconds_bucket[5m])) by (le, cluster, namespace, job, route))
- record: cluster_namespace_job_route:loki_request_duration_seconds:avg
expr: sum(rate(loki_request_duration_seconds_sum[5m])) by (cluster, namespace, job, route) / sum(rate(loki_request_duration_seconds_count[5m])) by (cluster, namespace, job, route)
- record: cluster_namespace_job_route:loki_request_duration_seconds_bucket:sum_rate
expr: sum(rate(loki_request_duration_seconds_bucket[5m])) by (le, cluster, namespace, job, route)
- record: cluster_namespace_job_route:loki_request_duration_seconds_sum:sum_rate
expr: sum(rate(loki_request_duration_seconds_sum[5m])) by (cluster, namespace, job, route)
- record: cluster_namespace_job_route:loki_request_duration_seconds_count:sum_rate
expr: sum(rate(loki_request_duration_seconds_count[5m])) by (cluster, namespace, job, route)
```
## Install kube-state-metrics
Metrics about Kubernetes objects are scraped from [kube-state-metrics](https://github.com/kubernetes/kube-state-metrics). This needs to be installed in the cluster. The `kubeStateMetrics.endpoint` entry in the meta-monitoring `values.yaml` should be set to its address (without the `/metrics` part in the URL):
```yaml
kubeStateMetrics:
# Scrape https://github.com/kubernetes/kube-state-metrics by default
enabled: true
# This endpoint is created when the helm chart from
# https://artifacthub.io/packages/helm/prometheus-community/kube-state-metrics/
# is used. Change this if kube-state-metrics is installed somewhere else.
endpoint: kube-state-metrics.kube-state-metrics.svc.cluster.local:8080
```
1. Install the self-hosted Grafana Loki integration by going to your hosted Grafana instance, selecting **Connections** from the Home menu, then search for and install the **Self-hosted Grafana Loki** integration.
1. Once the self-hosted Grafana Loki integration is installed, click the **View Dashboards** button to see the installed dashboards.

@ -1,7 +1,7 @@
---
title: Configure monitoring and alerting
menuTitle: Configure monitoring and alerting
description: Configuring monitoring and alerts using the Helm chart.
title: Monitor Loki using a local LGTM (Loki, Grafana, Tempo and Mimir) stack
menuTitle: Monitor Loki using a local LGTM stack
description: Monitor Loki using a local LGTM (Loki, Grafana, Tempo and Mimir) stack
aliases:
- ../../../../installation/helm/monitor-and-alert/with-local-monitoring/
weight: 100
@ -11,203 +11,162 @@ keywords:
- alerting
---
# Configure monitoring and alerting
# Monitor Loki using a local LGTM (Loki, Grafana, Tempo and Mimir) stack
By default this Helm Chart configures meta-monitoring of metrics (service monitoring) and logs (self monitoring). This topic will walk you through configuring monitoring using a monitoring solution local to the same cluster where Loki is installed.
This topic will walk you through using the meta-monitoring Helm chart to deploy a local stack to monitor your production Loki installation. This approach leverages many of the chart's _self monitoring_ features, but instead of sending logs back to Loki itself, it sends them to a small Loki, Grafana, Tempo, Mimir (LGTM) stack running within the `meta` namespace.
The `ServiceMonitor` resource works with either the Prometheus Operator or the Grafana Agent Operator, and defines how Loki's metrics should be scraped. Scraping this Loki cluster using the scrape config defined in the `SerivceMonitor` resource is required for the included dashboards to work. A `MetricsInstance` can be configured to write the metrics to a remote Prometheus instance such as Grafana Cloud Metrics.
_Self monitoring_ is enabled by default. This will deploy a `GrafanaAgent`, `LogsInstance`, and `PodLogs` resource which will instruct the Grafana Agent Operator (installed separately) on how to scrape this Loki cluster's logs and send them back to itself. Scraping this Loki cluster using the scrape config defined in the `PodLogs` resource is required for the included dashboards to work.
Rules and alerts are automatically deployed.
**Before you begin:**
## Before you begin
- Helm 3 or above. See [Installing Helm](https://helm.sh/docs/intro/install/).
- A running Kubernetes cluster with a running Loki deployment.
- A running Grafana instance.
- A running Prometheus Operator installed using the `kube-prometheus-stack` Helm chart.
**Prometheus Operator Prequisites**
## Configure the meta namespace
The meta-monitoring stack will be installed in a separate namespace called `meta`. To create this namespace, run the following command:
```bash
kubectl create namespace meta
```
The dashboards require certain metric labels to display Kubernetes metrics. The best way to accomplish this is to install the `kube-prometheus-stack` Helm chart with the following values file, replacing `CLUSTER_NAME` with the name of your cluster. The cluster name is what you specify during the helm installation, so a cluster installed with the command `helm install loki-cluster grafana/loki` would be called `loki-cluster`.
## Configuration and Installation
The meta-monitoring stack is installed using the `meta-monitoring` Helm chart. The local mode deploys a small LGTM stack that includes Alloy, Grafana, Mimir, Loki, and Tempo. To configure the meta-monitoring stack, create a `values.yaml` file with the following content:
```yaml
kubelet:
serviceMonitor:
cAdvisorRelabelings:
- action: replace
replacement: <CLUSTER_NAME>
targetLabel: cluster
- targetLabel: metrics_path
sourceLabels:
- "__metrics_path__"
- targetLabel: "instance"
sourceLabels:
- "node"
defaultRules:
additionalRuleLabels:
cluster: <CLUSTER_NAME>
"kube-state-metrics":
prometheus:
monitor:
relabelings:
- action: replace
replacement: <CLUSTER_NAME>
targetLabel: cluster
- targetLabel: "instance"
sourceLabels:
- "__meta_kubernetes_pod_node_name"
"prometheus-node-exporter":
prometheus:
monitor:
relabelings:
- action: replace
replacement: <CLUSTER_NAME>
targetLabel: cluster
- targetLabel: "instance"
sourceLabels:
- "__meta_kubernetes_pod_node_name"
prometheus:
monitor:
relabelings:
- action: replace
replacement: <CLUSTER_NAME>
targetLabel: cluster
namespacesToMonitor:
- default
cloud:
logs:
enabled: false
metrics:
enabled: false
traces:
enabled: false
local:
grafana:
enabled: true
logs:
enabled: true
metrics:
enabled: true
traces:
enabled: true
minio:
enabled: true
```
For further configuration options, refer to the [sample values.yaml file](https://github.com/grafana/meta-monitoring-chart/blob/main/charts/meta-monitoring/values.yaml).
Local mode by default will also enable Minio, which will act as the object storage for the LGTM stack. To provide access to Minio, you need to create a generic secret. To create the generic secret, run the following command:
```bash
kubectl create secret generic minio -n meta \
--from-literal=<INSERT USERNAME OF CHOICE> \
--from-literal=<INSERT PASSWORD OF CHOICE>
```
{{< admonition type="note" >}}
Username and password must have a minimum of 8 characters.
{{< /admonition >}}
To install the meta-monitoring stack, run the following commands:
```bash
helm repo add grafana https://grafana.github.io/helm-charts
helm repo update
helm install meta-monitoring grafana/meta-monitoring -n meta -f values.yaml
```
or when upgrading the configuration:
```bash
helm upgrade meta-monitoring grafana/meta-monitoring -n meta -f values.yaml
```
To verify the installation, run the following command:
```bash
kubectl get pods -n meta
```
It should return the following pods:
```bash
grafana-59d664f55f-dtfqr 1/1 Running 2 (2m7s ago) 137m
loki-backend-0 2/2 Running 2 (2m7s ago) 137m
loki-backend-1 2/2 Running 4 (2m7s ago) 137m
loki-backend-2 2/2 Running 3 (2m7s ago) 137m
loki-read-6f775d8c5-6t749 1/1 Running 1 (2m7s ago) 137m
loki-read-6f775d8c5-kdd8m 1/1 Running 1 (2m7s ago) 137m
loki-read-6f775d8c5-tsw2r 1/1 Running 1 (2m7s ago) 137m
loki-write-0 1/1 Running 1 (2m7s ago) 137m
loki-write-1 1/1 Running 1 (2m7s ago) 137m
loki-write-2 1/1 Running 1 (2m7s ago) 137m
meta-alloy-0 2/2 Running 2 (2m7s ago) 137m
meta-alloy-1 2/2 Running 2 (2m7s ago) 137m
...
```
## Enable Loki Tracing
By default, Loki does not have tracing enabled. To enable tracing, modify the Loki configuration by editing the `values.yaml` file and adding the following configuration:
Set the `tracing.enabled` configuration to `true`:
```yaml
loki:
tracing:
enabled: true
```
Next, instrument each of the Loki components to send traces to the meta-monitoring stack. Add the `extraEnv` configuration to each of the Loki components:
```yaml
ingester:
replicas: 3
extraEnv:
- name: JAEGER_ENDPOINT
value: "http://mmc-alloy-external.default.svc.cluster.local:14268/api/traces"
# This sets the Jaeger endpoint where traces will be sent.
# The endpoint points to the mmc-alloy service in the default namespace at port 14268.
- name: JAEGER_AGENT_TAGS
value: 'cluster="prod",namespace="default"'
# This specifies additional tags to attach to each span.
# Here, the cluster is labeled as "prod" and the namespace as "default".
- name: JAEGER_SAMPLER_TYPE
value: "ratelimiting"
# This sets the sampling strategy for traces.
# "ratelimiting" means that traces will be sampled at a fixed rate.
- name: JAEGER_SAMPLER_PARAM
value: "1.0"
# This sets the parameter for the sampler.
# For ratelimiting, "1.0" typically means one trace per second.
```
## Install kube-state-metrics
Metrics about Kubernetes objects are scraped from [kube-state-metrics](https://github.com/kubernetes/kube-state-metrics). This needs to be installed in the cluster. The `kubeStateMetrics.endpoint` entry in the meta-monitoring `values.yaml` should be set to its address (without the `/metrics` part in the URL):
```yaml
kubeStateMetrics:
# Scrape https://github.com/kubernetes/kube-state-metrics by default
enabled: true
# This endpoint is created when the helm chart from
# https://artifacthub.io/packages/helm/prometheus-community/kube-state-metrics/
# is used. Change this if kube-state-metrics is installed somewhere else.
endpoint: kube-state-metrics.kube-state-metrics.svc.cluster.local:8080
```
## Accessing the meta-monitoring stack
To access the meta-monitoring stack, you can use port-forwarding to access the Grafana dashboard. To do this, run the following command:
```bash
kubectl port-forward -n meta svc/grafana 3000:3000
```
## Dashboards and Rules
The `kube-prometheus-stack` installs `ServiceMonitor` and `PrometheusRule` resources for monitoring Kubernetes, and it depends on the `kube-state-metrics` and `prometheus-node-exporter` helm charts which also install `ServiceMonitor` resources for collecting `kubelet` and `node-exporter` metrics. The above values file adds the necessary additional labels required for these metrics to work with the included dashboards.
If you are using this helm chart in an environment which does not allow for the installation of `kube-prometheus-stack` or custom CRDs, you should run `helm template` on the `kube-prometheus-stack` helm chart with the above values file, and review all generated `ServiceMonitor` and `PrometheusRule` resources. These resources may have to be modified with the correct ports and selectors to find the various services such as `kubelet` and `node-exporter` in your environment.
**To install the dashboards:**
1. Dashboards are enabled by default. Set `monitoring.dashboards.namespace` to the namespace of the Grafana instance if it is in a different namespace than this Loki cluster.
1. Dashbards must be mounted to your Grafana container. The dashboards are in `ConfigMap`s named `loki-dashboards-1` and `loki-dashboards-2` for Loki, and `enterprise-logs-dashboards-1` and `enterprise-logs-dashboards-2` for GEL. Mount them to `/var/lib/grafana/dashboards/loki-1` and `/var/lib/grafana/dashboards/loki-2` in your Grafana container.
1. Create a dashboard provisioning file called `dashboards.yaml` in `/etc/grafana/provisioning/dashboards` of your Grafana container with the following contents (_note_: you may need to edit the `orgId`):
```yaml
---
apiVersion: 1
providers:
- disableDeletion: true
editable: false
folder: Loki
name: loki-1
options:
path: /var/lib/grafana/dashboards/loki-1
orgId: 1
type: file
- disableDeletion: true
editable: false
folder: Loki
name: loki-2
options:
path: /var/lib/grafana/dashboards/loki-2
orgId: 1
type: file
```
**To add add additional Prometheus rules:**
1. Modify the configuration file `values.yaml`:
```yaml
monitoring:
rules:
additionalGroups:
- name: loki-rules
rules:
- record: job:loki_request_duration_seconds_bucket:sum_rate
expr: sum(rate(loki_request_duration_seconds_bucket[1m])) by (le, job)
- record: job_route:loki_request_duration_seconds_bucket:sum_rate
expr: sum(rate(loki_request_duration_seconds_bucket[1m])) by (le, job, route)
- record: node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate
expr: sum(rate(container_cpu_usage_seconds_total[1m])) by (node, namespace, pod, container)
```
**To disable monitoring:**
1. Modify the configuration file `values.yaml`:
```yaml
selfMonitoring:
enabled: false
serviceMonitor:
enabled: false
```
**To use a remote Prometheus and Loki instance such as Grafana Cloud**
1. Create a `secrets.yaml` file with credentials to access the Grafana Cloud services:
```yaml
---
apiVersion: v1
kind: Secret
metadata:
name: primary-credentials-metrics
namespace: default
stringData:
username: "<instance ID>"
password: "<API key>"
---
apiVersion: v1
kind: Secret
metadata:
name: primary-credentials-logs
namespace: default
stringData:
username: "<instance ID>"
password: "<API key>"
```
2. Add the secret to Kubernetes with `kubectl create -f secret.yaml`.
3. Add a `remoteWrite` section to `serviceMonitor` in `values.yaml`:
```yaml
monitoring:
...
serviceMonitor:
enabled: true
...
metricsInstance:
remoteWrite:
- url: <metrics remote write endpoint>
basicAuth:
username:
name: primary-credentials-metrics
key: username
password:
name: primary-credentials-metrics
key: password
```
4. Add a client to `monitoring.selfMonitoring.logsInstance.clients`:
```yaml
monitoring:
---
selfMonitoring:
enabled: true
logsInstance:
clients:
- url: <logs remote write endpoint>
basicAuth:
username:
name: primary-credentials-logs
key: username
password:
name: primary-credentials-logs
key: password
lokiCanary:
enabled: false
```
5. Install the `Loki meta-motoring` connection on Grafana Cloud.
The local meta-monitoring stack comes with a set of pre-configured dashboards and alerting rules. These can be accessed via
[http://localhost:3000](http://localhost:3000) using the default credentials `admin` and `admin`.
Loading…
Cancel
Save