Like Prometheus, but for logs.
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
loki/docs/operations/observability.md

88 lines
7.4 KiB

Documentation Rewrite (#982) * docs: create structure of docs overhaul This commit removes all old docs and lays out the table of contents and framework for how the new documentation will be intended to be read. * docs: add design docs back in * docs: add community documentation * docs: add LogQL docs * docs: port existing operations documentation * docs: add new placeholder file for promtail configuration docs * docs: add TOC for operations/storage * docs: add Loki API documentation * docs: port troubleshooting document * docs: add docker-driver documentation * docs: link to configuration from main docker-driver document * docs: update API for new paths * docs: fix broken links in api.md and remove json marker from examples * docs: incorporate api changes from #1009 * docs: port promtail documentation * docs: add TOC to promtail configuration reference * docs: fix promtail spelling errors * docs: add loki configuration reference * docs: add TOC to configuration * docs: add loki configuration example * docs: add Loki overview with brief explanation about each component * docs: add comparisons document * docs: add info on table manager and update storage/README.md * docs: add getting started * docs: incorporate config yaml changes from #755 * docs: fix typo in releases url for promtail * docs: add installation instructions * docs: add more configuration examples * docs: add information on fluentd client fluent-bit has been temporarily removed until the PR for it is merged. * docs: PR review feedback * docs: add architecture document * docs: add missing information from old docs * `localy` typo Co-Authored-By: Ed Welch <ed@oqqer.com> * docs: s/ran/run/g * Typo * Typo * Tyop * Typo * docs: fixed typo * docs: PR feedback * docs: @cyriltovena PR feedback * docs: add more details to promtail url config option * docs: expand promtail's pipelines document with extra detail * docs: remove reference to Stage interface in pipelines.md * docs: fixed some spelling * docs: clarify promtail configuration and scraping * docs: attempt #2 at explaining promtail's usage of machine hostname * docs: spelling fixes * docs: add reference to promtail custom metrics and fix silly typo * docs: cognizant -> aware * docs: typo * docs: typos * docs: add which components expose which API endpoints in microservices mode * docs: change ksonnet installation to tanka * docs: address most @pracucci feedback * docs: fix all spelling errors so reviewers don't have to keep finding them :) * docs: incorporate changes to API endpoints made in #1022 * docs: add missing loki metrics * docs: add missing promtail metrics * docs: @pstribrany feedback * docs: more @pracucci feedback * docs: move metrics into a table * docs: update push path references to /loki/api/v1/push * docs: add detail to further explain limitations of monolithic mode * docs: add alternative names to modes_of_operation diagram * docs: add log ordering requirement * docs: add procedure for updating docs with latest version * docs: separate out stages documentation into one document per stage * docs: list supported stores in storage documentation * docs: add info on duplicate log lines in pipelines * docs: add line_format as key feature to fluentd * docs: hopefully final commit :)
6 years ago
# Observing Loki
Both Loki and Promtail expose a `/metrics` endpoint that expose Prometheus
metrics. You will need a local Prometheus and add Loki and Promtail as targets.
See [configuring
Prometheus](https://prometheus.io/docs/prometheus/latest/configuration/configuration)
for more information.
All components of Loki expose the following metrics:
| Metric Name | Metric Type | Description |
| ------------------------------- | ----------- | ---------------------------------------- |
| `log_messages_total` | Counter | Total number of messages logged by Loki. |
| `loki_request_duration_seconds` | Histogram | Number of received HTTP requests. |
The Loki Distributors expose the following metrics:
| Metric Name | Metric Type | Description |
| ------------------------------------------------- | ----------- | ----------------------------------------------------------- |
| `loki_distributor_ingester_appends_total` | Counter | The total number of batch appends sent to ingesters. |
| `loki_distributor_ingester_append_failures_total` | Counter | The total number of failed batch appends sent to ingesters. |
| `loki_distributor_bytes_received_total` | Counter | The total number of uncompressed bytes received per tenant. |
| `loki_distributor_lines_received_total` | Counter | The total number of lines received per tenant. |
The Loki Ingesters expose the following metrics:
| Metric Name | Metric Type | Description |
| ----------------------------------------- | ----------- | ------------------------------------------------------------------------------------------- |
| `cortex_ingester_flush_queue_length` | Gauge | The total number of series pending in the flush queue. |
| `loki_ingester_chunk_age_seconds` | Histogram | Distribution of chunk ages when flushed. |
| `loki_ingester_chunk_encode_time_seconds` | Histogram | Distribution of chunk encode times. |
| `loki_ingester_chunk_entries` | Histogram | Distribution of entires per-chunk when flushed. |
| `loki_ingester_chunk_size_bytes` | Histogram | Distribution of chunk sizes when flushed. |
| `loki_ingester_chunk_stored_bytes_total` | Counter | Total bytes stored in chunks per tenant. |
| `loki_ingester_chunks_created_total` | Counter | The total number of chunks created in the ingester. |
| `loki_ingester_chunks_flushed_total` | Counter | The total number of chunks flushed by the ingester. |
| `loki_ingester_chunks_stored_total` | Counter | Total stored chunks per tenant. |
| `loki_ingester_received_chunks` | Counter | The total number of chunks sent by this ingester whilst joining during the handoff process. |
| `loki_ingester_samples_per_chunk` | Histogram | The number of samples in a chunk. |
| `loki_ingester_sent_chunks` | Counter | The total number of chunks sent by this ingester whilst leaving during the handoff process. |
| `loki_ingester_streams_created_total` | Counter | The total number of streams created per tenant. |
| `loki_ingester_streams_removed_total` | Counter | The total number of streams removed per tenant. |
Promtail exposes these metrics:
| Metric Name | Metric Type | Description |
| ----------------------------------------- | ----------- | ------------------------------------------------------------------------------------------ |
| `promtail_read_bytes_total` | Gauge | Number of bytes read. |
| `promtail_read_lines_total` | Counter | Number of lines read. |
| `promtail_dropped_bytes_total` | Counter | Number of bytes dropped because failed to be sent to the ingester after all retries. |
| `promtail_dropped_entries_total` | Counter | Number of log entries dropped because failed to be sent to the ingester after all retries. |
| `promtail_encoded_bytes_total` | Counter | Number of bytes encoded and ready to send. |
| `promtail_file_bytes_total` | Gauge | Number of bytes read from files. |
| `promtail_files_active_total` | Gauge | Number of active files. |
| `promtail_log_entries_bytes` | Histogram | The total count of bytes read. |
| `promtail_request_duration_seconds_count` | Histogram | Number of send requests. |
| `promtail_sent_bytes_total` | Counter | Number of bytes sent. |
| `promtail_sent_entries_total` | Counter | Number of log entries sent to the ingester. |
| `promtail_targets_active_total` | Gauge | Number of total active targets. |
| `promtail_targets_failed_total` | Counter | Number of total failed targets. |
Most of these metrics are counters and should continuously increase during normal operations:
1. Your app emits a log line to a file that is tracked by Promtail.
2. Promtail reads the new line and increases its counters.
3. Promtail forwards the log line to a Loki distributor, where the received
counters should increase.
4. The Loki distributor forwards the log line to a Loki ingester, where the
request duration counter should increase.
If Promtail uses any pipelines with metrics stages, those metrics will also be
exposed by Promtail at its `/metrics` endpoint. See Promtail's documentation on
[Pipelines](../clients/promtail/pipelines.md) for more information.
An example Grafana dashboard was built by the community and is available as
dashboard [10004](https://grafana.com/dashboards/10004).
## Mixins
The Loki repository has a [mixin](../../production/loki-mixin) that includes a
set of dashboards, recording rules, and alerts. Together, the mixin gives you a
comprehensive package for monitoring Loki in production.
For more information about mixins, take a look at the docs for the
[monitoring-mixins project](https://github.com/monitoring-mixins/docs).