@ -173,7 +175,7 @@ Logs from each unique set of labels are built up into "chunks" in memory and
then flushed to the backing storage backend.
If an ingester process crashes or exits abruptly, all the data that has not yet
been flushed could be lost. Loki is usually configured with a [Write Ahead Log](../operations/storage/wal) which can be _replayed_ on restart as well as with a `replication_factor` (usually 3) of each log to mitigate this risk.
been flushed could be lost. Loki is usually configured with a [Write Ahead Log](../../operations/storage/wal) which can be _replayed_ on restart as well as with a `replication_factor` (usually 3) of each log to mitigate this risk.
When not configured to accept out-of-order writes,
all lines pushed to Loki for a given stream (unique combination of
@ -189,7 +191,7 @@ nanosecond timestamps:
different content, the log line is accepted. This means it is possible to
have two different log lines for the same timestamp.
#### Handoff - Deprecated in favor of the [WAL](../operations/storage/wal)
#### Handoff - Deprecated in favor of the [WAL](../../operations/storage/wal)
By default, when an ingester is shutting down and tries to leave the hash ring,
it will wait to see if a new ingester tries to enter before flushing and will
@ -243,7 +245,7 @@ Caching log (filter, regexp) queries are under active development.
### Querier
The **querier** service handles queries using the [LogQL](../logql/) query
The **querier** service handles queries using the [LogQL](../../logql/) query
language, fetching logs both from the ingesters and from long-term storage.
Queriers query all ingesters for in-memory data before falling back to
This document builds upon the information in the [Loki Architecture](./) page.
Distributors are stateless and communicate with ingesters via [gRPC](https://grpc.io). The quantity of distributors can be increased or decreased as needed.
## Where does it live?
@ -26,7 +24,7 @@ Currently the only way the distributor mutates incoming data is by normalizing l
The distributor can also rate limit incoming logs based on the maximum per-tenant bitrate. It does this by checking a per tenant limit and dividing it by the current number of distributors. This allows the rate limit to be specified per tenant at the cluster level and enables us to scale the distributors up or down and have the per-distributor limit adjust accordingly. For instance, say we have 10 distributors and tenant A has a 10MB rate limit. Each distributor will allow up to 1MB/second before limiting. Now, say another large tenant joins the cluster and we need to spin up 10 more distributors. The now 20 distributors will adjust their rate limits for tenant A to `(10MB / 20 distributors) = 500KB/s`! This is how global limits allow much simpler and safer operation of the Loki cluster.
**Note: The distributor uses the `ring` component under the hood to register itself amongst it's peers and get the total number of active distributors. This is a different "key" than the ingesters use in the ring and comes from the distributor's own [ring config](../../configuration#distributor_config).**
**Note: The distributor uses the `ring` component under the hood to register itself amongst it's peers and get the total number of active distributors. This is a different "key" than the ingesters use in the ring and comes from the distributor's own [ring configuration](../../../configuration#distributor_config).**
@ -189,7 +192,7 @@ Now let's talk about Loki, where the index is typically an order of magnitude sm
Loki will effectively keep your static costs as low as possible (index size and memory requirements as well as static log storage) and make the query performance something you can control at runtime with horizontal scaling.
To see how this works, let's look back at our example of querying your access log data for a specific IP address. We don't want to use a label to store the IP. Instead we use a [filter expression](../../logql#filter-expression) to query for it:
To see how this works, let's look back at our example of querying your access log data for a specific IP address. We don't want to use a label to store the IP address. Instead we use a [filter expression](../../logql/log_queries#line-filter-expression) to query for it: