diff --git a/docs/sources/architecture/_index.md b/docs/sources/architecture/_index.md index 8ef90b7dcf..132ca5bac3 100644 --- a/docs/sources/architecture/_index.md +++ b/docs/sources/architecture/_index.md @@ -159,10 +159,35 @@ if any write failed to one of the replicas, multiple differing chunk objects will be created in the backing store. See [Querier](#querier) for how data is deduplicated. -The ingesters validate timestamps for each log line received maintains a -strict ordering. See the [Loki -Overview](../overview#timestamp-ordering) for detailed documentation on -the rules of timestamp order. +#### Timestamp Ordering + +The ingester validates that ingested log lines are not out of order. When an +ingester receives a log line that doesn't follow the expected order, the line +is rejected and an error is returned to the user. + +The ingester validates that ingested log lines are received in +timestamp-ascending order (i.e., each log has a timestamp that occurs at a later +time than the log before it). When the ingester receives a log that does not +follow this order, the log line is rejected and an error is returned. + +Logs from each unique set of labels are built up into "chunks" in memory and +then flushed to the backing storage backend. + +If an ingester process crashes or exits abruptly, all the data that has not yet +been flushed could be lost. Loki is usually configured with a [Write Ahead Log](../operations/storage/wal) which can be _replayed_ on restart as well as with a `replication_factor` (usually 3) of each log to mitigate this risk. + +In general, all lines pushed to Loki for a given stream (unique combination of +labels) must have a newer timestamp than the line received before it. There are, +however, two cases for handling logs for the same stream with identical +nanosecond timestamps: + +1. If the incoming line exactly matches the previously received line (matching + both the previous timestamp and log text), the incoming line will be treated + as an exact duplicate and ignored. + +2. If the incoming line has the same timestamp as the previous line but + different content, the log line is accepted. This means it is possible to + have two different log lines for the same timestamp. #### Handoff - Deprecated in favor of the [WAL](../operations/storage/wal) @@ -179,6 +204,13 @@ set of tokens. This process is used to avoid flushing all chunks when shutting down, which is a slow process. +#### Filesystem Support + +While ingesters do support writing to the filesystem through BoltDB, this only +works in single-process mode as [queriers](#querier) need access to the same +back-end store and BoltDB only allows one process to have a lock on the DB at a +given time. + ### Query frontend The **query frontend** is an **optional service** providing the querier's API endpoints and can be used to accelerate the read path. When the query frontend is in place, incoming query requests should be directed to the query frontend instead of the queriers. The querier service will be still required within the cluster, in order to execute the actual queries. @@ -212,7 +244,7 @@ Caching log (filter, regexp) queries are under active development. ### Querier The **querier** service handles queries using the [LogQL](../logql/) query -language, fetching logs both from the ingesters and long-term storage. +language, fetching logs both from the ingesters and from long-term storage. Queriers query all ingesters for in-memory data before falling back to running the same query against the backend store. Because of the replication diff --git a/docs/sources/architecture/distributor.md b/docs/sources/architecture/distributor.md index 4a79642f6d..f0ed888c30 100644 --- a/docs/sources/architecture/distributor.md +++ b/docs/sources/architecture/distributor.md @@ -6,6 +6,8 @@ weight: 1000 This document builds upon the information in the [Loki Architecture](./) page. +Distributors are stateless and communicate with ingesters via [gRPC](https://grpc.io). The quantity of distributors can be increased or decreased as needed. + ## Where does it live? The distributor is the first component on Loki's write path downstream from any gateways providing auth or load balancing. It's responsible for validating, preprocessing, and applying a subset of rate limiting to incoming data before sending it to the ingester component. It is important that a load balancer sits in front of the distributor in order to properly balance traffic to them. diff --git a/docs/sources/overview/_index.md b/docs/sources/overview/_index.md index 322bf4cce1..44c2ceb08f 100644 --- a/docs/sources/overview/_index.md +++ b/docs/sources/overview/_index.md @@ -30,146 +30,3 @@ or for running it at a small scale. For horizontal scalability, the microservices of Loki can be broken out into separate processes, allowing them to scale independently of each other. -## Components - -### Distributor - -The **distributor** service is responsible for handling logs written by -[clients](../clients/). It's essentially the "first stop" in the write -path for log data. Once the distributor receives log data, it splits them into -batches and sends them to multiple [ingesters](#ingester) in parallel. - -Distributors communicate with ingesters via [gRPC](https://grpc.io). They are -stateless and can be scaled up and down as needed. - -#### Hashing - -Distributors use consistent hashing in conjunction with a configurable -replication factor to determine which instances of the ingester service should -receive log data. - -The hash is based on a combination of the log's labels and the tenant ID. - -A hash ring stored in [Consul](https://www.consul.io) is used to achieve -consistent hashing; all [ingesters](#ingester) register themselves into the -hash ring with a set of tokens they own. Distributors then find the token that -most closely matches the value of the log's hash and will send data to that -token's owner. - -#### Quorum consistency - -Since all distributors share access to the same hash ring, write requests can be -sent to any distributor. - -To ensure consistent query results, Loki uses -[Dynamo-style](https://www.cs.princeton.edu/courses/archive/fall15/cos518/studpres/dynamo.pdf) -quorum consistency on reads and writes. This means that the distributor will wait -for a positive response of at least one half plus one of the ingesters to send -the sample to before responding to the user. - -### Ingester - -The **ingester** service is responsible for writing log data to long-term -storage backends (DynamoDB, S3, Cassandra, etc.). - -The ingester validates that ingested log lines are not out of order. When an -ingester receives a log line that doesn't follow the expected order, the line -is rejected and an error is returned to the user. See the section on [Timestamp -ordering](#timestamp-ordering) for more information. - -The ingester validates that ingested log lines are received in -timestamp-ascending order (i.e., each log has a timestamp that occurs at a later -time than the log before it). When the ingester receives a log that does not -follow this order, the log line is rejected and an error is returned. - -Logs from each unique set of labels are built up into "chunks" in memory and -then flushed to the backing storage backend. - -If an ingester process crashes or exits abruptly, all the data that has not yet -been flushed will be lost. Loki is usually configured to replicate multiple -replicas (usually 3) of each log to mitigate this risk. - -#### Timestamp Ordering - -In general, all lines pushed to Loki for a given stream (unique combination of -labels) must have a newer timestamp than the line received before it. There are, -however, two cases for handling logs for the same stream with identical -nanosecond timestamps: - -1. If the incoming line exactly matches the previously received line (matching - both the previous timestamp and log text), the incoming line will be treated - as an exact duplicate and ignored. - -2. If the incoming line has the same timestamp as the previous line but - different content, the log line is accepted. This means it is possible to - have two different log lines for the same timestamp. - -#### Handoff - -By default, when an ingester is shutting down and tries to leave the hash ring, -it will wait to see if a new ingester tries to enter before flushing and will -try to initiate a handoff. The handoff will transfer all of the tokens and -in-memory chunks owned by the leaving ingester to the new ingester. - -This process is used to avoid flushing all chunks when shutting down, which is a -slow process. - -#### Filesystem Support - -While ingesters do support writing to the filesystem through BoltDB, this only -works in single-process mode as [queriers](#querier) need access to the same -back-end store and BoltDB only allows one process to have a lock on the DB at a -given time. - -### Querier - -The **querier** service handles the actual [LogQL](../logql/) evaluation of -logs stored in long-term storage. - -It first tries to query all ingesters for in-memory data before falling back to -loading data from the backend store. - -### Query frontend - -The **query-frontend** service is an optional component in front of a pool of queriers. It's responsible for fairly scheduling requests between them, paralleling them when possible, and caching. - -## Chunk Store - -The **chunk store** is Loki's long-term data store, designed to support -interactive querying and sustained writing without the need for background -maintenance tasks. It consists of: - -- An index for the chunks. This index can be backed by - [DynamoDB from Amazon Web Services](https://aws.amazon.com/dynamodb), - [Bigtable from Google Cloud Platform](https://cloud.google.com/bigtable), or - [Apache Cassandra](https://cassandra.apache.org). -- A key-value (KV) store for the chunk data itself, which can be DynamoDB, - Bigtable, Cassandra again, or an object store such as - [Amazon * S3](https://aws.amazon.com/s3) - -> Unlike the other core components of Loki, the chunk store is not a separate -> service, job, or process, but rather a library embedded in the two services -> that need to access Loki data: the [ingester](#ingester) and [querier](#querier). - -The chunk store relies on a unified interface to the -"[NoSQL](https://en.wikipedia.org/wiki/NoSQL)" stores (DynamoDB, Bigtable, and -Cassandra) that can be used to back the chunk store index. This interface -assumes that the index is a collection of entries keyed by: - -- A **hash key**. This is required for *all* reads and writes. -- A **range key**. This is required for writes and can be omitted for reads, -which can be queried by prefix or range. - -The interface works somewhat differently across the supported databases: - -- DynamoDB supports range and hash keys natively. Index entries are thus - modelled directly as DynamoDB entries, with the hash key as the distribution - key and the range as the range key. -- For Bigtable and Cassandra, index entries are modelled as individual column - values. The hash key becomes the row key and the range key becomes the column - key. - -A set of schemas are used to map the matchers and label sets used on reads and -writes to the chunk store into appropriate operations on the index. Schemas have -been added as Loki has evolved, mainly in an attempt to better load balance -writes and improve query performance.