Update labels.md (#10124)

**What this PR does / why we need it**:
- Reworded some phrasing to achieve a more formal tone
- Corrected some grammatical errors
- Modified sentence structure for improved readability
- Removed extra blank lines and made a pass at ensuring uniformity
through the document
pull/10240/head^2
willcramer 2 years ago committed by GitHub
parent 9bfe1d81d4
commit 7f48efb60a
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
  1. 18
      docs/sources/get-started/labels.md

@ -27,7 +27,6 @@ Loki places the same restrictions on label naming as [Prometheus](https://promet
>
> Note: The colons are reserved for user defined recording rules. They should not be used by exporters or direct instrumentation.
## Loki labels demo
This series of examples will illustrate basic use cases and concepts for labeling in Loki.
@ -46,14 +45,14 @@ scrape_configs:
__path__: /var/log/syslog
```
This config will tail one file and assign one label: `job=syslog`. You could query it like this:
This config will tail one file and assign one label: `job=syslog`. This will create one stream in Loki.
You could query it like this:
```
{job="syslog"}
```
This will create one stream in Loki.
Now let’s expand the example a little:
```yaml
@ -76,18 +75,17 @@ scrape_configs:
__path__: /var/log/apache.log
```
Now we are tailing two files. Each file gets just one label with one value so Loki will now be storing two streams.
Now we are tailing two files. Each file gets just one label with one value, so Loki will now be storing two streams.
We can query these streams in a few ways:
```
{job="apache"} <- show me logs where the job label is apache
{job="syslog"} <- show me logs where the job label is syslog
{job=~"apache|syslog"} <- show me logs where the job is apache **OR** syslog
```
In that last example, we used a regex label matcher to log streams that use the job label with two values. Now consider how an additional label could also be used:
In that last example, we used a regex label matcher to view log streams that use the job label with one of two possible values. Now consider how an additional label could also be used:
```yaml
scrape_configs:
@ -182,15 +180,15 @@ Imagine now if you set a label for `ip`. Not only does every request from a user
Doing some quick math, if there are maybe four common actions (GET, PUT, POST, DELETE) and maybe four common status codes (although there could be more than four!), this would be 16 streams and 16 separate chunks. Now multiply this by every user if we use a label for `ip`. You can quickly have thousands or tens of thousands of streams.
This is high cardinality. This can kill Loki.
This is high cardinality, and it can lead to significant performance degredation.
When we talk about _cardinality_ we are referring to the combination of labels and values and the number of streams they create. High cardinality is using labels with a large range of possible values, such as `ip`, **or** combining many labels, even if they have a small and finite set of values, such as using `status_code` and `action`.
High cardinality causes Loki to build a huge index (read: $$$$) and to flush thousands of tiny chunks to the object store (read: slow). Loki currently performs very poorly in this configuration and will be the least cost-effective and least fun to run and use.
High cardinality causes Loki to build a huge index and to flush thousands of tiny chunks to the object store. Loki currently performs very poorly in this configuration. If not accounted for, high cardinality will significantly reduce the operability and cost-effectiveness of Loki.
## Optimal Loki performance with parallelization
Now you may be asking: If using lots of labels or labels with lots of values is bad, how am I supposed to query my logs? If none of the data is indexed, won't queries be really slow?
Now you may be asking: If using too many labels—or using labels with too many values—is bad, then how am I supposed to query my logs? If none of the data is indexed, won't queries be really slow?
As we see people using Loki who are accustomed to other index-heavy solutions, it seems like they feel obligated to define a lot of labels in order to query their logs effectively. After all, many other logging solutions are all about the index, and this is the common way of thinking.

Loading…
Cancel
Save