**What this PR does / why we need it**:
- Reworded some phrasing to achieve a more formal tone
- Corrected some grammatical errors
- Modified sentence structure for improved readability
- Removed extra blank lines and made a pass at ensuring uniformity
through the document
In that last example, we used a regex label matcher to log streams that use the job label with two values. Now consider how an additional label could also be used:
In that last example, we used a regex label matcher to view log streams that use the job label with one of two possible values. Now consider how an additional label could also be used:
```yaml
scrape_configs:
@ -182,15 +180,15 @@ Imagine now if you set a label for `ip`. Not only does every request from a user
Doing some quick math, if there are maybe four common actions (GET, PUT, POST, DELETE) and maybe four common status codes (although there could be more than four!), this would be 16 streams and 16 separate chunks. Now multiply this by every user if we use a label for `ip`. You can quickly have thousands or tens of thousands of streams.
This is high cardinality. This can kill Loki.
This is high cardinality, and it can lead to significant performance degredation.
When we talk about _cardinality_ we are referring to the combination of labels and values and the number of streams they create. High cardinality is using labels with a large range of possible values, such as `ip`, **or** combining many labels, even if they have a small and finite set of values, such as using `status_code` and `action`.
High cardinality causes Loki to build a huge index (read: $$$$) and to flush thousands of tiny chunks to the object store (read: slow). Loki currently performs very poorly in this configuration and will be the least cost-effective and least fun to run and use.
High cardinality causes Loki to build a huge index and to flush thousands of tiny chunks to the object store. Loki currently performs very poorly in this configuration. If not accounted for, high cardinality will significantly reduce the operability and cost-effectiveness of Loki.
## Optimal Loki performance with parallelization
Now you may be asking: If using lots of labels or labels with lots of values is bad, how am I supposed to query my logs? If none of the data is indexed, won't queries be really slow?
Now you may be asking: If using too many labels—or using labels with too many values—is bad, then how am I supposed to query my logs? If none of the data is indexed, won't queries be really slow?
As we see people using Loki who are accustomed to other index-heavy solutions, it seems like they feel obligated to define a lot of labels in order to query their logs effectively. After all, many other logging solutions are all about the index, and this is the common way of thinking.