From 7d195a4cb907eb9cd154eca93dace8fa58c203eb Mon Sep 17 00:00:00 2001
From: Cyril Tovena <cyril.tovena@gmail.com>
Date: Sat, 24 Oct 2020 18:32:54 -0400
Subject: [PATCH] Logqlv2 docs (#2797)

* Working on the logql v2 documentation.

Signed-off-by: Cyril Tovena <cyril.tovena@gmail.com>

* Wip

Signed-off-by: Cyril Tovena <cyril.tovena@gmail.com>

* wip

Signed-off-by: Cyril Tovena <cyril.tovena@gmail.com>

* Add informations about metrics queries.

Signed-off-by: Cyril Tovena <cyril.tovena@gmail.com>

* doc work

* small fixes

Co-authored-by: Owen Diehl <ow.diehl@gmail.com>
---
 docs/sources/logql/_index.md | 500 +++++++++++++++++++++++++++++++++--
 1 file changed, 472 insertions(+), 28 deletions(-)

diff --git a/docs/sources/logql/_index.md b/docs/sources/logql/_index.md
index 7b373d7f13..824a53b014 100644
--- a/docs/sources/logql/_index.md
+++ b/docs/sources/logql/_index.md
@@ -11,25 +11,39 @@ LogQL uses labels and operators for filtering.
 There are two types of LogQL queries:
 
 - *Log queries* return the contents of log lines.
-- *Metric queries* extend log queries and calculate values based on the counts of logs from a log query.
+- *Metric queries* extend log queries and calculate sample values based on the content of logs from a log query.
+
+## Log Queries
 
 A basic log query consists of two parts:
 
 - **log stream selector**
-- **filter expression**
+- **log pipeline**
 
-Due to Loki's design, all LogQL queries must contain a log stream selector.
+Due to Loki's design, all LogQL queries must contain a **log stream selector**.
 
 The log stream selector determines how many log streams (unique sources of log content, such as files) will be searched.
 A more granular log stream selector then reduces the number of searched streams to a manageable volume.
 This means that the labels passed to the log stream selector will affect the relative performance of the query's execution.
-The filter expression is then used to do a distributed `grep` over the aggregated logs from the matching log streams.
 
-To avoid escaping special characters you can use the `` ` ``(back-tick) instead of `"` when quoting strings.
+**Optionally** the log stream selector can be followed by a **log pipeline**. A log pipeline is a set of stage expressions chained together and applied to the selected log streams. Each expressions can filter out, parse and mutate log lines and their respective labels.
+
+The following example shows a full log query in action:
+
+```logql
+{container="query-frontend",namespace="tempo-dev"} |= "metrics.go" | logfmt | duration > 10s and throughput_mb < 500
+```
+
+The query is composed of:
+
+- a log stream selector `{container="query-frontend",namespace="loki-dev"}` which targets the `query-frontend` container  in the `loki-dev`namespace.
+- a log pipeline `|= "metrics.go" | logfmt | duration > 10s and throughput_mb < 500` which will filter out log that doesn't contains the word `metrics.go`, then parses each log line to extract more labels and filter with them.
+
+> To avoid escaping special characters you can use the `` ` ``(back-tick) instead of `"` when quoting strings.
 For example `` `\w+` `` is the same as `"\\w+"`.
 This is specially useful when writing a regular expression which contains multiple backslashes that require escaping.
 
-## Log Stream Selector
+### Log Stream Selector
 
 The log stream selector determines which log streams should be included in your query results.
 The stream selector is comprised of one or more key-value pairs, where each key is a **log label** and each value is that **label's value**.
@@ -60,7 +74,26 @@ Examples:
 
 The same rules that apply for [Prometheus Label Selectors](https://prometheus.io/docs/prometheus/latest/querying/basics/#instant-vector-selectors) apply for Loki log stream selectors.
 
-### Filter Expression
+### Log Pipeline
+
+A log pipeline can be appended to a log stream selector to further process and filter log streams. It usually is composed of one or multiple expressions, each expressions is executed in sequence for each log line. If an expression filters out a log line, the pipeline will stop at this point and start processing the next line.
+
+Some expressions can mutate the log content and respective labels (e.g `| line_format "{{.status_code}}"`), which will be then available for further filtering and processing following expressions or metric queries.
+
+A log pipeline can be composed of:
+
+- [Line Filter Expression](#Line-Filter-Expression).
+- [Parser Expression](#Parser-Expression)
+- [Label Filter Expression](#Label-Filter-Expression)
+- [Line Format Expression](#Line-Format-Expression)
+- [Labels Format Expression](#Labels-Format-Expression)
+- [Unwrap Expression](#Unwrap-Expression)
+
+The [unwrap Expression](#Unwrap-Expression) is a special expression that should only be used within metric queries.
+
+#### Line Filter Expression
+
+The line filter expression is used to do a distributed `grep` over the aggregated logs from the matching log streams.
 
 After writing the log stream selector, the resulting set of logs can be further filtered with a search expression.
 The search expression can be just text or regex:
@@ -87,25 +120,355 @@ Filter operators can be chained and will sequentially filter down the expression
 When using `|~` and `!~`, Go (as in [Golang](https://golang.org/)) [RE2 syntax](https://github.com/google/re2/wiki/Syntax) regex may be used.
 The matching is case-sensitive by default and can be switched to case-insensitive prefixing the regex with `(?i)`.
 
+While line filter expressions could be placed anywhere in a pipeline, it is almost always better to have them at the beginning. This ways it will improve the performance of the query doing further processing only when a line matches.
+
+For example, while the result will be the same, the following query `{job="mysql"} |= "error" | json | line_format "{{.err}}"` will always run faster than  `{job="mysql"} | json | line_format "{{.message}}"` |= "error"`. Line filter expressions are the fastest way to filter logs after log stream selectors.
+
+#### Parser Expression
+
+Parser expression can parse and extract labels from the log content. Those extracted labels can then be used for filtering using [label filter expressions](#Label-Filter-Expression) or for [metric aggregations](#Metric-Queries).
+
+Extracted label keys are automatically sanitized by all parsers, to follow Prometheus metric name convention.(They can only contain ASCII letters and digits, as well as underscores and colons. They cannot start with a digit.)
+
+For instance, the pipeline `| json` will produce the following mapping:
+```json
+{ "a.b": {c: "d"}, e: "f" }
+```
+->
+```
+{a_b_c="d", e="f"}
+```
+
+In case of errors, for instance if the line is not in the expected format, the log line won't be filtered but instead will get a new `__error__` label added.
+
+If an extracted label key name already exists in the original log stream, the extracted label key will be suffixed with the `_extracted` keyword to make the distinction between the two labels. You can forcefully override the original label using a [label formatter expression](#Labels-Format-Expression). However if an extracted key appears twice, only the latest label value will be kept.
+
+We support currently support json, logfmt and regexp parsers.
+
+The **json** parsers take no parameters and can be added using the expression `| json` in your pipeline. It will extract all json properties as labels if the log line is a valid json document. Nested properties are flattened into label keys using the `_` separator. **Arrays are skipped**.
+
+For example the json parsers will extract from the following document:
+
+```json
+{
+	"protocol": "HTTP/2.0",
+	"servers": ["129.0.1.1","10.2.1.3"],
+	"request": {
+		"time": "6.032",
+		"method": "GET",
+		"host": "foo.grafana.net",
+		"size": "55",
+	},
+	"response": {
+		"status": 401,
+		"size": "228",
+		"latency_seconds": "6.031"
+	}
+}
+```
+
+The following list of labels:
+
+```kv
+"protocol" => "HTTP/2.0"
+"request_time" => "6.032"
+"request_method" => "GET"
+"request_host" => "foo.grafana.net"
+"request_size" => "55"
+"response_status" => "401"
+"response_size" => "228"
+"response_size" => "228"
+```
+
+The **logfmt** parser can be added using the `| logfmt` and will extract all keys and values from the [logfmt](https://brandur.org/logfmt) formatted log line.
+
+For example the following log line:
+
+```logfmt
+at=info method=GET path=/ host=grafana.net fwd="124.133.124.161" connect=4ms service=8ms status=200
+```
+
+will get those labels extracted:
+
+```kv
+"at" => "info"
+"method" => "GET"
+"path" => "/"
+"host" => "grafana.net"
+"fwd" => "124.133.124.161"
+"service" => "8ms"
+"status" => "200"
+```
+
+Unlike the logfmt and json, which extract implicitly all values and takes no parameters, the **regexp** parser takes a single parameter `| regexp "<re>"` which is the regular expression using the [Golang](https://golang.org/) [RE2 syntax](https://github.com/google/re2/wiki/Syntax).
+
+The regular expression must contain a least one named sub-match (e.g `(?P<name>re)`), each sub-match will extract a different label.
+
+For example the parser `| regexp "(?P<method>\\w+) (?P<path>[\\w|/]+) \\((?P<status>\\d+?)\\) (?P<duration>.*)"` will extract from the following line:
+
+```log
+POST /api/prom/api/v1/query_range (200) 1.5s
+```
+
+those labels:
+
+```kv
+"method" => "POST"
+"path" => "/api/prom/api/v1/query_range"
+"status" => "200"
+"duration" => "1.5s"
+```
+
+It's easier to use the predefined parsers like `json` and `logfmt` when you can, falling back to `regexp` when the log lines have unusual structure. Multiple parsers can be used during the same log pipeline which is useful when you want to parse complex logs. ([see examples](#Multiple-parsers))
+
+#### Label Filter Expression
+
+Label filter expression allows filtering log line using their original and extracted labels. It can contains multiple predicates.
+
+A predicate contains a **label identifier**, an **operation** and a **value** to compare the label with.
+
+For example with `cluster="namespace"` the cluster is the label identifier, the operation is `=` and the value is "namespace". The label identifier is always on the right side of the operation.
+
+We support multiple **value** types which are automatically inferred from the query input.
+
+- **String** is double quoted or backticked such as `"200"` or \``us-central1`\`.
+- **[Duration](https://golang.org/pkg/time/#ParseDuration)** is a sequence of decimal numbers, each with optional fraction and a unit suffix, such as "300ms", "1.5h" or "2h45m". Valid time units are "ns", "us" (or "µs"), "ms", "s", "m", "h".
+- **Number** are floating-point number (64bits), such as`250`, `89.923`.
+- **Bytes** is a sequence of decimal numbers, each with optional fraction and a unit suffix, such as "42MB", "1.5Kib" or "20b". Valid bytes units are "b", "kib", "kb", "mib", "mb", "gib",  "gb", "tib", "tb", "pib", "pb", "eib", "eb".
+
+String type work exactly like Prometheus label matchers use in [log stream selector](#Log-Stream-Selector). This means you can use the same operations (`=`,`!=`,`=~`,`!~`).
+
+> The string type is the only one that can filter out a log line with a label `__error__`.
+
+Using Duration, Number and Bytes will convert the label value prior to comparision and support the following comparators:
+
+- `==` or `=` for equality.
+- `!=` for inequality.
+- `>` and `>=` for greater than and greater than or equal.
+- `<` and `<=` for lesser than and lesser than or equal.
+
+For instance, `logfmt | duration > 1m and bytes_consumed > 20MB`
+
+If the conversion of the label value fails, the log line is not filtered and an `__error__` label is added. To filters those errors see the [pipeline errors](#Pipeline-Errors) section.
+
+You can chain multiple predicates using `and` and `or` which respectively express the `and` and `or` binary operations. `and` can be equivalently expressed by a comma, a space or another pipe. Label filters can be place anywhere in a log pipeline.
+
+This means that all the following expressions are equivalent:
+
+```logql
+| duration >= 20ms or size == 20kb and method!~"2.."
+| duration >= 20ms or size == 20kb | method!~"2.."
+| duration >= 20ms or size == 20kb , method!~"2.."
+| duration >= 20ms or size == 20kb  method!~"2.."
+
+```
+
+By default the precedence of multiple predicates is right to left. You can wrap predicates with parenthesis to force a different precedence left to right.
+
+For example the following are equivalent.
+
+```logql
+| duration >= 20ms or method="GET" and size <= 20KB
+| ((duration >= 20ms or method="GET") and size <= 20KB)
+```
+
+It will evaluate first `duration >= 20ms or method="GET"`. To evaluate first `method="GET" and size <= 20KB`, make sure to use proper parenthesis as shown below.
+
+```logql
+| duration >= 20ms or (method="GET" and size <= 20KB)
+```
+
+> Label filter expressions are the only expression allowed after the [unwrap expression](#Unwrap-Expression). This is mainly to allow filtering errors from the metric extraction (see [errors](#Pipeline-Errors)).
+
+#### Line Format Expression
+
+The line format expression can rewrite the log line content by using the [text/template](https://golang.org/pkg/text/template/) format.
+It takes a single string parameter `| line_format "{{.label_name}}"`, which is the template format. All labels are injected variables into the template and are available to use with the `{{.label_name}}` notation.
+
+For example the following expression:
+
+```logql
+{container="frontend"} | logfmt | line_format "{{.query}} {{.duration}}"
+```
+
+Will extract and rewrite the log line to only contains the query and the duration of a request.
+
+You can use double quoted string for the template or single backtick \``\{{.label_name}}`\` to avoid the need to escape special characters.
+
+See [functions](#Template-functions) to learn about available functions in the template format.
+
+#### Labels Format Expression
+
+The `| label_format` expression can renamed, modify or add labels. It takes as parameter a comma separated list of equality operations, enabling multiple operations at once.
+
+When both side are label identifiers, for example `dst=src`, the operation will rename the `src` label into `dst`.
+
+The left side can alternatively be a template string (double quoted or backtick), for example `dst="{{.status}} {{.query}}"`, in which case the `dst` label value will be replace by the result of the [text/template](https://golang.org/pkg/text/template/) evaluation. This is the same template engine as the `| line_format` expression, this mean labels are available as variables and you can use the same list of [functions](#Template-functions).
+
+In both case if the destination label doesn't exist a new one will be created.
+
+The renaming form `dst=src` will _drop_ the `src` label after remapping it to the `dst` label. However, the _template_ form will preserve the referenced labels, such that  `dst="{{.src}}"` results in both `dst` and `src` having the same value.
+
+> A single label name can only appear once per expression. This means `| label_format foo=bar,foo="new"` is not allowed but you can use two expressions for the desired effect: `| label_format foo=bar | label_format foo="new"`
+
+#### Template functions
+
+The text template format used in `| line_format` and `| label_format` support functions the following list of functions.
+
+##### ToLower & ToUpper
+
+Convert the entire string to lowercase or uppercase:
+
+Examples:
+
+```template
+"{{.request_method | ToLower}}"
+"{{.request_method | ToUpper}}"
+`{{ToUpper "This is a string" | ToLower}}`
+```
+
+##### Replace
+
+Perform simple string replacement.
+
+It takes three arguments:
+
+- string to replace
+- string to replace with
+- source string
+
+Example:
+
+```template
+`"This is a string" | Replace " " "-"`
+```
+
+The above will produce `This-is-a-string`
+
+##### Trim
+
+`Trim` returns a slice of the string s with all leading and
+trailing Unicode code points contained in cutset removed.
+
+`TrimLeft` and `TrimRight` are the same as `Trim` except that it respectively trim only leading and trailing characters.
+
+```template
+`{{ Trim .query ",. " }}`
+`{{ TrimLeft .uri ":" }}`
+`{{ TrimRight .path "/" }}`
+```
+
+`TrimSpace` TrimSpace returns string s with all leading
+and trailing white space removed, as defined by Unicode.
+
+```template
+{{ TrimSpace .latency }}
+```
+
+`TrimPrefix` and `TrimSuffix` will trim respectively the prefix or suffix supplied.
+
+```template
+{{ TrimPrefix .path "/" }}
+```
+
+##### Regex
+
+`regexReplaceAll` returns a copy of the input string, replacing matches of the Regexp with the replacement string replacement. Inside string replacement, $ signs are interpreted as in Expand, so for instance $1 represents the text of the first sub-match. See the golang [docs](https://golang.org/pkg/regexp/#Regexp.ReplaceAll) for detailed examples.
+
+```template
+`{{ regexReplaceAllLiteral "(a*)bc" .some_label "${1}a" }}`
+```
+
+`regexReplaceAllLiteral` returns a copy of the input string, replacing matches of the Regexp with the replacement string replacement The replacement string is substituted directly, without using Expand.
+
+```template
+`{{ regexReplaceAllLiteral "(ts=)" .timestamp "timestamp=" }}`
+```
+
+You can combine multiple function using pipe, for example if you want to strip out spaces and make the request method in capital you would write the following template `{{ .request_method | TrimSpace | ToUpper }}`.
+
+### Log Queries Examples
+
+#### Multiple filtering
+
+Filtering should be done first using label matchers, then line filters (when possible) and finally using label filters. The following query demonstrate this.
+
+```logql
+{cluster="ops-tools1", namespace="loki-dev", job="loki-dev/query-frontend"} |= "metrics.go" !="out of order" | logfmt | duration > 30s or status_code!="200"
+```
+
+#### Multiple parsers
+
+To extract the method and the path of the following logfmt log line:
+
+```log
+level=debug ts=2020-10-02T10:10:42.092268913Z caller=logging.go:66 traceID=a9d4d8a928d8db1 msg="POST /api/prom/api/v1/query_range (200) 1.5s"
+```
+
+You can use multiple parsers (logfmt and regexp) like this.
+
+```logql
+{job="cortex-ops/query-frontend"} | logfmt | line_format "{{.msg}}" | regexp "(?P<method>\\w+) (?P<path>[\\w|/]+) \\((?P<status>\\d+?)\\) (?P<duration>.*)"`
+```
+
+This is possible because the `| line_format` reformats the log line to become `POST /api/prom/api/v1/query_range (200) 1.5s` which can then be parsed with the `| regexp ...` parser.
+
+#### Formatting
+
+The following query shows how you can reformat a log line to make it easier to read on screen.
+
+```logql
+{cluster="ops-tools1", name="querier", namespace="loki-dev"}
+  |= "metrics.go" != "loki-canary"
+  | logfmt
+  | query != ""
+  | label_format query="{{ Replace .query \"\\n\" \"\" -1 }}"
+  | line_format "{{ .ts}}\t{{.duration}}\ttraceID = {{.traceID}}\t{{ printf \"%-100.100s\" .query }} "
+```
+
+Label formatting is used to sanitize the query while the line format reduce the amount of information and creates a tabular output.
+
+For those given log line:
+
+```log
+level=info ts=2020-10-23T20:32:18.094668233Z caller=metrics.go:81 org_id=29 traceID=1980d41501b57b68 latency=fast query="{cluster=\"ops-tools1\", job=\"cortex-ops/query-frontend\"} |= \"query_range\"" query_type=filter range_type=range length=15m0s step=7s duration=650.22401ms status=200 throughput_mb=1.529717 total_bytes_mb=0.994659
+level=info ts=2020-10-23T20:32:18.068866235Z caller=metrics.go:81 org_id=29 traceID=1980d41501b57b68 latency=fast query="{cluster=\"ops-tools1\", job=\"cortex-ops/query-frontend\"} |= \"query_range\"" query_type=filter range_type=range length=15m0s step=7s duration=624.008132ms status=200 throughput_mb=0.693449 total_bytes_mb=0.432718
+```
+
+The result would be:
+
+```log
+2020-10-23T20:32:18.094668233Z	650.22401ms	    traceID = 1980d41501b57b68	{cluster="ops-tools1", job="cortex-ops/query-frontend"} |= "query_range"
+2020-10-23T20:32:18.068866235Z	624.008132ms	traceID = 1980d41501b57b68	{cluster="ops-tools1", job="cortex-ops/query-frontend"} |= "query_range"
+```
+
 ## Metric Queries
 
-LogQL also supports wrapping a log query with functions that allows for counting entries per stream.
+LogQL also supports wrapping a log query with functions that allow for creating metrics out of the logs.
 
 Metric queries can be used to calculate things such as the rate of error messages, or the top N log sources with the most amount of logs over the last 3 hours.
 
+Combined with log [parsers](#Parser-Expression), metrics queries can also be used to calculate metrics from a sample value within the log line such latency or request size.
+Furthermore all labels, including extracted ones, will be available for aggregations and generation of new series.
+
 ### Range Vector aggregation
 
-LogQL shares the same [range vector](https://prometheus.io/docs/prometheus/latest/querying/basics/#range-vector-selectors) concept from Prometheus, except the selected range of samples include a value of 1 for each log entry.
-An aggregation can be applied over the selected range to transform it into an instance vector.
+LogQL shares the same [range vector](https://prometheus.io/docs/prometheus/latest/querying/basics/#range-vector-selectors) concept from Prometheus, except the selected range of samples is a range of selected log or label values.
+
+Loki supports two types of range aggregations. Log range and unwrapped range aggregations.
+
+#### Log Range Aggregations
 
-The currently supported functions for operating over are:
+A log range is a log query (with or without a log pipeline) followed by the range notation e.g [1m]. It should be noted that the range notation `[5m]` can be placed at end of the log pipeline or right after the log stream matcher.
 
-- `rate`: calculates the number of entries per second
-- `count_over_time`: counts the entries for each log stream within the given range.
-- `bytes_rate`: calculates the number of bytes per second for each stream.
-- `bytes_over_time`: counts the amount of bytes used by each log stream for a given range.
+The first type uses log entries to compute values and supported functions for operating over are:
 
-#### Examples
+- `rate(log-range)`: calculates the number of entries per second
+- `count_over_time(log-range)`: counts the entries for each log stream within the given range.
+- `bytes_rate(log-range)`: calculates the number of bytes per second for each stream.
+- `bytes_over_time(log-range)`: counts the amount of bytes used by each log stream for a given range.
+
+##### Log  Examples
 
 ```logql
 count_over_time({job="mysql"}[5m])
@@ -114,21 +477,67 @@ count_over_time({job="mysql"}[5m])
 This example counts all the log lines within the last five minutes for the MySQL job.
 
 ```logql
-rate({job="mysql"} |= "error" != "timeout" [10s] )
+sum by (host) (rate({job="mysql"} |= "error" != "timeout" | json | duration > 10s [1m]))
 ```
 
-This example demonstrates that a fully LogQL query can be wrapped in the aggregation syntax, including filter expressions.
-This example gets the per-second rate of all non-timeout errors within the last ten seconds for the MySQL job.
+This example demonstrates a LogQL aggregation which includes filters and parsers.
+It returns the per-second rate of all non-timeout errors within the last minutes per host for the MySQL job and only includes errors whose duration is above ten seconds.
+
+#### Unwrapped Range Aggregations
+
+Unwrapped ranges uses extracted labels as sample values instead of log lines. However to select which label will be use within the aggregation, the log query must end with an unwrap expression and optionally a label filter expression to discard [errors](#Pipeline-Errors).
 
-It should be noted that the range notation `[5m]` can be placed at end of the log stream filter or right after the log stream matcher.
-For example, the two syntaxes below are equivalent:
+The unwrap expression is noted `| unwrap label_identifier` where the label identifier is the label name to use for extracting sample values.
+
+Since label values are string, by default a conversion into a float (64bits) will be attempted, in case of failure the `__error__` label is added to the sample.
+Optionally the label identifier can be wrapped by a conversion function `| unwrap <function>(label_identifier)`, which will attempt to convert the label value from a specific format.
+
+We currently support only the function `duration_seconds` (or its short equivalent `duration`) which will convert the label value in seconds from the [go duration format](https://golang.org/pkg/time/#ParseDuration) (e.g `5m`, `24s30ms`).
+
+Supported function for operating over unwrapped ranges are:
+
+- `sum_over_time(unwrapped-range)`: the sum of all values in the specified interval.
+- `avg_over_time(unwrapped-range)`: the average value of all points in the specified interval.
+- `max_over_time(unwrapped-range)`: the maximum value of all points in the specified interval.
+- `min_over_time(unwrapped-range)`: the minimum value of all points in the specified interval
+- `stdvar_over_time(unwrapped-range)`: the population standard variance of the values in the specified interval.
+- `stddev_over_time(unwrapped-range)`: the population standard deviation of the values in the specified interval.
+- `quantile_over_time(scalar,unwrapped-range)`: the φ-quantile (0 ≤ φ ≤ 1) of the values in the specified interval.
+
+Except for `sum_over_time`, `min_over_time` and `max_over_time` unwrapped range aggregations support grouping.
 
 ```logql
-rate({job="mysql"} |= "error" != "timeout" [5m])
+<aggr-op>([parameter,] <unwrapped-range>) [without|by (<label list>)]
+```
 
-rate({job="mysql"}[5m] |= "error" != "timeout")
+Which can be used to aggregate over distinct labels dimensions by including a `without` or `by` clause.
+
+`without` removes the listed labels from the result vector, while all other labels are preserved the output. `by` does the opposite and drops labels that are not listed in the `by` clause, even if their label values are identical between all elements of the vector.
+
+#### Unwrapped Examples
+
+```logql
+quantile_over_time(0.99,
+  {cluster="ops-tools1",container="ingress-nginx"}
+    | json
+    | __error__ = ""
+    | unwrap request_time [1m])) by (path)
+```
+
+This example calculates the p99 of the nginx-ingress latency by path.
+
+```logql
+sum by (org_id) (
+  sum_over_time(
+  {cluster="ops-tools1",container="loki-dev"}
+      |= "metrics.go"
+      | logfmt
+      | unwrap bytes_processed [1m])
+  )
 ```
 
+This calculates the amount of bytes processed per organization id.
+
 ### Aggregation operators
 
 Like [PromQL](https://prometheus.io/docs/prometheus/latest/querying/operators/#aggregation-operators), LogQL supports a subset of built-in aggregation operators that can be used to aggregate the element of a single vector, resulting in a new vector of fewer elements but with aggregated values:
@@ -156,7 +565,7 @@ The aggregation operators can either be used to aggregate over all label values
 The `without` clause removes the listed labels from the resulting vector, keeping all others.
 The `by` clause does the opposite, dropping labels that are not listed in the clause, even if their label values are identical between all elements of the vector.
 
-#### Examples
+#### Vector Aggregations Examples
 
 Get the top 10 applications by the highest log throughput:
 
@@ -171,10 +580,10 @@ by level:
 sum(count_over_time({job="mysql"}[5m])) by (level)
 ```
 
-Get the rate of HTTP GET requests from NGINX logs:
+Get the rate of HTTP GET of /home requests from NGINX logs by region:
 
 ```logql
-avg(rate(({job="nginx"} |= "GET")[10s])) by (region)
+avg(rate(({job="nginx"} |= "GET" | json | path="/home")[10s])) by (region)
 ```
 
 ### Binary Operators
@@ -202,7 +611,7 @@ The result is propagated into the result vector with the grouping labels becomin
 
 Pay special attention to [operator order](#operator-order) when chaining arithmetic operators.
 
-##### Examples
+##### Arithmetic Examples
 
 Implement a health check with a simple query:
 
@@ -238,7 +647,7 @@ Other elements are dropped.
 `vector1 unless vector2` results in a vector consisting of the elements of vector1 for which there are no elements in vector2 with exactly matching label sets.
 All matching elements in both vectors are dropped.
 
-##### Examples
+##### Binary Operators Examples
 
 This contrived query will return the intersection of these queries, effectively `rate({app="bar"})`:
 
@@ -305,3 +714,38 @@ More details can be found in the [Golang language documentation](https://golang.
 `1 + 2 / 3` is equal to `1 + ( 2 / 3 )`.
 
 `2 * 3 % 2` is evaluated as `(2 * 3) % 2`.
+
+### Pipeline Errors
+
+There are multiple reasons which cause pipeline processing errors, such as:
+
+- A numeric label filter may fail to turn a label value into a number
+- A metric conversion for a label may fail.
+- A log line is not a valid json document.
+- etc...
+
+When those failures happen, Loki won't filter out those log lines. Instead they are passed into the next stage of the pipeline with a new system label named `__error__`. The only way to filter out errors is by using a label filter expressions. The `__error__` label can't be renamed via the language.
+
+For example to remove json errors:
+
+```logql
+  {cluster="ops-tools1",container="ingress-nginx"}
+    | json
+    | __error__ != "JSONParserErr"
+```
+
+Alternatively you can remove all error using a catch all matcher such as `__error__ = ""` or even show only errors using `__error__ != ""`.
+
+The filter should be placed after the stage that generated this error. This means if you need to remove errors from an unwrap expression it needs to be placed after the unwrap.
+
+```logql
+quantile_over_time(
+	0.99,
+	{container="ingress-nginx",service="hosted-grafana"}
+	| json
+	| unwrap response_latency_seconds
+	| __error__=""[1m]
+	) by (cluster)
+```
+
+>Metric queries cannot contains errors, in case errors are found during execution, Loki will return an error and appropriate status code.