Over 130 PR's merged for this release, from 40 different contributors!! We continue to be humbled and thankful for the growing community of contributors and users of Loki. Thank you all so much.
### Important Notes
**Really, this is important**
Before we get into new features, version 1.4.0 brings with it the first (that we are aware of) upgrade dependency.
We have created a dedicated page for upgrading Loki in the [operations section of the docs](https://github.com/grafana/loki/tree/v1.4.0/docs/operations/upgrade.md)
The docker image tag naming was changed, the starting in 1.4.0 docker images no longer have the `v` prefix: `grafana/loki:1.4.0`
Also you should be aware we are now pruning old `master-xxxxx` docker images from docker hub, currently anything older than 90 days is removed. **We will never remove released versions of Loki**
The API now returns a plethora of stats into the work Loki performed to execute your query, eventually this will be displayed in some form in Grafana to help users better understand how "expensive" their queries are. Our goal here initially was to better instrument the recent work done in v1.3.0 on query parallelization and to better understand the performance of each part of Loki. In the future we are looking at additional ideas to provide feedback to users to tailor their queries for better performance.
* [1649](https://github.com/grafana/loki/pull/1649) **cyriltovena**: Pipe data to Promtail
This is a long overdue addition to Promtail which can help setup and debug pipelines, with these new features you can do this to feed a single log line into Promtail:
`-log.level=debug 2>&1 | sed 's/^.*stage/stage/g` are added to enable debug output, direct the output to stdout, and a sed filter to remove some noise from the log lines.
The `stdin` functionality also works without `--dry-run` allowing you to feed any logs into Promtail via `stdin` and send them to Loki
* [1677](https://github.com/grafana/loki/pull/1677) **owen-d**: Literal Expressions in LogQL
* [1662](https://github.com/grafana/loki/pull/1662) **owen-d**: Binary operators in LogQL
These two extensions to LogQL now let you execute queries like this:
* `sum(rate({app="foo"}[5m])) * 2`
* `sum(rate({app="foo"}[5m]))/1e6`
* [1678](https://github.com/grafana/loki/pull/1678) **slim-bean**: promtail: metrics pipeline count all log lines
Now you can get per-stream line counts as a metric from promtail, useful for seeing which applications log the most
```yaml
- metrics:
line_count_total:
config:
action: inc
match_all: true
description: A running counter of all lines with their corresponding
* [1572](https://github.com/grafana/loki/pull/1572) **owen-d**: Feature/query ingesters within
These two configs let you set the max time a chunk can stay in memory in Loki, this is useful to keep memory usage down as well as limit potential loss of data if ingesters crash. Combine this with the `query_ingesters_within` config and you can have your queriers skip asking the ingesters for data which you know won't still be in memory (older than max_chunk_age).
**NOTE** Do not set the `max_chunk_age` too small, the default of 1h is probably a good point for most people. Loki does not perform well when you flush many small chunks (such as when your logs have too much cardinality), setting this lower than 1h risks flushing too many small chunks.
* [1581](https://github.com/grafana/loki/pull/1581) **slim-bean**: Add sleep to canary reconnect on error
This isn't a feature but it's an important fix, this is the second time our canaries have tried to DDOS our Loki clusters so you should update to prevent them from trying to attack you. Aggressive little things these canaries...
* [1845](https://github.com/grafana/loki/pull/1845) **wardbekker**: throw exceptions on HTTPTooManyRequests and HTTPServerError so FluentD will retry
These two PR's change how 429 HTTP Response codes are handled (Rate Limiting), previously these responses were dropped, now they will be retried for these clients
* Promtail
* Docker logging driver
* Fluent Bit
* Fluentd
This pushes the failure to send logs to two places. First is the retry limits. The defaults in promtail (and thus also the Docker logging driver and Fluent Bit, which share the same underlying code) will retry 429s (and 500s) on an exponential backoff for up to about 8.5 mins on the default configurations. (This can be changed; see the [config docs](https://github.com/grafana/loki/blob/v1.4.0/docs/clients/promtail/configuration.md#client_config) for more info.)
The second place would be the log file itself. At some point, most log files roll based on size or time. Promtail makes an attempt to read a rolled log file but will only try once. If you are very sensitive to lost logs, give yourself really big log files with size-based rolling rules and increase those retry timeouts. This should protect you from Loki server outages or network issues.
### All Changes
There are many other important fixes and improvements to Loki, way too many to call out in individual detail, so take a look!
* [1731](https://github.com/grafana/loki/pull/1731) **billimek**: [promtail helm chart] - Expand promtail syslog svc to support values
* [1688](https://github.com/grafana/loki/pull/1688) **fredgate**: Loki stack helm chart can deploy datasources without Grafana
* [1632](https://github.com/grafana/loki/pull/1632) **lukipro**: Added support for imagePullSecrets in Loki Helm chart
* [1620](https://github.com/grafana/loki/pull/1620) **rsteneteg**: [promtail helm chart] option to set fs.inotify.max_user_instances with init container
* [1617](https://github.com/grafana/loki/pull/1617) **billimek**: [promtail helm chart] Enable support for syslog service
* [1590](https://github.com/grafana/loki/pull/1590) **polar3130**: Helm/loki-stack: refresh default grafana.image.tag to 6.6.0
* [1587](https://github.com/grafana/loki/pull/1587) **polar3130**: Helm/loki-stack: add template for the service name to connect to loki
Every attempt is made to keep Loki backwards compatible, such that upgrades should be low risk and low friction.
Unfortunately Loki is software and software is hard and sometimes things are not as easy as we want them to be.
On this page we will document any upgrade issues/gotchas/considerations we are aware of.
## 1.4.0
Loki 1.4.0 vendors Cortex v0.7.0-rc.0 which contains [several breaking config changes](https://github.com/cortexproject/cortex/blob/v0.7.0-rc.0/CHANGELOG.md), however, you would only be affected if you were configuring your schema via arguments and not a config file. This is not something we had ever provided as an option via docs and is unlikely anyone is doing, but worth mentioning. None of the other config changes are relevant to Loki.
### Required Upgrade Path
The newly vendored version of Cortex removes code related to de-normalized tokens in the ring. What you need to know is this:
*Note:* A "shared ring" as mentioned below refers to using *consul* or *etcd* for values in the following config:
```yaml
kvstore:
# The backend storage to use for the ring. Supported values are
# consul, etcd, inmemory
store: <string>
```
* Running without using a shared ring (inmemory): No action required
* Running with a shared ring and upgrading from v1.3.0 -> v1.4.0: No action required
* Running with a shared ring and upgrading from any version less than v1.3.0 (e.g. v1.2.0) -> v1.4.0: **ACTION REQUIRED**
There are two options for upgrade if you are not on version 1.3.0 and are using a shared ring:
* Upgrade first to v1.3.0 **BEFORE** upgrading to v1.4.0
OR
**Note:** If you are running a single binary you only need to add this flag to your single binary command.
1. Add the following configuration to your ingesters command: `-ingester.normalise-tokens=true`
1. Restart your ingesters with this config
1. Proceed with upgrading to v1.4.0
1. Remove the config option (only do this after everything is running v1.4.0)
**Note:** It's also possible to enable this flag via config file, see the [docs](https://github.com/grafana/loki/tree/v1.3.0/docs/configuration#lifecycler_config)
If using the Helm Loki chart:
```yaml
extraArgs:
ingester.normalise-tokens: true
```
If using the Helm Loki-Stack chart:
```yaml
loki:
extraArgs:
ingester.normalise-tokens: true
```
#### What will go wrong
If you attempt to add a v1.4.0 ingester to a ring created by Loki v1.2.0 or older which does not have the commandline argument `-ingester.normalise-tokens=true` (or configured via [config file](https://github.com/grafana/loki/tree/v1.3.0/docs/configuration#lifecycler_config)), the v1.4.0 ingester will remove all the entries in the ring for all the other ingesters as it cannot "see" them.
This will result in distributors failing to write and a general ingestion failure for the system.
If this happens to you, you will want to rollback your deployment immediately. You need to remove the v1.4.0 ingester from the ring ASAP, this should allow the existing ingesters to re-insert their tokens. You will also want to remove any v1.4.0 distributors as they will not understand the old ring either and will fail to send traffic.