--- title: Upgrading weight: 250 --- # Upgrading Grafana Loki Every attempt is made to keep Grafana Loki backwards compatible, such that upgrades should be low risk and low friction. Unfortunately Loki is software and software is hard and sometimes we are forced to make decisions between ease of use and ease of maintenance. If we have any expectation of difficulty upgrading we will document it here. As more versions are released it becomes more likely unexpected problems arise moving between multiple versions at once. If possible try to stay current and do sequential updates. If you want to skip versions, try it in a development environment before attempting to upgrade production. # Checking for config changes Using docker you can check changes between 2 versions of Loki with a command like this: ``` export OLD_LOKI=2.3.0 export NEW_LOKI=2.4.1 export CONFIG_FILE=loki-local-config.yaml diff --color=always --side-by-side <(docker run --rm -t -v "${PWD}":/config grafana/loki:${OLD_LOKI} -config.file=/config/${CONFIG_FILE} -print-config-stderr 2>&1 | sed '/Starting Loki/q' | tr -d '\r') <(docker run --rm -t -v "${PWD}":/config grafana/loki:${NEW_LOKI} -config.file=/config/${CONFIG_FILE} -print-config-stderr 2>&1 | sed '/Starting Loki/q' | tr -d '\r') | less -R ``` the `tr -d '\r'` is likely not necessary for most people, seems like WSL2 was sneaking in some windows newline characters... The output is incredibly verbose as it shows the entire internal config struct used to run Loki, you can play around with the diff command if you prefer to only show changes or a different style output. ## Main / Unreleased ### Loki ### Engine query timeout is deprecated Previously, we had two configurations to define a query timeout: `engine.timeout` and `querier.query-timeout`. As they were conflicting and `engine.timeout` isn't as expressive as `querier.query-tiomeout`, we're deprecating it in favor of relying on `engine.query-timeout` only. #### Fifocache is deprecated We introduced a new cache called `embedded-cache` which is an in-process cache system that make it possible to run Loki without the need for an external cache (like Memcached, Redis, etc). It can be run in two modes `distributed: false` (default, and same as old `fifocache`) and `distributed: true` which runs cache in distributed fashion sharding keys across peers if Loki is run in microservices or SSD mode. Currently `embedded-cache` with `distributed: true` can be enabled only for results cache. #### Evenly spread distributors across kubernetes nodes We now evenly spread distributors across the available kubernetes nodes, but allowing more than one distributors to be scheduled into the same node. If you want to run at most a single distributors per node, set `$._config.distributors.use_topology_spread` to false. While we attempt to schedule at most 1 distributor per Kubernetes node with the `topology_spread_max_skew: 1` field, if no more nodes are available then multiple distributors will be scheduled on the same node. This can potentially impact your service's reliability so consider tuning these values according to your risk tolerance. #### Evenly spread queriers across kubernetes nodes We now evenly spread queriers across the available kubernetes nodes, but allowing more than one querier to be scheduled into the same node. If you want to run at most a single querier per node, set `$._config.querier.use_topology_spread` to false. While we attempt to schedule at most 1 querier per Kubernetes node with the `topology_spread_max_skew: 1` field, if no more nodes are available then multiple queriers will be scheduled on the same node. This can potentially impact your service's reliability so consider tuning these values according to your risk tolerance. #### Default value for `server.http-listen-port` changed This value now defaults to 3100, so the Loki process doesn't require special privileges. Previously, it had been set to port 80, which is a privileged port. If you need Loki to listen on port 80, you can set it back to the previous default using `-server.http-listen-port=80`. #### docker-compose setup has been updated The docker-compose [setup](https://github.com/grafana/loki/blob/main/production/docker) has been updated to **v2.6.0** and includes many improvements. Notable changes include: - authentication (multi-tenancy) is **enabled** by default; you can disable it in `production/docker/config/loki.yaml` by setting `auth_enabled: false` - storage is now using Minio instead of local filesystem - move your current storage into `.data/minio` and it should work transparently - log-generator was added - if you don't need it, simply remove the service from `docker-compose.yaml` or don't start the service #### Configuration for deletes has changed The global `deletion_mode` option in the compactor configuration moved to runtime configurations. - The `deletion_mode` option needs to be removed from your compactor configuration - The `deletion_mode` global override needs to be set to the desired mode: `disabled`, `filter-only`, or `filter-and-delete`. By default, `filter-and-delete` is enabled. - Any `allow_delete` per-tenant overrides need to be removed or changed to `deletion_mode` overrides with the desired mode. ## 2.6.0 ### Loki #### Implementation of unwrapped `rate` aggregation changed The implementation of the `rate()` aggregation function changed back to the previous implemention prior to [#5013](https://github.com/grafana/loki/pulls/5013). This means that the rate per second is calculated based on the sum of the extracted values, instead of the average increase over time. If you want the extracted values to be treated as [Counter](https://prometheus.io/docs/concepts/metric_types/#counter) metric, you should use the new `rate_counter()` aggregation function, which calculates the per-second average rate of increase of the vector. #### Default value for `azure.container-name` changed This value now defaults to `loki`, it was previously set to `cortex`. If you are relying on this container name for your chunks or ruler storage, you will have to manually specify `-azure.container-name=cortex` or `-ruler.storage.azure.container-name=cortex` respectively. ## 2.5.0 ### Loki #### `split_queries_by_interval` yaml configuration has moved. It was previously possible to define this value in two places ```yaml query_range: split_queries_by_interval: 10m ``` and/or ``` limits_config: split_queries_by_interval: 10m ``` In 2.5.0 it can only be defined in the `limits_config` section, **Loki will fail to start if you do not remove the `split_queries_by_interval` config from the `query_range` section.** Additionally, it has a new default value of `30m` rather than `0`. The CLI flag is not changed and remains `querier.split-queries-by-interval`. #### Dropped support for old Prometheus rules configuration format Alerting rules previously could be specified in two formats: 1.x format (legacy one, named `v0` internally) and 2.x. We decided to drop support for format `1.x` as it is fairly old and keeping support for it required a lot of code. In case you're still using the legacy format, take a look at [Alerting Rules](https://prometheus.io/docs/prometheus/latest/configuration/alerting_rules/) for instructions on how to write alerting rules in the new format. For reference, the newer format follows a structure similar to the one below: ```yaml groups: - name: example rules: - alert: HighErrorRate expr: job:request_latency_seconds:mean5m{job="myjob"} > 0.5 for: 10m labels: severity: page annotations: summary: High request latency ``` Meanwhile, the legacy format is a string in the following format: ``` ALERT IF [ FOR ] [ LABELS