* [BUGFIX] TSDB: Fix panic on failed snapshot replay. #9438
* [BUGFIX] TSDB: Don't fail snapshot replay with exemplar storage disabled when the snapshot contains exemplars. #9438
## 2.30.2 / 2021-10-01
* [BUGFIX] TSDB: Don't error on overlapping m-mapped chunks during WAL replay. #9381
## 2.30.1 / 2021-09-28
* [ENHANCEMENT] Remote Write: Redact remote write URL when used for metric label. #9383
* [ENHANCEMENT] UI: Redact remote write URL and proxy URL passwords in the `/config` page. #9408
* [BUGFIX] promtool rules backfill: Prevent creation of data before the start time. #9339
* [BUGFIX] promtool rules backfill: Do not query after the end time. #9340
* [BUGFIX] Azure SD: Fix panic when no computername is set. #9387
## 2.30.0 / 2021-09-14
* [FEATURE] **experimental** TSDB: Snapshot in-memory chunks on shutdown for faster restarts. Behind `--enable-feature=memory-snapshot-on-shutdown` flag. #7229
* [FEATURE] **experimental** Scrape: Configure scrape interval and scrape timeout via relabeling using `__scrape_interval__` and `__scrape_timeout__` labels respectively. #8911
* [FEATURE] Scrape: Add `scrape_timeout_seconds` and `scrape_sample_limit` metric. Behind `--enable-feature=extra-scrape-metrics` flag to avoid additional cardinality by default. #9247#9295
* [ENHANCEMENT] Scrape: Add `--scrape.timestamp-tolerance` flag to adjust scrape timestamp tolerance when enabled via `--scrape.adjust-timestamps`. #9283
* [ENHANCEMENT] Remote Write: Improve throughput when sending exemplars. #8921
* [ENHANCEMENT] TSDB: Optimise WAL loading by removing extra map and caching min-time #9160
* [ENHANCEMENT] promtool: Speed up checking for duplicate rules. #9262/#9306
* [ENHANCEMENT] Scrape: Reduce allocations when parsing the metrics. #9299
* [ENHANCEMENT] docker_sd: Support host network mode #9125
* [BUGFIX] Exemplars: Fix panic when resizing exemplar storage from 0 to a non-zero size. #9286
* [BUGFIX] TSDB: Correctly decrement `prometheus_tsdb_head_active_appenders` when the append has no samples. #9230
* [BUGFIX] promtool rules backfill: Return 1 if backfill was unsuccessful. #9303
@ -52,7 +52,7 @@ All our issues are regularly tagged so that you can also filter down the issues
* Commits should be as small as possible, while ensuring that each commit is correct independently (i.e., each commit should compile and pass tests).
* If your patch is not getting reviewed or you need a specific person to review it, you can @-reply a reviewer asking for a review in the pull request or a comment, or you can ask for a review on IRC channel [#prometheus](https://web.libera.chat/?channels=#prometheus) on irc.libera.chat (for the easiest start, [join via Riot](https://riot.im/app/#/room/#prometheus:matrix.org)).
* If your patch is not getting reviewed or you need a specific person to review it, you can @-reply a reviewer asking for a review in the pull request or a comment, or you can ask for a review on the IRC channel [#prometheus-dev](https://web.libera.chat/?channels=#prometheus-dev) on irc.libera.chat (for the easiest start, [join via Element](https://app.element.io/#/room/#prometheus-dev:matrix.org)).
* Add tests relevant to the fixed bug or new feature.
@ -64,10 +64,10 @@ To add or update a new dependency, use the `go get` command:
| v2.31 | 2021-10-20 | **searching for volunteer** |
| v2.32 | 2021-12-01 | **searching for volunteer** |
| v2.33 | 2022-01-12 | **searching for volunteer** |
If you are interested in volunteering please create a pull request against the [prometheus/prometheus](https://github.com/prometheus/prometheus) repository and propose yourself for the release series of your choice.
@ -97,18 +99,21 @@ Either upgrade the dependencies within their existing version constraints as spe
```
cd web/ui/react-app
yarn upgrade
git add yarn.lock
npm update
git add package.json package-lock.json
```
Or alternatively, update all dependencies to their latest major versions. This is potentially more disruptive and will require more follow-up fixes, but should be done from time to time (use your best judgement):
```
cd web/ui/react-app
yarn upgrade --latest
git add package.json yarn.lock
npx npm-check-updates -u
npm install
git add package.json package-lock.json
```
You can find more details on managing npm dependencies and updates [in this blog post](https://www.carlrippon.com/upgrading-npm-dependencies/).
### 1. Prepare your release
At the start of a new major or minor release cycle create the corresponding release branch based on the main branch. For example if we're releasing `2.17.0` and the previous stable release is `2.16.0` we need to create a `release-2.17` branch. Note that all releases are handled in protected release branches, see the above `Branch management and versioning` section. Release candidates and patch releases for any given major or minor release happen in the same `release-<major>.<minor>` branch. Do not create `release-<version>` for patch or release candidate releases.
a.Flag("storage.tsdb.retention.time","How long to retain samples in storage. When this flag is set it overrides \"storage.tsdb.retention\". If neither this flag nor \"storage.tsdb.retention\" nor \"storage.tsdb.retention.size\" is set, the retention time defaults to "+defaultRetentionString+". Units Supported: y, w, d, h, m, s, ms.").
SetValue(&newFlagRetentionDuration)
a.Flag("storage.tsdb.retention.size","Maximum number of bytes that can be stored for blocks. A unit is required, supported units: B, KB, MB, GB, TB, PB, EB. Ex: \"512MB\". This flag is experimental and can be changed in future releases.").
a.Flag("storage.tsdb.retention.size","Maximum number of bytes that can be stored for blocks. A unit is required, supported units: B, KB, MB, GB, TB, PB, EB. Ex: \"512MB\".").
BytesVar(&cfg.tsdb.MaxBytes)
a.Flag("storage.tsdb.no-lockfile","Do not create lockfile in data directory.").
@ -292,9 +296,12 @@ func main() {
a.Flag("rules.alert.resend-delay","Minimum amount of time to wait before resending an alert to Alertmanager.").
Default("1m").SetValue(&cfg.resendDelay)
a.Flag("scrape.adjust-timestamps","Adjust scrape timestamps by up to 2ms to align them to the intended schedule. See https://github.com/prometheus/prometheus/issues/7846 for more context. Experimental. This flag will be removed in a future release.").
a.Flag("scrape.adjust-timestamps","Adjust scrape timestamps by up to `scrape.timestamp-tolerance` to align them to the intended schedule. See https://github.com/prometheus/prometheus/issues/7846 for more context. Experimental. This flag will be removed in a future release.").
a.Flag("scrape.timestamp-tolerance","Timestamp tolerance. See https://github.com/prometheus/prometheus/issues/7846 for more context. Experimental. This flag will be removed in a future release.").
a.Flag("query.max-samples","Maximum number of samples a single query can load into memory. Note that queries will fail if they try to load more samples than this into memory, so this also limits the number of samples a query can return.").
Default("50000000").IntVar(&cfg.queryMaxSamples)
a.Flag("enable-feature","Comma separated feature names to enable. Valid options: exemplar-storage, expand-external-labels, memory-snapshot-on-shutdown, promql-at-modifier, promql-negative-offset, remote-write-receiver. See https://prometheus.io/docs/prometheus/latest/feature_flags/ for more details.").
a.Flag("enable-feature","Comma separated feature names to enable. Valid options: exemplar-storage, expand-external-labels, memory-snapshot-on-shutdown, promql-at-modifier, promql-negative-offset, remote-write-receiver, extra-scrape-metrics. See https://prometheus.io/docs/prometheus/latest/feature_flags/ for more details.").
{"run importer with dup name label",1,8,4,4,[]*model.SampleStream{{Metric:model.Metric{"__name__":"val1","name1":"val1"},Values:[]model.SamplePair{{Timestamp:testTime,Value:testValue}}}}},
# List of PuppetDB service discovery configurations.
puppetdb_sd_configs:
[ - <puppetdb_sd_config> ... ]
# List of Scaleway service discovery configurations.
scaleway_sd_configs:
[ - <scaleway_sd_config> ... ]
@ -901,7 +905,7 @@ The following meta labels are available on targets during [relabeling](#relabel_
* `__meta_ec2_ami`: the EC2 Amazon Machine Image
* `__meta_ec2_architecture`: the architecture of the instance
* `__meta_ec2_availability_zone`: the availability zone in which the instance is running
* `__meta_ec2_availability_zone_id`: the [availability zone ID](https://docs.aws.amazon.com/ram/latest/userguide/working-with-az-ids.html) in which the instance is running
* `__meta_ec2_availability_zone_id`: the [availability zone ID](https://docs.aws.amazon.com/ram/latest/userguide/working-with-az-ids.html) in which the instance is running (requires `ec2:DescribeAvailabilityZones`)
* `__meta_ec2_instance_id`: the EC2 instance ID
* `__meta_ec2_instance_lifecycle`: the lifecycle of the EC2 instance, set only for 'spot' or 'scheduled' instances, absent otherwise
* `__meta_ec2_instance_state`: the state of the EC2 instance
@ -1069,6 +1073,94 @@ tls_config:
[ <tls_config> ]
```
### `<puppetdb_sd_config>`
PuppetDB SD configurations allow retrieving scrape targets from
See [this example Prometheus configuration file](/documentation/examples/prometheus-puppetdb.yml)
for a detailed example of configuring Prometheus with PuppetDB.
### `<file_sd_config>`
File-based service discovery provides a more generic way to configure static targets
@ -1454,6 +1546,29 @@ Available meta labels:
* If the endpoints belong to a service, all labels of the `role: service` discovery are attached.
* For all targets backed by a pod, all labels of the `role: pod` discovery are attached.
#### `endpointslice`
The `endpointslice` role discovers targets from existing endpointslices. For each endpoint
address referenced in the endpointslice object one target is discovered. If the endpoint is backed by a pod, all
additional container ports of the pod, not bound to an endpoint port, are discovered as targets as well.
Available meta labels:
* `__meta_kubernetes_namespace`: The namespace of the endpoints object.
* `__meta_kubernetes_endpointslice_name`: The name of endpointslice object.
* For all targets discovered directly from the endpointslice list (those not additionally inferred
from underlying pods), the following labels are attached:
* `__meta_kubernetes_endpointslice_address_target_kind`: Kind of the referenced object.
* `__meta_kubernetes_endpointslice_address_target_name`: Name of referenced object.
* `__meta_kubernetes_endpointslice_address_type`: The ip protocol family of the adress target.
* `__meta_kubernetes_endpointslice_endpoint_conditions_ready`: Set to `true` or `false` for the referenced endpoint's ready state.
* `__meta_kubernetes_endpointslice_endpoint_topology_kubernetes_io_hostname`: Name of the node hosting the referenced endpoint.
* `__meta_kubernetes_endpointslice_endpoint_topology_present_kubernetes_io_hostname`: Flag that shows if the referenced object has a kubernetes.io/hostname annotation.
* `__meta_kubernetes_endpointslice_port`: Port of the referenced endpoint.
* `__meta_kubernetes_endpointslice_port_name`: Named port of the referenced endpoint.
* `__meta_kubernetes_endpointslice_port_protocol`: Protocol of the referenced endpoint.
* If the endpoints belong to a service, all labels of the `role: service` discovery are attached.
* For all targets backed by a pod, all labels of the `role: pod` discovery are attached.
#### `ingress`
The `ingress` role discovers a target for each path of each ingress.
@ -1487,7 +1602,7 @@ See below for the configuration options for Kubernetes discovery:
# One of endpoints, service, pod, node, or ingress.
role: <string>
# Optional path to a kubeconfig file.
# Optional path to a kubeconfig file.
# Note that api_server and kube_config are mutually exclusive.
[ kubeconfig_file: <filename> ]
@ -1566,7 +1681,7 @@ inside a Prometheus-enabled mesh.
The following meta labels are available for each target:
* `__meta_kuma_mesh`: the name of the proxy's Mesh
* `__meta_kuma_mesh`: the name of the proxy's Mesh
* `__meta_kuma_dataplane`: the name of the proxy
* `__meta_kuma_service`: the name of the proxy's associated Service
* `__meta_kuma_label_<tagname>`: each tag of the proxy
@ -2172,6 +2287,9 @@ it was not set during relabeling. The `__scheme__` and `__metrics_path__` labels
are set to the scheme and metrics path of the target respectively. The `__param_<name>`
label is set to the value of the first passed URL parameter called `<name>`.
The `__scrape_interval__` and `__scrape_timeout__` labels are set to the target's
interval and timeout. This is **experimental** and could change in the future.
Additional labels prefixed with `__meta_` may be available during the
relabeling phase. They are set by the service discovery mechanism that provided
the target and vary between mechanisms.
@ -2384,6 +2502,10 @@ nerve_sd_configs:
openstack_sd_configs:
[ - <openstack_sd_config> ... ]
# List of PuppetDB service discovery configurations.
puppetdb_sd_configs:
[ - <puppetdb_sd_config> ... ]
# List of Scaleway service discovery configurations.
@ -34,7 +34,7 @@ that PromQL does not look ahead of the evaluation time for samples.
`--enable-feature=promql-negative-offset`
In contrast to the positive offset modifier, the negative offset modifier lets
one shift a vector selector into the future. An example in which one may want
one shift a vector selector into the future. An example in which one may want
to use a negative offset is reviewing past data and making temporal comparisons
with more recent data.
@ -59,5 +59,15 @@ Exemplar storage is implemented as a fixed size circular buffer that stores exem
`--enable-feature=memory-snapshot-on-shutdown`
This takes the snapshot of the chunks that are in memory along with the series information when shutting down and stores
it on disk. This will reduce the startup time since the memory state can be restored with this snapshot and m-mapped
it on disk. This will reduce the startup time since the memory state can be restored with this snapshot and m-mapped
chunks without the need of WAL replay.
## Extra Scrape Metrics
`--enable-feature=extra-scrape-metrics`
When enabled, for each instance scrape, Prometheus stores a sample in the following additional time series:
- `scrape_timeout_seconds`. The configured `scrape_timeout` for a target. This allows you to measure each target to find out how close they are to timing out with `scrape_duration_seconds / scrape_timeout_seconds`.
- `scrape_sample_limit`. The configured `sample_limit` for a target. This allows you to measure each target
to find out how close they are to reaching the limit with `scrape_samples_post_metric_relabeling / scrape_sample_limit`. Note that `scrape_sample_limit` can be zero if there is no limit configured, which means that the query above can return `+Inf` for targets with no limit (as we divide by zero). If you want to query only for targets that do have a sample limit use this query: `scrape_samples_post_metric_relabeling / (scrape_sample_limit > 0)`.
@ -27,7 +27,7 @@ replayed when the Prometheus server restarts. Write-ahead log files are stored
in the `wal` directory in 128MB segments. These files contain raw data that
has not yet been compacted; thus they are significantly larger than regular block
files. Prometheus will retain a minimum of three write-ahead log files.
High-traffic servers may retain more than three WAL files in order to to keep at
High-traffic servers may retain more than three WAL files in order to keep at
least two hours of raw data.
A Prometheus server's data directory looks something like this:
@ -82,7 +82,7 @@ Prometheus has several flags that configure local storage. The most important ar
* `--storage.tsdb.path`: Where Prometheus writes its database. Defaults to `data/`.
* `--storage.tsdb.retention.time`: When to remove old data. Defaults to `15d`. Overrides `storage.tsdb.retention` if this flag is set to anything other than default.
* `--storage.tsdb.retention.size`: The maximum number of bytes of storage blocks to retain. The oldest data will be removed first. Defaults to `0` or disabled. Units supported: B, KB, MB, GB, TB, PB, EB. Ex: "512MB"
* `--storage.tsdb.retention.size`: The maximum number of bytes of storage blocks to retain. The oldest data will be removed first. Defaults to `0` or disabled. Units supported: B, KB, MB, GB, TB, PB, EB. Ex: "512MB". Only the persistent blocks are deleted to honor this retention although WAL and m-mapped chunks are counted in the total size. So the minimum requirement for the disk is the peak space taken by the `wal` (the WAL and Checkpoint) and `chunks_head` (m-mapped Head chunks) directory combined (peaks every 2 hours).
* `--storage.tsdb.retention`: Deprecated in favor of `storage.tsdb.retention.time`.
* `--storage.tsdb.wal-compression`: Enables compression of the write-ahead log (WAL). Depending on your data, you can expect the WAL size to be halved with little extra cpu load. This flag was introduced in 2.11.0 and enabled by default in 2.20.0. Note that once enabled, downgrading Prometheus to a version below 2.11.0 will require deleting the WAL.
// BenchmarkLabels_Get was written to check whether a binary search can improve the performance vs the linear search implementation
// The results have shown that binary search would only be better when searching last labels in scenarios with more than 10 labels.
// In the following list, `old` is the linear search while `new` is the binary search implementaiton (without calling sort.Search, which performs even worse here)