Import querying documentation from prometheus/docs

8 years ago · e6cdc2d355
parent 299802dfd0
commit e6cdc2d355
11 changed files with 1449 additions and 3 deletions
--- a/docs/configuration.md
+++ b/docs/configuration.md
@ -1,5 +1,6 @@
 ---
 title: Configuration
+sort_rank: 3
 ---

 # Configuration
--- a/docs/getting_started.md
+++ b/docs/getting_started.md
@ -1,6 +1,6 @@
 ---
 title: Getting started
-sort_rank: 10
+sort_rank: 1
 ---

 # Getting started
--- a/docs/index.md
+++ b/docs/index.md
@ -14,3 +14,4 @@ The documentation is available alongside all the project documentation at
 - [Installing](install.md)
 - [Getting started](getting_started.md)
 - [Configuration](configuration.md)
+- [Querying](querying/basics.md)
--- a/docs/installation.md
+++ b/docs/installation.md
@ -1,8 +1,9 @@
 ---
-title: Installing
+title: Installation
+sort_rank: 2
 ---

-# Installing
+# Installation

 ## Using pre-compiled binaries

--- a/docs/querying/api.md
+++ b/docs/querying/api.md
@ -0,0 +1,417 @@
+---
+title: HTTP API
+sort_rank: 7
+---
+
+# HTTP API
+
+The current stable HTTP API is reachable under `/api/v1` on a Prometheus
+server. Any non-breaking additions will be added under that endpoint.
+
+## Format overview
+
+The API response format is JSON. Every successful API request returns a `2xx`
+status code.
+
+Invalid requests that reach the API handlers return a JSON error object
+and one of the following HTTP response codes:
+
+- `400 Bad Request` when parameters are missing or incorrect.
+- `422 Unprocessable Entity` when an expression can't be executed
+  ([RFC4918](http://tools.ietf.org/html/rfc4918#page-78)).
+- `503 Service Unavailable` when queries time out or abort.
+
+Other non-`2xx` codes may be returned for errors occurring before the API
+endpoint is reached.
+
+The JSON response envelope format is as follows:
+
+```
+{
+  "status": "success" | "error",
+  "data": <data>,
+
+  // Only set if status is "error". The data field may still hold
+  // additional data.
+  "errorType": "<string>",
+  "error": "<string>"
+}
+```
+
+Input timestamps may be provided either in
+[RFC3339](https://www.ietf.org/rfc/rfc3339.txt) format or as a Unix timestamp
+in seconds, with optional decimal places for sub-second precision. Output
+timestamps are always represented as Unix timestamps in seconds.
+
+Names of query parameters that may be repeated end with `[]`.
+
+`<series_selector>` placeholders refer to Prometheus [time series
+selectors](basics.md#time-series-selectors) like `http_requests_total` or
+`http_requests_total{method=~"^GET|POST$"}` and need to be URL-encoded.
+
+`<duration>` placeholders refer to Prometheus duration strings of the form
+`[0-9]+[smhdwy]`. For example, `5m` refers to a duration of 5 minutes.
+
+## Expression queries
+
+Query language expressions may be evaluated at a single instant or over a range
+of time. The sections below describe the API endpoints for each type of
+expression query.
+
+### Instant queries
+
+The following endpoint evaluates an instant query at a single point in time:
+
+```
+GET /api/v1/query
+```
+
+URL query parameters:
+
+- `query=<string>`: Prometheus expression query string.
+- `time=<rfc3339 | unix_timestamp>`: Evaluation timestamp. Optional.
+- `timeout=<duration>`: Evaluation timeout. Optional. Defaults to and
+   is capped by the value of the `-query.timeout` flag.
+
+The current server time is used if the `time` parameter is omitted.
+
+The `data` section of the query result has the following format:
+
+```
+{
+  "resultType": "matrix" | "vector" | "scalar" | "string",
+  "result": <value>
+}
+```
+
+`<value>` refers to the query result data, which has varying formats
+depending on the `resultType`. See the [expression query result
+formats](#expression-query-result-formats).
+
+The following example evaluates the expression `up` at the time
+`2015-07-01T20:10:51.781Z`:
+
+```json
+$ curl 'http://localhost:9090/api/v1/query?query=up&time=2015-07-01T20:10:51.781Z'
+{
+   "status" : "success",
+   "data" : {
+      "resultType" : "vector",
+      "result" : [
+         {
+            "metric" : {
+               "__name__" : "up",
+               "job" : "prometheus",
+               "instance" : "localhost:9090"
+            },
+            "value": [ 1435781451.781, "1" ]
+         },
+         {
+            "metric" : {
+               "__name__" : "up",
+               "job" : "node",
+               "instance" : "localhost:9100"
+            },
+            "value" : [ 1435781451.781, "0" ]
+         }
+      ]
+   }
+}
+```
+
+### Range queries
+
+The following endpoint evaluates an expression query over a range of time:
+
+```
+GET /api/v1/query_range
+```
+
+URL query parameters:
+
+- `query=<string>`: Prometheus expression query string.
+- `start=<rfc3339 | unix_timestamp>`: Start timestamp.
+- `end=<rfc3339 | unix_timestamp>`: End timestamp.
+- `step=<duration>`: Query resolution step width.
+- `timeout=<duration>`: Evaluation timeout. Optional. Defaults to and
+   is capped by the value of the `-query.timeout` flag.
+
+The `data` section of the query result has the following format:
+
+```
+{
+  "resultType": "matrix",
+  "result": <value>
+}
+```
+
+For the format of the `<value>` placeholder, see the [range-vector result
+format](#range-vectors).
+
+The following example evaluates the expression `up` over a 30-second range with
+a query resolution of 15 seconds.
+
+```json
+$ curl 'http://localhost:9090/api/v1/query_range?query=up&start=2015-07-01T20:10:30.781Z&end=2015-07-01T20:11:00.781Z&step=15s'
+{
+   "status" : "success",
+   "data" : {
+      "resultType" : "matrix",
+      "result" : [
+         {
+            "metric" : {
+               "__name__" : "up",
+               "job" : "prometheus",
+               "instance" : "localhost:9090"
+            },
+            "values" : [
+               [ 1435781430.781, "1" ],
+               [ 1435781445.781, "1" ],
+               [ 1435781460.781, "1" ]
+            ]
+         },
+         {
+            "metric" : {
+               "__name__" : "up",
+               "job" : "node",
+               "instance" : "localhost:9091"
+            },
+            "values" : [
+               [ 1435781430.781, "0" ],
+               [ 1435781445.781, "0" ],
+               [ 1435781460.781, "1" ]
+            ]
+         }
+      ]
+   }
+}
+```
+
+## Querying metadata
+
+### Finding series by label matchers
+
+The following endpoint returns the list of time series that match a certain label set.
+
+```
+GET /api/v1/series
+```
+
+URL query parameters:
+
+- `match[]=<series_selector>`: Repeated series selector argument that selects the
+  series to return. At least one `match[]` argument must be provided.
+- `start=<rfc3339 | unix_timestamp>`: Start timestamp.
+- `end=<rfc3339 | unix_timestamp>`: End timestamp.
+
+The `data` section of the query result consists of a list of objects that
+contain the label name/value pairs which identify each series.
+
+The following example returns all series that match either of the selectors
+`up` or `process_start_time_seconds{job="prometheus"}`:
+
+```json
+$ curl -g 'http://localhost:9090/api/v1/series?match[]=up&match[]=process_start_time_seconds{job="prometheus"}'
+{
+   "status" : "success",
+   "data" : [
+      {
+         "__name__" : "up",
+         "job" : "prometheus",
+         "instance" : "localhost:9090"
+      },
+      {
+         "__name__" : "up",
+         "job" : "node",
+         "instance" : "localhost:9091"
+      },
+      {
+         "__name__" : "process_start_time_seconds",
+         "job" : "prometheus",
+         "instance" : "localhost:9090"
+      }
+   ]
+}
+```
+
+### Querying label values
+
+The following endpoint returns a list of label values for a provided label name:
+
+```
+GET /api/v1/label/<label_name>/values
+```
+
+The `data` section of the JSON response is a list of string label names.
+
+This example queries for all label values for the `job` label:
+
+```json
+$ curl http://localhost:9090/api/v1/label/job/values
+{
+   "status" : "success",
+   "data" : [
+      "node",
+      "prometheus"
+   ]
+}
+```
+
+## Deleting series
+
+The following endpoint deletes matched series entirely from a Prometheus server:
+
+```
+DELETE /api/v1/series
+```
+
+URL query parameters:
+
+- `match[]=<series_selector>`: Repeated label matcher argument that selects the
+  series to delete. At least one `match[]` argument must be provided.
+
+The `data` section of the JSON response has the following format:
+
+```
+{
+  "numDeleted": <number of deleted series>
+}
+```
+
+The following example deletes all series that match either of the selectors
+`up` or `process_start_time_seconds{job="prometheus"}`:
+
+```json
+$ curl -XDELETE -g 'http://localhost:9090/api/v1/series?match[]=up&match[]=process_start_time_seconds{job="prometheus"}'
+{
+   "status" : "success",
+   "data" : {
+      "numDeleted" : 3
+   }
+}
+```
+
+## Expression query result formats
+
+Expression queries may return the following response values in the `result`
+property of the `data` section. `<sample_value>` placeholders are numeric
+sample values. JSON does not support special float values such as `NaN`, `Inf`,
+and `-Inf`, so sample values are transferred as quoted JSON strings rather than
+raw numbers.
+
+### Range vectors
+
+Range vectors are returned as result type `matrix`. The corresponding
+`result` property has the following format:
+
+```
+[
+  {
+    "metric": { "<label_name>": "<label_value>", ... },
+    "values": [ [ <unix_time>, "<sample_value>" ], ... ]
+  },
+  ...
+]
+```
+
+### Instant vectors
+
+Instant vectors are returned as result type `vector`. The corresponding
+`result` property has the following format:
+
+```
+[
+  {
+    "metric": { "<label_name>": "<label_value>", ... },
+    "value": [ <unix_time>, "<sample_value>" ]
+  },
+  ...
+]
+```
+
+### Scalars
+
+Scalar results are returned as result type `scalar`. The corresponding
+`result` property has the following format:
+
+```
+[ <unix_time>, "<scalar_value>" ]
+```
+
+### Strings
+
+String results are returned as result type `string`. The corresponding
+`result` property has the following format:
+
+```
+[ <unix_time>, "<string_value>" ]
+```
+
+## Targets
+
+> This API is experimental as it is intended to be extended with targets
+> dropped due to relabelling in the future.
+
+The following endpoint returns an overview of the current state of the
+Prometheus target discovery:
+
+```
+GET /api/v1/targets
+```
+
+Currently only the active targets are part of the response.
+
+```json
+$ curl http://localhost:9090/api/v1/targets
+{
+  "status": "success",                                                                                                                                [3/11]
+  "data": {
+    "activeTargets": [
+      {
+        "discoveredLabels": {
+          "__address__": "127.0.0.1:9090",
+          "__metrics_path__": "/metrics",
+          "__scheme__": "http",
+          "job": "prometheus"
+        },
+        "labels": {
+          "instance": "127.0.0.1:9090",
+          "job": "prometheus"
+        },
+        "scrapeUrl": "http://127.0.0.1:9090/metrics",
+        "lastError": "",
+        "lastScrape": "2017-01-17T15:07:44.723715405+01:00",
+        "health": "up"
+      }
+    ]
+  }
+}
+```
+
+## Alertmanagers
+
+> This API is experimental as it is intended to be extended with Alertmanagers
+> dropped due to relabelling in the future.
+
+The following endpoint returns an overview of the current state of the
+Prometheus alertmanager discovery:
+
+```
+GET /api/v1/alertmanagers
+```
+
+Currently only the active Alertmanagers are part of the response.
+
+```json
+$ curl http://localhost:9090/api/v1/alertmanagers
+{
+  "status": "success",
+  "data": {
+    "activeAlertmanagers": [
+      {
+        "url": "http://127.0.0.1:9090/api/v1/alerts"
+      }
+    ]
+  }
+}
+```
--- a/docs/querying/basics.md
+++ b/docs/querying/basics.md
@ -0,0 +1,215 @@
+---
+title: Querying basics
+nav_title: Basics
+sort_rank: 1
+---
+
+# Querying Prometheus
+
+Prometheus provides a functional expression language that lets the user select
+and aggregate time series data in real time. The result of an expression can
+either be shown as a graph, viewed as tabular data in Prometheus's expression
+browser, or consumed by external systems via the [HTTP API](api.md).
+
+## Examples
+
+This document is meant as a reference. For learning, it might be easier to
+start with a couple of [examples](examples.md).
+
+## Expression language data types
+
+In Prometheus's expression language, an expression or sub-expression can
+evaluate to one of four types:
+
+* **Instant vector** - a set of time series containing a single sample for each time series, all sharing the same timestamp
+* **Range vector** - a set of time series containing a range of data points over time for each time series
+* **Scalar** - a simple numeric floating point value
+* **String** - a simple string value; currently unused
+
+Depending on the use-case (e.g. when graphing vs. displaying the output of an
+expression), only some of these types are legal as the result from a
+user-specified expression. For example, an expression that returns an instant
+vector is the only type that can be directly graphed.
+
+## Literals
+
+### String literals
+
+Strings may be specified as literals in single quotes, double quotes or
+backticks.
+
+PromQL follows the same [escaping rules as
+Go](https://golang.org/ref/spec#String_literals). In single or double quotes a
+backslash begins an escape sequence, which may be followed by `a`, `b`, `f`,
+`n`, `r`, `t`, `v` or `\`. Specific characters can be provided using octal
+(`\nnn`) or hexadecimal (`\xnn`, `\unnnn` and `\Unnnnnnnn`).
+
+No escaping is processed inside backticks. Unlike Go, Prometheus does not discard newlines inside backticks.
+
+Example:
+
+    "this is a string"
+    'these are unescaped: \n \\ \t'
+    `these are not unescaped: \n ' " \t`
+
+### Float literals
+
+Scalar float values can be literally written as numbers of the form
+`[-](digits)[.(digits)]`.
+
+    -2.43
+
+## Time series Selectors
+
+### Instant vector selectors
+
+Instant vector selectors allow the selection of a set of time series and a
+single sample value for each at a given timestamp (instant): in the simplest
+form, only a metric name is specified. This results in an instant vector
+containing elements for all time series that have this metric name.
+
+This example selects all time series that have the `http_requests_total` metric
+name:
+
+    http_requests_total
+
+It is possible to filter these time series further by appending a set of labels
+to match in curly braces (`{}`).
+
+This example selects only those time series with the `http_requests_total`
+metric name that also have the `job` label set to `prometheus` and their
+`group` label set to `canary`:
+
+    http_requests_total{job="prometheus",group="canary"}
+
+It is also possible to negatively match a label value, or to match label values
+against regular expressions. The following label matching operators exist:
+
+* `=`: Select labels that are exactly equal to the provided string.
+* `!=`: Select labels that are not equal to the provided string.
+* `=~`: Select labels that regex-match the provided string (or substring).
+* `!~`: Select labels that do not regex-match the provided string (or substring).
+
+For example, this selects all `http_requests_total` time series for `staging`,
+`testing`, and `development` environments and HTTP methods other than `GET`.
+
+    http_requests_total{environment=~"staging|testing|development",method!="GET"}
+
+Label matchers that match empty label values also select all time series that do
+not have the specific label set at all. Regex-matches are fully anchored.
+
+Vector selectors must either specify a name or at least one label matcher
+that does not match the empty string. The following expression is illegal:
+
+    {job=~".*"} # Bad!
+
+In contrast, these expressions are valid as they both have a selector that does not
+match empty label values.
+
+    {job=~".+"}              # Good!
+    {job=~".*",method="get"} # Good!
+
+Label matchers can also be applied to metric names by matching against the internal
+`__name__` label. For example, the expression `http_requests_total` is equivalent to
+`{__name__="http_requests_total"}`. Matchers other than `=` (`!=`, `=~`, `!~`) may also be used.
+The following expression selects all metrics that have a name starting with `job:`:
+
+    {__name__=~"^job:.*"}
+
+### Range Vector Selectors
+
+Range vector literals work like instant vector literals, except that they
+select a range of samples back from the current instant. Syntactically, a range
+duration is appended in square brackets (`[]`) at the end of a vector selector
+to specify how far back in time values should be fetched for each resulting
+range vector element.
+
+Time durations are specified as a number, followed immediately by one of the
+following units:
+
+* `s` - seconds
+* `m` - minutes
+* `h` - hours
+* `d` - days
+* `w` - weeks
+* `y` - years
+
+In this example, we select all the values we have recorded within the last 5
+minutes for all time series that have the metric name `http_requests_total` and
+a `job` label set to `prometheus`:
+
+    http_requests_total{job="prometheus"}[5m]
+
+### Offset modifier
+
+The `offset` modifier allows changing the time offset for individual
+instant and range vectors in a query.
+
+For example, the following expression returns the value of
+`http_requests_total` 5 minutes in the past relative to the current
+query evaluation time:
+
+    http_requests_total offset 5m
+
+Note that the `offset` modifier always needs to follow the selector
+immediately, i.e. the following would be correct:
+
+    sum(http_requests_total{method="GET"} offset 5m) // GOOD.
+
+While the following would be *incorrect*:
+
+    sum(http_requests_total{method="GET"}) offset 5m // INVALID.
+
+The same works for range vectors. This returns the 5-minutes rate that
+`http_requests_total` had a week ago:
+
+    rate(http_requests_total[5m] offset 1w)
+
+## Operators
+
+Prometheus supports many binary and aggregation operators. These are described
+in detail in the [expression language operators](operators.md) page.
+
+## Functions
+
+Prometheus supports several functions to operate on data. These are described
+in detail in the [expression language functions](functions.md) page.
+
+## Gotchas
+
+### Interpolation and staleness
+
+When queries are run, timestamps at which to sample data are selected
+independently of the actual present time series data. This is mainly to support
+cases like aggregation (`sum`, `avg`, and so on), where multiple aggregated
+time series do not exactly align in time. Because of their independence,
+Prometheus needs to assign a value at those timestamps for each relevant time
+series. It does so by simply taking the newest sample before this timestamp.
+
+If no stored sample is found (by default) 5 minutes before a sampling timestamp,
+no value is assigned for this time series at this point in time. This
+effectively means that time series "disappear" from graphs at times where their
+latest collected sample is older than 5 minutes.
+
+NOTE: <b>NOTE:</b> Staleness and interpolation handling might change. See
+https://github.com/prometheus/prometheus/issues/398 and
+https://github.com/prometheus/prometheus/issues/581.
+
+### Avoiding slow queries and overloads
+
+If a query needs to operate on a very large amount of data, graphing it might
+time out or overload the server or browser. Thus, when constructing queries
+over unknown data, always start building the query in the tabular view of
+Prometheus's expression browser until the result set seems reasonable
+(hundreds, not thousands, of time series at most).  Only when you have filtered
+or aggregated your data sufficiently, switch to graph mode. If the expression
+still takes too long to graph ad-hoc, pre-record it via a [recording
+rule](rules.md#recording-rules).
+
+This is especially relevant for Prometheus's query language, where a bare
+metric name selector like `api_http_requests_total` could expand to thousands
+of time series with different labels. Also keep in mind that expressions which
+aggregate over many time series will generate load on the server even if the
+output is only a small number of time series. This is similar to how it would
+be slow to sum all values of a column in a relational database, even if the
+output value is only a single number.
--- a/docs/querying/examples.md
+++ b/docs/querying/examples.md
@ -0,0 +1,83 @@
+---
+title: Querying examples
+nav_title: Examples
+sort_rank: 4
+---
+
+# Query examples
+
+## Simple time series selection
+
+Return all time series with the metric `http_requests_total`:
+
+    http_requests_total
+
+Return all time series with the metric `http_requests_total` and the given
+`job` and `handler` labels:
+
+    http_requests_total{job="apiserver", handler="/api/comments"}
+
+Return a whole range of time (in this case 5 minutes) for the same vector,
+making it a range vector:
+
+    http_requests_total{job="apiserver", handler="/api/comments"}[5m]
+
+Note that an expression resulting in a range vector cannot be graphed directly,
+but viewed in the tabular ("Console") view of the expression browser.
+
+Using regular expressions, you could select time series only for jobs whose
+name match a certain pattern, in this case, all jobs that end with `server`.
+Note that this does a substring match, not a full string match:
+
+    http_requests_total{job=~"server$"}
+
+To select all HTTP status codes except 4xx ones, you could run:
+
+    http_requests_total{status!~"^4..$"}
+
+## Using functions, operators, etc.
+
+Return the per-second rate for all time series with the `http_requests_total`
+metric name, as measured over the last 5 minutes:
+
+    rate(http_requests_total[5m])
+
+Assuming that the `http_requests_total` time series all have the labels `job`
+(fanout by job name) and `instance` (fanout by instance of the job), we might
+want to sum over the rate of all instances, so we get fewer output time series,
+but still preserve the `job` dimension:
+
+    sum(rate(http_requests_total[5m])) by (job)
+
+If we have two different metrics with the same dimensional labels, we can apply
+binary operators to them and elements on both sides with the same label set
+will get matched and propagated to the output. For example, this expression
+returns the unused memory in MiB for every instance (on a fictional cluster
+scheduler exposing these metrics about the instances it runs):
+
+    (instance_memory_limit_bytes - instance_memory_usage_bytes) / 1024 / 1024
+
+The same expression, but summed by application, could be written like this:
+
+    sum(
+      instance_memory_limit_bytes - instance_memory_usage_bytes
+    ) by (app, proc) / 1024 / 1024
+
+If the same fictional cluster scheduler exposed CPU usage metrics like the
+following for every instance:
+
+    instance_cpu_time_ns{app="lion", proc="web", rev="34d0f99", env="prod", job="cluster-manager"}
+    instance_cpu_time_ns{app="elephant", proc="worker", rev="34d0f99", env="prod", job="cluster-manager"}
+    instance_cpu_time_ns{app="turtle", proc="api", rev="4d3a513", env="prod", job="cluster-manager"}
+    instance_cpu_time_ns{app="fox", proc="widget", rev="4d3a513", env="prod", job="cluster-manager"}
+    ...
+
+...we could get the top 3 CPU users grouped by application (`app`) and process
+type (`proc`) like this:
+
+    topk(3, sum(rate(instance_cpu_time_ns[5m])) by (app, proc))
+
+Assuming this metric contains one time series per running instance, you could
+count the number of running instances per application like this:
+
+    count(instance_cpu_time_ns) by (app)
--- a/docs/querying/functions.md
+++ b/docs/querying/functions.md
@ -0,0 +1,408 @@
+---
+title: Query functions
+nav_title: Functions
+sort_rank: 3
+---
+
+# Functions
+
+Some functions have default arguments, e.g. `year(v=vector(time())
+instant-vector)`. This means that there is one argument `v` which is an instant
+vector, which if not provided it will default to the value of the expression
+`vector(time())`.
+
+## `abs()`
+
+`abs(v instant-vector)` returns the input vector with all sample values converted to
+their absolute value.
+
+## `absent()`
+
+`absent(v instant-vector)` returns an empty vector if the vector passed to it
+has any elements and a 1-element vector with the value 1 if the vector passed to
+it has no elements.
+
+This is useful for alerting on when no time series exist for a given metric name
+and label combination.
+
+```
+absent(nonexistent{job="myjob"})
+# => {job="myjob"}
+
+absent(nonexistent{job="myjob",instance=~".*"})
+# => {job="myjob"}
+
+absent(sum(nonexistent{job="myjob"}))
+# => {}
+```
+
+In the second example, `absent()` tries to be smart about deriving labels of the
+1-element output vector from the input vector.
+
+## `ceil()`
+
+`ceil(v instant-vector)` rounds the sample values of all elements in `v` up to
+the nearest integer.
+
+## `changes()`
+
+For each input time series, `changes(v range-vector)` returns the number of
+times its value has changed within the provided time range as an instant
+vector.
+
+## `clamp_max()`
+
+`clamp_max(v instant-vector, max scalar)` clamps the sample values of all
+elements in `v` to have an upper limit of `max`.
+
+## `clamp_min()`
+
+`clamp_min(v instant-vector, min scalar)` clamps the sample values of all
+elements in `v` to have a lower limit of `min`.
+
+## `count_scalar()`
+
+`count_scalar(v instant-vector)` returns the number of elements in a time series
+vector as a scalar. This is in contrast to the `count()`
+[aggregation operator](operators.md#aggregation-operators), which
+always returns a vector (an empty one if the input vector is empty) and allows
+grouping by labels via a `by` clause.
+
+## `day_of_month()`
+
+`day_of_month(v=vector(time()) instant-vector)` returns the day of the month
+for each of the given times in UTC. Returned values are from 1 to 31.
+
+## `day_of_week()`
+
+`day_of_week(v=vector(time()) instant-vector)` returns the day of the week for
+each of the given times in UTC. Returned values are from 0 to 6, where 0 means
+Sunday etc.
+
+## `days_in_month()`
+
+`days_in_month(v=vector(time()) instant-vector)` returns number of days in the
+month for each of the given times in UTC. Returned values are from 28 to 31.
+
+## `delta()`
+
+`delta(v range-vector)` calculates the difference between the
+first and last value of each time series element in a range vector `v`,
+returning an instant vector with the given deltas and equivalent labels.
+The delta is extrapolated to cover the full time range as specified in
+the range vector selector, so that it is possible to get a non-integer
+result even if the sample values are all integers.
+
+The following example expression returns the difference in CPU temperature
+between now and 2 hours ago:
+
+```
+delta(cpu_temp_celsius{host="zeus"}[2h])
+```
+
+`delta` should only be used with gauges.
+
+## `deriv()`
+
+`deriv(v range-vector)` calculates the per-second derivative of the time series in a range
+vector `v`, using [simple linear regression](http://en.wikipedia.org/wiki/Simple_linear_regression).
+
+`deriv` should only be used with gauges.
+
+## `drop_common_labels()`
+
+`drop_common_labels(instant-vector)` drops all labels that have the same name
+and value across all series in the input vector.
+
+## `exp()`
+
+`exp(v instant-vector)` calculates the exponential function for all elements in `v`.
+Special cases are:
+
+* `Exp(+Inf) = +Inf`
+* `Exp(NaN) = NaN`
+
+## `floor()`
+
+`floor(v instant-vector)` rounds the sample values of all elements in `v` down
+to the nearest integer.
+
+## `histogram_quantile()`
+
+`histogram_quantile(φ float, b instant-vector)` calculates the φ-quantile (0 ≤ φ
+≤ 1) from the buckets `b` of a
+[histogram](https://prometheus.io/docs/concepts/metric_types/#histogram). (See
+[histograms and summaries](https://prometheus.io/docs/practices/histograms) for
+a detailed explanation of φ-quantiles and the usage of the histogram metric type
+in general.) The samples in `b` are the counts of observations in each bucket.
+Each sample must have a label `le` where the label value denotes the inclusive
+upper bound of the bucket. (Samples without such a label are silently ignored.)
+The [histogram metric type](https://prometheus.io/docs/concepts/metric_types/#histogram)
+automatically provides time series with the `_bucket` suffix and the appropriate
+labels.
+
+Use the `rate()` function to specify the time window for the quantile
+calculation.
+
+Example: A histogram metric is called `http_request_duration_seconds`. To
+calculate the 90th percentile of request durations over the last 10m, use the
+following expression:
+
+    histogram_quantile(0.9, rate(http_request_duration_seconds_bucket[10m]))
+
+The quantile is calculated for each label combination in
+`http_request_duration_seconds`. To aggregate, use the `sum()` aggregator
+around the `rate()` function. Since the `le` label is required by
+`histogram_quantile()`, it has to be included in the `by` clause. The following
+expression aggregates the 90th percentile by `job`:
+
+    histogram_quantile(0.9, sum(rate(http_request_duration_seconds_bucket[10m])) by (job, le))
+
+To aggregate everything, specify only the `le` label:
+
+    histogram_quantile(0.9, sum(rate(http_request_duration_seconds_bucket[10m])) by (le))
+
+The `histogram_quantile()` function interpolates quantile values by
+assuming a linear distribution within a bucket. The highest bucket
+must have an upper bound of `+Inf`. (Otherwise, `NaN` is returned.) If
+a quantile is located in the highest bucket, the upper bound of the
+second highest bucket is returned. A lower limit of the lowest bucket
+is assumed to be 0 if the upper bound of that bucket is greater than
+0. In that case, the usual linear interpolation is applied within that
+bucket. Otherwise, the upper bound of the lowest bucket is returned
+for quantiles located in the lowest bucket.
+
+If `b` contains fewer than two buckets, `NaN` is returned. For φ < 0, `-Inf` is
+returned. For φ > 1, `+Inf` is returned.
+
+## `holt_winters()`
+
+`holt_winters(v range-vector, sf scalar, tf scalar)` produces a smoothed value
+for time series based on the range in `v`. The lower the smoothing factor `sf`,
+the more importance is given to old data. The higher the trend factor `tf`, the
+more trends in the data is considered. Both `sf` and `tf` must be between 0 and
+1.
+
+`holt_winters` should only be used with gauges.
+
+## `hour()`
+
+`hour(v=vector(time()) instant-vector)` returns the hour of the day
+for each of the given times in UTC. Returned values are from 0 to 23.
+
+## `idelta()`
+
+`idelta(v range-vector)`
+
+`idelta(v range-vector)` calculates the difference between the last two samples
+in the range vector `v`, returning an instant vector with the given deltas and
+equivalent labels.
+
+`idelta` should only be used with gauges.
+
+## `increase()`
+
+`increase(v range-vector)` calculates the increase in the
+time series in the range vector. Breaks in monotonicity (such as counter
+resets due to target restarts) are automatically adjusted for. The
+increase is extrapolated to cover the full time range as specified
+in the range vector selector, so that it is possible to get a
+non-integer result even if a counter increases only by integer
+increments.
+
+The following example expression returns the number of HTTP requests as measured
+over the last 5 minutes, per time series in the range vector:
+
+```
+increase(http_requests_total{job="api-server"}[5m])
+```
+
+`increase` should only be used with counters. It is syntactic sugar
+for `rate(v)` multiplied by the number of seconds under the specified
+time range window, and should be used primarily for human readability.
+Use `rate` in recording rules so that increases are tracked consistently
+on a per-second basis.
+
+## `irate()`
+
+`irate(v range-vector)` calculates the per-second instant rate of increase of
+the time series in the range vector. This is based on the last two data points.
+Breaks in monotonicity (such as counter resets due to target restarts) are
+automatically adjusted for.
+
+The following example expression returns the per-second rate of HTTP requests
+looking up to 5 minutes back for the two most recent data points, per time
+series in the range vector:
+
+```
+irate(http_requests_total{job="api-server"}[5m])
+```
+
+`irate` should only be used when graphing volatile, fast-moving counters.
+Use `rate` for alerts and slow-moving counters, as brief changes
+in the rate can reset the `FOR` clause and graphs consisting entirely of rare
+spikes are hard to read.
+
+Note that when combining `irate()` with an
+[aggregation operator](operators.md#aggregation-operators) (e.g. `sum()`)
+or a function aggregating over time (any function ending in `_over_time`),
+always take a `irate()` first, then aggregate. Otherwise `irate()` cannot detect
+counter resets when your target restarts.
+
+## `label_join()`
+
+For each timeseries in `v`, `label_join(v instant-vector, dst_label string, separator string, src_label_1 string, src_label_2 string, ...)` joins all the values of all the `src_labels`
+using `separator` and returns the timeseries with the label `dst_label` containing the joined value.
+There can be any number of `src_labels` in this function.
+
+This example will return a vector with each time series having a `foo` label with the value `a,b,c` added to it:
+
+```
+label_join(up{job="api-server",src1="a",src2="b",src3="c"}, "foo", ",", "src1", "src2", "src3")
+```
+
+## `label_replace()`
+
+For each timeseries in `v`, `label_replace(v instant-vector, dst_label string,
+replacement string, src_label string, regex string)` matches the regular
+expression `regex` against the label `src_label`.  If it matches, then the
+timeseries is returned with the label `dst_label` replaced by the expansion of
+`replacement`. `$1` is replaced with the first matching subgroup, `$2` with the
+second etc. If the regular expression doesn't match then the timeseries is
+returned unchanged.
+
+This example will return a vector with each time series having a `foo`
+label with the value `a` added to it:
+
+```
+label_replace(up{job="api-server",service="a:c"}, "foo", "$1", "service", "(.*):.*")
+```
+
+## `ln()`
+
+`ln(v instant-vector)` calculates the natural logarithm for all elements in `v`.
+Special cases are:
+
+* `ln(+Inf) = +Inf`
+* `ln(0) = -Inf`
+* `ln(x < 0) = NaN`
+* `ln(NaN) = NaN`
+
+## `log2()`
+
+`log2(v instant-vector)` calculates the binary logarithm for all elements in `v`.
+The special cases are equivalent to those in `ln`.
+
+## `log10()`
+
+`log10(v instant-vector)` calculates the decimal logarithm for all elements in `v`.
+The special cases are equivalent to those in `ln`.
+
+## `minute()`
+
+`minute(v=vector(time()) instant-vector)` returns the minute of the hour for each
+of the given times in UTC. Returned values are from 0 to 59.
+
+## `month()`
+
+`month(v=vector(time()) instant-vector)` returns the month of the year for each
+of the given times in UTC. Returned values are from 1 to 12, where 1 means
+January etc.
+
+## `predict_linear()`
+
+`predict_linear(v range-vector, t scalar)` predicts the value of time series
+`t` seconds from now, based on the range vector `v`, using [simple linear
+regression](http://en.wikipedia.org/wiki/Simple_linear_regression).
+
+`predict_linear` should only be used with gauges.
+
+## `rate()`
+
+`rate(v range-vector)` calculates the per-second average rate of increase of the
+time series in the range vector. Breaks in monotonicity (such as counter
+resets due to target restarts) are automatically adjusted for. Also, the
+calculation extrapolates to the ends of the time range, allowing for missed
+scrapes or imperfect alignment of scrape cycles with the range's time period.
+
+The following example expression returns the per-second rate of HTTP requests as measured
+over the last 5 minutes, per time series in the range vector:
+
+```
+rate(http_requests_total{job="api-server"}[5m])
+```
+
+`rate` should only be used with counters. It is best suited for alerting,
+and for graphing of slow-moving counters.
+
+Note that when combining `rate()` with an aggregation operator (e.g. `sum()`)
+or a function aggregating over time (any function ending in `_over_time`),
+always take a `rate()` first, then aggregate. Otherwise `rate()` cannot detect
+counter resets when your target restarts.
+
+## `resets()`
+
+For each input time series, `resets(v range-vector)` returns the number of
+counter resets within the provided time range as an instant vector. Any
+decrease in the value between two consecutive samples is interpreted as a
+counter reset.
+
+`resets` should only be used with counters.
+
+## `round()`
+
+`round(v instant-vector, to_nearest=1 scalar)` rounds the sample values of all
+elements in `v` to the nearest integer. Ties are resolved by rounding up. The
+optional `to_nearest` argument allows specifying the nearest multiple to which
+the sample values should be rounded. This multiple may also be a fraction.
+
+## `scalar()`
+
+Given a single-element input vector, `scalar(v instant-vector)` returns the
+sample value of that single element as a scalar. If the input vector does not
+have exactly one element, `scalar` will return `NaN`.
+
+## `sort()`
+
+`sort(v instant-vector)` returns vector elements sorted by their sample values,
+in ascending order.
+
+## `sort_desc()`
+
+Same as `sort`, but sorts in descending order.
+
+## `sqrt()`
+
+`sqrt(v instant-vector)` calculates the square root of all elements in `v`.
+
+## `time()`
+
+`time()` returns the number of seconds since January 1, 1970 UTC. Note that
+this does not actually return the current time, but the time at which the
+expression is to be evaluated.
+
+## `vector()`
+
+`vector(s scalar)` returns the scalar `s` as a vector with no labels.
+
+## `year()`
+
+`year(v=vector(time()) instant-vector)` returns the year
+for each of the given times in UTC.
+
+## `<aggregation>_over_time()`
+
+The following functions allow aggregating each series of a given range vector
+over time and return an instant vector with per-series aggregation results:
+
+* `avg_over_time(range-vector)`: the average value of all points in the specified interval.
+* `min_over_time(range-vector)`: the minimum value of all points in the specified interval.
+* `max_over_time(range-vector)`: the maximum value of all points in the specified interval.
+* `sum_over_time(range-vector)`: the sum of all values in the specified interval.
+* `count_over_time(range-vector)`: the count of all values in the specified interval.
+* `quantile_over_time(scalar, range-vector)`: the φ-quantile (0 ≤ φ ≤ 1) of the values in the specified interval.
+* `stddev_over_time(range-vector)`: the population standard deviation of the values in the specified interval.
+* `stdvar_over_time(range-vector)`: the population standard variance of the values in the specified interval.
+
+Note that all values in the specified interval have the same weight in the
+aggregation even if the values are not equally spaced throughout the interval.
--- a/docs/querying/index.md
+++ b/docs/querying/index.md
@ -0,0 +1,4 @@
+---
+title: Querying
+sort_rank: 4
+---
--- a/docs/querying/operators.md
+++ b/docs/querying/operators.md
@ -0,0 +1,250 @@
+---
+title: Operators
+sort_rank: 2
+---
+
+# Operators
+
+## Binary operators
+
+Prometheus's query language supports basic logical and arithmetic operators.
+For operations between two instant vectors, the [matching behavior](#vector-matching)
+can be modified.
+
+### Arithmetic binary operators
+
+The following binary arithmetic operators exist in Prometheus:
+
+* `+` (addition)
+* `-` (subtraction)
+* `*` (multiplication)
+* `/` (division)
+* `%` (modulo)
+* `^` (power/exponentiation)
+
+Binary arithmetic operators are defined between scalar/scalar, vector/scalar,
+and vector/vector value pairs.
+
+**Between two scalars**, the behavior is obvious: they evaluate to another
+scalar that is the result of the operator applied to both scalar operands.
+
+**Between an instant vector and a scalar**, the operator is applied to the
+value of every data sample in the vector. E.g. if a time series instant vector
+is multiplied by 2, the result is another vector in which every sample value of
+the original vector is multiplied by 2.
+
+**Between two instant vectors**, a binary arithmetic operator is applied to
+each entry in the left-hand-side vector and its [matching element](#vector-matching)
+in the right hand vector. The result is propagated into the result vector and the metric
+name is dropped. Entries for which no matching entry in the right-hand vector can be
+found are not part of the result.
+
+### Comparison binary operators
+
+The following binary comparison operators exist in Prometheus:
+
+* `==` (equal)
+* `!=` (not-equal)
+* `>` (greater-than)
+* `<` (less-than)
+* `>=` (greater-or-equal)
+* `<=` (less-or-equal)
+
+Comparison operators are defined between scalar/scalar, vector/scalar,
+and vector/vector value pairs. By default they filter. Their behaviour can be
+modified by providing `bool` after the operator, which will return `0` or `1`
+for the value rather than filtering.
+
+**Between two scalars**, the `bool` modifier must be provided and these
+operators result in another scalar that is either `0` (`false`) or `1`
+(`true`), depending on the comparison result.
+
+**Between an instant vector and a scalar**, these operators are applied to the
+value of every data sample in the vector, and vector elements between which the
+comparison result is `false` get dropped from the result vector. If the `bool`
+modifier is provided, vector elements that would be dropped instead have the value
+`0` and vector elements that would be kept have the value `1`.
+
+**Between two instant vectors**, these operators behave as a filter by default,
+applied to matching entries. Vector elements for which the expression is not
+true or which do not find a match on the other side of the expression get
+dropped from the result, while the others are propagated into a result vector
+with their original (left-hand-side) metric names and label values.
+If the `bool` modifier is provided, vector elements that would have been
+dropped instead have the value `0` and vector elements that would be kept have
+the value `1` with the left-hand-side metric names and label values.
+
+### Logical/set binary operators
+
+These logical/set binary operators are only defined between instant vectors:
+
+* `and` (intersection)
+* `or` (union)
+* `unless` (complement)
+
+`vector1 and vector2` results in a vector consisting of the elements of
+`vector1` for which there are elements in `vector2` with exactly matching
+label sets. Other elements are dropped. The metric name and values are carried
+over from the left-hand-side vector.
+
+`vector1 or vector2` results in a vector that contains all original elements
+(label sets + values) of `vector1` and additionally all elements of `vector2`
+which do not have matching label sets in `vector1`.
+
+`vector1 unless vector2` results in a vector consisting of the elements of
+`vector1` for which there are no elements in `vector2` with exactly matching
+label sets. All matching elements in both vectors are dropped.
+
+## Vector matching
+
+Operations between vectors attempt to find a matching element in the right-hand-side
+vector for each entry in the left-hand side. There are two basic types of
+matching behavior:
+
+**One-to-one** finds a unique pair of entries from each side of the operation.
+In the default case, that is an operation following the format `vector1 <operator> vector2`.
+Two entries match if they have the exact same set of labels and corresponding values.
+The `ignoring` keyword allows ignoring certain labels when matching, while the
+`on` keyword allows reducing the set of considered labels to a provided list:
+
+    <vector expr> <bin-op> ignoring(<label list>) <vector expr>
+    <vector expr> <bin-op> on(<label list>) <vector expr>
+
+Example input:
+
+    method_code:http_errors:rate5m{method="get", code="500"}  24
+    method_code:http_errors:rate5m{method="get", code="404"}  30
+    method_code:http_errors:rate5m{method="put", code="501"}  3
+    method_code:http_errors:rate5m{method="post", code="500"} 6
+    method_code:http_errors:rate5m{method="post", code="404"} 21
+
+    method:http_requests:rate5m{method="get"}  600
+    method:http_requests:rate5m{method="del"}  34
+    method:http_requests:rate5m{method="post"} 120
+
+Example query:
+
+    method_code:http_errors:rate5m{code="500"} / ignoring(code) method:http_requests:rate5m
+
+This returns a result vector containing the fraction of HTTP requests with status code
+of 500 for each method, as measured over the last 5 minutes. Without `ignoring(code)` there
+would have been no match as the metrics do not share the same set of labels.
+The entries with methods `put` and `del` have no match and will not show up in the result:
+
+    {method="get"}  0.04            //  24 / 600
+    {method="post"} 0.05            //   6 / 120
+
+**Many-to-one** and **one-to-many** matchings refer to the case where each vector element on
+the "one"-side can match with multiple elements on the "many"-side. This has to
+be explicitly requested using the `group_left` or `group_right` modifier, where
+left/right determines which vector has the higher cardinality.
+
+    <vector expr> <bin-op> ignoring(<label list>) group_left(<label list>) <vector expr>
+    <vector expr> <bin-op> ignoring(<label list>) group_right(<label list>) <vector expr>
+    <vector expr> <bin-op> on(<label list>) group_left(<label list>) <vector expr>
+    <vector expr> <bin-op> on(<label list>) group_right(<label list>) <vector expr>
+
+The label list provided with the group modifier contains additional labels from
+the "one"-side to be included in the result metrics. For `on` a label can only
+appear in one of the lists. Every time series of the result vector must be
+uniquely identifiable.
+
+_Grouping modifiers can only be used for
+[comparison](#comparison-binary-operators) and
+[arithmetic](#arithmetic-binary-operators). Operations as `and`, `unless` and
+`or` operations match with all possible entries in the right vector by
+default._
+
+Example query:
+
+    method_code:http_errors:rate5m / ignoring(code) group_left method:http_requests:rate5m
+
+In this case the left vector contains more than one entry per `method` label
+value. Thus, we indicate this using `group_left`. The elements from the right
+side are now matched with multiple elements with the same `method` label on the
+left:
+
+    {method="get", code="500"}  0.04            //  24 / 600
+    {method="get", code="404"}  0.05            //  30 / 600
+    {method="post", code="500"} 0.05            //   6 / 120
+    {method="post", code="404"} 0.175           //  21 / 120
+
+_Many-to-one and one-to-many matching are advanced use cases that should be carefully considered.
+Often a proper use of `ignoring(<labels>)` provides the desired outcome._
+
+## Aggregation operators
+
+Prometheus supports the following built-in aggregation operators that can be
+used to aggregate the elements of a single instant vector, resulting in a new
+vector of fewer elements with aggregated values:
+
+* `sum` (calculate sum over dimensions)
+* `min` (select minimum over dimensions)
+* `max` (select maximum over dimensions)
+* `avg` (calculate the average over dimensions)
+* `stddev` (calculate population standard deviation over dimensions)
+* `stdvar` (calculate population standard variance over dimensions)
+* `count` (count number of elements in the vector)
+* `count_values` (count number of elements with the same value)
+* `bottomk` (smallest k elements by sample value)
+* `topk` (largest k elements by sample value)
+* `quantile` (calculate φ-quantile (0 ≤ φ ≤ 1) over dimensions)
+
+These operators can either be used to aggregate over **all** label dimensions
+or preserve distinct dimensions by including a `without` or `by` clause.
+
+    <aggr-op>([parameter,] <vector expression>) [without|by (<label list>)] [keep_common]
+
+`parameter` is only required for `count_values`, `quantile`, `topk` and
+`bottomk`. `without` removes the listed labels from the result vector, while
+all other labels are preserved the output. `by` does the opposite and drops
+labels that are not listed in the `by` clause, even if their label values are
+identical between all elements of the vector. The `keep_common` clause allows
+keeping those extra labels (labels that are identical between elements, but not
+in the `by` clause).
+
+`count_values` outputs one time series per unique sample value. Each series has
+an additional label. The name of that label is given by the aggregation
+parameter, and the label value is the unique sample value.  The value of each
+time series is the number of times that sample value was present.
+
+`topk` and `bottomk` are different from other aggregators in that a subset of
+the input samples, including the original labels, are returned in the result
+vector. `by` and `without` are only used to bucket the input vector.
+
+Example:
+
+If the metric `http_requests_total` had time series that fan out by
+`application`, `instance`, and `group` labels, we could calculate the total
+number of seen HTTP requests per application and group over all instances via:
+
+    sum(http_requests_total) without (instance)
+
+If we are just interested in the total of HTTP requests we have seen in **all**
+applications, we could simply write:
+
+    sum(http_requests_total)
+
+To count the number of binaries running each build version we could write:
+
+    count_values("version", build_version)
+
+To get the 5 largest HTTP requests counts across all instances we could write:
+
+    topk(5, http_requests_total)
+
+## Binary operator precedence
+
+The following list shows the precedence of binary operators in Prometheus, from
+highest to lowest.
+
+1. `^`
+2. `*`, `/`, `%`
+3. `+`, `-`
+4. `==`, `!=`, `<=`, `<`, `>=`, `>`
+5. `and`, `unless`
+6. `or`
+
+Operators on the same precedence level are left-associative. For example,
+`2 * 3 % 2` is equivalent to `(2 * 3) % 2`. However `^` is right associative,
+so `2 ^ 3 ^ 2` is equivalent to `2 ^ (3 ^ 2)`.
--- a/docs/querying/rules.md
+++ b/docs/querying/rules.md
@ -0,0 +1,66 @@
+---
+title: Recording rules
+sort_rank: 6
+---
+
+# Defining recording rules
+
+## Configuring rules
+
+Prometheus supports two types of rules which may be configured and then
+evaluated at regular intervals: recording rules and [alerting
+rules](https://prometheus.io/docs/alerting/rules/). To include rules in
+Prometheus, create a file containing the necessary rule statements and have
+Prometheus load the file via the `rule_files` field in the [Prometheus
+configuration](../configuration.md).
+
+The rule files can be reloaded at runtime by sending `SIGHUP` to the Prometheus
+process. The changes are only applied if all rule files are well-formatted.
+
+## Syntax-checking rules
+
+To quickly check whether a rule file is syntactically correct without starting
+a Prometheus server, install and run Prometheus's `promtool` command-line
+utility tool:
+
+```bash
+go get github.com/prometheus/prometheus/cmd/promtool
+promtool check-rules /path/to/example.rules
+```
+
+When the file is syntactically valid, the checker prints a textual
+representation of the parsed rules to standard output and then exits with
+a `0` return status.
+
+If there are any syntax errors, it prints an error message to standard error
+and exits with a `1` return status. On invalid input arguments the exit status
+is `2`.
+
+## Recording rules
+
+Recording rules allow you to precompute frequently needed or computationally
+expensive expressions and save their result as a new set of time series.
+Querying the precomputed result will then often be much faster than executing
+the original expression every time it is needed. This is especially useful for
+dashboards, which need to query the same expression repeatedly every time they
+refresh.
+
+To add a new recording rule, add a line of the following syntax to your rule
+file:
+
+    <new time series name>[{<label overrides>}] = <expression to record>
+
+Some examples:
+
+    # Saving the per-job HTTP in-progress request count as a new set of time series:
+    job:http_inprogress_requests:sum = sum(http_inprogress_requests) by (job)
+
+    # Drop or rewrite labels in the result time series:
+    new_time_series{label_to_change="new_value",label_to_drop=""} = old_time_series
+
+Recording rules are evaluated at the interval specified by the
+`evaluation_interval` field in the Prometheus configuration. During each
+evaluation cycle, the right-hand-side expression of the rule statement is
+evaluated at the current instant in time and the resulting sample vector is
+stored as a new set of time series with the current timestamp and a new metric
+name (and perhaps an overridden set of labels).