Like Prometheus, but for logs.
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
loki/pkg/querier/queryrange/labels_cache.go

127 lines
4.2 KiB

feat(caching): Support caching `/series` and `/labels` query results (#11539) **What this PR does / why we need it**: Add support for caching metadata queries (both series and labels). caching happens after splitting similar to other types of queries. This pr adds the following configs to enable them. ``` cache_series_results: true|false (default false) cache_label_results: true|false (default false) ``` And the cache backend for them can be configured using `series_results_cache` and `label_results_cache` blocks under the `query_range` section. Currently the split interval for metadata queries is fixed and defaults to 24h, this pr makes it configurable by introducing `split_metadata_queries_by_interval` **Which issue(s) this PR fixes**: Fixes #<issue number> **Special notes for your reviewer**: **Checklist** - [x] Reviewed the [`CONTRIBUTING.md`](https://github.com/grafana/loki/blob/main/CONTRIBUTING.md) guide (**required**) - [x] Documentation added - [x] Tests updated - [ ] `CHANGELOG.md` updated - [ ] If the change is worth mentioning in the release notes, add `add-to-release-notes` label - [ ] Changes that require user attention or interaction to upgrade are documented in `docs/sources/setup/upgrade/_index.md` - [ ] For Helm chart changes bump the Helm chart version in `production/helm/loki/Chart.yaml` and update `production/helm/loki/CHANGELOG.md` and `production/helm/loki/README.md`. [Example PR](https://github.com/grafana/loki/commit/d10549e3ece02120974929894ee333d07755d213) - [ ] If the change is deprecating or removing a configuration option, update the `deprecated-config.yaml` and `deleted-config.yaml` files respectively in the `tools/deprecated-config-checker` directory. [Example PR](https://github.com/grafana/loki/pull/10840/commits/0d4416a4b03739583349934b96f272fb4f685d15) --------- Signed-off-by: Kaviraj <kavirajkanagaraj@gmail.com> Co-authored-by: Ashwanth Goli <iamashwanth@gmail.com>
1 year ago
package queryrange
import (
"context"
"flag"
"fmt"
"time"
"github.com/go-kit/log"
"github.com/grafana/loki/pkg/querier/queryrange/queryrangebase"
"github.com/grafana/loki/pkg/storage/chunk/cache"
"github.com/grafana/loki/pkg/storage/chunk/cache/resultscache"
"github.com/grafana/loki/pkg/util/validation"
feat(caching): Support caching `/series` and `/labels` query results (#11539) **What this PR does / why we need it**: Add support for caching metadata queries (both series and labels). caching happens after splitting similar to other types of queries. This pr adds the following configs to enable them. ``` cache_series_results: true|false (default false) cache_label_results: true|false (default false) ``` And the cache backend for them can be configured using `series_results_cache` and `label_results_cache` blocks under the `query_range` section. Currently the split interval for metadata queries is fixed and defaults to 24h, this pr makes it configurable by introducing `split_metadata_queries_by_interval` **Which issue(s) this PR fixes**: Fixes #<issue number> **Special notes for your reviewer**: **Checklist** - [x] Reviewed the [`CONTRIBUTING.md`](https://github.com/grafana/loki/blob/main/CONTRIBUTING.md) guide (**required**) - [x] Documentation added - [x] Tests updated - [ ] `CHANGELOG.md` updated - [ ] If the change is worth mentioning in the release notes, add `add-to-release-notes` label - [ ] Changes that require user attention or interaction to upgrade are documented in `docs/sources/setup/upgrade/_index.md` - [ ] For Helm chart changes bump the Helm chart version in `production/helm/loki/Chart.yaml` and update `production/helm/loki/CHANGELOG.md` and `production/helm/loki/README.md`. [Example PR](https://github.com/grafana/loki/commit/d10549e3ece02120974929894ee333d07755d213) - [ ] If the change is deprecating or removing a configuration option, update the `deprecated-config.yaml` and `deleted-config.yaml` files respectively in the `tools/deprecated-config-checker` directory. [Example PR](https://github.com/grafana/loki/pull/10840/commits/0d4416a4b03739583349934b96f272fb4f685d15) --------- Signed-off-by: Kaviraj <kavirajkanagaraj@gmail.com> Co-authored-by: Ashwanth Goli <iamashwanth@gmail.com>
1 year ago
)
type cacheKeyLabels struct {
Limits
transformer UserIDTransformer
}
// metadataSplitIntervalForTimeRange returns split interval for series and label requests.
// If `recent_metadata_query_window` is configured and the query start interval is within this window,
// it returns `split_recent_metadata_queries_by_interval`.
// For other cases, the default split interval of `split_metadata_queries_by_interval` will be used.
func metadataSplitIntervalForTimeRange(limits Limits, tenantIDs []string, ref, start time.Time) time.Duration {
split := validation.MaxDurationOrZeroPerTenant(tenantIDs, limits.MetadataQuerySplitDuration)
recentMetadataQueryWindow := validation.MaxDurationOrZeroPerTenant(tenantIDs, limits.RecentMetadataQueryWindow)
recentMetadataQuerySplitInterval := validation.MaxDurationOrZeroPerTenant(tenantIDs, limits.RecentMetadataQuerySplitDuration)
// if either of the options are not configured, use the default metadata split interval
if recentMetadataQueryWindow == 0 || recentMetadataQuerySplitInterval == 0 {
return split
}
// if the query start is not before window start, it would be split using recentMetadataQuerySplitInterval
if windowStart := ref.Add(-recentMetadataQueryWindow); !start.Before(windowStart) {
split = recentMetadataQuerySplitInterval
}
return split
feat(caching): Support caching `/series` and `/labels` query results (#11539) **What this PR does / why we need it**: Add support for caching metadata queries (both series and labels). caching happens after splitting similar to other types of queries. This pr adds the following configs to enable them. ``` cache_series_results: true|false (default false) cache_label_results: true|false (default false) ``` And the cache backend for them can be configured using `series_results_cache` and `label_results_cache` blocks under the `query_range` section. Currently the split interval for metadata queries is fixed and defaults to 24h, this pr makes it configurable by introducing `split_metadata_queries_by_interval` **Which issue(s) this PR fixes**: Fixes #<issue number> **Special notes for your reviewer**: **Checklist** - [x] Reviewed the [`CONTRIBUTING.md`](https://github.com/grafana/loki/blob/main/CONTRIBUTING.md) guide (**required**) - [x] Documentation added - [x] Tests updated - [ ] `CHANGELOG.md` updated - [ ] If the change is worth mentioning in the release notes, add `add-to-release-notes` label - [ ] Changes that require user attention or interaction to upgrade are documented in `docs/sources/setup/upgrade/_index.md` - [ ] For Helm chart changes bump the Helm chart version in `production/helm/loki/Chart.yaml` and update `production/helm/loki/CHANGELOG.md` and `production/helm/loki/README.md`. [Example PR](https://github.com/grafana/loki/commit/d10549e3ece02120974929894ee333d07755d213) - [ ] If the change is deprecating or removing a configuration option, update the `deprecated-config.yaml` and `deleted-config.yaml` files respectively in the `tools/deprecated-config-checker` directory. [Example PR](https://github.com/grafana/loki/pull/10840/commits/0d4416a4b03739583349934b96f272fb4f685d15) --------- Signed-off-by: Kaviraj <kavirajkanagaraj@gmail.com> Co-authored-by: Ashwanth Goli <iamashwanth@gmail.com>
1 year ago
}
// GenerateCacheKey generates a cache key based on the userID, split duration and the interval of the request.
// It also includes the label name and the provided query for label values request.
func (i cacheKeyLabels) GenerateCacheKey(ctx context.Context, userID string, r resultscache.Request) string {
lr := r.(*LabelRequest)
split := metadataSplitIntervalForTimeRange(i.Limits, []string{userID}, time.Now().UTC(), r.GetStart().UTC())
feat(caching): Support caching `/series` and `/labels` query results (#11539) **What this PR does / why we need it**: Add support for caching metadata queries (both series and labels). caching happens after splitting similar to other types of queries. This pr adds the following configs to enable them. ``` cache_series_results: true|false (default false) cache_label_results: true|false (default false) ``` And the cache backend for them can be configured using `series_results_cache` and `label_results_cache` blocks under the `query_range` section. Currently the split interval for metadata queries is fixed and defaults to 24h, this pr makes it configurable by introducing `split_metadata_queries_by_interval` **Which issue(s) this PR fixes**: Fixes #<issue number> **Special notes for your reviewer**: **Checklist** - [x] Reviewed the [`CONTRIBUTING.md`](https://github.com/grafana/loki/blob/main/CONTRIBUTING.md) guide (**required**) - [x] Documentation added - [x] Tests updated - [ ] `CHANGELOG.md` updated - [ ] If the change is worth mentioning in the release notes, add `add-to-release-notes` label - [ ] Changes that require user attention or interaction to upgrade are documented in `docs/sources/setup/upgrade/_index.md` - [ ] For Helm chart changes bump the Helm chart version in `production/helm/loki/Chart.yaml` and update `production/helm/loki/CHANGELOG.md` and `production/helm/loki/README.md`. [Example PR](https://github.com/grafana/loki/commit/d10549e3ece02120974929894ee333d07755d213) - [ ] If the change is deprecating or removing a configuration option, update the `deprecated-config.yaml` and `deleted-config.yaml` files respectively in the `tools/deprecated-config-checker` directory. [Example PR](https://github.com/grafana/loki/pull/10840/commits/0d4416a4b03739583349934b96f272fb4f685d15) --------- Signed-off-by: Kaviraj <kavirajkanagaraj@gmail.com> Co-authored-by: Ashwanth Goli <iamashwanth@gmail.com>
1 year ago
var currentInterval int64
if denominator := int64(split / time.Millisecond); denominator > 0 {
currentInterval = lr.GetStart().UnixMilli() / denominator
}
if i.transformer != nil {
userID = i.transformer(ctx, userID)
}
if lr.GetValues() {
return fmt.Sprintf("labelvalues:%s:%s:%s:%d:%d", userID, lr.GetName(), lr.GetQuery(), currentInterval, split)
}
return fmt.Sprintf("labels:%s:%d:%d", userID, currentInterval, split)
}
type labelsExtractor struct{}
// Extract extracts the labels response for the specific time range.
// It is a no-op since it is not possible to partition the labels data by time range as it is just a slice of strings.
func (p labelsExtractor) Extract(_, _ int64, res resultscache.Response, _, _ int64) resultscache.Response {
return res
}
func (p labelsExtractor) ResponseWithoutHeaders(resp queryrangebase.Response) queryrangebase.Response {
labelsResp := resp.(*LokiLabelNamesResponse)
return &LokiLabelNamesResponse{
Status: labelsResp.Status,
Data: labelsResp.Data,
Version: labelsResp.Version,
Statistics: labelsResp.Statistics,
}
}
type LabelsCacheConfig struct {
queryrangebase.ResultsCacheConfig `yaml:",inline"`
}
// RegisterFlags registers flags.
func (cfg *LabelsCacheConfig) RegisterFlags(f *flag.FlagSet) {
cfg.RegisterFlagsWithPrefix(f, "frontend.label-results-cache.")
}
func (cfg *LabelsCacheConfig) Validate() error {
return cfg.ResultsCacheConfig.Validate()
}
func NewLabelsCacheMiddleware(
logger log.Logger,
limits Limits,
merger queryrangebase.Merger,
c cache.Cache,
cacheGenNumberLoader queryrangebase.CacheGenNumberLoader,
shouldCache queryrangebase.ShouldCacheFn,
parallelismForReq queryrangebase.ParallelismForReqFn,
retentionEnabled bool,
transformer UserIDTransformer,
metrics *queryrangebase.ResultsCacheMetrics,
) (queryrangebase.Middleware, error) {
return queryrangebase.NewResultsCacheMiddleware(
logger,
c,
cacheKeyLabels{limits, transformer},
feat(caching): Support caching `/series` and `/labels` query results (#11539) **What this PR does / why we need it**: Add support for caching metadata queries (both series and labels). caching happens after splitting similar to other types of queries. This pr adds the following configs to enable them. ``` cache_series_results: true|false (default false) cache_label_results: true|false (default false) ``` And the cache backend for them can be configured using `series_results_cache` and `label_results_cache` blocks under the `query_range` section. Currently the split interval for metadata queries is fixed and defaults to 24h, this pr makes it configurable by introducing `split_metadata_queries_by_interval` **Which issue(s) this PR fixes**: Fixes #<issue number> **Special notes for your reviewer**: **Checklist** - [x] Reviewed the [`CONTRIBUTING.md`](https://github.com/grafana/loki/blob/main/CONTRIBUTING.md) guide (**required**) - [x] Documentation added - [x] Tests updated - [ ] `CHANGELOG.md` updated - [ ] If the change is worth mentioning in the release notes, add `add-to-release-notes` label - [ ] Changes that require user attention or interaction to upgrade are documented in `docs/sources/setup/upgrade/_index.md` - [ ] For Helm chart changes bump the Helm chart version in `production/helm/loki/Chart.yaml` and update `production/helm/loki/CHANGELOG.md` and `production/helm/loki/README.md`. [Example PR](https://github.com/grafana/loki/commit/d10549e3ece02120974929894ee333d07755d213) - [ ] If the change is deprecating or removing a configuration option, update the `deprecated-config.yaml` and `deleted-config.yaml` files respectively in the `tools/deprecated-config-checker` directory. [Example PR](https://github.com/grafana/loki/pull/10840/commits/0d4416a4b03739583349934b96f272fb4f685d15) --------- Signed-off-by: Kaviraj <kavirajkanagaraj@gmail.com> Co-authored-by: Ashwanth Goli <iamashwanth@gmail.com>
1 year ago
limits,
merger,
labelsExtractor{},
cacheGenNumberLoader,
feat(metadata cache): adds max_metadata_cache_freshness (#11682) **What this PR does / why we need it**: Adds `max_metadata_cache_freshness` to limit the metadata requests that get cached. When configured, only metadata requests with end time before `now - max_metadata_cache_freshness` are cacheable. _reason for setting the default to 24h?_ metric results cache can [extract samples for the desired time range from an extent](https://github.com/grafana/loki/blob/b6e64e1ef1fb2a2155661c815d0198e147579c8e/pkg/querier/queryrange/queryrangebase/results_cache.go#L78) since the samples are associated with a timestamp. But the same is not true for metadata caching, it is not possible to extract a subset of labels/series from a cached extent. As a result, we could return inaccurate results, more that what was requested. for ex: returning results from an entire 1h extent for a 5m query Setting `max_metadata_cache_freshness` to 24h should help us avoid caching recent data. For anything older, we would report cached metadata results at a granularity controlled by `split_metadata_queries_by_interval` **Which issue(s) this PR fixes**: Fixes #<issue number> **Special notes for your reviewer**: **Checklist** - [x] Reviewed the [`CONTRIBUTING.md`](https://github.com/grafana/loki/blob/main/CONTRIBUTING.md) guide (**required**) - [x] Documentation added - [x] Tests updated - [ ] `CHANGELOG.md` updated - [ ] If the change is worth mentioning in the release notes, add `add-to-release-notes` label - [ ] Changes that require user attention or interaction to upgrade are documented in `docs/sources/setup/upgrade/_index.md` - [ ] For Helm chart changes bump the Helm chart version in `production/helm/loki/Chart.yaml` and update `production/helm/loki/CHANGELOG.md` and `production/helm/loki/README.md`. [Example PR](https://github.com/grafana/loki/commit/d10549e3ece02120974929894ee333d07755d213) - [ ] If the change is deprecating or removing a configuration option, update the `deprecated-config.yaml` and `deleted-config.yaml` files respectively in the `tools/deprecated-config-checker` directory. [Example PR](https://github.com/grafana/loki/pull/10840/commits/0d4416a4b03739583349934b96f272fb4f685d15)
1 year ago
func(ctx context.Context, r queryrangebase.Request) bool {
return shouldCacheMetadataReq(ctx, logger, shouldCache, r, limits)
},
feat(caching): Support caching `/series` and `/labels` query results (#11539) **What this PR does / why we need it**: Add support for caching metadata queries (both series and labels). caching happens after splitting similar to other types of queries. This pr adds the following configs to enable them. ``` cache_series_results: true|false (default false) cache_label_results: true|false (default false) ``` And the cache backend for them can be configured using `series_results_cache` and `label_results_cache` blocks under the `query_range` section. Currently the split interval for metadata queries is fixed and defaults to 24h, this pr makes it configurable by introducing `split_metadata_queries_by_interval` **Which issue(s) this PR fixes**: Fixes #<issue number> **Special notes for your reviewer**: **Checklist** - [x] Reviewed the [`CONTRIBUTING.md`](https://github.com/grafana/loki/blob/main/CONTRIBUTING.md) guide (**required**) - [x] Documentation added - [x] Tests updated - [ ] `CHANGELOG.md` updated - [ ] If the change is worth mentioning in the release notes, add `add-to-release-notes` label - [ ] Changes that require user attention or interaction to upgrade are documented in `docs/sources/setup/upgrade/_index.md` - [ ] For Helm chart changes bump the Helm chart version in `production/helm/loki/Chart.yaml` and update `production/helm/loki/CHANGELOG.md` and `production/helm/loki/README.md`. [Example PR](https://github.com/grafana/loki/commit/d10549e3ece02120974929894ee333d07755d213) - [ ] If the change is deprecating or removing a configuration option, update the `deprecated-config.yaml` and `deleted-config.yaml` files respectively in the `tools/deprecated-config-checker` directory. [Example PR](https://github.com/grafana/loki/pull/10840/commits/0d4416a4b03739583349934b96f272fb4f685d15) --------- Signed-off-by: Kaviraj <kavirajkanagaraj@gmail.com> Co-authored-by: Ashwanth Goli <iamashwanth@gmail.com>
1 year ago
parallelismForReq,
retentionEnabled,
true,
feat(caching): Support caching `/series` and `/labels` query results (#11539) **What this PR does / why we need it**: Add support for caching metadata queries (both series and labels). caching happens after splitting similar to other types of queries. This pr adds the following configs to enable them. ``` cache_series_results: true|false (default false) cache_label_results: true|false (default false) ``` And the cache backend for them can be configured using `series_results_cache` and `label_results_cache` blocks under the `query_range` section. Currently the split interval for metadata queries is fixed and defaults to 24h, this pr makes it configurable by introducing `split_metadata_queries_by_interval` **Which issue(s) this PR fixes**: Fixes #<issue number> **Special notes for your reviewer**: **Checklist** - [x] Reviewed the [`CONTRIBUTING.md`](https://github.com/grafana/loki/blob/main/CONTRIBUTING.md) guide (**required**) - [x] Documentation added - [x] Tests updated - [ ] `CHANGELOG.md` updated - [ ] If the change is worth mentioning in the release notes, add `add-to-release-notes` label - [ ] Changes that require user attention or interaction to upgrade are documented in `docs/sources/setup/upgrade/_index.md` - [ ] For Helm chart changes bump the Helm chart version in `production/helm/loki/Chart.yaml` and update `production/helm/loki/CHANGELOG.md` and `production/helm/loki/README.md`. [Example PR](https://github.com/grafana/loki/commit/d10549e3ece02120974929894ee333d07755d213) - [ ] If the change is deprecating or removing a configuration option, update the `deprecated-config.yaml` and `deleted-config.yaml` files respectively in the `tools/deprecated-config-checker` directory. [Example PR](https://github.com/grafana/loki/pull/10840/commits/0d4416a4b03739583349934b96f272fb4f685d15) --------- Signed-off-by: Kaviraj <kavirajkanagaraj@gmail.com> Co-authored-by: Ashwanth Goli <iamashwanth@gmail.com>
1 year ago
metrics,
)
}