Improve metric queries by computing samples at the edges. (#2293)

* First pass breaking the code appart.

Wondering how we're going to achieve fast mutation of labels.

Signed-off-by: Cyril Tovena <cyril.tovena@gmail.com>

* Work in progress.

I realize I need hash for deduping lines.
going to benchmark somes.

Signed-off-by: Cyril Tovena <cyril.tovena@gmail.com>

* Tested some hash and decided which one to use.

Signed-off-by: Cyril Tovena <cyril.tovena@gmail.com>

* Wip

Signed-off-by: Cyril Tovena <cyril.tovena@gmail.com>

* Starting working on ingester.

Signed-off-by: Cyril Tovena <cyril.tovena@gmail.com>

* Trying to find a better hash function.

Signed-off-by: Cyril Tovena <cyril.tovena@gmail.com>

* More hash testing we have a winner. xxhash it is.

Signed-off-by: Cyril Tovena <cyril.tovena@gmail.com>

* Settle on xxhash

Signed-off-by: Cyril Tovena <cyril.tovena@gmail.com>

* Better params interfacing.

Signed-off-by: Cyril Tovena <cyril.tovena@gmail.com>

* Add interface for queryparams for things that exist in both type of params.

Signed-off-by: Cyril Tovena <cyril.tovena@gmail.com>

* Add storage sample iterator implementations.

Signed-off-by: Cyril Tovena <cyril.tovena@gmail.com>

* Fixing tests and verifying we don't get collions for the hashing method.

Signed-off-by: Cyril Tovena <cyril.tovena@gmail.com>

* Fixing ingesters tests and refactoring utility function/tests.

Signed-off-by: Cyril Tovena <cyril.tovena@gmail.com>

* Fixing and testing that stats are still well computed.

Signed-off-by: Cyril Tovena <cyril.tovena@gmail.com>

* Fixing more tests.

Signed-off-by: Cyril Tovena <cyril.tovena@gmail.com>

* More engine tests finished.

Signed-off-by: Cyril Tovena <cyril.tovena@gmail.com>

* Fixes sharding evaluator.

Signed-off-by: Cyril Tovena <cyril.tovena@gmail.com>

* Fixes more engine tests.

Signed-off-by: Cyril Tovena <cyril.tovena@gmail.com>

* Fix error tests in the engine.

Signed-off-by: Cyril Tovena <cyril.tovena@gmail.com>

* Finish fixing all tests.

Signed-off-by: Cyril Tovena <cyril.tovena@gmail.com>

* Fixes a bug where extractor was not passed in correctly.

Signed-off-by: Cyril Tovena <cyril.tovena@gmail.com>

* Add notes about upgrade.

Signed-off-by: Cyril Tovena <cyril.tovena@gmail.com>

* Renamed and fix a bug.

Signed-off-by: Cyril Tovena <cyril.tovena@gmail.com>

* Add memchunk tests and starting test for sampleIterator.

Signed-off-by: Cyril Tovena <cyril.tovena@gmail.com>

* Test heap sample iterator.

Signed-off-by: Cyril Tovena <cyril.tovena@gmail.com>

* working on test.

Signed-off-by: Cyril Tovena <cyril.tovena@gmail.com>

* Finishing testing all new iterators.

Signed-off-by: Cyril Tovena <cyril.tovena@gmail.com>

* Making sure all store functions are tested.

Signed-off-by: Cyril Tovena <cyril.tovena@gmail.com>

* Benchmark and verify everything is working well.

Signed-off-by: Cyril Tovena <cyril.tovena@gmail.com>

* Make the linter happy.

Signed-off-by: Cyril Tovena <cyril.tovena@gmail.com>

* use xxhash v2.

Signed-off-by: Cyril Tovena <cyril.tovena@gmail.com>

* Fix a flaky test because of map.

Signed-off-by: Cyril Tovena <cyril.tovena@gmail.com>

* go.mod.

Signed-off-by: Cyril Tovena <cyril.tovena@gmail.com>

Co-authored-by: Edward Welch <edward.welch@grafana.com>
pull/2318/head
Cyril Tovena 6 years ago committed by GitHub
parent 0b5996021f
commit 0be64fcb34
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
  1. 19
      docs/operations/upgrade.md
  2. 3
      go.mod
  3. 4
      go.sum
  4. 4
      pkg/chunkenc/dumb_chunk.go
  5. 73
      pkg/chunkenc/hash_test.go
  6. 3
      pkg/chunkenc/interface.go
  7. 136
      pkg/chunkenc/memchunk.go
  8. 28
      pkg/chunkenc/memchunk_test.go
  9. 925
      pkg/chunkenc/testdata/testdata.go
  10. 6
      pkg/ingester/flush_test.go
  11. 81
      pkg/ingester/ingester.go
  12. 95
      pkg/ingester/ingester_test.go
  13. 65
      pkg/ingester/instance.go
  14. 12
      pkg/ingester/stream.go
  15. 11
      pkg/iter/entry_iterator.go
  16. 22
      pkg/iter/entry_iterator_test.go
  17. 512
      pkg/iter/sample_iterator.go
  18. 195
      pkg/iter/sample_iterator_test.go
  19. 22
      pkg/logcli/query/query.go
  20. 1588
      pkg/logproto/logproto.pb.go
  21. 22
      pkg/logproto/logproto.proto
  22. 6
      pkg/logproto/types.go
  23. 52
      pkg/logql/ast.go
  24. 8
      pkg/logql/ast_test.go
  25. 1038
      pkg/logql/engine_test.go
  26. 28
      pkg/logql/evaluator.go
  27. 6
      pkg/logql/functions.go
  28. 13
      pkg/logql/parser.go
  29. 24
      pkg/logql/range_vector.go
  30. 47
      pkg/logql/range_vector_test.go
  31. 94
      pkg/logql/series_extractor.go
  32. 159
      pkg/logql/series_extractor_test.go
  33. 21
      pkg/logql/sharding.go
  34. 10
      pkg/logql/shardmapper_test.go
  35. 20
      pkg/logql/stats/grpc_test.go
  36. 87
      pkg/logql/test_utils.go
  37. 109
      pkg/querier/querier.go
  38. 13
      pkg/querier/querier_mock_test.go
  39. 18
      pkg/querier/querier_test.go
  40. 361
      pkg/storage/batch.go
  41. 301
      pkg/storage/batch_test.go
  42. 87
      pkg/storage/cache.go
  43. 77
      pkg/storage/cache_test.go
  44. 59
      pkg/storage/lazy_chunk.go
  45. 11
      pkg/storage/lazy_chunk_test.go
  46. 44
      pkg/storage/store.go
  47. 285
      pkg/storage/store_test.go
  48. 35
      pkg/storage/util_test.go
  49. 80
      vendor/github.com/segmentio/fasthash/fnv1a/hash.go
  50. 85
      vendor/github.com/segmentio/fasthash/fnv1a/hash32.go
  51. 2
      vendor/golang.org/x/sys/unix/mkerrors.sh
  52. 16
      vendor/golang.org/x/sys/unix/syscall_linux.go
  53. 1
      vendor/golang.org/x/sys/unix/zerrors_linux_386.go
  54. 1
      vendor/golang.org/x/sys/unix/zerrors_linux_amd64.go
  55. 1
      vendor/golang.org/x/sys/unix/zerrors_linux_arm.go
  56. 1
      vendor/golang.org/x/sys/unix/zerrors_linux_arm64.go
  57. 1
      vendor/golang.org/x/sys/unix/zerrors_linux_mips.go
  58. 1
      vendor/golang.org/x/sys/unix/zerrors_linux_mips64.go
  59. 1
      vendor/golang.org/x/sys/unix/zerrors_linux_mips64le.go
  60. 1
      vendor/golang.org/x/sys/unix/zerrors_linux_mipsle.go
  61. 1
      vendor/golang.org/x/sys/unix/zerrors_linux_ppc64.go
  62. 1
      vendor/golang.org/x/sys/unix/zerrors_linux_ppc64le.go
  63. 1
      vendor/golang.org/x/sys/unix/zerrors_linux_riscv64.go
  64. 1
      vendor/golang.org/x/sys/unix/zerrors_linux_s390x.go
  65. 1
      vendor/golang.org/x/sys/unix/zerrors_linux_sparc64.go
  66. 12
      vendor/golang.org/x/sys/unix/ztypes_freebsd_arm.go
  67. 7
      vendor/modules.txt

@ -6,6 +6,11 @@ Unfortunately Loki is software and software is hard and sometimes things are not
On this page we will document any upgrade issues/gotchas/considerations we are aware of.
## 1.6.0
A new ingester GRPC API has been added allowing to speed up metric queries, to ensure a rollout without query errors make sure you upgrade all ingesters first.
Once this is done you can then proceed with the rest of the deployment, this is to ensure that queriers won't look for an API not yet available.
## 1.5.0
Note: The required upgrade path outlined for version 1.4.0 below is still true for moving to 1.5.0 from any release older than 1.4.0 (e.g. 1.3.0->1.5.0 needs to also look at the 1.4.0 upgrade requirements).
@ -102,8 +107,8 @@ docker run -d --name=loki --mount source=loki-data,target=/loki -p 3100:3100 gra
Notice the change in the `target=/loki` for 1.5.0 to the new data directory location specified in the [included Loki config file](../../cmd/loki/loki-docker-config.yaml).
The intermediate step of using an ubuntu image to change the ownership of the Loki files to the new user might not be necessary if you can easily access these files to run the `chown` command directly.
That is if you have access to `/var/lib/docker/volumes` or if you mounted to a different local filesystem directory, you can change the ownership directly without using a container.
The intermediate step of using an ubuntu image to change the ownership of the Loki files to the new user might not be necessary if you can easily access these files to run the `chown` command directly.
That is if you have access to `/var/lib/docker/volumes` or if you mounted to a different local filesystem directory, you can change the ownership directly without using a container.
### Loki Duration Configs
@ -146,7 +151,7 @@ The new values are:
```yaml
min_period:
max_period:
max_retries:
max_retries:
```
## 1.4.0
@ -157,9 +162,9 @@ One such config change which will affect Loki users:
In the [cache_config](../configuration/README.md#cache_config):
`defaul_validity` has changed to `default_validity`
Also in the unlikely case you were configuring your schema via arguments and not a config file, this is no longer supported. This is not something we had ever provided as an option via docs and is unlikely anyone is doing, but worth mentioning.
`defaul_validity` has changed to `default_validity`
Also in the unlikely case you were configuring your schema via arguments and not a config file, this is no longer supported. This is not something we had ever provided as an option via docs and is unlikely anyone is doing, but worth mentioning.
The other config changes should not be relevant to Loki.
@ -184,7 +189,7 @@ There are two options for upgrade if you are not on version 1.3.0 and are using
* Upgrade first to v1.3.0 **BEFORE** upgrading to v1.4.0
OR
OR
**Note:** If you are running a single binary you only need to add this flag to your single binary command.

@ -6,6 +6,7 @@ require (
github.com/blang/semver v3.5.1+incompatible // indirect
github.com/bmatcuk/doublestar v1.2.2
github.com/c2h5oh/datasize v0.0.0-20200112174442-28bbd4740fee
github.com/cespare/xxhash/v2 v2.1.1
github.com/containerd/fifo v0.0.0-20190226154929-a9fb20d87448 // indirect
github.com/coreos/go-systemd v0.0.0-20191104093116-d3cd4ed1dbcf
github.com/cortexproject/cortex v1.2.1-0.20200702073552-0ea5a8b50b19
@ -44,6 +45,7 @@ require (
github.com/prometheus/client_model v0.2.0
github.com/prometheus/common v0.10.0
github.com/prometheus/prometheus v1.8.2-0.20200626180636-d17d88935c8d
github.com/segmentio/fasthash v1.0.2
github.com/shurcooL/httpfs v0.0.0-20190707220628-8d4bc4ba7749
github.com/shurcooL/vfsgen v0.0.0-20181202132449-6a9ea43bcacd
github.com/stretchr/testify v1.5.1
@ -53,6 +55,7 @@ require (
github.com/weaveworks/common v0.0.0-20200512154658-384f10054ec5
go.etcd.io/bbolt v1.3.5-0.20200615073812-232d8fc87f50
golang.org/x/net v0.0.0-20200602114024-627f9648deb9
golang.org/x/sys v0.0.0-20200625212154-ddb9806d33ae // indirect
google.golang.org/grpc v1.29.1
gopkg.in/alecthomas/kingpin.v2 v2.2.6
gopkg.in/fsnotify.v1 v1.4.7

@ -1074,6 +1074,8 @@ github.com/sean-/seed v0.0.0-20170313163322-e2103e2c3529 h1:nn5Wsu0esKSJiIVhscUt
github.com/sean-/seed v0.0.0-20170313163322-e2103e2c3529/go.mod h1:DxrIzT+xaE7yg65j358z/aeFdxmN0P9QXhEzd20vsDc=
github.com/segmentio/fasthash v0.0.0-20180216231524-a72b379d632e h1:uO75wNGioszjmIzcY/tvdDYKRLVvzggtAmmJkn9j4GQ=
github.com/segmentio/fasthash v0.0.0-20180216231524-a72b379d632e/go.mod h1:tm/wZFQ8e24NYaBGIlnO2WGCAi67re4HHuOm0sftE/M=
github.com/segmentio/fasthash v1.0.2 h1:86fGDl2hB+iSHYlccB/FP9qRGvLNuH/fhEEFn6gnQUs=
github.com/segmentio/fasthash v1.0.2/go.mod h1:waKX8l2N8yckOgmSsXJi7x1ZfdKZ4x7KRMzBtS3oedY=
github.com/segmentio/kafka-go v0.1.0/go.mod h1:X6itGqS9L4jDletMsxZ7Dz+JFWxM6JHfPOCvTvk+EJo=
github.com/segmentio/kafka-go v0.2.0/go.mod h1:X6itGqS9L4jDletMsxZ7Dz+JFWxM6JHfPOCvTvk+EJo=
github.com/sercand/kuberesolver v2.1.0+incompatible h1:iJ1oCzPQ/aacsbCWLfJW1hPKkHMvCEgNSA9kvWcb9MY=
@ -1429,6 +1431,8 @@ golang.org/x/sys v0.0.0-20200420163511-1957bb5e6d1f/go.mod h1:h1NjWce9XRLGQEsW7w
golang.org/x/sys v0.0.0-20200602225109-6fdc65e7d980/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
golang.org/x/sys v0.0.0-20200615200032-f1bc736245b1 h1:ogLJMz+qpzav7lGMh10LMvAkM/fAoGlaiiHYiFYdm80=
golang.org/x/sys v0.0.0-20200615200032-f1bc736245b1/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
golang.org/x/sys v0.0.0-20200625212154-ddb9806d33ae h1:Ih9Yo4hSPImZOpfGuA4bR/ORKTAbhZo2AbWNRCnevdo=
golang.org/x/sys v0.0.0-20200625212154-ddb9806d33ae/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
golang.org/x/text v0.0.0-20160726164857-2910a502d2bf/go.mod h1:NqM8EUOU14njkJ3fqMW+pc6Ldnwhi/IjpwHt7yyuwOQ=
golang.org/x/text v0.0.0-20170915032832-14c0d48ead0c/go.mod h1:NqM8EUOU14njkJ3fqMW+pc6Ldnwhi/IjpwHt7yyuwOQ=
golang.org/x/text v0.3.0/go.mod h1:NqM8EUOU14njkJ3fqMW+pc6Ldnwhi/IjpwHt7yyuwOQ=

@ -93,6 +93,10 @@ func (c *dumbChunk) Iterator(_ context.Context, from, through time.Time, directi
}, nil
}
func (c *dumbChunk) SampleIterator(_ context.Context, from, through time.Time, _ logql.LineFilter, _ logql.SampleExtractor) iter.SampleIterator {
return nil
}
func (c *dumbChunk) Bytes() ([]byte, error) {
return nil, nil
}

@ -0,0 +1,73 @@
package chunkenc
import (
"hash/fnv"
"hash/maphash"
"testing"
"github.com/cespare/xxhash/v2"
"github.com/segmentio/fasthash/fnv1a"
"github.com/stretchr/testify/require"
"github.com/grafana/loki/pkg/chunkenc/testdata"
)
var res uint64
func Benchmark_fnv64a(b *testing.B) {
for n := 0; n < b.N; n++ {
for i := 0; i < len(testdata.LogsBytes); i++ {
h := fnv.New64a()
_, _ = h.Write(testdata.LogsBytes[i])
res = h.Sum64()
}
}
}
func Benchmark_fnv64a_third_party(b *testing.B) {
for n := 0; n < b.N; n++ {
for i := 0; i < len(testdata.LogsBytes); i++ {
res = fnv1a.HashBytes64(testdata.LogsBytes[i])
}
}
}
func Benchmark_xxhash(b *testing.B) {
for n := 0; n < b.N; n++ {
for i := 0; i < len(testdata.LogsBytes); i++ {
res = xxhash.Sum64(testdata.LogsBytes[i])
}
}
}
func Benchmark_hashmap(b *testing.B) {
// I discarded hashmap/map as it will compute different value on different binary for the same entry
var h maphash.Hash
for n := 0; n < b.N; n++ {
for i := 0; i < len(testdata.LogsBytes); i++ {
h.SetSeed(maphash.MakeSeed())
_, _ = h.Write(testdata.LogsBytes[i])
res = h.Sum64()
}
}
}
func Test_xxhash_integrity(t *testing.T) {
data := []uint64{}
for i := 0; i < len(testdata.LogsBytes); i++ {
data = append(data, xxhash.Sum64(testdata.LogsBytes[i]))
}
for i := 0; i < len(testdata.LogsBytes); i++ {
require.Equal(t, data[i], xxhash.Sum64(testdata.LogsBytes[i]))
}
unique := map[uint64]struct{}{}
for i := 0; i < len(testdata.LogsBytes); i++ {
_, ok := unique[xxhash.Sum64(testdata.LogsBytes[i])]
require.False(t, ok, string(testdata.LogsBytes[i])) // all lines have been made unique
unique[xxhash.Sum64(testdata.LogsBytes[i])] = struct{}{}
}
}

@ -98,6 +98,7 @@ type Chunk interface {
SpaceFor(*logproto.Entry) bool
Append(*logproto.Entry) error
Iterator(ctx context.Context, from, through time.Time, direction logproto.Direction, filter logql.LineFilter) (iter.EntryIterator, error)
SampleIterator(ctx context.Context, from, through time.Time, filter logql.LineFilter, extractor logql.SampleExtractor) iter.SampleIterator
// Returns the list of blocks in the chunks.
Blocks(mintT, maxtT time.Time) []Block
Size() int
@ -121,4 +122,6 @@ type Block interface {
Entries() int
// Iterator returns an entry iterator for the block.
Iterator(context.Context, logql.LineFilter) iter.EntryIterator
// SampleIterator returns a sample iterator for the block.
SampleIterator(context.Context, logql.LineFilter, logql.SampleExtractor) iter.SampleIterator
}

@ -11,6 +11,7 @@ import (
"io"
"time"
"github.com/cespare/xxhash/v2"
"github.com/cortexproject/cortex/pkg/util"
"github.com/go-kit/kit/log/level"
"github.com/pkg/errors"
@ -502,6 +503,29 @@ func (c *MemChunk) Iterator(ctx context.Context, mintT, maxtT time.Time, directi
return iter.NewEntryReversedIter(iterForward)
}
// Iterator implements Chunk.
func (c *MemChunk) SampleIterator(ctx context.Context, mintT, maxtT time.Time, filter logql.LineFilter, extractor logql.SampleExtractor) iter.SampleIterator {
mint, maxt := mintT.UnixNano(), maxtT.UnixNano()
its := make([]iter.SampleIterator, 0, len(c.blocks)+1)
for _, b := range c.blocks {
if maxt < b.mint || b.maxt < mint {
continue
}
its = append(its, b.SampleIterator(ctx, filter, extractor))
}
if !c.head.isEmpty() {
its = append(its, c.head.sampleIterator(ctx, mint, maxt, filter, extractor))
}
return iter.NewTimeRangedSampleIterator(
iter.NewNonOverlappingSampleIterator(its, ""),
mint,
maxt,
)
}
// Blocks implements Chunk
func (c *MemChunk) Blocks(mintT, maxtT time.Time) []Block {
mint, maxt := mintT.UnixNano(), maxtT.UnixNano()
@ -519,7 +543,14 @@ func (b block) Iterator(ctx context.Context, filter logql.LineFilter) iter.Entry
if len(b.b) == 0 {
return emptyIterator
}
return newBufferedIterator(ctx, b.readers, b.b, filter)
return newEntryIterator(ctx, b.readers, b.b, filter)
}
func (b block) SampleIterator(ctx context.Context, filter logql.LineFilter, extractor logql.SampleExtractor) iter.SampleIterator {
if len(b.b) == 0 {
return iter.NoopIterator
}
return newSampleIterator(ctx, b.readers, b.b, filter, extractor)
}
func (b block) Offset() int {
@ -566,6 +597,34 @@ func (hb *headBlock) iterator(ctx context.Context, mint, maxt int64, filter logq
}
}
func (hb *headBlock) sampleIterator(ctx context.Context, mint, maxt int64, filter logql.LineFilter, extractor logql.SampleExtractor) iter.SampleIterator {
if hb.isEmpty() || (maxt < hb.mint || hb.maxt < mint) {
return iter.NoopIterator
}
chunkStats := stats.GetChunkData(ctx)
chunkStats.HeadChunkLines += int64(len(hb.entries))
samples := make([]logproto.Sample, 0, len(hb.entries))
for _, e := range hb.entries {
chunkStats.HeadChunkBytes += int64(len(e.s))
if filter == nil || filter.Filter([]byte(e.s)) {
if value, ok := extractor.Extract([]byte(e.s)); ok {
samples = append(samples, logproto.Sample{
Timestamp: e.t,
Value: value,
Hash: xxhash.Sum64([]byte(e.s)),
})
}
}
}
if len(samples) == 0 {
return iter.NoopIterator
}
return iter.NewSeriesIterator(logproto.Series{Samples: samples})
}
var emptyIterator = &listIterator{}
type listIterator struct {
@ -604,12 +663,13 @@ type bufferedIterator struct {
reader io.Reader
pool ReaderPool
cur logproto.Entry
err error
buf []byte // The buffer for a single entry.
decBuf []byte // The buffer for decoding the lengths.
decBuf []byte // The buffer for decoding the lengths.
buf []byte // The buffer for a single entry.
currLine []byte // the current line, this is the same as the buffer but sliced the the line size.
currTs int64
consumed bool
closed bool
@ -627,6 +687,7 @@ func newBufferedIterator(ctx context.Context, pool ReaderPool, b []byte, filter
pool: pool,
filter: filter,
decBuf: make([]byte, binary.MaxVarintLen64),
consumed: true,
}
}
@ -649,8 +710,9 @@ func (si *bufferedIterator) Next() bool {
if si.filter != nil && !si.filter.Filter(line) {
continue
}
si.cur.Line = string(line)
si.cur.Timestamp = time.Unix(0, ts)
si.currTs = ts
si.currLine = line
si.consumed = false
return true
}
}
@ -690,7 +752,6 @@ func (si *bufferedIterator) moveNext() (int64, []byte, bool) {
return 0, nil, false
}
}
// Then process reading the line.
n, err := si.bufReader.Read(si.buf[:lineSize])
if err != nil && err != io.EOF {
@ -708,10 +769,6 @@ func (si *bufferedIterator) moveNext() (int64, []byte, bool) {
return ts, si.buf[:lineSize], true
}
func (si *bufferedIterator) Entry() logproto.Entry {
return si.cur
}
func (si *bufferedIterator) Error() error { return si.err }
func (si *bufferedIterator) Close() error {
@ -741,3 +798,58 @@ func (si *bufferedIterator) close() {
}
func (si *bufferedIterator) Labels() string { return "" }
func newEntryIterator(ctx context.Context, pool ReaderPool, b []byte, filter logql.LineFilter) iter.EntryIterator {
return &entryBufferedIterator{
bufferedIterator: newBufferedIterator(ctx, pool, b, filter),
}
}
type entryBufferedIterator struct {
*bufferedIterator
cur logproto.Entry
}
func (e *entryBufferedIterator) Entry() logproto.Entry {
if !e.consumed {
e.cur.Timestamp = time.Unix(0, e.currTs)
e.cur.Line = string(e.currLine)
e.consumed = true
}
return e.cur
}
func newSampleIterator(ctx context.Context, pool ReaderPool, b []byte, filter logql.LineFilter, extractor logql.SampleExtractor) iter.SampleIterator {
it := &sampleBufferedIterator{
bufferedIterator: newBufferedIterator(ctx, pool, b, filter),
extractor: extractor,
}
return it
}
type sampleBufferedIterator struct {
*bufferedIterator
extractor logql.SampleExtractor
cur logproto.Sample
currValue float64
}
func (e *sampleBufferedIterator) Next() bool {
var ok bool
for e.bufferedIterator.Next() {
if e.currValue, ok = e.extractor.Extract(e.currLine); ok {
return true
}
}
return false
}
func (e *sampleBufferedIterator) Sample() logproto.Sample {
if !e.consumed {
e.cur.Timestamp = e.currTs
e.cur.Hash = xxhash.Sum64(e.currLine)
e.cur.Value = e.currValue
e.consumed = true
}
return e.cur
}

@ -112,6 +112,21 @@ func TestBlock(t *testing.T) {
}
require.NoError(t, it.Error())
require.NoError(t, it.Close())
require.Equal(t, len(cases), idx)
sampleIt := chk.SampleIterator(context.Background(), time.Unix(0, 0), time.Unix(0, math.MaxInt64), nil, logql.ExtractCount)
idx = 0
for sampleIt.Next() {
s := sampleIt.Sample()
require.Equal(t, cases[idx].ts, s.Timestamp)
require.Equal(t, 1., s.Value)
require.NotEmpty(t, s.Hash)
idx++
}
require.NoError(t, sampleIt.Error())
require.NoError(t, sampleIt.Close())
require.Equal(t, len(cases), idx)
t.Run("bounded-iteration", func(t *testing.T) {
@ -225,7 +240,7 @@ func TestSerialization(t *testing.T) {
t.Run(enc.String(), func(t *testing.T) {
chk := NewMemChunk(enc, testBlockSize, testTargetSize)
numSamples := 500000
numSamples := 50000
for i := 0; i < numSamples; i++ {
require.NoError(t, chk.Append(logprotoEntry(int64(i), string(i))))
@ -246,9 +261,18 @@ func TestSerialization(t *testing.T) {
require.Equal(t, int64(i), e.Timestamp.UnixNano())
require.Equal(t, string(i), e.Line)
}
require.NoError(t, it.Error())
sampleIt := bc.SampleIterator(context.Background(), time.Unix(0, 0), time.Unix(0, math.MaxInt64), nil, logql.ExtractCount)
for i := 0; i < numSamples; i++ {
require.True(t, sampleIt.Next(), i)
s := sampleIt.Sample()
require.Equal(t, int64(i), s.Timestamp)
require.Equal(t, 1., s.Value)
}
require.NoError(t, sampleIt.Error())
byt2, err := chk.Bytes()
require.NoError(t, err)

File diff suppressed because it is too large Load Diff

@ -229,7 +229,11 @@ func (s *testStore) IsLocal() bool {
return false
}
func (s *testStore) LazyQuery(ctx context.Context, req logql.SelectParams) (iter.EntryIterator, error) {
func (s *testStore) SelectLogs(ctx context.Context, req logql.SelectLogParams) (iter.EntryIterator, error) {
return nil, nil
}
func (s *testStore) SelectSamples(ctx context.Context, req logql.SelectSampleParams) (iter.SampleIterator, error) {
return nil, nil
}

@ -122,7 +122,8 @@ type Ingester struct {
// ChunkStore is the interface we need to store chunks.
type ChunkStore interface {
Put(ctx context.Context, chunks []chunk.Chunk) error
LazyQuery(ctx context.Context, req logql.SelectParams) (iter.EntryIterator, error)
SelectLogs(ctx context.Context, req logql.SelectLogParams) (iter.EntryIterator, error)
SelectSamples(ctx context.Context, req logql.SelectSampleParams) (iter.SampleIterator, error)
}
// New makes a new Ingester.
@ -285,13 +286,21 @@ func (i *Ingester) Query(req *logproto.QueryRequest, queryServer logproto.Querie
}
instance := i.getOrCreateInstance(instanceID)
itrs, err := instance.Query(ctx, req)
itrs, err := instance.Query(ctx, logql.SelectLogParams{QueryRequest: req})
if err != nil {
return err
}
if storeReq := buildStoreRequest(i.cfg, req); storeReq != nil {
storeItr, err := i.store.LazyQuery(ctx, logql.SelectParams{QueryRequest: storeReq})
if start, end, ok := buildStoreRequest(i.cfg, req.End, req.End, time.Now()); ok {
storeReq := logql.SelectLogParams{QueryRequest: &logproto.QueryRequest{
Selector: req.Selector,
Direction: req.Direction,
Start: start,
End: end,
Limit: req.Limit,
Shards: req.Shards,
}}
storeItr, err := i.store.SelectLogs(ctx, storeReq)
if err != nil {
return err
}
@ -306,6 +315,45 @@ func (i *Ingester) Query(req *logproto.QueryRequest, queryServer logproto.Querie
return sendBatches(queryServer.Context(), heapItr, queryServer, req.Limit)
}
// QuerySample the ingesters for series from logs matching a set of matchers.
func (i *Ingester) QuerySample(req *logproto.SampleQueryRequest, queryServer logproto.Querier_QuerySampleServer) error {
// initialize stats collection for ingester queries and set grpc trailer with stats.
ctx := stats.NewContext(queryServer.Context())
defer stats.SendAsTrailer(ctx, queryServer)
instanceID, err := user.ExtractOrgID(ctx)
if err != nil {
return err
}
instance := i.getOrCreateInstance(instanceID)
itrs, err := instance.QuerySample(ctx, logql.SelectSampleParams{SampleQueryRequest: req})
if err != nil {
return err
}
if start, end, ok := buildStoreRequest(i.cfg, req.Start, req.End, time.Now()); ok {
storeReq := logql.SelectSampleParams{SampleQueryRequest: &logproto.SampleQueryRequest{
Start: start,
End: end,
Selector: req.Selector,
Shards: req.Shards,
}}
storeItr, err := i.store.SelectSamples(ctx, storeReq)
if err != nil {
return err
}
itrs = append(itrs, storeItr)
}
heapItr := iter.NewHeapSampleIterator(ctx, itrs)
defer helpers.LogErrorWithContext(ctx, "closing iterator", heapItr.Close)
return sendSampleBatches(queryServer.Context(), heapItr, queryServer)
}
// Label returns the set of labels for the stream this ingester knows about.
func (i *Ingester) Label(ctx context.Context, req *logproto.LabelRequest) (*logproto.LabelResponse, error) {
instanceID, err := user.ExtractOrgID(ctx)
@ -336,7 +384,7 @@ func (i *Ingester) Label(ctx context.Context, req *logproto.LabelRequest) (*logp
return nil, err
}
// Adjust the start time based on QueryStoreMaxLookBackPeriod.
start := adjustQueryStartTime(i.cfg, *req.Start)
start := adjustQueryStartTime(i.cfg, *req.Start, time.Now())
if start.After(*req.End) {
// The request is older than we are allowed to query the store, just return what we have.
return resp, nil
@ -454,30 +502,23 @@ func (i *Ingester) TailersCount(ctx context.Context, in *logproto.TailersCountRe
// buildStoreRequest returns a store request from an ingester request, returns nit if QueryStore is set to false in configuration.
// The request may be truncated due to QueryStoreMaxLookBackPeriod which limits the range of request to make sure
// we only query enough to not miss any data and not add too to many duplicates by covering the who time range in query.
func buildStoreRequest(cfg Config, req *logproto.QueryRequest) *logproto.QueryRequest {
func buildStoreRequest(cfg Config, start, end, now time.Time) (time.Time, time.Time, bool) {
if !cfg.QueryStore {
return nil
return time.Time{}, time.Time{}, false
}
start := req.Start
end := req.End
start = adjustQueryStartTime(cfg, start)
start = adjustQueryStartTime(cfg, start, now)
if start.After(end) {
return nil
return time.Time{}, time.Time{}, false
}
newRequest := *req
newRequest.Start = start
newRequest.End = end
return &newRequest
return start, end, true
}
func adjustQueryStartTime(cfg Config, start time.Time) time.Time {
func adjustQueryStartTime(cfg Config, start, now time.Time) time.Time {
if cfg.QueryStoreMaxLookBackPeriod > 0 {
oldestStartTime := time.Now().Add(-cfg.QueryStoreMaxLookBackPeriod)
oldestStartTime := now.Add(-cfg.QueryStoreMaxLookBackPeriod)
if oldestStartTime.After(start) {
start = oldestStartTime
return oldestStartTime
}
}
return start

@ -263,7 +263,11 @@ func (s *mockStore) Put(ctx context.Context, chunks []chunk.Chunk) error {
return nil
}
func (s *mockStore) LazyQuery(ctx context.Context, req logql.SelectParams) (iter.EntryIterator, error) {
func (s *mockStore) SelectLogs(ctx context.Context, req logql.SelectLogParams) (iter.EntryIterator, error) {
return nil, nil
}
func (s *mockStore) SelectSamples(ctx context.Context, req logql.SelectSampleParams) (iter.SampleIterator, error) {
return nil, nil
}
@ -291,75 +295,64 @@ func defaultLimitsTestConfig() validation.Limits {
}
func TestIngester_buildStoreRequest(t *testing.T) {
ingesterQueryRequest := logproto.QueryRequest{
Selector: `{foo="bar"}`,
Limit: 100,
}
now := time.Now()
for _, tc := range []struct {
name string
queryStore bool
maxLookBackPeriod time.Duration
ingesterQueryRequest *logproto.QueryRequest
expectedStoreQueryRequest *logproto.QueryRequest
name string
queryStore bool
maxLookBackPeriod time.Duration
start, end time.Time
expectedStart, expectedEnd time.Time
shouldQuery bool
}{
{
name: "do not query store",
queryStore: false,
ingesterQueryRequest: recreateRequestWithTime(ingesterQueryRequest, now.Add(-time.Minute), now),
expectedStoreQueryRequest: nil,
name: "do not query store",
queryStore: false,
start: now.Add(-time.Minute),
end: now,
shouldQuery: false,
},
{
name: "query store with max look back covering whole request duration",
queryStore: true,
maxLookBackPeriod: time.Hour,
ingesterQueryRequest: recreateRequestWithTime(ingesterQueryRequest, now.Add(-10*time.Minute), now),
expectedStoreQueryRequest: recreateRequestWithTime(ingesterQueryRequest, now.Add(-10*time.Minute), now),
name: "query store with max look back covering whole request duration",
queryStore: true,
maxLookBackPeriod: time.Hour,
start: now.Add(-10 * time.Minute),
end: now,
expectedStart: now.Add(-10 * time.Minute),
expectedEnd: now,
shouldQuery: true,
},
{
name: "query store with max look back covering partial request duration",
queryStore: true,
maxLookBackPeriod: time.Hour,
ingesterQueryRequest: recreateRequestWithTime(ingesterQueryRequest, now.Add(-2*time.Hour), now),
expectedStoreQueryRequest: recreateRequestWithTime(ingesterQueryRequest, now.Add(-time.Hour), now),
name: "query store with max look back covering partial request duration",
queryStore: true,
maxLookBackPeriod: time.Hour,
start: now.Add(-2 * time.Hour),
end: now,
expectedStart: now.Add(-time.Hour),
expectedEnd: now,
shouldQuery: true,
},
{
name: "query store with max look back not covering request duration at all",
queryStore: true,
maxLookBackPeriod: time.Hour,
ingesterQueryRequest: recreateRequestWithTime(ingesterQueryRequest, now.Add(-4*time.Hour), now.Add(-2*time.Hour)),
expectedStoreQueryRequest: nil,
name: "query store with max look back not covering request duration at all",
queryStore: true,
maxLookBackPeriod: time.Hour,
start: now.Add(-4 * time.Hour),
end: now.Add(-2 * time.Hour),
shouldQuery: false,
},
} {
t.Run(tc.name, func(t *testing.T) {
ingesterConfig := defaultIngesterTestConfig(t)
ingesterConfig.QueryStore = tc.queryStore
ingesterConfig.QueryStoreMaxLookBackPeriod = tc.maxLookBackPeriod
storeRequest := buildStoreRequest(ingesterConfig, tc.ingesterQueryRequest)
if tc.expectedStoreQueryRequest == nil {
require.Nil(t, storeRequest)
return
}
// because start time of store could be changed and built based on time when function is called we can't predict expected start time.
// So allowing upto 1s difference between expected and actual start time of store query request.
require.Equal(t, tc.expectedStoreQueryRequest.Selector, storeRequest.Selector)
require.Equal(t, tc.expectedStoreQueryRequest.Limit, storeRequest.Limit)
require.Equal(t, tc.expectedStoreQueryRequest.End, storeRequest.End)
start, end, ok := buildStoreRequest(ingesterConfig, tc.start, tc.end, now)
if storeRequest.Start.Sub(tc.expectedStoreQueryRequest.Start) > time.Second {
t.Fatalf("expected upto 1s difference in expected and actual store request end time but got %d", storeRequest.End.Sub(tc.expectedStoreQueryRequest.End))
if !tc.shouldQuery {
require.False(t, ok)
return
}
require.Equal(t, tc.expectedEnd, end, "end")
require.Equal(t, tc.expectedStart, start, "start")
})
}
}
func recreateRequestWithTime(req logproto.QueryRequest, start, end time.Time) *logproto.QueryRequest {
newReq := req
newReq.Start = start
newReq.End = end
return &newReq
}

@ -28,7 +28,10 @@ import (
"github.com/grafana/loki/pkg/util/validation"
)
const queryBatchSize = 128
const (
queryBatchSize = 128
queryBatchSampleSize = 512
)
// Errors returned on Query.
var (
@ -192,8 +195,8 @@ func (i *instance) getLabelsFromFingerprint(fp model.Fingerprint) labels.Labels
return s.labels
}
func (i *instance) Query(ctx context.Context, req *logproto.QueryRequest) ([]iter.EntryIterator, error) {
expr, err := (logql.SelectParams{QueryRequest: req}).LogSelector()
func (i *instance) Query(ctx context.Context, req logql.SelectLogParams) ([]iter.EntryIterator, error) {
expr, err := req.LogSelector()
if err != nil {
return nil, err
}
@ -223,6 +226,40 @@ func (i *instance) Query(ctx context.Context, req *logproto.QueryRequest) ([]ite
return iters, nil
}
func (i *instance) QuerySample(ctx context.Context, req logql.SelectSampleParams) ([]iter.SampleIterator, error) {
expr, err := req.Expr()
if err != nil {
return nil, err
}
filter, err := expr.Selector().Filter()
if err != nil {
return nil, err
}
extractor, err := expr.Extractor()
if err != nil {
return nil, err
}
ingStats := stats.GetIngesterData(ctx)
var iters []iter.SampleIterator
err = i.forMatchingStreams(
expr.Selector().Matchers(),
func(stream *stream) error {
ingStats.TotalChunksMatched += int64(len(stream.chunks))
iter, err := stream.SampleIterator(ctx, req.Start, req.End, filter, extractor)
if err != nil {
return err
}
iters = append(iters, iter)
return nil
},
)
if err != nil {
return nil, err
}
return iters, nil
}
func (i *instance) Label(_ context.Context, req *logproto.LabelRequest) (*logproto.LabelResponse, error) {
var labels []string
if req.Values {
@ -466,6 +503,26 @@ func sendBatches(ctx context.Context, i iter.EntryIterator, queryServer logproto
return nil
}
func sendSampleBatches(ctx context.Context, it iter.SampleIterator, queryServer logproto.Querier_QuerySampleServer) error {
ingStats := stats.GetIngesterData(ctx)
for !isDone(ctx) {
batch, size, err := iter.ReadSampleBatch(it, queryBatchSampleSize)
if err != nil {
return err
}
if len(batch.Series) == 0 {
return nil
}
if err := queryServer.Send(batch); err != nil {
return err
}
ingStats.TotalLinesSent += int64(size)
ingStats.TotalBatches++
}
return nil
}
func shouldConsiderStream(stream *stream, req *logproto.SeriesRequest) bool {
firstchunkFrom, _ := stream.chunks[0].chunk.Bounds()
_, lastChunkTo := stream.chunks[len(stream.chunks)-1].chunk.Bounds()
@ -474,4 +531,4 @@ func shouldConsiderStream(stream *stream, req *logproto.SeriesRequest) bool {
return true
}
return false
}
}

@ -277,6 +277,18 @@ func (s *stream) Iterator(ctx context.Context, from, through time.Time, directio
return iter.NewNonOverlappingIterator(iterators, s.labelsString), nil
}
// Returns an SampleIterator.
func (s *stream) SampleIterator(ctx context.Context, from, through time.Time, filter logql.LineFilter, extractor logql.SampleExtractor) (iter.SampleIterator, error) {
iterators := make([]iter.SampleIterator, 0, len(s.chunks))
for _, c := range s.chunks {
if itr := c.chunk.SampleIterator(ctx, from, through, filter, extractor); itr != nil {
iterators = append(iterators, itr)
}
}
return iter.NewNonOverlappingSampleIterator(iterators, s.labelsString), nil
}
func (s *stream) addTailer(t *tailer) {
s.tailerMtx.Lock()
defer s.tailerMtx.Unlock()

@ -26,11 +26,12 @@ type noOpIterator struct{}
var NoopIterator = noOpIterator{}
func (noOpIterator) Next() bool { return false }
func (noOpIterator) Error() error { return nil }
func (noOpIterator) Labels() string { return "" }
func (noOpIterator) Entry() logproto.Entry { return logproto.Entry{} }
func (noOpIterator) Close() error { return nil }
func (noOpIterator) Next() bool { return false }
func (noOpIterator) Error() error { return nil }
func (noOpIterator) Labels() string { return "" }
func (noOpIterator) Entry() logproto.Entry { return logproto.Entry{} }
func (noOpIterator) Sample() logproto.Sample { return logproto.Sample{} }
func (noOpIterator) Close() error { return nil }
// streamIterator iterates over entries in a stream.
type streamIterator struct {

@ -567,7 +567,7 @@ func Test_timeRangedIterator_Next(t *testing.T) {
}
for _, tt := range tests {
t.Run(fmt.Sprintf("mint:%d maxt:%d", tt.mint.UnixNano(), tt.maxt.UnixNano()), func(t *testing.T) {
i := NewTimeRangedIterator(
it := NewTimeRangedIterator(
NewStreamIterator(
logproto.Stream{Entries: []logproto.Entry{
{Timestamp: time.Unix(0, 1)},
@ -578,9 +578,25 @@ func Test_timeRangedIterator_Next(t *testing.T) {
tt.maxt,
)
for _, b := range tt.expect {
require.Equal(t, b, i.Next())
require.Equal(t, b, it.Next())
}
require.NoError(t, i.Close())
require.NoError(t, it.Close())
})
t.Run(fmt.Sprintf("mint:%d maxt:%d_sample", tt.mint.UnixNano(), tt.maxt.UnixNano()), func(t *testing.T) {
it := NewTimeRangedSampleIterator(
NewSeriesIterator(
logproto.Series{Samples: []logproto.Sample{
sample(1),
sample(2),
sample(3),
}}),
tt.mint.UnixNano(),
tt.maxt.UnixNano(),
)
for _, b := range tt.expect {
require.Equal(t, b, it.Next())
}
require.NoError(t, it.Close())
})
}
}

@ -0,0 +1,512 @@
package iter
import (
"container/heap"
"context"
"fmt"
"io"
"github.com/grafana/loki/pkg/helpers"
"github.com/grafana/loki/pkg/logproto"
"github.com/grafana/loki/pkg/logql/stats"
)
// SampleIterator iterates over samples in time-order.
type SampleIterator interface {
Next() bool
// todo(ctovena) we should add `Seek(t int64) bool`
// This way we can skip when ranging over samples.
Sample() logproto.Sample
Labels() string
Error() error
Close() error
}
// PeekingSampleIterator is a sample iterator that can peek sample without moving the current sample.
type PeekingSampleIterator interface {
SampleIterator
Peek() (string, logproto.Sample, bool)
}
type peekingSampleIterator struct {
iter SampleIterator
cache *sampleWithLabels
next *sampleWithLabels
}
type sampleWithLabels struct {
logproto.Sample
labels string
}
func NewPeekingSampleIterator(iter SampleIterator) PeekingSampleIterator {
// initialize the next entry so we can peek right from the start.
var cache *sampleWithLabels
next := &sampleWithLabels{}
if iter.Next() {
cache = &sampleWithLabels{
Sample: iter.Sample(),
labels: iter.Labels(),
}
next.Sample = cache.Sample
next.labels = cache.labels
}
return &peekingSampleIterator{
iter: iter,
cache: cache,
next: next,
}
}
func (it *peekingSampleIterator) Close() error {
return it.iter.Close()
}
func (it *peekingSampleIterator) Labels() string {
if it.next != nil {
return it.next.labels
}
return ""
}
func (it *peekingSampleIterator) Next() bool {
if it.cache != nil {
it.next.Sample = it.cache.Sample
it.next.labels = it.cache.labels
it.cacheNext()
return true
}
return false
}
// cacheNext caches the next element if it exists.
func (it *peekingSampleIterator) cacheNext() {
if it.iter.Next() {
it.cache.Sample = it.iter.Sample()
it.cache.labels = it.iter.Labels()
return
}
// nothing left removes the cached entry
it.cache = nil
}
func (it *peekingSampleIterator) Sample() logproto.Sample {
if it.next != nil {
return it.next.Sample
}
return logproto.Sample{}
}
func (it *peekingSampleIterator) Peek() (string, logproto.Sample, bool) {
if it.cache != nil {
return it.cache.labels, it.cache.Sample, true
}
return "", logproto.Sample{}, false
}
func (it *peekingSampleIterator) Error() error {
return it.iter.Error()
}
type sampleIteratorHeap []SampleIterator
func (h sampleIteratorHeap) Len() int { return len(h) }
func (h sampleIteratorHeap) Swap(i, j int) { h[i], h[j] = h[j], h[i] }
func (h sampleIteratorHeap) Peek() SampleIterator { return h[0] }
func (h *sampleIteratorHeap) Push(x interface{}) {
*h = append(*h, x.(SampleIterator))
}
func (h *sampleIteratorHeap) Pop() interface{} {
old := *h
n := len(old)
x := old[n-1]
*h = old[0 : n-1]
return x
}
func (h sampleIteratorHeap) Less(i, j int) bool {
s1, s2 := h[i].Sample(), h[j].Sample()
switch {
case s1.Timestamp < s2.Timestamp:
return true
case s1.Timestamp > s2.Timestamp:
return false
default:
return h[i].Labels() < h[j].Labels()
}
}
// heapSampleIterator iterates over a heap of iterators.
type heapSampleIterator struct {
heap *sampleIteratorHeap
is []SampleIterator
prefetched bool
stats *stats.ChunkData
tuples []sampletuple
curr logproto.Sample
currLabels string
errs []error
}
// NewHeapSampleIterator returns a new iterator which uses a heap to merge together
// entries for multiple iterators.
func NewHeapSampleIterator(ctx context.Context, is []SampleIterator) SampleIterator {
return &heapSampleIterator{
stats: stats.GetChunkData(ctx),
is: is,
heap: &sampleIteratorHeap{},
tuples: make([]sampletuple, 0, len(is)),
}
}
// prefetch iterates over all inner iterators to merge together, calls Next() on
// each of them to prefetch the first entry and pushes of them - who are not
// empty - to the heap
func (i *heapSampleIterator) prefetch() {
if i.prefetched {
return
}
i.prefetched = true
for _, it := range i.is {
i.requeue(it, false)
}
// We can now clear the list of input iterators to merge, given they have all
// been processed and the non empty ones have been pushed to the heap
i.is = nil
}
// requeue pushes the input ei EntryIterator to the heap, advancing it via an ei.Next()
// call unless the advanced input parameter is true. In this latter case it expects that
// the iterator has already been advanced before calling requeue().
//
// If the iterator has no more entries or an error occur while advancing it, the iterator
// is not pushed to the heap and any possible error captured, so that can be get via Error().
func (i *heapSampleIterator) requeue(ei SampleIterator, advanced bool) {
if advanced || ei.Next() {
heap.Push(i.heap, ei)
return
}
if err := ei.Error(); err != nil {
i.errs = append(i.errs, err)
}
helpers.LogError("closing iterator", ei.Close)
}
type sampletuple struct {
logproto.Sample
SampleIterator
}
func (i *heapSampleIterator) Next() bool {
i.prefetch()
if i.heap.Len() == 0 {
return false
}
// We support multiple entries with the same timestamp, and we want to
// preserve their original order. We look at all the top entries in the
// heap with the same timestamp, and pop the ones whose common value
// occurs most often.
for i.heap.Len() > 0 {
next := i.heap.Peek()
sample := next.Sample()
if len(i.tuples) > 0 && (i.tuples[0].Labels() != next.Labels() || i.tuples[0].Timestamp != sample.Timestamp) {
break
}
heap.Pop(i.heap)
i.tuples = append(i.tuples, sampletuple{
Sample: sample,
SampleIterator: next,
})
}
i.curr = i.tuples[0].Sample
i.currLabels = i.tuples[0].Labels()
t := i.tuples[0]
if len(i.tuples) == 1 {
i.requeue(i.tuples[0].SampleIterator, false)
i.tuples = i.tuples[:0]
return true
}
// Requeue the iterators, advancing them if they were consumed.
for j := range i.tuples {
if i.tuples[j].Hash != i.curr.Hash {
i.requeue(i.tuples[j].SampleIterator, true)
continue
}
// we count as duplicates only if the tuple is not the one (t) used to fill the current entry
if i.tuples[j] != t {
i.stats.TotalDuplicates++
}
i.requeue(i.tuples[j].SampleIterator, false)
}
i.tuples = i.tuples[:0]
return true
}
func (i *heapSampleIterator) Sample() logproto.Sample {
return i.curr
}
func (i *heapSampleIterator) Labels() string {
return i.currLabels
}
func (i *heapSampleIterator) Error() error {
switch len(i.errs) {
case 0:
return nil
case 1:
return i.errs[0]
default:
return fmt.Errorf("Multiple errors: %+v", i.errs)
}
}
func (i *heapSampleIterator) Close() error {
for i.heap.Len() > 0 {
if err := i.heap.Pop().(SampleIterator).Close(); err != nil {
return err
}
}
i.tuples = nil
return nil
}
type sampleQueryClientIterator struct {
client QuerySampleClient
err error
curr SampleIterator
}
// QuerySampleClient is GRPC stream client with only method used by the SampleQueryClientIterator
type QuerySampleClient interface {
Recv() (*logproto.SampleQueryResponse, error)
Context() context.Context
CloseSend() error
}
// NewQueryClientIterator returns an iterator over a QueryClient.
func NewSampleQueryClientIterator(client QuerySampleClient) SampleIterator {
return &sampleQueryClientIterator{
client: client,
}
}
func (i *sampleQueryClientIterator) Next() bool {
for i.curr == nil || !i.curr.Next() {
batch, err := i.client.Recv()
if err == io.EOF {
return false
} else if err != nil {
i.err = err
return false
}
i.curr = NewSampleQueryResponseIterator(i.client.Context(), batch)
}
return true
}
func (i *sampleQueryClientIterator) Sample() logproto.Sample {
return i.curr.Sample()
}
func (i *sampleQueryClientIterator) Labels() string {
return i.curr.Labels()
}
func (i *sampleQueryClientIterator) Error() error {
return i.err
}
func (i *sampleQueryClientIterator) Close() error {
return i.client.CloseSend()
}
// NewSampleQueryResponseIterator returns an iterator over a SampleQueryResponse.
func NewSampleQueryResponseIterator(ctx context.Context, resp *logproto.SampleQueryResponse) SampleIterator {
return NewMultiSeriesIterator(ctx, resp.Series)
}
type seriesIterator struct {
i int
samples []logproto.Sample
labels string
}
// NewMultiSeriesIterator returns an iterator over multiple logproto.Series
func NewMultiSeriesIterator(ctx context.Context, series []logproto.Series) SampleIterator {
is := make([]SampleIterator, 0, len(series))
for i := range series {
is = append(is, NewSeriesIterator(series[i]))
}
return NewHeapSampleIterator(ctx, is)
}
// NewSeriesIterator iterates over sample in a series.
func NewSeriesIterator(series logproto.Series) SampleIterator {
return &seriesIterator{
i: -1,
samples: series.Samples,
labels: series.Labels,
}
}
func (i *seriesIterator) Next() bool {
i.i++
return i.i < len(i.samples)
}
func (i *seriesIterator) Error() error {
return nil
}
func (i *seriesIterator) Labels() string {
return i.labels
}
func (i *seriesIterator) Sample() logproto.Sample {
return i.samples[i.i]
}
func (i *seriesIterator) Close() error {
return nil
}
type nonOverlappingSampleIterator struct {
labels string
i int
iterators []SampleIterator
curr SampleIterator
}
// NewNonOverlappingSampleIterator gives a chained iterator over a list of iterators.
func NewNonOverlappingSampleIterator(iterators []SampleIterator, labels string) SampleIterator {
return &nonOverlappingSampleIterator{
labels: labels,
iterators: iterators,
}
}
func (i *nonOverlappingSampleIterator) Next() bool {
for i.curr == nil || !i.curr.Next() {
if len(i.iterators) == 0 {
if i.curr != nil {
i.curr.Close()
}
return false
}
if i.curr != nil {
i.curr.Close()
}
i.i++
i.curr, i.iterators = i.iterators[0], i.iterators[1:]
}
return true
}
func (i *nonOverlappingSampleIterator) Sample() logproto.Sample {
return i.curr.Sample()
}
func (i *nonOverlappingSampleIterator) Labels() string {
if i.labels != "" {
return i.labels
}
return i.curr.Labels()
}
func (i *nonOverlappingSampleIterator) Error() error {
if i.curr != nil {
return i.curr.Error()
}
return nil
}
func (i *nonOverlappingSampleIterator) Close() error {
for _, iter := range i.iterators {
iter.Close()
}
i.iterators = nil
return nil
}
type timeRangedSampleIterator struct {
SampleIterator
mint, maxt int64
}
// NewTimeRangedSampleIterator returns an iterator which filters entries by time range.
func NewTimeRangedSampleIterator(it SampleIterator, mint, maxt int64) SampleIterator {
return &timeRangedSampleIterator{
SampleIterator: it,
mint: mint,
maxt: maxt,
}
}
func (i *timeRangedSampleIterator) Next() bool {
ok := i.SampleIterator.Next()
if !ok {
i.SampleIterator.Close()
return ok
}
ts := i.SampleIterator.Sample().Timestamp
for ok && i.mint > ts {
ok = i.SampleIterator.Next()
if !ok {
continue
}
ts = i.SampleIterator.Sample().Timestamp
}
if ok {
if ts == i.mint { // The mint is inclusive
return true
}
if i.maxt < ts || i.maxt == ts { // The maxt is exclusive.
ok = false
}
}
if !ok {
i.SampleIterator.Close()
}
return ok
}
// ReadBatch reads a set of entries off an iterator.
func ReadSampleBatch(i SampleIterator, size uint32) (*logproto.SampleQueryResponse, uint32, error) {
series := map[string]*logproto.Series{}
respSize := uint32(0)
for ; respSize < size && i.Next(); respSize++ {
labels, sample := i.Labels(), i.Sample()
s, ok := series[labels]
if !ok {
s = &logproto.Series{
Labels: labels,
}
series[labels] = s
}
s.Samples = append(s.Samples, sample)
}
result := logproto.SampleQueryResponse{
Series: make([]logproto.Series, 0, len(series)),
}
for _, s := range series {
result.Series = append(result.Series, *s)
}
return &result, respSize, i.Error()
}

@ -0,0 +1,195 @@
package iter
import (
"context"
"io"
"testing"
"time"
"github.com/stretchr/testify/require"
"github.com/grafana/loki/pkg/logproto"
)
func TestNewPeekingSampleIterator(t *testing.T) {
iter := NewPeekingSampleIterator(NewSeriesIterator(logproto.Series{
Samples: []logproto.Sample{
{
Timestamp: time.Unix(0, 1).UnixNano(),
},
{
Timestamp: time.Unix(0, 2).UnixNano(),
},
{
Timestamp: time.Unix(0, 3).UnixNano(),
},
},
}))
_, peek, ok := iter.Peek()
if peek.Timestamp != 1 {
t.Fatal("wrong peeked time.")
}
if !ok {
t.Fatal("should be ok.")
}
hasNext := iter.Next()
if !hasNext {
t.Fatal("should have next.")
}
if iter.Sample().Timestamp != 1 {
t.Fatal("wrong peeked time.")
}
_, peek, ok = iter.Peek()
if peek.Timestamp != 2 {
t.Fatal("wrong peeked time.")
}
if !ok {
t.Fatal("should be ok.")
}
hasNext = iter.Next()
if !hasNext {
t.Fatal("should have next.")
}
if iter.Sample().Timestamp != 2 {
t.Fatal("wrong peeked time.")
}
_, peek, ok = iter.Peek()
if peek.Timestamp != 3 {
t.Fatal("wrong peeked time.")
}
if !ok {
t.Fatal("should be ok.")
}
hasNext = iter.Next()
if !hasNext {
t.Fatal("should have next.")
}
if iter.Sample().Timestamp != 3 {
t.Fatal("wrong peeked time.")
}
_, _, ok = iter.Peek()
if ok {
t.Fatal("should not be ok.")
}
require.NoError(t, iter.Close())
require.NoError(t, iter.Error())
}
func sample(i int) logproto.Sample {
return logproto.Sample{
Timestamp: int64(i),
Hash: uint64(i),
Value: float64(1),
}
}
var varSeries = logproto.Series{
Labels: `{foo="var"}`,
Samples: []logproto.Sample{
sample(1), sample(2), sample(3),
},
}
var carSeries = logproto.Series{
Labels: `{foo="car"}`,
Samples: []logproto.Sample{
sample(1), sample(2), sample(3),
},
}
func TestNewHeapSampleIterator(t *testing.T) {
it := NewHeapSampleIterator(context.Background(),
[]SampleIterator{
NewSeriesIterator(varSeries),
NewSeriesIterator(carSeries),
NewSeriesIterator(carSeries),
NewSeriesIterator(varSeries),
NewSeriesIterator(carSeries),
NewSeriesIterator(varSeries),
NewSeriesIterator(carSeries),
})
for i := 1; i < 4; i++ {
require.True(t, it.Next(), i)
require.Equal(t, `{foo="car"}`, it.Labels(), i)
require.Equal(t, sample(i), it.Sample(), i)
require.True(t, it.Next(), i)
require.Equal(t, `{foo="var"}`, it.Labels(), i)
require.Equal(t, sample(i), it.Sample(), i)
}
require.False(t, it.Next())
require.NoError(t, it.Error())
require.NoError(t, it.Close())
}
type fakeSampleClient struct {
series [][]logproto.Series
curr int
}
func (f *fakeSampleClient) Recv() (*logproto.SampleQueryResponse, error) {
if f.curr >= len(f.series) {
return nil, io.EOF
}
res := &logproto.SampleQueryResponse{
Series: f.series[f.curr],
}
f.curr++
return res, nil
}
func (fakeSampleClient) Context() context.Context { return context.Background() }
func (fakeSampleClient) CloseSend() error { return nil }
func TestNewSampleQueryClientIterator(t *testing.T) {
it := NewSampleQueryClientIterator(&fakeSampleClient{
series: [][]logproto.Series{
{varSeries},
{carSeries},
},
})
for i := 1; i < 4; i++ {
require.True(t, it.Next(), i)
require.Equal(t, `{foo="var"}`, it.Labels(), i)
require.Equal(t, sample(i), it.Sample(), i)
}
for i := 1; i < 4; i++ {
require.True(t, it.Next(), i)
require.Equal(t, `{foo="car"}`, it.Labels(), i)
require.Equal(t, sample(i), it.Sample(), i)
}
require.False(t, it.Next())
require.NoError(t, it.Error())
require.NoError(t, it.Close())
}
func TestNewNonOverlappingSampleIterator(t *testing.T) {
it := NewNonOverlappingSampleIterator([]SampleIterator{
NewSeriesIterator(varSeries),
NewSeriesIterator(logproto.Series{
Labels: varSeries.Labels,
Samples: []logproto.Sample{sample(4), sample(5)},
}),
}, varSeries.Labels)
for i := 1; i < 6; i++ {
require.True(t, it.Next(), i)
require.Equal(t, `{foo="var"}`, it.Labels(), i)
require.Equal(t, sample(i), it.Sample(), i)
}
require.False(t, it.Next())
require.NoError(t, it.Error())
require.NoError(t, it.Close())
}
func TestReadSampleBatch(t *testing.T) {
res, size, err := ReadSampleBatch(NewSeriesIterator(carSeries), 1)
require.Equal(t, &logproto.SampleQueryResponse{Series: []logproto.Series{{Labels: carSeries.Labels, Samples: []logproto.Sample{sample(1)}}}}, res)
require.Equal(t, uint32(1), size)
require.NoError(t, err)
res, size, err = ReadSampleBatch(NewMultiSeriesIterator(context.Background(), []logproto.Series{carSeries, varSeries}), 100)
require.ElementsMatch(t, []logproto.Series{carSeries, varSeries}, res.Series)
require.Equal(t, uint32(6), size)
require.NoError(t, err)
}

@ -18,7 +18,6 @@ import (
"github.com/weaveworks/common/user"
"github.com/grafana/loki/pkg/cfg"
"github.com/grafana/loki/pkg/iter"
"github.com/grafana/loki/pkg/logcli/client"
"github.com/grafana/loki/pkg/logcli/output"
"github.com/grafana/loki/pkg/loghttp"
@ -117,7 +116,12 @@ func (q *Query) DoLocalQuery(out output.LogOutput, statistics bool, orgID string
return err
}
querier, err := localStore(conf)
limits, err := validation.NewOverrides(conf.LimitsConfig, nil)
if err != nil {
return err
}
querier, err := storage.NewStore(conf.StorageConfig, conf.ChunkStoreConfig, conf.SchemaConfig, limits, prometheus.DefaultRegisterer)
if err != nil {
return err
}
@ -169,20 +173,6 @@ func (q *Query) DoLocalQuery(out output.LogOutput, statistics bool, orgID string
return nil
}
func localStore(conf loki.Config) (logql.Querier, error) {
limits, err := validation.NewOverrides(conf.LimitsConfig, nil)
if err != nil {
return nil, err
}
s, err := storage.NewStore(conf.StorageConfig, conf.ChunkStoreConfig, conf.SchemaConfig, limits, prometheus.DefaultRegisterer)
if err != nil {
return nil, err
}
return logql.QuerierFunc(func(ctx context.Context, params logql.SelectParams) (iter.EntryIterator, error) {
return s.LazyQuery(ctx, params)
}), nil
}
// SetInstant makes the Query an instant type
func (q *Query) SetInstant(time time.Time) {
q.Start = time

File diff suppressed because it is too large Load Diff

@ -13,6 +13,7 @@ service Pusher {
service Querier {
rpc Query(QueryRequest) returns (stream QueryResponse) {};
rpc QuerySample(SampleQueryRequest) returns (stream SampleQueryResponse) {};
rpc Label(LabelRequest) returns (LabelResponse) {};
rpc Tail(TailRequest) returns (stream TailResponse) {};
rpc Series(SeriesRequest) returns (SeriesResponse) {};
@ -38,7 +39,17 @@ message QueryRequest {
Direction direction = 5;
reserved 6;
repeated string shards = 7 [(gogoproto.jsontag) = "shards,omitempty"];
}
message SampleQueryRequest {
string selector = 1;
google.protobuf.Timestamp start = 2 [(gogoproto.stdtime) = true, (gogoproto.nullable) = false];
google.protobuf.Timestamp end = 3 [(gogoproto.stdtime) = true, (gogoproto.nullable) = false];
repeated string shards = 4 [(gogoproto.jsontag) = "shards,omitempty"];
}
message SampleQueryResponse {
repeated Series series = 1 [(gogoproto.customtype) = "Series", (gogoproto.nullable) = true];
}
enum Direction {
@ -71,6 +82,17 @@ message EntryAdapter {
string line = 2 [(gogoproto.jsontag) = "line"];
}
message Sample {
int64 timestamp = 1 [(gogoproto.jsontag) = "ts"];
double value = 2 [(gogoproto.jsontag) = "value"];
uint64 hash = 3 [(gogoproto.jsontag) = "hash"];
}
message Series {
string labels = 1 [(gogoproto.jsontag) = "labels"];
repeated Sample samples = 2 [(gogoproto.nullable) = false, (gogoproto.jsontag) = "samples"];
}
message TailRequest {
string query = 1;
reserved 2;

@ -1,9 +1,9 @@
package logproto
import (
fmt "fmt"
io "io"
time "time"
"fmt"
"io"
"time"
)
// Stream contains a unique labels set as a string and a set of entries for it.

@ -21,29 +21,49 @@ type Expr interface {
fmt.Stringer
}
type QueryParams interface {
LogSelector() (LogSelectorExpr, error)
GetStart() time.Time
GetEnd() time.Time
GetShards() []string
}
// SelectParams specifies parameters passed to data selections.
type SelectParams struct {
type SelectLogParams struct {
*logproto.QueryRequest
}
// LogSelector returns the LogSelectorExpr from the SelectParams.
// The `LogSelectorExpr` can then returns all matchers and filters to use for that request.
func (s SelectParams) LogSelector() (LogSelectorExpr, error) {
func (s SelectLogParams) LogSelector() (LogSelectorExpr, error) {
return ParseLogSelector(s.Selector)
}
// QuerierFunc implements Querier.
type QuerierFunc func(context.Context, SelectParams) (iter.EntryIterator, error)
type SelectSampleParams struct {
*logproto.SampleQueryRequest
}
// Select implements Querier.
func (q QuerierFunc) Select(ctx context.Context, p SelectParams) (iter.EntryIterator, error) {
return q(ctx, p)
// Expr returns the SampleExpr from the SelectSampleParams.
// The `LogSelectorExpr` can then returns all matchers and filters to use for that request.
func (s SelectSampleParams) Expr() (SampleExpr, error) {
return ParseSampleExpr(s.Selector)
}
// LogSelector returns the LogSelectorExpr from the SelectParams.
// The `LogSelectorExpr` can then returns all matchers and filters to use for that request.
func (s SelectSampleParams) LogSelector() (LogSelectorExpr, error) {
expr, err := ParseSampleExpr(s.Selector)
if err != nil {
return nil, err
}
return expr.Selector(), nil
}
// Querier allows a LogQL expression to fetch an EntryIterator for a
// set of matchers and filters
type Querier interface {
Select(context.Context, SelectParams) (iter.EntryIterator, error)
SelectLogs(context.Context, SelectLogParams) (iter.EntryIterator, error)
SelectSamples(context.Context, SelectSampleParams) (iter.SampleIterator, error)
}
// LogSelectorExpr is a LogQL expression filtering and returning logs.
@ -162,9 +182,7 @@ type logRange struct {
// impls Stringer
func (r logRange) String() string {
var sb strings.Builder
sb.WriteString("(")
sb.WriteString(r.left.String())
sb.WriteString(")")
sb.WriteString(fmt.Sprintf("[%v]", model.Duration(r.interval)))
return sb.String()
}
@ -248,6 +266,7 @@ func IsLogicalBinOp(op string) bool {
type SampleExpr interface {
// Selector is the LogQL selector to apply when retrieving logs.
Selector() LogSelectorExpr
Extractor() (SampleExtractor, error)
// Operations returns the list of operations used in this SampleExpr
Operations() []string
Expr
@ -345,6 +364,10 @@ func (e *vectorAggregationExpr) Selector() LogSelectorExpr {
return e.left.Selector()
}
func (e *vectorAggregationExpr) Extractor() (SampleExtractor, error) {
return e.left.Extractor()
}
// impl Expr
func (e *vectorAggregationExpr) logQLExpr() {}
@ -482,10 +505,11 @@ func (e *literalExpr) String() string {
// literlExpr impls SampleExpr & LogSelectorExpr mainly to reduce the need for more complicated typings
// to facilitate sum types. We'll be type switching when evaluating them anyways
// and they will only be present in binary operation legs.
func (e *literalExpr) Selector() LogSelectorExpr { return e }
func (e *literalExpr) Operations() []string { return nil }
func (e *literalExpr) Filter() (LineFilter, error) { return nil, nil }
func (e *literalExpr) Matchers() []*labels.Matcher { return nil }
func (e *literalExpr) Selector() LogSelectorExpr { return e }
func (e *literalExpr) Operations() []string { return nil }
func (e *literalExpr) Filter() (LineFilter, error) { return nil, nil }
func (e *literalExpr) Matchers() []*labels.Matcher { return nil }
func (e *literalExpr) Extractor() (SampleExtractor, error) { return nil, nil }
// helper used to impl Stringer for vector and range aggregations
// nolint:interfacer

@ -204,16 +204,16 @@ func TestStringer(t *testing.T) {
},
{
in: `1 > bool 1 > count_over_time({foo="bar"}[1m])`,
out: `0.000000 > count_over_time(({foo="bar"})[1m])`,
out: `0.000000 > count_over_time({foo="bar"}[1m])`,
},
{
in: `1 > bool 1 > bool count_over_time({foo="bar"}[1m])`,
out: `0.000000 > bool count_over_time(({foo="bar"})[1m])`,
out: `0.000000 > bool count_over_time({foo="bar"}[1m])`,
},
{
in: `0.000000 > count_over_time(({foo="bar"})[1m])`,
out: `0.000000 > count_over_time(({foo="bar"})[1m])`,
in: `0.000000 > count_over_time({foo="bar"}[1m])`,
out: `0.000000 > count_over_time({foo="bar"}[1m])`,
},
} {
t.Run(tc.in, func(t *testing.T) {

File diff suppressed because it is too large Load Diff

@ -129,7 +129,7 @@ func NewDefaultEvaluator(querier Querier, maxLookBackPeriod time.Duration) *Defa
}
func (ev *DefaultEvaluator) Iterator(ctx context.Context, expr LogSelectorExpr, q Params) (iter.EntryIterator, error) {
params := SelectParams{
params := SelectLogParams{
QueryRequest: &logproto.QueryRequest{
Start: q.Start(),
End: q.End(),
@ -144,7 +144,7 @@ func (ev *DefaultEvaluator) Iterator(ctx context.Context, expr LogSelectorExpr,
params.Start = params.Start.Add(-ev.maxLookBackPeriod)
}
return ev.querier.Select(ctx, params)
return ev.querier.SelectLogs(ctx, params)
}
@ -158,20 +158,18 @@ func (ev *DefaultEvaluator) StepEvaluator(
case *vectorAggregationExpr:
return vectorAggEvaluator(ctx, nextEv, e, q)
case *rangeAggregationExpr:
entryIter, err := ev.querier.Select(ctx, SelectParams{
&logproto.QueryRequest{
Start: q.Start().Add(-e.left.interval),
End: q.End(),
Limit: 0,
Direction: logproto.FORWARD,
Selector: expr.Selector().String(),
Shards: q.Shards(),
it, err := ev.querier.SelectSamples(ctx, SelectSampleParams{
&logproto.SampleQueryRequest{
Start: q.Start().Add(-e.left.interval),
End: q.End(),
Selector: expr.String(),
Shards: q.Shards(),
},
})
if err != nil {
return nil, err
}
return rangeAggEvaluator(entryIter, e, q)
return rangeAggEvaluator(iter.NewPeekingSampleIterator(it), e, q)
case *binOpExpr:
return binOpStepEvaluator(ctx, nextEv, e, q)
default:
@ -377,7 +375,7 @@ func vectorAggEvaluator(
}
func rangeAggEvaluator(
entryIter iter.EntryIterator,
it iter.PeekingSampleIterator,
expr *rangeAggregationExpr,
q Params,
) (StepEvaluator, error) {
@ -385,13 +383,9 @@ func rangeAggEvaluator(
if err != nil {
return nil, err
}
extractor, err := expr.extractor()
if err != nil {
return nil, err
}
return rangeVectorEvaluator{
iter: newRangeVectorIterator(
newSeriesIterator(entryIter, extractor),
it,
expr.left.interval.Nanoseconds(),
q.Step().Nanoseconds(),
q.Start().UnixNano(), q.End().UnixNano(),

@ -9,12 +9,12 @@ import (
const unsupportedErr = "unsupported range vector aggregation operation: %s"
func (r rangeAggregationExpr) extractor() (SampleExtractor, error) {
func (r rangeAggregationExpr) Extractor() (SampleExtractor, error) {
switch r.operation {
case OpRangeTypeRate, OpRangeTypeCount:
return extractCount, nil
return ExtractCount, nil
case OpRangeTypeBytes, OpRangeTypeBytesRate:
return extractBytes, nil
return ExtractBytes, nil
default:
return nil, fmt.Errorf(unsupportedErr, r.operation)
}

@ -59,6 +59,19 @@ func ParseMatchers(input string) ([]*labels.Matcher, error) {
return matcherExpr.matchers, nil
}
// ParseSampleExpr parses a string and returns the sampleExpr
func ParseSampleExpr(input string) (SampleExpr, error) {
expr, err := ParseExpr(input)
if err != nil {
return nil, err
}
sampleExpr, ok := expr.(SampleExpr)
if !ok {
return nil, errors.New("only sample expression supported")
}
return sampleExpr, nil
}
// ParseLogSelector parses a log selector expression `{app="foo"} |= "filter"`
func ParseLogSelector(input string) (LogSelectorExpr, error) {
expr, err := ParseExpr(input)

@ -6,6 +6,8 @@ import (
"github.com/prometheus/prometheus/pkg/labels"
"github.com/prometheus/prometheus/promql"
"github.com/prometheus/prometheus/promql/parser"
"github.com/grafana/loki/pkg/iter"
)
// RangeVectorAggregator aggregates samples for a given range of samples.
@ -23,7 +25,7 @@ type RangeVectorIterator interface {
}
type rangeVectorIterator struct {
iter SeriesIterator
iter iter.PeekingSampleIterator
selRange, step, end, current int64
window map[string]*promql.Series
metrics map[string]labels.Labels
@ -31,7 +33,7 @@ type rangeVectorIterator struct {
}
func newRangeVectorIterator(
it SeriesIterator,
it iter.PeekingSampleIterator,
selRange, step, start, end int64) *rangeVectorIterator {
// forces at least one step.
if step == 0 {
@ -97,37 +99,37 @@ func (r *rangeVectorIterator) popBack(newStart int64) {
// load the next sample range window.
func (r *rangeVectorIterator) load(start, end int64) {
for sample, hasNext := r.iter.Peek(); hasNext; sample, hasNext = r.iter.Peek() {
if sample.TimestampNano > end {
for lbs, sample, hasNext := r.iter.Peek(); hasNext; lbs, sample, hasNext = r.iter.Peek() {
if sample.Timestamp > end {
// not consuming the iterator as this belong to another range.
return
}
// the lower bound of the range is not inclusive
if sample.TimestampNano <= start {
if sample.Timestamp <= start {
_ = r.iter.Next()
continue
}
// adds the sample.
var series *promql.Series
var ok bool
series, ok = r.window[sample.Labels]
series, ok = r.window[lbs]
if !ok {
var metric labels.Labels
if metric, ok = r.metrics[sample.Labels]; !ok {
if metric, ok = r.metrics[lbs]; !ok {
var err error
metric, err = parser.ParseMetric(sample.Labels)
metric, err = parser.ParseMetric(lbs)
if err != nil {
continue
}
r.metrics[sample.Labels] = metric
r.metrics[lbs] = metric
}
series = getSeries()
series.Metric = metric
r.window[sample.Labels] = series
r.window[lbs] = series
}
p := promql.Point{
T: sample.TimestampNano,
T: sample.Timestamp,
V: sample.Value,
}
series.Points = append(series.Points, p)

@ -14,41 +14,38 @@ import (
"github.com/grafana/loki/pkg/logproto"
)
var entries = []logproto.Entry{
{Timestamp: time.Unix(2, 0)},
{Timestamp: time.Unix(5, 0)},
{Timestamp: time.Unix(6, 0)},
{Timestamp: time.Unix(10, 0)},
{Timestamp: time.Unix(10, 1)},
{Timestamp: time.Unix(11, 0)},
{Timestamp: time.Unix(35, 0)},
{Timestamp: time.Unix(35, 1)},
{Timestamp: time.Unix(40, 0)},
{Timestamp: time.Unix(100, 0)},
{Timestamp: time.Unix(100, 1)},
var samples = []logproto.Sample{
{Timestamp: time.Unix(2, 0).UnixNano(), Hash: 1, Value: 1.},
{Timestamp: time.Unix(5, 0).UnixNano(), Hash: 2, Value: 1.},
{Timestamp: time.Unix(6, 0).UnixNano(), Hash: 3, Value: 1.},
{Timestamp: time.Unix(10, 0).UnixNano(), Hash: 4, Value: 1.},
{Timestamp: time.Unix(10, 1).UnixNano(), Hash: 5, Value: 1.},
{Timestamp: time.Unix(11, 0).UnixNano(), Hash: 6, Value: 1.},
{Timestamp: time.Unix(35, 0).UnixNano(), Hash: 7, Value: 1.},
{Timestamp: time.Unix(35, 1).UnixNano(), Hash: 8, Value: 1.},
{Timestamp: time.Unix(40, 0).UnixNano(), Hash: 9, Value: 1.},
{Timestamp: time.Unix(100, 0).UnixNano(), Hash: 10, Value: 1.},
{Timestamp: time.Unix(100, 1).UnixNano(), Hash: 11, Value: 1.},
}
var labelFoo, _ = parser.ParseMetric("{app=\"foo\"}")
var labelBar, _ = parser.ParseMetric("{app=\"bar\"}")
func newEntryIterator() iter.EntryIterator {
return iter.NewHeapIterator(context.Background(), []iter.EntryIterator{
iter.NewStreamIterator(logproto.Stream{
func newSampleIterator() iter.SampleIterator {
return iter.NewHeapSampleIterator(context.Background(), []iter.SampleIterator{
iter.NewSeriesIterator(logproto.Series{
Labels: labelFoo.String(),
Entries: entries,
Samples: samples,
}),
iter.NewStreamIterator(logproto.Stream{
iter.NewSeriesIterator(logproto.Series{
Labels: labelBar.String(),
Entries: entries,
Samples: samples,
}),
}, logproto.FORWARD)
})
}
func newfakeSeriesIterator() SeriesIterator {
return &seriesIterator{
iter: iter.NewPeekingIterator(newEntryIterator()),
sampler: extractCount,
}
func newfakePeekingSampleIterator() iter.PeekingSampleIterator {
return iter.NewPeekingSampleIterator(newSampleIterator())
}
func newPoint(t time.Time, v float64) promql.Point {
@ -151,7 +148,7 @@ func Test_RangeVectorIterator(t *testing.T) {
t.Run(
fmt.Sprintf("logs[%s] - step: %s", time.Duration(tt.selRange), time.Duration(tt.step)),
func(t *testing.T) {
it := newRangeVectorIterator(newfakeSeriesIterator(), tt.selRange,
it := newRangeVectorIterator(newfakePeekingSampleIterator(), tt.selRange,
tt.step, tt.start.UnixNano(), tt.end.UnixNano())
i := 0

@ -1,104 +1,24 @@
package logql
import (
"github.com/grafana/loki/pkg/iter"
"github.com/grafana/loki/pkg/logproto"
)
var (
extractBytes = bytesSampleExtractor{}
extractCount = countSampleExtractor{}
ExtractBytes = bytesSampleExtractor{}
ExtractCount = countSampleExtractor{}
)
// SeriesIterator is an iterator that iterate over a stream of logs and returns sample.
type SeriesIterator interface {
Close() error
Next() bool
Peek() (Sample, bool)
Error() error
}
// Sample is a series sample
type Sample struct {
Labels string
Value float64
TimestampNano int64
}
type seriesIterator struct {
iter iter.PeekingEntryIterator
sampler SampleExtractor
updated bool
cur Sample
}
func newSeriesIterator(it iter.EntryIterator, sampler SampleExtractor) SeriesIterator {
return &seriesIterator{
iter: iter.NewPeekingIterator(it),
sampler: sampler,
}
}
func (e *seriesIterator) Close() error {
return e.iter.Close()
}
func (e *seriesIterator) Next() bool {
e.updated = false
return e.iter.Next()
}
func (e *seriesIterator) Peek() (Sample, bool) {
if e.updated {
return e.cur, true
}
for {
lbs, entry, ok := e.iter.Peek()
if !ok {
return Sample{}, false
}
// transform
e.cur, ok = e.sampler.From(lbs, entry)
if ok {
break
}
if !e.iter.Next() {
return Sample{}, false
}
}
e.updated = true
return e.cur, true
}
func (e *seriesIterator) Error() error {
return e.iter.Error()
}
// SampleExtractor transforms a log entry into a sample.
// In case of failure the second return value will be false.
type SampleExtractor interface {
From(labels string, e logproto.Entry) (Sample, bool)
Extract(line []byte) (float64, bool)
}
type countSampleExtractor struct{}
func (countSampleExtractor) From(lbs string, entry logproto.Entry) (Sample, bool) {
return Sample{
Labels: lbs,
TimestampNano: entry.Timestamp.UnixNano(),
Value: 1.,
}, true
func (countSampleExtractor) Extract(line []byte) (float64, bool) {
return 1., true
}
type bytesSampleExtractor struct{}
func (bytesSampleExtractor) From(lbs string, entry logproto.Entry) (Sample, bool) {
return Sample{
Labels: lbs,
TimestampNano: entry.Timestamp.UnixNano(),
Value: float64(len(entry.Line)),
}, true
func (bytesSampleExtractor) Extract(line []byte) (float64, bool) {
return float64(len(line)), true
}

@ -1,159 +0,0 @@
package logql
import (
"context"
"testing"
"time"
"github.com/stretchr/testify/require"
"github.com/grafana/loki/pkg/iter"
"github.com/grafana/loki/pkg/logproto"
)
func Test_seriesIterator_Peek(t *testing.T) {
type expectation struct {
ok bool
sample Sample
}
for _, test := range []struct {
name string
it SeriesIterator
expectations []expectation
}{
{
"count",
newSeriesIterator(iter.NewStreamIterator(newStream(5, identity, `{app="foo"}`)), extractCount),
[]expectation{
{true, Sample{Labels: `{app="foo"}`, TimestampNano: 0, Value: 1}},
{true, Sample{Labels: `{app="foo"}`, TimestampNano: time.Unix(1, 0).UnixNano(), Value: 1}},
{true, Sample{Labels: `{app="foo"}`, TimestampNano: time.Unix(2, 0).UnixNano(), Value: 1}},
{true, Sample{Labels: `{app="foo"}`, TimestampNano: time.Unix(3, 0).UnixNano(), Value: 1}},
{true, Sample{Labels: `{app="foo"}`, TimestampNano: time.Unix(4, 0).UnixNano(), Value: 1}},
{false, Sample{}},
},
},
{
"bytes empty",
newSeriesIterator(
iter.NewStreamIterator(
newStream(
3,
func(i int64) logproto.Entry {
return logproto.Entry{
Timestamp: time.Unix(i, 0),
}
},
`{app="foo"}`,
),
),
extractBytes,
),
[]expectation{
{true, Sample{Labels: `{app="foo"}`, TimestampNano: 0, Value: 0}},
{true, Sample{Labels: `{app="foo"}`, TimestampNano: time.Unix(1, 0).UnixNano(), Value: 0}},
{true, Sample{Labels: `{app="foo"}`, TimestampNano: time.Unix(2, 0).UnixNano(), Value: 0}},
{false, Sample{}},
},
},
{
"bytes",
newSeriesIterator(
iter.NewStreamIterator(
newStream(
3,
func(i int64) logproto.Entry {
return logproto.Entry{
Timestamp: time.Unix(i, 0),
Line: "foo",
}
},
`{app="foo"}`,
),
),
extractBytes,
),
[]expectation{
{true, Sample{Labels: `{app="foo"}`, TimestampNano: 0, Value: 3}},
{true, Sample{Labels: `{app="foo"}`, TimestampNano: time.Unix(1, 0).UnixNano(), Value: 3}},
{true, Sample{Labels: `{app="foo"}`, TimestampNano: time.Unix(2, 0).UnixNano(), Value: 3}},
{false, Sample{}},
},
},
{
"bytes backward",
newSeriesIterator(
iter.NewStreamsIterator(context.Background(),
[]logproto.Stream{
newStream(
3,
func(i int64) logproto.Entry {
return logproto.Entry{
Timestamp: time.Unix(i, 0),
Line: "foo",
}
},
`{app="foo"}`,
),
newStream(
3,
func(i int64) logproto.Entry {
return logproto.Entry{
Timestamp: time.Unix(i, 0),
Line: "barr",
}
},
`{app="barr"}`,
),
},
logproto.BACKWARD,
),
extractBytes,
),
[]expectation{
{true, Sample{Labels: `{app="barr"}`, TimestampNano: 0, Value: 4}},
{true, Sample{Labels: `{app="barr"}`, TimestampNano: time.Unix(1, 0).UnixNano(), Value: 4}},
{true, Sample{Labels: `{app="barr"}`, TimestampNano: time.Unix(2, 0).UnixNano(), Value: 4}},
{true, Sample{Labels: `{app="foo"}`, TimestampNano: 0, Value: 3}},
{true, Sample{Labels: `{app="foo"}`, TimestampNano: time.Unix(1, 0).UnixNano(), Value: 3}},
{true, Sample{Labels: `{app="foo"}`, TimestampNano: time.Unix(2, 0).UnixNano(), Value: 3}},
{false, Sample{}},
},
},
{
"skip first",
newSeriesIterator(iter.NewStreamIterator(newStream(2, identity, `{app="foo"}`)), fakeSampler{}),
[]expectation{
{true, Sample{Labels: `{app="foo"}`, TimestampNano: time.Unix(1, 0).UnixNano(), Value: 10}},
{false, Sample{}},
},
},
} {
t.Run(test.name, func(t *testing.T) {
for _, e := range test.expectations {
sample, ok := test.it.Peek()
require.Equal(t, e.ok, ok)
if !e.ok {
continue
}
require.Equal(t, e.sample, sample)
test.it.Next()
}
require.NoError(t, test.it.Close())
})
}
}
// fakeSampler is a Sampler that returns no value for 0 timestamp otherwise always 10
type fakeSampler struct{}
func (fakeSampler) From(lbs string, entry logproto.Entry) (Sample, bool) {
if entry.Timestamp.UnixNano() == 0 {
return Sample{}, false
}
return Sample{
Labels: lbs,
TimestampNano: entry.Timestamp.UnixNano(),
Value: 10,
}, true
}

@ -166,7 +166,7 @@ type Downstreamer interface {
// DownstreamEvaluator is an evaluator which handles shard aware AST nodes
type DownstreamEvaluator struct {
Downstreamer
defaultEvaluator *DefaultEvaluator
defaultEvaluator Evaluator
}
// Downstream runs queries and collects stats from the embedded Downstreamer
@ -186,16 +186,19 @@ func (ev DownstreamEvaluator) Downstream(ctx context.Context, queries []Downstre
}
type errorQuerier struct{}
func (errorQuerier) SelectLogs(ctx context.Context, p SelectLogParams) (iter.EntryIterator, error) {
return nil, errors.New("Unimplemented")
}
func (errorQuerier) SelectSamples(ctx context.Context, p SelectSampleParams) (iter.SampleIterator, error) {
return nil, errors.New("Unimplemented")
}
func NewDownstreamEvaluator(downstreamer Downstreamer) *DownstreamEvaluator {
return &DownstreamEvaluator{
Downstreamer: downstreamer,
defaultEvaluator: NewDefaultEvaluator(
QuerierFunc(func(_ context.Context, p SelectParams) (iter.EntryIterator, error) {
// TODO(owen-d): add metric here, this should never happen.
return nil, errors.New("Unimplemented")
}),
0,
),
Downstreamer: downstreamer,
defaultEvaluator: NewDefaultEvaluator(&errorQuerier{}, 0),
}
}

@ -132,19 +132,19 @@ func TestMappingStrings(t *testing.T) {
},
{
in: `sum(rate({foo="bar"}[1m]))`,
out: `sum(downstream<sum(rate(({foo="bar"})[1m])), shard=0_of_2> ++ downstream<sum(rate(({foo="bar"})[1m])), shard=1_of_2>)`,
out: `sum(downstream<sum(rate({foo="bar"}[1m])), shard=0_of_2> ++ downstream<sum(rate({foo="bar"}[1m])), shard=1_of_2>)`,
},
{
in: `max(count(rate({foo="bar"}[5m]))) / 2`,
out: `max(sum(downstream<count(rate(({foo="bar"})[5m])), shard=0_of_2> ++ downstream<count(rate(({foo="bar"})[5m])), shard=1_of_2>)) / 2.000000`,
out: `max(sum(downstream<count(rate({foo="bar"}[5m])), shard=0_of_2> ++ downstream<count(rate({foo="bar"}[5m])), shard=1_of_2>)) / 2.000000`,
},
{
in: `topk(3, rate({foo="bar"}[5m]))`,
out: `topk(3,downstream<rate(({foo="bar"})[5m]), shard=0_of_2> ++ downstream<rate(({foo="bar"})[5m]), shard=1_of_2>)`,
out: `topk(3,downstream<rate({foo="bar"}[5m]), shard=0_of_2> ++ downstream<rate({foo="bar"}[5m]), shard=1_of_2>)`,
},
{
in: `sum(max(rate({foo="bar"}[5m])))`,
out: `sum(max(downstream<rate(({foo="bar"})[5m]), shard=0_of_2> ++ downstream<rate(({foo="bar"})[5m]), shard=1_of_2>))`,
out: `sum(max(downstream<rate({foo="bar"}[5m]), shard=0_of_2> ++ downstream<rate({foo="bar"}[5m]), shard=1_of_2>))`,
},
{
in: `{foo="bar"} |= "id=123"`,
@ -152,7 +152,7 @@ func TestMappingStrings(t *testing.T) {
},
{
in: `sum by (cluster) (rate({foo="bar"} |= "id=123" [5m]))`,
out: `sum by(cluster)(downstream<sum by(cluster)(rate(({foo="bar"}|="id=123")[5m])), shard=0_of_2> ++ downstream<sum by(cluster)(rate(({foo="bar"}|="id=123")[5m])), shard=1_of_2>)`,
out: `sum by(cluster)(downstream<sum by(cluster)(rate({foo="bar"}|="id=123"[5m])), shard=0_of_2> ++ downstream<sum by(cluster)(rate({foo="bar"}|="id=123"[5m])), shard=1_of_2>)`,
},
} {
t.Run(tc.in, func(t *testing.T) {

@ -35,7 +35,7 @@ func TestCollectTrailer(t *testing.T) {
t.Fatalf("Failed to dial bufnet: %v", err)
}
defer conn.Close()
ing := ingesterFn(func(req *logproto.QueryRequest, s logproto.Querier_QueryServer) error {
ing := ingesterFn(func(s grpc.ServerStream) error {
ingCtx := NewContext(s.Context())
defer SendAsTrailer(ingCtx, s)
GetIngesterData(ingCtx).TotalChunksMatched++
@ -60,7 +60,7 @@ func TestCollectTrailer(t *testing.T) {
ctx = NewContext(ctx)
// query the ingester twice.
// query the ingester twice once for logs , once for samples.
clientStream, err := ingClient.Query(ctx, &logproto.QueryRequest{}, CollectTrailer(ctx))
if err != nil {
t.Fatal(err)
@ -69,15 +69,15 @@ func TestCollectTrailer(t *testing.T) {
if err != nil && err != io.EOF {
t.Fatal(err)
}
clientStream, err = ingClient.Query(ctx, &logproto.QueryRequest{}, CollectTrailer(ctx))
clientSamples, err := ingClient.QuerySample(ctx, &logproto.SampleQueryRequest{}, CollectTrailer(ctx))
if err != nil {
t.Fatal(err)
}
_, err = clientStream.Recv()
_, err = clientSamples.Recv()
if err != nil && err != io.EOF {
t.Fatal(err)
}
err = clientStream.CloseSend()
err = clientSamples.CloseSend()
if err != nil {
t.Fatal(err)
}
@ -94,10 +94,14 @@ func TestCollectTrailer(t *testing.T) {
require.Equal(t, int64(2), res.Ingester.TotalDuplicates)
}
type ingesterFn func(*logproto.QueryRequest, logproto.Querier_QueryServer) error
type ingesterFn func(grpc.ServerStream) error
func (i ingesterFn) Query(_ *logproto.QueryRequest, s logproto.Querier_QueryServer) error {
return i(s)
}
func (i ingesterFn) Query(req *logproto.QueryRequest, s logproto.Querier_QueryServer) error {
return i(req, s)
func (i ingesterFn) QuerySample(_ *logproto.SampleQueryRequest, s logproto.Querier_QuerySampleServer) error {
return i(s)
}
func (ingesterFn) Label(context.Context, *logproto.LabelRequest) (*logproto.LabelResponse, error) {
return nil, nil

@ -6,6 +6,7 @@ import (
"log"
"time"
"github.com/cespare/xxhash/v2"
"github.com/cortexproject/cortex/pkg/querier/astmapper"
"github.com/prometheus/prometheus/pkg/labels"
"github.com/prometheus/prometheus/promql/parser"
@ -27,7 +28,7 @@ type MockQuerier struct {
streams []logproto.Stream
}
func (q MockQuerier) Select(_ context.Context, req SelectParams) (iter.EntryIterator, error) {
func (q MockQuerier) SelectLogs(ctx context.Context, req SelectLogParams) (iter.EntryIterator, error) {
expr, err := req.LogSelector()
if err != nil {
return nil, err
@ -91,12 +92,94 @@ outer:
}
return iter.NewTimeRangedIterator(
iter.NewStreamsIterator(context.Background(), filtered, req.Direction),
iter.NewStreamsIterator(ctx, filtered, req.Direction),
req.Start,
req.End,
), nil
}
func (q MockQuerier) SelectSamples(ctx context.Context, req SelectSampleParams) (iter.SampleIterator, error) {
selector, err := req.LogSelector()
if err != nil {
return nil, err
}
filter, err := selector.Filter()
if err != nil {
return nil, err
}
expr, err := req.Expr()
if err != nil {
return nil, err
}
extractor, err := expr.Extractor()
if err != nil {
return nil, err
}
matchers := selector.Matchers()
var shard *astmapper.ShardAnnotation
if len(req.Shards) > 0 {
shards, err := ParseShards(req.Shards)
if err != nil {
return nil, err
}
shard = &shards[0]
}
var matched []logproto.Stream
outer:
for _, stream := range q.streams {
ls := mustParseLabels(stream.Labels)
// filter by shard if requested
if shard != nil && ls.Hash()%uint64(shard.Of) != uint64(shard.Shard) {
continue
}
for _, matcher := range matchers {
if !matcher.Matches(ls.Get(matcher.Name)) {
continue outer
}
}
matched = append(matched, stream)
}
// apply the LineFilter
filtered := make([]logproto.Series, 0, len(matched))
for _, s := range matched {
var samples []logproto.Sample
for _, entry := range s.Entries {
if filter == nil || filter.Filter([]byte(entry.Line)) {
v, ok := extractor.Extract([]byte(entry.Line))
if !ok {
continue
}
samples = append(samples, logproto.Sample{
Timestamp: entry.Timestamp.UnixNano(),
Value: v,
Hash: xxhash.Sum64([]byte(entry.Line)),
})
}
}
if len(samples) > 0 {
filtered = append(filtered, logproto.Series{
Labels: s.Labels,
Samples: samples,
})
}
}
return iter.NewTimeRangedSampleIterator(
iter.NewMultiSeriesIterator(ctx, filtered),
req.Start.UnixNano(),
req.End.UnixNano(),
), nil
}
type MockDownstreamer struct {
*Engine
}

@ -141,8 +141,8 @@ func (q *Querier) forGivenIngesters(ctx context.Context, replicationSet ring.Rep
}
// Select Implements logql.Querier which select logs via matchers and regex filters.
func (q *Querier) Select(ctx context.Context, params logql.SelectParams) (iter.EntryIterator, error) {
err := q.validateQueryRequest(ctx, params.QueryRequest)
func (q *Querier) SelectLogs(ctx context.Context, params logql.SelectLogParams) (iter.EntryIterator, error) {
err := q.validateQueryRequest(ctx, params)
if err != nil {
return nil, err
}
@ -151,7 +151,7 @@ func (q *Querier) Select(ctx context.Context, params logql.SelectParams) (iter.E
if q.cfg.IngesterQueryStoreMaxLookback == 0 {
// IngesterQueryStoreMaxLookback is zero, the default state, query the store normally
chunkStoreIter, err = q.store.LazyQuery(ctx, params)
chunkStoreIter, err = q.store.SelectLogs(ctx, params)
if err != nil {
return nil, err
}
@ -165,11 +165,11 @@ func (q *Querier) Select(ctx context.Context, params logql.SelectParams) (iter.E
// Make a copy of the request before modifying
// because the initial request is used below to query ingesters
queryRequestCopy := *params.QueryRequest
newParams := logql.SelectParams{
newParams := logql.SelectLogParams{
QueryRequest: &queryRequestCopy,
}
newParams.End = adjustedEnd
chunkStoreIter, err = q.store.LazyQuery(ctx, newParams)
chunkStoreIter, err = q.store.SelectLogs(ctx, newParams)
if err != nil {
return nil, err
}
@ -182,7 +182,7 @@ func (q *Querier) Select(ctx context.Context, params logql.SelectParams) (iter.E
// skip ingester queries only when QueryIngestersWithin is enabled (not the zero value) and
// the end of the query is earlier than the lookback
if lookback := time.Now().Add(-q.cfg.QueryIngestersWithin); q.cfg.QueryIngestersWithin != 0 && params.GetEnd().Before(lookback) {
if !shouldQueryIngester(q.cfg, params) {
return chunkStoreIter, nil
}
@ -194,7 +194,61 @@ func (q *Querier) Select(ctx context.Context, params logql.SelectParams) (iter.E
return iter.NewHeapIterator(ctx, append(iters, chunkStoreIter), params.Direction), nil
}
func (q *Querier) queryIngesters(ctx context.Context, params logql.SelectParams) ([]iter.EntryIterator, error) {
func (q *Querier) SelectSamples(ctx context.Context, params logql.SelectSampleParams) (iter.SampleIterator, error) {
err := q.validateQueryRequest(ctx, params)
if err != nil {
return nil, err
}
var chunkStoreIter iter.SampleIterator
switch {
case q.cfg.IngesterQueryStoreMaxLookback == 0:
// IngesterQueryStoreMaxLookback is zero, the default state, query the store normally
chunkStoreIter, err = q.store.SelectSamples(ctx, params)
if err != nil {
return nil, err
}
case q.cfg.IngesterQueryStoreMaxLookback > 0:
adjustedEnd := params.End.Add(-q.cfg.IngesterQueryStoreMaxLookback)
if params.Start.After(adjustedEnd) {
chunkStoreIter = iter.NoopIterator
break
}
// Make a copy of the request before modifying
// because the initial request is used below to query ingesters
queryRequestCopy := *params.SampleQueryRequest
newParams := logql.SelectSampleParams{
SampleQueryRequest: &queryRequestCopy,
}
newParams.End = adjustedEnd
chunkStoreIter, err = q.store.SelectSamples(ctx, newParams)
if err != nil {
return nil, err
}
default:
chunkStoreIter = iter.NoopIterator
}
// skip ingester queries only when QueryIngestersWithin is enabled (not the zero value) and
// the end of the query is earlier than the lookback
if !shouldQueryIngester(q.cfg, params) {
return chunkStoreIter, nil
}
iters, err := q.queryIngestersForSample(ctx, params)
if err != nil {
return nil, err
}
return iter.NewHeapSampleIterator(ctx, append(iters, chunkStoreIter)), nil
}
func shouldQueryIngester(cfg Config, params logql.QueryParams) bool {
lookback := time.Now().Add(-cfg.QueryIngestersWithin)
return !(cfg.QueryIngestersWithin != 0 && params.GetEnd().Before(lookback))
}
func (q *Querier) queryIngesters(ctx context.Context, params logql.SelectLogParams) ([]iter.EntryIterator, error) {
clients, err := q.forAllIngesters(ctx, func(client logproto.QuerierClient) (interface{}, error) {
return client.Query(ctx, params.QueryRequest, stats.CollectTrailer(ctx))
})
@ -209,6 +263,21 @@ func (q *Querier) queryIngesters(ctx context.Context, params logql.SelectParams)
return iterators, nil
}
func (q *Querier) queryIngestersForSample(ctx context.Context, params logql.SelectSampleParams) ([]iter.SampleIterator, error) {
clients, err := q.forAllIngesters(ctx, func(client logproto.QuerierClient) (interface{}, error) {
return client.QuerySample(ctx, params.SampleQueryRequest, stats.CollectTrailer(ctx))
})
if err != nil {
return nil, err
}
iterators := make([]iter.SampleIterator, len(clients))
for i := range clients {
iterators[i] = iter.NewSampleQueryClientIterator(clients[i].response.(logproto.Querier_QuerySampleClient))
}
return iterators, nil
}
// Label does the heavy lifting for a Label query.
func (q *Querier) Label(ctx context.Context, req *logproto.LabelRequest) (*logproto.LabelResponse, error) {
// Enforce the query timeout while querying backends
@ -264,7 +333,7 @@ func (q *Querier) Tail(ctx context.Context, req *logproto.TailRequest) (*Tailer,
return nil, err
}
histReq := logql.SelectParams{
histReq := logql.SelectLogParams{
QueryRequest: &logproto.QueryRequest{
Selector: req.Query,
Start: req.Start,
@ -274,7 +343,7 @@ func (q *Querier) Tail(ctx context.Context, req *logproto.TailRequest) (*Tailer,
},
}
err = q.validateQueryRequest(ctx, histReq.QueryRequest)
err = q.validateQueryRequest(ctx, histReq)
if err != nil {
return nil, err
}
@ -297,7 +366,7 @@ func (q *Querier) Tail(ctx context.Context, req *logproto.TailRequest) (*Tailer,
tailClients[clients[i].addr] = clients[i].response.(logproto.Querier_TailClient)
}
histIterators, err := q.Select(queryCtx, histReq)
histIterators, err := q.SelectLogs(queryCtx, histReq)
if err != nil {
return nil, err
}
@ -376,7 +445,7 @@ func (q *Querier) Series(ctx context.Context, req *logproto.SeriesRequest) (*log
return nil, err
}
if err = q.validateQueryTimeRange(userID, &req.Start, &req.End); err != nil {
if err = q.validateQueryTimeRange(userID, req.Start, req.End); err != nil {
return nil, err
}
@ -483,7 +552,7 @@ func (q *Querier) seriesForMatchers(
// seriesForMatcher fetches series from the store for a given matcher
func (q *Querier) seriesForMatcher(ctx context.Context, from, through time.Time, matcher string) ([]logproto.SeriesIdentifier, error) {
ids, err := q.store.GetSeries(ctx, logql.SelectParams{
ids, err := q.store.GetSeries(ctx, logql.SelectLogParams{
QueryRequest: &logproto.QueryRequest{
Selector: matcher,
Limit: 1,
@ -498,13 +567,13 @@ func (q *Querier) seriesForMatcher(ctx context.Context, from, through time.Time,
return ids, nil
}
func (q *Querier) validateQueryRequest(ctx context.Context, req *logproto.QueryRequest) error {
func (q *Querier) validateQueryRequest(ctx context.Context, req logql.QueryParams) error {
userID, err := user.ExtractOrgID(ctx)
if err != nil {
return err
}
selector, err := logql.ParseLogSelector(req.Selector)
selector, err := req.LogSelector()
if err != nil {
return err
}
@ -516,17 +585,17 @@ func (q *Querier) validateQueryRequest(ctx context.Context, req *logproto.QueryR
"max streams matchers per query exceeded, matchers-count > limit (%d > %d)", len(matchers), maxStreamMatchersPerQuery)
}
return q.validateQueryTimeRange(userID, &req.Start, &req.End)
return q.validateQueryTimeRange(userID, req.GetStart(), req.GetEnd())
}
func (q *Querier) validateQueryTimeRange(userID string, from *time.Time, through *time.Time) error {
if (*through).Before(*from) {
return httpgrpc.Errorf(http.StatusBadRequest, "invalid query, through < from (%s < %s)", *through, *from)
func (q *Querier) validateQueryTimeRange(userID string, from time.Time, through time.Time) error {
if (through).Before(from) {
return httpgrpc.Errorf(http.StatusBadRequest, "invalid query, through < from (%s < %s)", through, from)
}
maxQueryLength := q.limits.MaxQueryLength(userID)
if maxQueryLength > 0 && (*through).Sub(*from) > maxQueryLength {
return httpgrpc.Errorf(http.StatusBadRequest, cortex_validation.ErrQueryTooLong, (*through).Sub(*from), maxQueryLength)
if maxQueryLength > 0 && (through).Sub(from) > maxQueryLength {
return httpgrpc.Errorf(http.StatusBadRequest, cortex_validation.ErrQueryTooLong, (through).Sub(from), maxQueryLength)
}
return nil

@ -207,7 +207,7 @@ func newStoreMock() *storeMock {
return &storeMock{}
}
func (s *storeMock) LazyQuery(ctx context.Context, req logql.SelectParams) (iter.EntryIterator, error) {
func (s *storeMock) SelectLogs(ctx context.Context, req logql.SelectLogParams) (iter.EntryIterator, error) {
args := s.Called(ctx, req)
res := args.Get(0)
if res == nil {
@ -216,6 +216,15 @@ func (s *storeMock) LazyQuery(ctx context.Context, req logql.SelectParams) (iter
return res.(iter.EntryIterator), args.Error(1)
}
func (s *storeMock) SelectSamples(ctx context.Context, req logql.SelectSampleParams) (iter.SampleIterator, error) {
args := s.Called(ctx, req)
res := args.Get(0)
if res == nil {
return iter.SampleIterator(nil), args.Error(1)
}
return res.(iter.SampleIterator), args.Error(1)
}
func (s *storeMock) Get(ctx context.Context, userID string, from, through model.Time, matchers ...*labels.Matcher) ([]chunk.Chunk, error) {
args := s.Called(ctx, userID, from, through, matchers)
return args.Get(0).([]chunk.Chunk), args.Error(1)
@ -252,7 +261,7 @@ func (s *storeMock) DeleteSeriesIDs(ctx context.Context, from, through model.Tim
panic("don't call me please")
}
func (s *storeMock) GetSeries(ctx context.Context, req logql.SelectParams) ([]logproto.SeriesIdentifier, error) {
func (s *storeMock) GetSeries(ctx context.Context, req logql.SelectLogParams) ([]logproto.SeriesIdentifier, error) {
args := s.Called(ctx, req)
res := args.Get(0)
if res == nil {

@ -85,7 +85,7 @@ func TestQuerier_Tail_QueryTimeoutConfigFlag(t *testing.T) {
}
store := newStoreMock()
store.On("LazyQuery", mock.Anything, mock.Anything).Return(mockStreamIterator(1, 2), nil)
store.On("SelectLogs", mock.Anything, mock.Anything).Return(mockStreamIterator(1, 2), nil)
queryClient := newQueryClientMock()
queryClient.On("Recv").Return(mockQueryResponse([]logproto.Stream{mockStream(1, 2)}), nil)
@ -124,7 +124,7 @@ func TestQuerier_Tail_QueryTimeoutConfigFlag(t *testing.T) {
_, ok = calls[0].Arguments.Get(0).(context.Context).Deadline()
assert.False(t, ok)
calls = store.GetMockedCallsByMethod("LazyQuery")
calls = store.GetMockedCallsByMethod("SelectLogs")
assert.Equal(t, 1, len(calls))
deadline, ok = calls[0].Arguments.Get(0).(context.Context).Deadline()
assert.True(t, ok)
@ -261,7 +261,7 @@ func TestQuerier_validateQueryRequest(t *testing.T) {
}
store := newStoreMock()
store.On("LazyQuery", mock.Anything, mock.Anything).Return(mockStreamIterator(1, 2), nil)
store.On("SelectLogs", mock.Anything, mock.Anything).Return(mockStreamIterator(1, 2), nil)
queryClient := newQueryClientMock()
queryClient.On("Recv").Return(mockQueryResponse([]logproto.Stream{mockStream(1, 2)}), nil)
@ -286,15 +286,15 @@ func TestQuerier_validateQueryRequest(t *testing.T) {
ctx := user.InjectOrgID(context.Background(), "test")
_, err = q.Select(ctx, logql.SelectParams{QueryRequest: &request})
_, err = q.SelectLogs(ctx, logql.SelectLogParams{QueryRequest: &request})
require.Equal(t, httpgrpc.Errorf(http.StatusBadRequest, "max streams matchers per query exceeded, matchers-count > limit (2 > 1)"), err)
request.Selector = "{type=\"test\"}"
_, err = q.Select(ctx, logql.SelectParams{QueryRequest: &request})
_, err = q.SelectLogs(ctx, logql.SelectLogParams{QueryRequest: &request})
require.NoError(t, err)
request.Start = request.End.Add(-3 * time.Minute)
_, err = q.Select(ctx, logql.SelectParams{QueryRequest: &request})
_, err = q.SelectLogs(ctx, logql.SelectLogParams{QueryRequest: &request})
require.Equal(t, httpgrpc.Errorf(http.StatusBadRequest, "invalid query, length > limit (3m0s > 2m0s)"), err)
}
@ -493,7 +493,7 @@ func TestQuerier_IngesterMaxQueryLookback(t *testing.T) {
}
store := newStoreMock()
store.On("LazyQuery", mock.Anything, mock.Anything).Return(mockStreamIterator(0, 1), nil)
store.On("SelectLogs", mock.Anything, mock.Anything).Return(mockStreamIterator(0, 1), nil)
conf := mockQuerierConfig()
conf.QueryIngestersWithin = tc.lookback
@ -507,7 +507,7 @@ func TestQuerier_IngesterMaxQueryLookback(t *testing.T) {
ctx := user.InjectOrgID(context.Background(), "test")
res, err := q.Select(ctx, logql.SelectParams{QueryRequest: &req})
res, err := q.SelectLogs(ctx, logql.SelectLogParams{QueryRequest: &req})
require.Nil(t, err)
// since streams are loaded lazily, force iterators to exhaust
@ -570,7 +570,7 @@ func TestQuerier_concurrentTailLimits(t *testing.T) {
// For this test's purpose, whenever a new ingester client needs to
// be created, the factory will always return the same mock instance
store := newStoreMock()
store.On("LazyQuery", mock.Anything, mock.Anything).Return(mockStreamIterator(1, 2), nil)
store.On("SelectLogs", mock.Anything, mock.Anything).Return(mockStreamIterator(1, 2), nil)
queryClient := newQueryClientMock()
queryClient.On("Recv").Return(mockQueryResponse([]logproto.Stream{mockStream(1, 2)}), nil)

@ -22,6 +22,15 @@ import (
"github.com/grafana/loki/pkg/logql/stats"
)
type genericIterator interface {
Next() bool
Labels() string
Error() error
Close() error
}
type chunksIteratorFactory func(chunks []*LazyChunk, from, through time.Time, nextChunk *LazyChunk) (genericIterator, error)
// batchChunkIterator is an EntryIterator that iterates through chunks by batch of `batchSize`.
// Since chunks can overlap across batches for each iteration the iterator will keep all overlapping
// chunks with the next chunk from the next batch and added it to the next iteration. In this case the boundaries of the batch
@ -30,79 +39,75 @@ type batchChunkIterator struct {
chunks lazyChunks
batchSize int
err error
curr iter.EntryIterator
curr genericIterator
lastOverlapping []*LazyChunk
labels map[model.Fingerprint]string
iterFactory chunksIteratorFactory
ctx context.Context
cancel context.CancelFunc
matchers []*labels.Matcher
filter logql.LineFilter
req *logproto.QueryRequest
next chan *struct {
iter iter.EntryIterator
cancel context.CancelFunc
start, end time.Time
direction logproto.Direction
next chan *struct {
iter genericIterator
err error
}
}
// newBatchChunkIterator creates a new batch iterator with the given batchSize.
func newBatchChunkIterator(ctx context.Context, chunks []*LazyChunk, batchSize int, matchers []*labels.Matcher, filter logql.LineFilter, req *logproto.QueryRequest) *batchChunkIterator {
// __name__ is not something we filter by because it's a constant in loki
// and only used for upstream compatibility; therefore remove it.
// The same applies to the sharding label which is injected by the cortex storage code.
for _, omit := range []string{labels.MetricName, astmapper.ShardLabel} {
for i := range matchers {
if matchers[i].Name == omit {
matchers = append(matchers[:i], matchers[i+1:]...)
break
}
}
}
func newBatchChunkIterator(
ctx context.Context,
chunks []*LazyChunk,
batchSize int,
direction logproto.Direction,
start, end time.Time,
iterFactory chunksIteratorFactory,
) *batchChunkIterator {
ctx, cancel := context.WithCancel(ctx)
res := &batchChunkIterator{
batchSize: batchSize,
matchers: matchers,
filter: filter,
req: req,
ctx: ctx,
cancel: cancel,
chunks: lazyChunks{direction: req.Direction, chunks: chunks},
labels: map[model.Fingerprint]string{},
start: start,
end: end,
direction: direction,
cancel: cancel,
iterFactory: iterFactory,
chunks: lazyChunks{direction: direction, chunks: chunks},
next: make(chan *struct {
iter iter.EntryIterator
iter genericIterator
err error
}),
}
sort.Sort(res.chunks)
go func() {
for {
if res.chunks.Len() == 0 {
close(res.next)
go res.loop(ctx)
return res
}
func (it *batchChunkIterator) loop(ctx context.Context) {
for {
if it.chunks.Len() == 0 {
close(it.next)
return
}
next, err := it.nextBatch()
select {
case <-ctx.Done():
close(it.next)
// next can be nil if we are waiting to return that the nextBatch was empty and the context is closed
// or if another error occurred reading nextBatch
if next == nil {
return
}
next, err := res.nextBatch()
select {
case <-ctx.Done():
close(res.next)
// next can be nil if we are waiting to return that the nextBatch was empty and the context is closed
// or if another error occurred reading nextBatch
if next == nil {
return
}
err = next.Close()
if err != nil {
level.Error(util.WithContext(ctx, util.Logger)).Log("msg", "Failed to close the pre-fetched iterator when pre-fetching was canceled", "err", err)
}
return
case res.next <- &struct {
iter iter.EntryIterator
err error
}{next, err}:
err = next.Close()
if err != nil {
level.Error(util.WithContext(ctx, util.Logger)).Log("msg", "Failed to close the pre-fetched iterator when pre-fetching was canceled", "err", err)
}
return
case it.next <- &struct {
iter genericIterator
err error
}{next, err}:
}
}()
return res
}
}
func (it *batchChunkIterator) Next() bool {
@ -128,10 +133,10 @@ func (it *batchChunkIterator) Next() bool {
}
}
func (it *batchChunkIterator) nextBatch() (iter.EntryIterator, error) {
func (it *batchChunkIterator) nextBatch() (genericIterator, error) {
// the first chunk of the batch
headChunk := it.chunks.Peek()
from, through := it.req.Start, it.req.End
from, through := it.start, it.end
batch := make([]*LazyChunk, 0, it.batchSize+len(it.lastOverlapping))
var nextChunk *LazyChunk
@ -139,11 +144,11 @@ func (it *batchChunkIterator) nextBatch() (iter.EntryIterator, error) {
// pop the next batch of chunks and append/prepend previous overlapping chunks
// so we can merge/de-dupe overlapping entries.
if it.req.Direction == logproto.FORWARD {
if it.direction == logproto.FORWARD {
batch = append(batch, it.lastOverlapping...)
}
batch = append(batch, it.chunks.pop(it.batchSize)...)
if it.req.Direction == logproto.BACKWARD {
if it.direction == logproto.BACKWARD {
batch = append(batch, it.lastOverlapping...)
}
@ -151,14 +156,14 @@ func (it *batchChunkIterator) nextBatch() (iter.EntryIterator, error) {
nextChunk = it.chunks.Peek()
// we max out our iterator boundaries to the next chunks in the queue
// so that overlapping chunks are together
if it.req.Direction == logproto.BACKWARD {
if it.direction == logproto.BACKWARD {
from = time.Unix(0, nextChunk.Chunk.Through.UnixNano())
// we have to reverse the inclusivity of the chunk iterator from
// [from, through) to (from, through] for backward queries, except when
// the batch's `from` is equal to the query's Start. This can be achieved
// by shifting `from` by one nanosecond.
if !from.Equal(it.req.Start) {
if !from.Equal(it.start) {
from = from.Add(time.Nanosecond)
}
} else {
@ -184,18 +189,18 @@ func (it *batchChunkIterator) nextBatch() (iter.EntryIterator, error) {
}
if it.req.Direction == logproto.BACKWARD {
if it.direction == logproto.BACKWARD {
through = time.Unix(0, headChunk.Chunk.Through.UnixNano())
if through.After(it.req.End) {
through = it.req.End
if through.After(it.end) {
through = it.end
}
// we have to reverse the inclusivity of the chunk iterator from
// [from, through) to (from, through] for backward queries, except when
// the batch's `through` is equal to the query's End. This can be achieved
// by shifting `through` by one nanosecond.
if !through.Equal(it.req.End) {
if !through.Equal(it.end) {
through = through.Add(time.Nanosecond)
}
} else {
@ -203,8 +208,8 @@ func (it *batchChunkIterator) nextBatch() (iter.EntryIterator, error) {
// when clipping the from it should never be before the start or equal to the end.
// Doing so would include entries not requested.
if from.Before(it.req.Start) || from.Equal(it.req.End) {
from = it.req.Start
if from.Before(it.start) || from.Equal(it.end) {
from = it.start
}
}
@ -218,17 +223,13 @@ func (it *batchChunkIterator) nextBatch() (iter.EntryIterator, error) {
if it.chunks.Len() > 0 {
it.lastOverlapping = it.lastOverlapping[:0]
for _, c := range batch {
if c.IsOverlapping(nextChunk, it.req.Direction) {
if c.IsOverlapping(nextChunk, it.direction) {
it.lastOverlapping = append(it.lastOverlapping, c)
}
}
}
// create the new chunks iterator from the current batch.
return it.newChunksIterator(batch, from, through, nextChunk)
}
func (it *batchChunkIterator) Entry() logproto.Entry {
return it.curr.Entry()
return it.iterFactory(batch, from, through, nextChunk)
}
func (it *batchChunkIterator) Labels() string {
@ -253,29 +254,59 @@ func (it *batchChunkIterator) Close() error {
return nil
}
// newChunksIterator creates an iterator over a set of lazychunks.
func (it *batchChunkIterator) newChunksIterator(chunks []*LazyChunk, from, through time.Time, nextChunk *LazyChunk) (iter.EntryIterator, error) {
chksBySeries := partitionBySeriesChunks(chunks)
type labelCache map[model.Fingerprint]string
// Make sure the initial chunks are loaded. This is not one chunk
// per series, but rather a chunk per non-overlapping iterator.
if err := loadFirstChunks(it.ctx, chksBySeries); err != nil {
return nil, err
// computeLabels compute the labels string representation, uses a map to cache result per fingerprint.
func (l labelCache) computeLabels(c *LazyChunk) string {
if lbs, ok := l[c.Chunk.Fingerprint]; ok {
return lbs
}
lbs := dropLabels(c.Chunk.Metric, labels.MetricName).String()
l[c.Chunk.Fingerprint] = lbs
return lbs
}
// Now that we have the first chunk for each series loaded,
// we can proceed to filter the series that don't match.
chksBySeries = filterSeriesByMatchers(chksBySeries, it.matchers)
type logBatchIterator struct {
*batchChunkIterator
var allChunks []*LazyChunk
for _, series := range chksBySeries {
for _, chunks := range series {
allChunks = append(allChunks, chunks...)
}
ctx context.Context
matchers []*labels.Matcher
filter logql.LineFilter
labels labelCache
}
func newLogBatchIterator(
ctx context.Context,
chunks []*LazyChunk,
batchSize int,
matchers []*labels.Matcher,
filter logql.LineFilter,
direction logproto.Direction,
start, end time.Time,
) (iter.EntryIterator, error) {
// __name__ is not something we filter by because it's a constant in loki
// and only used for upstream compatibility; therefore remove it.
// The same applies to the sharding label which is injected by the cortex storage code.
matchers = removeMatchersByName(matchers, labels.MetricName, astmapper.ShardLabel)
logbatch := &logBatchIterator{
labels: map[model.Fingerprint]string{},
matchers: matchers,
filter: filter,
ctx: ctx,
}
batch := newBatchChunkIterator(ctx, chunks, batchSize, direction, start, end, logbatch.newChunksIterator)
logbatch.batchChunkIterator = batch
return logbatch, nil
}
// Finally we load all chunks not already loaded
if err := fetchLazyChunks(it.ctx, allChunks); err != nil {
func (it *logBatchIterator) Entry() logproto.Entry {
return it.curr.(iter.EntryIterator).Entry()
}
// newChunksIterator creates an iterator over a set of lazychunks.
func (it *logBatchIterator) newChunksIterator(chunks []*LazyChunk, from, through time.Time, nextChunk *LazyChunk) (genericIterator, error) {
chksBySeries, err := fetchChunkBySeries(it.ctx, chunks, it.matchers)
if err != nil {
return nil, err
}
@ -284,10 +315,10 @@ func (it *batchChunkIterator) newChunksIterator(chunks []*LazyChunk, from, throu
return nil, err
}
return iter.NewHeapIterator(it.ctx, iters, it.req.Direction), nil
return iter.NewHeapIterator(it.ctx, iters, it.direction), nil
}
func (it *batchChunkIterator) buildIterators(chks map[model.Fingerprint][][]*LazyChunk, from, through time.Time, nextChunk *LazyChunk) ([]iter.EntryIterator, error) {
func (it *logBatchIterator) buildIterators(chks map[model.Fingerprint][][]*LazyChunk, from, through time.Time, nextChunk *LazyChunk) ([]iter.EntryIterator, error) {
result := make([]iter.EntryIterator, 0, len(chks))
for _, chunks := range chks {
iterator, err := it.buildHeapIterator(chunks, from, through, nextChunk)
@ -300,34 +331,24 @@ func (it *batchChunkIterator) buildIterators(chks map[model.Fingerprint][][]*Laz
return result, nil
}
// computeLabels compute the labels string representation, uses a map to cache result per fingerprint.
func (it *batchChunkIterator) computeLabels(c *LazyChunk) string {
if lbs, ok := it.labels[c.Chunk.Fingerprint]; ok {
return lbs
}
lbs := dropLabels(c.Chunk.Metric, labels.MetricName).String()
it.labels[c.Chunk.Fingerprint] = lbs
return lbs
}
func (it *batchChunkIterator) buildHeapIterator(chks [][]*LazyChunk, from, through time.Time, nextChunk *LazyChunk) (iter.EntryIterator, error) {
func (it *logBatchIterator) buildHeapIterator(chks [][]*LazyChunk, from, through time.Time, nextChunk *LazyChunk) (iter.EntryIterator, error) {
result := make([]iter.EntryIterator, 0, len(chks))
// __name__ is only used for upstream compatibility and is hardcoded within loki. Strip it from the return label set.
labels := it.computeLabels(chks[0][0])
labels := it.labels.computeLabels(chks[0][0])
for i := range chks {
iterators := make([]iter.EntryIterator, 0, len(chks[i]))
for j := range chks[i] {
if !chks[i][j].IsValid {
continue
}
iterator, err := chks[i][j].Iterator(it.ctx, from, through, it.req.Direction, it.filter, nextChunk)
iterator, err := chks[i][j].Iterator(it.ctx, from, through, it.direction, it.filter, nextChunk)
if err != nil {
return nil, err
}
iterators = append(iterators, iterator)
}
if it.req.Direction == logproto.BACKWARD {
if it.direction == logproto.BACKWARD {
for i, j := 0, len(iterators)-1; i < j; i, j = i+1, j-1 {
iterators[i], iterators[j] = iterators[j], iterators[i]
}
@ -335,7 +356,137 @@ func (it *batchChunkIterator) buildHeapIterator(chks [][]*LazyChunk, from, throu
result = append(result, iter.NewNonOverlappingIterator(iterators, labels))
}
return iter.NewHeapIterator(it.ctx, result, it.req.Direction), nil
return iter.NewHeapIterator(it.ctx, result, it.direction), nil
}
type sampleBatchIterator struct {
*batchChunkIterator
ctx context.Context
matchers []*labels.Matcher
filter logql.LineFilter
extractor logql.SampleExtractor
labels labelCache
}
func newSampleBatchIterator(
ctx context.Context,
chunks []*LazyChunk,
batchSize int,
matchers []*labels.Matcher,
filter logql.LineFilter,
extractor logql.SampleExtractor,
start, end time.Time,
) (iter.SampleIterator, error) {
// __name__ is not something we filter by because it's a constant in loki
// and only used for upstream compatibility; therefore remove it.
// The same applies to the sharding label which is injected by the cortex storage code.
matchers = removeMatchersByName(matchers, labels.MetricName, astmapper.ShardLabel)
samplebatch := &sampleBatchIterator{
labels: map[model.Fingerprint]string{},
matchers: matchers,
filter: filter,
extractor: extractor,
ctx: ctx,
}
batch := newBatchChunkIterator(ctx, chunks, batchSize, logproto.FORWARD, start, end, samplebatch.newChunksIterator)
samplebatch.batchChunkIterator = batch
return samplebatch, nil
}
func (it *sampleBatchIterator) Sample() logproto.Sample {
return it.curr.(iter.SampleIterator).Sample()
}
// newChunksIterator creates an iterator over a set of lazychunks.
func (it *sampleBatchIterator) newChunksIterator(chunks []*LazyChunk, from, through time.Time, nextChunk *LazyChunk) (genericIterator, error) {
chksBySeries, err := fetchChunkBySeries(it.ctx, chunks, it.matchers)
if err != nil {
return nil, err
}
iters, err := it.buildIterators(chksBySeries, from, through, nextChunk)
if err != nil {
return nil, err
}
return iter.NewHeapSampleIterator(it.ctx, iters), nil
}
func (it *sampleBatchIterator) buildIterators(chks map[model.Fingerprint][][]*LazyChunk, from, through time.Time, nextChunk *LazyChunk) ([]iter.SampleIterator, error) {
result := make([]iter.SampleIterator, 0, len(chks))
for _, chunks := range chks {
iterator, err := it.buildHeapIterator(chunks, from, through, nextChunk)
if err != nil {
return nil, err
}
result = append(result, iterator)
}
return result, nil
}
func (it *sampleBatchIterator) buildHeapIterator(chks [][]*LazyChunk, from, through time.Time, nextChunk *LazyChunk) (iter.SampleIterator, error) {
result := make([]iter.SampleIterator, 0, len(chks))
// __name__ is only used for upstream compatibility and is hardcoded within loki. Strip it from the return label set.
labels := it.labels.computeLabels(chks[0][0])
for i := range chks {
iterators := make([]iter.SampleIterator, 0, len(chks[i]))
for j := range chks[i] {
if !chks[i][j].IsValid {
continue
}
iterator, err := chks[i][j].SampleIterator(it.ctx, from, through, it.filter, it.extractor, nextChunk)
if err != nil {
return nil, err
}
iterators = append(iterators, iterator)
}
result = append(result, iter.NewNonOverlappingSampleIterator(iterators, labels))
}
return iter.NewHeapSampleIterator(it.ctx, result), nil
}
func removeMatchersByName(matchers []*labels.Matcher, names ...string) []*labels.Matcher {
for _, omit := range names {
for i := range matchers {
if matchers[i].Name == omit {
matchers = append(matchers[:i], matchers[i+1:]...)
break
}
}
}
return matchers
}
func fetchChunkBySeries(ctx context.Context, chunks []*LazyChunk, matchers []*labels.Matcher) (map[model.Fingerprint][][]*LazyChunk, error) {
chksBySeries := partitionBySeriesChunks(chunks)
// Make sure the initial chunks are loaded. This is not one chunk
// per series, but rather a chunk per non-overlapping iterator.
if err := loadFirstChunks(ctx, chksBySeries); err != nil {
return nil, err
}
// Now that we have the first chunk for each series loaded,
// we can proceed to filter the series that don't match.
chksBySeries = filterSeriesByMatchers(chksBySeries, matchers)
var allChunks []*LazyChunk
for _, series := range chksBySeries {
for _, chunks := range series {
allChunks = append(allChunks, chunks...)
}
}
// Finally we load all chunks not already loaded
if err := fetchLazyChunks(ctx, allChunks); err != nil {
return nil, err
}
return chksBySeries, nil
}
func filterSeriesByMatchers(chks map[model.Fingerprint][][]*LazyChunk, matchers []*labels.Matcher) map[model.Fingerprint][][]*LazyChunk {

@ -6,6 +6,7 @@ import (
"testing"
"time"
"github.com/cespare/xxhash/v2"
"github.com/cortexproject/cortex/pkg/chunk"
"github.com/pkg/errors"
"github.com/prometheus/common/model"
@ -21,7 +22,7 @@ import (
"github.com/grafana/loki/pkg/logql/stats"
)
func Test_newBatchChunkIterator(t *testing.T) {
func Test_newLogBatchChunkIterator(t *testing.T) {
tests := map[string]struct {
chunks []*LazyChunk
@ -552,7 +553,8 @@ func Test_newBatchChunkIterator(t *testing.T) {
for name, tt := range tests {
tt := tt
t.Run(name, func(t *testing.T) {
it := newBatchChunkIterator(context.Background(), tt.chunks, tt.batchSize, newMatchers(tt.matchers), nil, newQuery("", tt.start, tt.end, tt.direction, nil))
it, err := newLogBatchIterator(context.Background(), tt.chunks, tt.batchSize, newMatchers(tt.matchers), nil, tt.direction, tt.start, tt.end)
require.NoError(t, err)
streams, _, err := iter.ReadBatch(it, 1000)
_ = it.Close()
if err != nil {
@ -565,6 +567,291 @@ func Test_newBatchChunkIterator(t *testing.T) {
}
}
func Test_newSampleBatchChunkIterator(t *testing.T) {
tests := map[string]struct {
chunks []*LazyChunk
expected []logproto.Series
matchers string
start, end time.Time
batchSize int
}{
"forward with overlap": {
[]*LazyChunk{
newLazyChunk(logproto.Stream{
Labels: fooLabelsWithName,
Entries: []logproto.Entry{
{
Timestamp: from,
Line: "1",
},
{
Timestamp: from.Add(time.Millisecond),
Line: "2",
},
},
}),
newLazyChunk(logproto.Stream{
Labels: fooLabelsWithName,
Entries: []logproto.Entry{
{
Timestamp: from.Add(time.Millisecond),
Line: "2",
},
{
Timestamp: from.Add(2 * time.Millisecond),
Line: "3",
},
},
}),
newLazyChunk(logproto.Stream{
Labels: fooLabelsWithName,
Entries: []logproto.Entry{
{
Timestamp: from.Add(time.Millisecond),
Line: "2",
},
{
Timestamp: from.Add(2 * time.Millisecond),
Line: "3",
},
},
}),
newLazyChunk(logproto.Stream{
Labels: fooLabelsWithName,
Entries: []logproto.Entry{
{
Timestamp: from.Add(2 * time.Millisecond),
Line: "3",
},
{
Timestamp: from.Add(3 * time.Millisecond),
Line: "4",
},
},
}),
newLazyChunk(logproto.Stream{
Labels: fooLabelsWithName,
Entries: []logproto.Entry{
{
Timestamp: from.Add(2 * time.Millisecond),
Line: "3",
},
{
Timestamp: from.Add(3 * time.Millisecond),
Line: "4",
},
},
}),
newLazyChunk(logproto.Stream{
Labels: fooLabelsWithName,
Entries: []logproto.Entry{
{
Timestamp: from.Add(3 * time.Millisecond),
Line: "4",
},
{
Timestamp: from.Add(4 * time.Millisecond),
Line: "5",
},
},
}),
},
[]logproto.Series{
{
Labels: fooLabels,
Samples: []logproto.Sample{
{
Timestamp: from.UnixNano(),
Hash: xxhash.Sum64String("1"),
Value: 1.,
},
{
Timestamp: from.Add(time.Millisecond).UnixNano(),
Hash: xxhash.Sum64String("2"),
Value: 1.,
},
{
Timestamp: from.Add(2 * time.Millisecond).UnixNano(),
Hash: xxhash.Sum64String("3"),
Value: 1.,
},
{
Timestamp: from.Add(3 * time.Millisecond).UnixNano(),
Hash: xxhash.Sum64String("4"),
Value: 1.,
},
},
},
},
fooLabelsWithName,
from, from.Add(4 * time.Millisecond),
2,
},
"forward with overlapping non-continuous entries": {
[]*LazyChunk{
newLazyChunk(logproto.Stream{
Labels: fooLabelsWithName,
Entries: []logproto.Entry{
{
Timestamp: from,
Line: "1",
},
{
Timestamp: from.Add(time.Millisecond),
Line: "2",
},
{
Timestamp: from.Add(3 * time.Millisecond),
Line: "4",
},
},
}),
newLazyChunk(logproto.Stream{
Labels: fooLabelsWithName,
Entries: []logproto.Entry{
{
Timestamp: from.Add(time.Millisecond),
Line: "2",
},
{
Timestamp: from.Add(2 * time.Millisecond),
Line: "3",
},
},
}),
newLazyChunk(logproto.Stream{
Labels: fooLabelsWithName,
Entries: []logproto.Entry{
{
Timestamp: from.Add(time.Millisecond),
Line: "2",
},
{
Timestamp: from.Add(3 * time.Millisecond),
Line: "4",
},
},
}),
newLazyChunk(logproto.Stream{
Labels: fooLabelsWithName,
Entries: []logproto.Entry{
{
Timestamp: from.Add(2 * time.Millisecond),
Line: "3",
},
{
Timestamp: from.Add(3 * time.Millisecond),
Line: "4",
},
},
}),
},
[]logproto.Series{
{
Labels: fooLabels,
Samples: []logproto.Sample{
{
Timestamp: from.UnixNano(),
Hash: xxhash.Sum64String("1"),
Value: 1.,
},
{
Timestamp: from.Add(time.Millisecond).UnixNano(),
Hash: xxhash.Sum64String("2"),
Value: 1.,
},
{
Timestamp: from.Add(2 * time.Millisecond).UnixNano(),
Hash: xxhash.Sum64String("3"),
Value: 1.,
},
},
},
},
fooLabelsWithName,
from, from.Add(3 * time.Millisecond),
2,
},
"forward without overlap": {
[]*LazyChunk{
newLazyChunk(logproto.Stream{
Labels: fooLabelsWithName,
Entries: []logproto.Entry{
{
Timestamp: from,
Line: "1",
},
{
Timestamp: from.Add(time.Millisecond),
Line: "2",
},
},
}),
newLazyChunk(logproto.Stream{
Labels: fooLabelsWithName,
Entries: []logproto.Entry{
{
Timestamp: from.Add(2 * time.Millisecond),
Line: "3",
},
},
}),
newLazyChunk(logproto.Stream{
Labels: fooLabelsWithName,
Entries: []logproto.Entry{
{
Timestamp: from.Add(3 * time.Millisecond),
Line: "4",
},
},
}),
},
[]logproto.Series{
{
Labels: fooLabels,
Samples: []logproto.Sample{
{
Timestamp: from.UnixNano(),
Hash: xxhash.Sum64String("1"),
Value: 1.,
},
{
Timestamp: from.Add(time.Millisecond).UnixNano(),
Hash: xxhash.Sum64String("2"),
Value: 1.,
},
{
Timestamp: from.Add(2 * time.Millisecond).UnixNano(),
Hash: xxhash.Sum64String("3"),
Value: 1.,
},
},
},
},
fooLabelsWithName,
from, from.Add(3 * time.Millisecond),
2,
},
}
for name, tt := range tests {
tt := tt
t.Run(name, func(t *testing.T) {
it, err := newSampleBatchIterator(context.Background(), tt.chunks, tt.batchSize, newMatchers(tt.matchers), nil, logql.ExtractCount, tt.start, tt.end)
require.NoError(t, err)
series, _, err := iter.ReadSampleBatch(it, 1000)
_ = it.Close()
if err != nil {
t.Fatalf("error reading batch %s", err)
}
assertSeries(t, tt.expected, series.Series)
})
}
}
func TestPartitionOverlappingchunks(t *testing.T) {
var (
oneThroughFour = newLazyChunk(logproto.Stream{
@ -754,9 +1041,11 @@ func TestBuildHeapIterator(t *testing.T) {
} {
t.Run(fmt.Sprintf("%d", i), func(t *testing.T) {
ctx = user.InjectOrgID(context.Background(), "test-user")
b := &batchChunkIterator{
b := &logBatchIterator{
batchChunkIterator: &batchChunkIterator{
direction: logproto.FORWARD,
},
ctx: ctx,
req: &logproto.QueryRequest{Direction: logproto.FORWARD},
labels: map[model.Fingerprint]string{},
}
it, err := b.buildHeapIterator(tc.input, from, from.Add(6*time.Millisecond), nil)
@ -764,7 +1053,7 @@ func TestBuildHeapIterator(t *testing.T) {
t.Errorf("buildHeapIterator error = %v", err)
return
}
req := newQuery("{foo=\"bar\"}", from, from.Add(6*time.Millisecond), logproto.FORWARD, nil)
req := newQuery("{foo=\"bar\"}", from, from.Add(6*time.Millisecond), nil)
streams, _, err := iter.ReadBatch(it, req.Limit)
_ = it.Close()
if err != nil {
@ -864,7 +1153,7 @@ func Benchmark_store_OverlappingChunks(b *testing.B) {
ctx := user.InjectOrgID(stats.NewContext(context.Background()), "fake")
start := time.Now()
for i := 0; i < b.N; i++ {
it, err := st.LazyQuery(ctx, logql.SelectParams{QueryRequest: &logproto.QueryRequest{
it, err := st.SelectLogs(ctx, logql.SelectLogParams{QueryRequest: &logproto.QueryRequest{
Selector: `{foo="bar"}`,
Direction: logproto.BACKWARD,
Limit: 0,

@ -92,3 +92,90 @@ func (it *cachedIterator) Close() error {
it.reset()
return it.closeErr
}
// cachedIterator is an iterator that caches iteration to be replayed later on.
type cachedSampleIterator struct {
cache []logproto.Sample
base iter.SampleIterator
labels string
curr int
closeErr error
iterErr error
}
// newSampleCachedIterator creates an iterator that cache iteration result and can be iterated again
// after closing it without re-using the underlaying iterator `it`.
// The cache iterator should be used for entries that belongs to the same stream only.
func newCachedSampleIterator(it iter.SampleIterator, cap int) *cachedSampleIterator {
c := &cachedSampleIterator{
base: it,
cache: make([]logproto.Sample, 0, cap),
curr: -1,
}
c.load()
return c
}
func (it *cachedSampleIterator) reset() {
it.curr = -1
}
func (it *cachedSampleIterator) load() {
if it.base != nil {
defer func() {
it.closeErr = it.base.Close()
it.iterErr = it.base.Error()
it.base = nil
it.reset()
}()
// set labels using the first entry
if !it.base.Next() {
return
}
it.labels = it.base.Labels()
// add all entries until the base iterator is exhausted
for {
it.cache = append(it.cache, it.base.Sample())
if !it.base.Next() {
break
}
}
}
}
func (it *cachedSampleIterator) Next() bool {
if len(it.cache) == 0 {
it.cache = nil
return false
}
if it.curr+1 >= len(it.cache) {
return false
}
it.curr++
return it.curr < len(it.cache)
}
func (it *cachedSampleIterator) Sample() logproto.Sample {
if len(it.cache) == 0 {
return logproto.Sample{}
}
if it.curr < 0 {
return it.cache[0]
}
return it.cache[it.curr]
}
func (it *cachedSampleIterator) Labels() string {
return it.labels
}
func (it *cachedSampleIterator) Error() error { return it.iterErr }
func (it *cachedSampleIterator) Close() error {
it.reset()
return it.closeErr
}

@ -77,10 +77,77 @@ func Test_ErrorCachedIterator(t *testing.T) {
require.Equal(t, errors.New("close"), c.Close())
}
func Test_CachedSampleIterator(t *testing.T) {
series := logproto.Series{
Labels: `{foo="bar"}`,
Samples: []logproto.Sample{
{Timestamp: time.Unix(0, 1).UnixNano(), Hash: 1, Value: 1.},
{Timestamp: time.Unix(0, 2).UnixNano(), Hash: 2, Value: 2.},
{Timestamp: time.Unix(0, 3).UnixNano(), Hash: 3, Value: 3.},
},
}
c := newCachedSampleIterator(iter.NewSeriesIterator(series), 3)
assert := func() {
// we should crash for call of entry without next although that's not expected.
require.Equal(t, series.Labels, c.Labels())
require.Equal(t, series.Samples[0], c.Sample())
require.Equal(t, true, c.Next())
require.Equal(t, series.Samples[0], c.Sample())
require.Equal(t, true, c.Next())
require.Equal(t, series.Samples[1], c.Sample())
require.Equal(t, true, c.Next())
require.Equal(t, series.Samples[2], c.Sample())
require.Equal(t, false, c.Next())
require.Equal(t, nil, c.Error())
require.Equal(t, series.Samples[2], c.Sample())
require.Equal(t, false, c.Next())
}
assert()
// Close the iterator reset it to the beginning.
require.Equal(t, nil, c.Close())
assert()
}
func Test_EmptyCachedSampleIterator(t *testing.T) {
c := newCachedSampleIterator(iter.NoopIterator, 0)
require.Equal(t, "", c.Labels())
require.Equal(t, logproto.Sample{}, c.Sample())
require.Equal(t, false, c.Next())
require.Equal(t, "", c.Labels())
require.Equal(t, logproto.Sample{}, c.Sample())
require.Equal(t, nil, c.Close())
require.Equal(t, "", c.Labels())
require.Equal(t, logproto.Sample{}, c.Sample())
require.Equal(t, false, c.Next())
require.Equal(t, "", c.Labels())
require.Equal(t, logproto.Sample{}, c.Sample())
}
func Test_ErrorCachedSampleIterator(t *testing.T) {
c := newCachedSampleIterator(&errorIter{}, 0)
require.Equal(t, false, c.Next())
require.Equal(t, "", c.Labels())
require.Equal(t, logproto.Sample{}, c.Sample())
require.Equal(t, errors.New("error"), c.Error())
require.Equal(t, errors.New("close"), c.Close())
}
type errorIter struct{}
func (errorIter) Next() bool { return false }
func (errorIter) Error() error { return errors.New("error") }
func (errorIter) Labels() string { return "" }
func (errorIter) Entry() logproto.Entry { return logproto.Entry{} }
func (errorIter) Close() error { return errors.New("close") }
func (errorIter) Next() bool { return false }
func (errorIter) Error() error { return errors.New("error") }
func (errorIter) Labels() string { return "" }
func (errorIter) Entry() logproto.Entry { return logproto.Entry{} }
func (errorIter) Sample() logproto.Sample { return logproto.Sample{} }
func (errorIter) Close() error { return errors.New("close") }

@ -21,7 +21,8 @@ type LazyChunk struct {
// cache of overlapping block.
// We use the offset of the block as key since it's unique per chunk.
overlappingBlocks map[int]*cachedIterator
overlappingBlocks map[int]*cachedIterator
overlappingSampleBlocks map[int]*cachedSampleIterator
}
// Iterator returns an entry iterator.
@ -86,6 +87,62 @@ func (c *LazyChunk) Iterator(
return iter.NewEntryReversedIter(iterForward)
}
// SampleIterator returns an sample iterator.
// The iterator returned will cache overlapping block's entries with the next chunk if passed.
// This way when we re-use them for ordering across batches we don't re-decompress the data again.
func (c *LazyChunk) SampleIterator(
ctx context.Context,
from, through time.Time,
filter logql.LineFilter,
extractor logql.SampleExtractor,
nextChunk *LazyChunk,
) (iter.SampleIterator, error) {
// If the chunk is not already loaded, then error out.
if c.Chunk.Data == nil {
return nil, errors.New("chunk is not loaded")
}
lokiChunk := c.Chunk.Data.(*chunkenc.Facade).LokiChunk()
blocks := lokiChunk.Blocks(from, through)
if len(blocks) == 0 {
return iter.NoopIterator, nil
}
its := make([]iter.SampleIterator, 0, len(blocks))
for _, b := range blocks {
// if we have already processed and cache block let's use it.
if cache, ok := c.overlappingSampleBlocks[b.Offset()]; ok {
clone := *cache
clone.reset()
its = append(its, &clone)
continue
}
// if the block is overlapping cache it with the next chunk boundaries.
if nextChunk != nil && IsBlockOverlapping(b, nextChunk, logproto.FORWARD) {
it := newCachedSampleIterator(b.SampleIterator(ctx, filter, extractor), b.Entries())
its = append(its, it)
if c.overlappingSampleBlocks == nil {
c.overlappingSampleBlocks = make(map[int]*cachedSampleIterator)
}
c.overlappingSampleBlocks[b.Offset()] = it
continue
}
if nextChunk != nil {
delete(c.overlappingBlocks, b.Offset())
}
// non-overlapping block with the next chunk are not cached.
its = append(its, b.SampleIterator(ctx, filter, extractor))
}
// build the final iterator bound to the requested time range.
return iter.NewTimeRangedSampleIterator(
iter.NewNonOverlappingSampleIterator(its, ""),
from.UnixNano(),
through.UnixNano(),
), nil
}
func IsBlockOverlapping(b chunkenc.Block, with *LazyChunk, direction logproto.Direction) bool {
if direction == logproto.BACKWARD {
through := int64(with.Chunk.Through) * int64(time.Millisecond)

@ -99,11 +99,12 @@ type fakeBlock struct {
mint, maxt int64
}
func (fakeBlock) Entries() int { return 0 }
func (fakeBlock) Offset() int { return 0 }
func (f fakeBlock) MinTime() int64 { return f.mint }
func (f fakeBlock) MaxTime() int64 { return f.maxt }
func (fakeBlock) Iterator(context.Context, logql.LineFilter) iter.EntryIterator {
func (fakeBlock) Entries() int { return 0 }
func (fakeBlock) Offset() int { return 0 }
func (f fakeBlock) MinTime() int64 { return f.mint }
func (f fakeBlock) MaxTime() int64 { return f.maxt }
func (fakeBlock) Iterator(context.Context, logql.LineFilter) iter.EntryIterator { return nil }
func (fakeBlock) SampleIterator(context.Context, logql.LineFilter, logql.SampleExtractor) iter.SampleIterator {
return nil
}

@ -39,8 +39,9 @@ func (cfg *Config) RegisterFlags(f *flag.FlagSet) {
// Store is the Loki chunk store to retrieve and save chunks.
type Store interface {
chunk.Store
LazyQuery(ctx context.Context, req logql.SelectParams) (iter.EntryIterator, error)
GetSeries(ctx context.Context, req logql.SelectParams) ([]logproto.SeriesIdentifier, error)
SelectSamples(ctx context.Context, req logql.SelectSampleParams) (iter.SampleIterator, error)
SelectLogs(ctx context.Context, req logql.SelectLogParams) (iter.EntryIterator, error)
GetSeries(ctx context.Context, req logql.SelectLogParams) ([]logproto.SeriesIdentifier, error)
}
type store struct {
@ -72,7 +73,7 @@ func NewTableClient(name string, cfg Config) (chunk.TableClient, error) {
// decodeReq sanitizes an incoming request, rounds bounds, appends the __name__ matcher,
// and adds the "__cortex_shard__" label if this is a sharded query.
func decodeReq(req logql.SelectParams) ([]*labels.Matcher, logql.LineFilter, model.Time, model.Time, error) {
func decodeReq(req logql.QueryParams) ([]*labels.Matcher, logql.LineFilter, model.Time, model.Time, error) {
expr, err := req.LogSelector()
if err != nil {
return nil, nil, 0, 0, err
@ -113,7 +114,7 @@ func decodeReq(req logql.SelectParams) ([]*labels.Matcher, logql.LineFilter, mod
}
}
from, through := util.RoundToMilliseconds(req.Start, req.End)
from, through := util.RoundToMilliseconds(req.GetStart(), req.GetEnd())
return matchers, filter, from, through, nil
}
@ -147,7 +148,7 @@ func (s *store) lazyChunks(ctx context.Context, matchers []*labels.Matcher, from
return lazyChunks, nil
}
func (s *store) GetSeries(ctx context.Context, req logql.SelectParams) ([]logproto.SeriesIdentifier, error) {
func (s *store) GetSeries(ctx context.Context, req logql.SelectLogParams) ([]logproto.SeriesIdentifier, error) {
var from, through model.Time
var matchers []*labels.Matcher
@ -227,9 +228,9 @@ func (s *store) GetSeries(ctx context.Context, req logql.SelectParams) ([]logpro
}
// LazyQuery returns an iterator that will query the store for more chunks while iterating instead of fetching all chunks upfront
// SelectLogs returns an iterator that will query the store for more chunks while iterating instead of fetching all chunks upfront
// for that request.
func (s *store) LazyQuery(ctx context.Context, req logql.SelectParams) (iter.EntryIterator, error) {
func (s *store) SelectLogs(ctx context.Context, req logql.SelectLogParams) (iter.EntryIterator, error) {
matchers, filter, from, through, err := decodeReq(req)
if err != nil {
return nil, err
@ -244,10 +245,37 @@ func (s *store) LazyQuery(ctx context.Context, req logql.SelectParams) (iter.Ent
return iter.NoopIterator, nil
}
return newBatchChunkIterator(ctx, lazyChunks, s.cfg.MaxChunkBatchSize, matchers, filter, req.QueryRequest), nil
return newLogBatchIterator(ctx, lazyChunks, s.cfg.MaxChunkBatchSize, matchers, filter, req.Direction, req.Start, req.End)
}
func (s *store) SelectSamples(ctx context.Context, req logql.SelectSampleParams) (iter.SampleIterator, error) {
matchers, filter, from, through, err := decodeReq(req)
if err != nil {
return nil, err
}
expr, err := req.Expr()
if err != nil {
return nil, err
}
extractor, err := expr.Extractor()
if err != nil {
return nil, err
}
lazyChunks, err := s.lazyChunks(ctx, matchers, from, through)
if err != nil {
return nil, err
}
if len(lazyChunks) == 0 {
return iter.NoopIterator, nil
}
return newSampleBatchIterator(ctx, lazyChunks, s.cfg.MaxChunkBatchSize, matchers, filter, extractor, req.Start, req.End)
}
func filterChunksByTime(from, through model.Time, chunks []chunk.Chunk) []chunk.Chunk {
filtered := make([]chunk.Chunk, 0, len(chunks))
for _, chunk := range chunks {

@ -12,6 +12,7 @@ import (
"testing"
"time"
"github.com/cespare/xxhash/v2"
"github.com/prometheus/common/model"
"github.com/prometheus/prometheus/pkg/labels"
"github.com/stretchr/testify/require"
@ -39,9 +40,9 @@ var (
)
//go test -bench=. -benchmem -memprofile memprofile.out -cpuprofile profile.out
func Benchmark_store_LazyQueryRegexBackward(b *testing.B) {
func Benchmark_store_SelectLogsRegexBackward(b *testing.B) {
benchmarkStoreQuery(b, &logproto.QueryRequest{
Selector: `{foo="bar"} |= "fuzz"`,
Selector: `{foo="bar"} |~ "fuzz"`,
Limit: 1000,
Start: time.Unix(0, start.UnixNano()),
End: time.Unix(0, (24*time.Hour.Nanoseconds())+start.UnixNano()),
@ -49,7 +50,7 @@ func Benchmark_store_LazyQueryRegexBackward(b *testing.B) {
})
}
func Benchmark_store_LazyQueryLogQLBackward(b *testing.B) {
func Benchmark_store_SelectLogsLogQLBackward(b *testing.B) {
benchmarkStoreQuery(b, &logproto.QueryRequest{
Selector: `{foo="bar"} |= "test" != "toto" |= "fuzz"`,
Limit: 1000,
@ -59,9 +60,9 @@ func Benchmark_store_LazyQueryLogQLBackward(b *testing.B) {
})
}
func Benchmark_store_LazyQueryRegexForward(b *testing.B) {
func Benchmark_store_SelectLogsRegexForward(b *testing.B) {
benchmarkStoreQuery(b, &logproto.QueryRequest{
Selector: `{foo="bar"} |= "fuzz"`,
Selector: `{foo="bar"} |~ "fuzz"`,
Limit: 1000,
Start: time.Unix(0, start.UnixNano()),
End: time.Unix(0, (24*time.Hour.Nanoseconds())+start.UnixNano()),
@ -69,7 +70,7 @@ func Benchmark_store_LazyQueryRegexForward(b *testing.B) {
})
}
func Benchmark_store_LazyQueryForward(b *testing.B) {
func Benchmark_store_SelectLogsForward(b *testing.B) {
benchmarkStoreQuery(b, &logproto.QueryRequest{
Selector: `{foo="bar"}`,
Limit: 1000,
@ -79,7 +80,7 @@ func Benchmark_store_LazyQueryForward(b *testing.B) {
})
}
func Benchmark_store_LazyQueryBackward(b *testing.B) {
func Benchmark_store_SelectLogsBackward(b *testing.B) {
benchmarkStoreQuery(b, &logproto.QueryRequest{
Selector: `{foo="bar"}`,
Limit: 1000,
@ -89,6 +90,37 @@ func Benchmark_store_LazyQueryBackward(b *testing.B) {
})
}
// rm -Rf /tmp/benchmark/chunks/ /tmp/benchmark/index
// go run -mod=vendor ./pkg/storage/hack/main.go
// go test -benchmem -run=^$ -mod=vendor ./pkg/storage -bench=Benchmark_store_SelectSample -memprofile memprofile.out -cpuprofile cpuprofile.out
func Benchmark_store_SelectSample(b *testing.B) {
var sampleRes []logproto.Sample
for _, test := range []string{
`count_over_time({foo="bar"}[5m])`,
`rate({foo="bar"}[5m])`,
`bytes_rate({foo="bar"}[5m])`,
`bytes_over_time({foo="bar"}[5m])`,
} {
b.Run(test, func(b *testing.B) {
for i := 0; i < b.N; i++ {
iter, err := chunkStore.SelectSamples(ctx, logql.SelectSampleParams{
SampleQueryRequest: newSampleQuery(test, time.Unix(0, start.UnixNano()), time.Unix(0, (24*time.Hour.Nanoseconds())+start.UnixNano())),
})
if err != nil {
b.Fatal(err)
}
for iter.Next() {
sampleRes = append(sampleRes, iter.Sample())
}
iter.Close()
}
})
}
log.Print("sample processed ", len(sampleRes))
}
func benchmarkStoreQuery(b *testing.B, query *logproto.QueryRequest) {
b.ReportAllocs()
// force to run gc 10x more often this can be useful to detect fast allocation vs leak.
@ -111,7 +143,7 @@ func benchmarkStoreQuery(b *testing.B, query *logproto.QueryRequest) {
}
}()
for i := 0; i < b.N; i++ {
iter, err := chunkStore.LazyQuery(ctx, logql.SelectParams{QueryRequest: query})
iter, err := chunkStore.SelectLogs(ctx, logql.SelectLogParams{QueryRequest: query})
if err != nil {
b.Fatal(err)
}
@ -180,7 +212,7 @@ func getLocalStore() Store {
return store
}
func Test_store_LazyQuery(t *testing.T) {
func Test_store_SelectLogs(t *testing.T) {
tests := []struct {
name string
@ -189,7 +221,7 @@ func Test_store_LazyQuery(t *testing.T) {
}{
{
"all",
newQuery("{foo=~\"ba.*\"}", from, from.Add(6*time.Millisecond), logproto.FORWARD, nil),
newQuery("{foo=~\"ba.*\"}", from, from.Add(6*time.Millisecond), nil),
[]logproto.Stream{
{
Labels: "{foo=\"bar\"}",
@ -257,7 +289,7 @@ func Test_store_LazyQuery(t *testing.T) {
},
{
"filter regex",
newQuery("{foo=~\"ba.*\"} |~ \"1|2|3\" !~ \"2|3\"", from, from.Add(6*time.Millisecond), logproto.FORWARD, nil),
newQuery("{foo=~\"ba.*\"} |~ \"1|2|3\" !~ \"2|3\"", from, from.Add(6*time.Millisecond), nil),
[]logproto.Stream{
{
Labels: "{foo=\"bar\"}",
@ -281,7 +313,7 @@ func Test_store_LazyQuery(t *testing.T) {
},
{
"filter matcher",
newQuery("{foo=\"bar\"}", from, from.Add(6*time.Millisecond), logproto.FORWARD, nil),
newQuery("{foo=\"bar\"}", from, from.Add(6*time.Millisecond), nil),
[]logproto.Stream{
{
Labels: "{foo=\"bar\"}",
@ -318,7 +350,7 @@ func Test_store_LazyQuery(t *testing.T) {
},
{
"filter time",
newQuery("{foo=~\"ba.*\"}", from, from.Add(time.Millisecond), logproto.FORWARD, nil),
newQuery("{foo=~\"ba.*\"}", from, from.Add(time.Millisecond), nil),
[]logproto.Stream{
{
Labels: "{foo=\"bar\"}",
@ -351,7 +383,7 @@ func Test_store_LazyQuery(t *testing.T) {
}
ctx = user.InjectOrgID(context.Background(), "test-user")
it, err := s.LazyQuery(ctx, logql.SelectParams{QueryRequest: tt.req})
it, err := s.SelectLogs(ctx, logql.SelectLogParams{QueryRequest: tt.req})
if err != nil {
t.Errorf("store.LazyQuery() error = %v", err)
return
@ -367,6 +399,215 @@ func Test_store_LazyQuery(t *testing.T) {
}
}
func Test_store_SelectSample(t *testing.T) {
tests := []struct {
name string
req *logproto.SampleQueryRequest
expected []logproto.Series
}{
{
"all",
newSampleQuery("count_over_time({foo=~\"ba.*\"}[5m])", from, from.Add(6*time.Millisecond)),
[]logproto.Series{
{
Labels: "{foo=\"bar\"}",
Samples: []logproto.Sample{
{
Timestamp: from.UnixNano(),
Hash: xxhash.Sum64String("1"),
Value: 1.,
},
{
Timestamp: from.Add(time.Millisecond).UnixNano(),
Hash: xxhash.Sum64String("2"),
Value: 1.,
},
{
Timestamp: from.Add(2 * time.Millisecond).UnixNano(),
Hash: xxhash.Sum64String("3"),
Value: 1.,
},
{
Timestamp: from.Add(3 * time.Millisecond).UnixNano(),
Hash: xxhash.Sum64String("4"),
Value: 1.,
},
{
Timestamp: from.Add(4 * time.Millisecond).UnixNano(),
Hash: xxhash.Sum64String("5"),
Value: 1.,
},
{
Timestamp: from.Add(5 * time.Millisecond).UnixNano(),
Hash: xxhash.Sum64String("6"),
Value: 1.,
},
},
},
{
Labels: "{foo=\"bazz\"}",
Samples: []logproto.Sample{
{
Timestamp: from.UnixNano(),
Hash: xxhash.Sum64String("1"),
Value: 1.,
},
{
Timestamp: from.Add(time.Millisecond).UnixNano(),
Hash: xxhash.Sum64String("2"),
Value: 1.,
},
{
Timestamp: from.Add(2 * time.Millisecond).UnixNano(),
Hash: xxhash.Sum64String("3"),
Value: 1.,
},
{
Timestamp: from.Add(3 * time.Millisecond).UnixNano(),
Hash: xxhash.Sum64String("4"),
Value: 1.,
},
{
Timestamp: from.Add(4 * time.Millisecond).UnixNano(),
Hash: xxhash.Sum64String("5"),
Value: 1.,
},
{
Timestamp: from.Add(5 * time.Millisecond).UnixNano(),
Hash: xxhash.Sum64String("6"),
Value: 1.,
},
},
},
},
},
{
"filter regex",
newSampleQuery("rate({foo=~\"ba.*\"} |~ \"1|2|3\" !~ \"2|3\"[1m])", from, from.Add(6*time.Millisecond)),
[]logproto.Series{
{
Labels: "{foo=\"bar\"}",
Samples: []logproto.Sample{
{
Timestamp: from.UnixNano(),
Hash: xxhash.Sum64String("1"),
Value: 1.,
},
},
},
{
Labels: "{foo=\"bazz\"}",
Samples: []logproto.Sample{
{
Timestamp: from.UnixNano(),
Hash: xxhash.Sum64String("1"),
Value: 1.,
},
},
},
},
},
{
"filter matcher",
newSampleQuery("count_over_time({foo=\"bar\"}[10m])", from, from.Add(6*time.Millisecond)),
[]logproto.Series{
{
Labels: "{foo=\"bar\"}",
Samples: []logproto.Sample{
{
Timestamp: from.UnixNano(),
Hash: xxhash.Sum64String("1"),
Value: 1.,
},
{
Timestamp: from.Add(time.Millisecond).UnixNano(),
Hash: xxhash.Sum64String("2"),
Value: 1.,
},
{
Timestamp: from.Add(2 * time.Millisecond).UnixNano(),
Hash: xxhash.Sum64String("3"),
Value: 1.,
},
{
Timestamp: from.Add(3 * time.Millisecond).UnixNano(),
Hash: xxhash.Sum64String("4"),
Value: 1.,
},
{
Timestamp: from.Add(4 * time.Millisecond).UnixNano(),
Hash: xxhash.Sum64String("5"),
Value: 1.,
},
{
Timestamp: from.Add(5 * time.Millisecond).UnixNano(),
Hash: xxhash.Sum64String("6"),
Value: 1.,
},
},
},
},
},
{
"filter time",
newSampleQuery("count_over_time({foo=~\"ba.*\"}[1s])", from, from.Add(time.Millisecond)),
[]logproto.Series{
{
Labels: "{foo=\"bar\"}",
Samples: []logproto.Sample{
{
Timestamp: from.UnixNano(),
Hash: xxhash.Sum64String("1"),
Value: 1.,
},
},
},
{
Labels: "{foo=\"bazz\"}",
Samples: []logproto.Sample{
{
Timestamp: from.UnixNano(),
Hash: xxhash.Sum64String("1"),
Value: 1.,
},
},
},
},
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
s := &store{
Store: storeFixture,
cfg: Config{
MaxChunkBatchSize: 10,
},
}
ctx = user.InjectOrgID(context.Background(), "test-user")
it, err := s.SelectSamples(ctx, logql.SelectSampleParams{SampleQueryRequest: tt.req})
if err != nil {
t.Errorf("store.LazyQuery() error = %v", err)
return
}
series, _, err := iter.ReadSampleBatch(it, uint32(100000))
_ = it.Close()
if err != nil {
t.Fatalf("error reading batch %s", err)
}
assertSeries(t, tt.expected, series.Series)
})
}
}
func Test_store_GetSeries(t *testing.T) {
tests := []struct {
@ -377,7 +618,7 @@ func Test_store_GetSeries(t *testing.T) {
}{
{
"all",
newQuery("{foo=~\"ba.*\"}", from, from.Add(6*time.Millisecond), logproto.FORWARD, nil),
newQuery("{foo=~\"ba.*\"}", from, from.Add(6*time.Millisecond), nil),
[]logproto.SeriesIdentifier{
{Labels: mustParseLabels("{foo=\"bar\"}")},
{Labels: mustParseLabels("{foo=\"bazz\"}")},
@ -386,7 +627,7 @@ func Test_store_GetSeries(t *testing.T) {
},
{
"all-single-batch",
newQuery("{foo=~\"ba.*\"}", from, from.Add(6*time.Millisecond), logproto.FORWARD, nil),
newQuery("{foo=~\"ba.*\"}", from, from.Add(6*time.Millisecond), nil),
[]logproto.SeriesIdentifier{
{Labels: mustParseLabels("{foo=\"bar\"}")},
{Labels: mustParseLabels("{foo=\"bazz\"}")},
@ -395,7 +636,7 @@ func Test_store_GetSeries(t *testing.T) {
},
{
"regexp filter (post chunk fetching)",
newQuery("{foo=~\"bar.*\"}", from, from.Add(6*time.Millisecond), logproto.FORWARD, nil),
newQuery("{foo=~\"bar.*\"}", from, from.Add(6*time.Millisecond), nil),
[]logproto.SeriesIdentifier{
{Labels: mustParseLabels("{foo=\"bar\"}")},
},
@ -403,7 +644,7 @@ func Test_store_GetSeries(t *testing.T) {
},
{
"filter matcher",
newQuery("{foo=\"bar\"}", from, from.Add(6*time.Millisecond), logproto.FORWARD, nil),
newQuery("{foo=\"bar\"}", from, from.Add(6*time.Millisecond), nil),
[]logproto.SeriesIdentifier{
{Labels: mustParseLabels("{foo=\"bar\"}")},
},
@ -419,7 +660,7 @@ func Test_store_GetSeries(t *testing.T) {
},
}
ctx = user.InjectOrgID(context.Background(), "test-user")
out, err := s.GetSeries(ctx, logql.SelectParams{QueryRequest: tt.req})
out, err := s.GetSeries(ctx, logql.SelectLogParams{QueryRequest: tt.req})
if err != nil {
t.Errorf("store.GetSeries() error = %v", err)
return
@ -437,7 +678,7 @@ func Test_store_decodeReq_Matchers(t *testing.T) {
}{
{
"unsharded",
newQuery("{foo=~\"ba.*\"}", from, from.Add(6*time.Millisecond), logproto.FORWARD, nil),
newQuery("{foo=~\"ba.*\"}", from, from.Add(6*time.Millisecond), nil),
[]*labels.Matcher{
labels.MustNewMatcher(labels.MatchRegexp, "foo", "ba.*"),
labels.MustNewMatcher(labels.MatchEqual, labels.MetricName, "logs"),
@ -446,7 +687,7 @@ func Test_store_decodeReq_Matchers(t *testing.T) {
{
"unsharded",
newQuery(
"{foo=~\"ba.*\"}", from, from.Add(6*time.Millisecond), logproto.FORWARD,
"{foo=~\"ba.*\"}", from, from.Add(6*time.Millisecond),
[]astmapper.ShardAnnotation{
{Shard: 1, Of: 2},
},
@ -464,7 +705,7 @@ func Test_store_decodeReq_Matchers(t *testing.T) {
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
ms, _, _, _, err := decodeReq(logql.SelectParams{QueryRequest: tt.req})
ms, _, _, _, err := decodeReq(logql.SelectLogParams{QueryRequest: tt.req})
if err != nil {
t.Errorf("store.GetSeries() error = %v", err)
return

@ -47,6 +47,28 @@ func assertStream(t *testing.T, expected, actual []logproto.Stream) {
}
}
func assertSeries(t *testing.T, expected, actual []logproto.Series) {
if len(expected) != len(actual) {
t.Fatalf("error stream length are different expected %d actual %d\n%s", len(expected), len(actual), spew.Sdump(expected, actual))
return
}
sort.Slice(expected, func(i int, j int) bool { return expected[i].Labels < expected[j].Labels })
sort.Slice(actual, func(i int, j int) bool { return actual[i].Labels < actual[j].Labels })
for i := range expected {
assert.Equal(t, expected[i].Labels, actual[i].Labels)
if len(expected[i].Samples) != len(actual[i].Samples) {
t.Fatalf("error entries length are different expected %d actual%d\n%s", len(expected[i].Samples), len(actual[i].Samples), spew.Sdump(expected[i].Samples, actual[i].Samples))
return
}
for j := range expected[i].Samples {
assert.Equal(t, expected[i].Samples[j].Timestamp, actual[i].Samples[j].Timestamp)
assert.Equal(t, expected[i].Samples[j].Value, actual[i].Samples[j].Value)
assert.Equal(t, expected[i].Samples[j].Hash, actual[i].Samples[j].Hash)
}
}
}
func newLazyChunk(stream logproto.Stream) *LazyChunk {
return &LazyChunk{
Fetcher: nil,
@ -102,13 +124,13 @@ func newMatchers(matchers string) []*labels.Matcher {
return res
}
func newQuery(query string, start, end time.Time, direction logproto.Direction, shards []astmapper.ShardAnnotation) *logproto.QueryRequest {
func newQuery(query string, start, end time.Time, shards []astmapper.ShardAnnotation) *logproto.QueryRequest {
req := &logproto.QueryRequest{
Selector: query,
Start: start,
Limit: 1000,
End: end,
Direction: direction,
Direction: logproto.FORWARD,
}
for _, shard := range shards {
req.Shards = append(req.Shards, shard.String())
@ -116,6 +138,15 @@ func newQuery(query string, start, end time.Time, direction logproto.Direction,
return req
}
func newSampleQuery(query string, start, end time.Time) *logproto.SampleQueryRequest {
req := &logproto.SampleQueryRequest{
Selector: query,
Start: start,
End: end,
}
return req
}
type mockChunkStore struct {
chunks []chunk.Chunk
client *mockChunkStoreClient

@ -14,6 +14,11 @@ func HashString64(s string) uint64 {
return AddString64(Init64, s)
}
// HashBytes64 returns the hash of u.
func HashBytes64(b []byte) uint64 {
return AddBytes64(Init64, b)
}
// HashUint64 returns the hash of u.
func HashUint64(u uint64) uint64 {
return AddUint64(Init64, u)
@ -34,24 +39,69 @@ func AddString64(h uint64, s string) uint64 {
- BenchmarkHash64/hash_function-4 50000000 38.6 ns/op 932.35 MB/s 0 B/op 0 allocs/op
*/
for len(s) >= 8 {
h = (h ^ uint64(s[0])) * prime64
h = (h ^ uint64(s[1])) * prime64
h = (h ^ uint64(s[2])) * prime64
h = (h ^ uint64(s[3])) * prime64
h = (h ^ uint64(s[4])) * prime64
h = (h ^ uint64(s[5])) * prime64
h = (h ^ uint64(s[6])) * prime64
h = (h ^ uint64(s[7])) * prime64
s = s[8:]
}
if len(s) >= 4 {
h = (h ^ uint64(s[0])) * prime64
h = (h ^ uint64(s[1])) * prime64
h = (h ^ uint64(s[2])) * prime64
h = (h ^ uint64(s[3])) * prime64
s = s[4:]
}
if len(s) >= 2 {
h = (h ^ uint64(s[0])) * prime64
h = (h ^ uint64(s[1])) * prime64
s = s[2:]
}
if len(s) > 0 {
h = (h ^ uint64(s[0])) * prime64
}
return h
}
// AddBytes64 adds the hash of b to the precomputed hash value h.
func AddBytes64(h uint64, b []byte) uint64 {
for len(b) >= 8 {
h = (h ^ uint64(b[0])) * prime64
h = (h ^ uint64(b[1])) * prime64
h = (h ^ uint64(b[2])) * prime64
h = (h ^ uint64(b[3])) * prime64
h = (h ^ uint64(b[4])) * prime64
h = (h ^ uint64(b[5])) * prime64
h = (h ^ uint64(b[6])) * prime64
h = (h ^ uint64(b[7])) * prime64
b = b[8:]
}
if len(b) >= 4 {
h = (h ^ uint64(b[0])) * prime64
h = (h ^ uint64(b[1])) * prime64
h = (h ^ uint64(b[2])) * prime64
h = (h ^ uint64(b[3])) * prime64
b = b[4:]
}
i := 0
n := (len(s) / 8) * 8
for i != n {
h = (h ^ uint64(s[i])) * prime64
h = (h ^ uint64(s[i+1])) * prime64
h = (h ^ uint64(s[i+2])) * prime64
h = (h ^ uint64(s[i+3])) * prime64
h = (h ^ uint64(s[i+4])) * prime64
h = (h ^ uint64(s[i+5])) * prime64
h = (h ^ uint64(s[i+6])) * prime64
h = (h ^ uint64(s[i+7])) * prime64
i += 8
if len(b) >= 2 {
h = (h ^ uint64(b[0])) * prime64
h = (h ^ uint64(b[1])) * prime64
b = b[2:]
}
for _, c := range s[i:] {
h = (h ^ uint64(c)) * prime64
if len(b) > 0 {
h = (h ^ uint64(b[0])) * prime64
}
return h

@ -14,6 +14,11 @@ func HashString32(s string) uint32 {
return AddString32(Init32, s)
}
// HashBytes32 returns the hash of u.
func HashBytes32(b []byte) uint32 {
return AddBytes32(Init32, b)
}
// HashUint32 returns the hash of u.
func HashUint32(u uint32) uint32 {
return AddUint32(Init32, u)
@ -21,23 +26,69 @@ func HashUint32(u uint32) uint32 {
// AddString32 adds the hash of s to the precomputed hash value h.
func AddString32(h uint32, s string) uint32 {
i := 0
n := (len(s) / 8) * 8
for i != n {
h = (h ^ uint32(s[i])) * prime32
h = (h ^ uint32(s[i+1])) * prime32
h = (h ^ uint32(s[i+2])) * prime32
h = (h ^ uint32(s[i+3])) * prime32
h = (h ^ uint32(s[i+4])) * prime32
h = (h ^ uint32(s[i+5])) * prime32
h = (h ^ uint32(s[i+6])) * prime32
h = (h ^ uint32(s[i+7])) * prime32
i += 8
}
for _, c := range s[i:] {
h = (h ^ uint32(c)) * prime32
for len(s) >= 8 {
h = (h ^ uint32(s[0])) * prime32
h = (h ^ uint32(s[1])) * prime32
h = (h ^ uint32(s[2])) * prime32
h = (h ^ uint32(s[3])) * prime32
h = (h ^ uint32(s[4])) * prime32
h = (h ^ uint32(s[5])) * prime32
h = (h ^ uint32(s[6])) * prime32
h = (h ^ uint32(s[7])) * prime32
s = s[8:]
}
if len(s) >= 4 {
h = (h ^ uint32(s[0])) * prime32
h = (h ^ uint32(s[1])) * prime32
h = (h ^ uint32(s[2])) * prime32
h = (h ^ uint32(s[3])) * prime32
s = s[4:]
}
if len(s) >= 2 {
h = (h ^ uint32(s[0])) * prime32
h = (h ^ uint32(s[1])) * prime32
s = s[2:]
}
if len(s) > 0 {
h = (h ^ uint32(s[0])) * prime32
}
return h
}
// AddBytes32 adds the hash of b to the precomputed hash value h.
func AddBytes32(h uint32, b []byte) uint32 {
for len(b) >= 8 {
h = (h ^ uint32(b[0])) * prime32
h = (h ^ uint32(b[1])) * prime32
h = (h ^ uint32(b[2])) * prime32
h = (h ^ uint32(b[3])) * prime32
h = (h ^ uint32(b[4])) * prime32
h = (h ^ uint32(b[5])) * prime32
h = (h ^ uint32(b[6])) * prime32
h = (h ^ uint32(b[7])) * prime32
b = b[8:]
}
if len(b) >= 4 {
h = (h ^ uint32(b[0])) * prime32
h = (h ^ uint32(b[1])) * prime32
h = (h ^ uint32(b[2])) * prime32
h = (h ^ uint32(b[3])) * prime32
b = b[4:]
}
if len(b) >= 2 {
h = (h ^ uint32(b[0])) * prime32
h = (h ^ uint32(b[1])) * prime32
b = b[2:]
}
if len(b) > 0 {
h = (h ^ uint32(b[0])) * prime32
}
return h

@ -509,7 +509,7 @@ ccflags="$@"
$2 ~ /^CAP_/ ||
$2 ~ /^ALG_/ ||
$2 ~ /^FS_(POLICY_FLAGS|KEY_DESC|ENCRYPTION_MODE|[A-Z0-9_]+_KEY_SIZE)/ ||
$2 ~ /^FS_IOC_.*(ENCRYPTION|VERITY|GETFLAGS)/ ||
$2 ~ /^FS_IOC_.*(ENCRYPTION|VERITY|[GS]ETFLAGS)/ ||
$2 ~ /^FS_VERITY_/ ||
$2 ~ /^FSCRYPT_/ ||
$2 ~ /^GRND_/ ||

@ -1950,6 +1950,20 @@ func Vmsplice(fd int, iovs []Iovec, flags int) (int, error) {
return int(n), nil
}
func isGroupMember(gid int) bool {
groups, err := Getgroups()
if err != nil {
return false
}
for _, g := range groups {
if g == gid {
return true
}
}
return false
}
//sys faccessat(dirfd int, path string, mode uint32) (err error)
func Faccessat(dirfd int, path string, mode uint32, flags int) (err error) {
@ -2007,7 +2021,7 @@ func Faccessat(dirfd int, path string, mode uint32, flags int) (err error) {
gid = Getgid()
}
if uint32(gid) == st.Gid {
if uint32(gid) == st.Gid || isGroupMember(gid) {
fmode = (st.Mode >> 3) & 7
} else {
fmode = st.Mode & 7

@ -78,6 +78,7 @@ const (
FS_IOC_GET_ENCRYPTION_NONCE = 0x8010661b
FS_IOC_GET_ENCRYPTION_POLICY = 0x400c6615
FS_IOC_GET_ENCRYPTION_PWSALT = 0x40106614
FS_IOC_SETFLAGS = 0x40046602
FS_IOC_SET_ENCRYPTION_POLICY = 0x800c6613
F_GETLK = 0xc
F_GETLK64 = 0xc

@ -78,6 +78,7 @@ const (
FS_IOC_GET_ENCRYPTION_NONCE = 0x8010661b
FS_IOC_GET_ENCRYPTION_POLICY = 0x400c6615
FS_IOC_GET_ENCRYPTION_PWSALT = 0x40106614
FS_IOC_SETFLAGS = 0x40086602
FS_IOC_SET_ENCRYPTION_POLICY = 0x800c6613
F_GETLK = 0x5
F_GETLK64 = 0x5

@ -77,6 +77,7 @@ const (
FS_IOC_GET_ENCRYPTION_NONCE = 0x8010661b
FS_IOC_GET_ENCRYPTION_POLICY = 0x400c6615
FS_IOC_GET_ENCRYPTION_PWSALT = 0x40106614
FS_IOC_SETFLAGS = 0x40046602
FS_IOC_SET_ENCRYPTION_POLICY = 0x800c6613
F_GETLK = 0xc
F_GETLK64 = 0xc

@ -80,6 +80,7 @@ const (
FS_IOC_GET_ENCRYPTION_NONCE = 0x8010661b
FS_IOC_GET_ENCRYPTION_POLICY = 0x400c6615
FS_IOC_GET_ENCRYPTION_PWSALT = 0x40106614
FS_IOC_SETFLAGS = 0x40086602
FS_IOC_SET_ENCRYPTION_POLICY = 0x800c6613
F_GETLK = 0x5
F_GETLK64 = 0x5

@ -77,6 +77,7 @@ const (
FS_IOC_GET_ENCRYPTION_NONCE = 0x4010661b
FS_IOC_GET_ENCRYPTION_POLICY = 0x800c6615
FS_IOC_GET_ENCRYPTION_PWSALT = 0x80106614
FS_IOC_SETFLAGS = 0x80046602
FS_IOC_SET_ENCRYPTION_POLICY = 0x400c6613
F_GETLK = 0x21
F_GETLK64 = 0x21

@ -77,6 +77,7 @@ const (
FS_IOC_GET_ENCRYPTION_NONCE = 0x4010661b
FS_IOC_GET_ENCRYPTION_POLICY = 0x800c6615
FS_IOC_GET_ENCRYPTION_PWSALT = 0x80106614
FS_IOC_SETFLAGS = 0x80086602
FS_IOC_SET_ENCRYPTION_POLICY = 0x400c6613
F_GETLK = 0xe
F_GETLK64 = 0xe

@ -77,6 +77,7 @@ const (
FS_IOC_GET_ENCRYPTION_NONCE = 0x4010661b
FS_IOC_GET_ENCRYPTION_POLICY = 0x800c6615
FS_IOC_GET_ENCRYPTION_PWSALT = 0x80106614
FS_IOC_SETFLAGS = 0x80086602
FS_IOC_SET_ENCRYPTION_POLICY = 0x400c6613
F_GETLK = 0xe
F_GETLK64 = 0xe

@ -77,6 +77,7 @@ const (
FS_IOC_GET_ENCRYPTION_NONCE = 0x4010661b
FS_IOC_GET_ENCRYPTION_POLICY = 0x800c6615
FS_IOC_GET_ENCRYPTION_PWSALT = 0x80106614
FS_IOC_SETFLAGS = 0x80046602
FS_IOC_SET_ENCRYPTION_POLICY = 0x400c6613
F_GETLK = 0x21
F_GETLK64 = 0x21

@ -77,6 +77,7 @@ const (
FS_IOC_GET_ENCRYPTION_NONCE = 0x4010661b
FS_IOC_GET_ENCRYPTION_POLICY = 0x800c6615
FS_IOC_GET_ENCRYPTION_PWSALT = 0x80106614
FS_IOC_SETFLAGS = 0x80086602
FS_IOC_SET_ENCRYPTION_POLICY = 0x400c6613
F_GETLK = 0x5
F_GETLK64 = 0xc

@ -77,6 +77,7 @@ const (
FS_IOC_GET_ENCRYPTION_NONCE = 0x4010661b
FS_IOC_GET_ENCRYPTION_POLICY = 0x800c6615
FS_IOC_GET_ENCRYPTION_PWSALT = 0x80106614
FS_IOC_SETFLAGS = 0x80086602
FS_IOC_SET_ENCRYPTION_POLICY = 0x400c6613
F_GETLK = 0x5
F_GETLK64 = 0xc

@ -77,6 +77,7 @@ const (
FS_IOC_GET_ENCRYPTION_NONCE = 0x8010661b
FS_IOC_GET_ENCRYPTION_POLICY = 0x400c6615
FS_IOC_GET_ENCRYPTION_PWSALT = 0x40106614
FS_IOC_SETFLAGS = 0x40086602
FS_IOC_SET_ENCRYPTION_POLICY = 0x800c6613
F_GETLK = 0x5
F_GETLK64 = 0x5

@ -77,6 +77,7 @@ const (
FS_IOC_GET_ENCRYPTION_NONCE = 0x8010661b
FS_IOC_GET_ENCRYPTION_POLICY = 0x400c6615
FS_IOC_GET_ENCRYPTION_PWSALT = 0x40106614
FS_IOC_SETFLAGS = 0x40086602
FS_IOC_SET_ENCRYPTION_POLICY = 0x800c6613
F_GETLK = 0x5
F_GETLK64 = 0x5

@ -81,6 +81,7 @@ const (
FS_IOC_GET_ENCRYPTION_NONCE = 0x4010661b
FS_IOC_GET_ENCRYPTION_POLICY = 0x800c6615
FS_IOC_GET_ENCRYPTION_PWSALT = 0x80106614
FS_IOC_SETFLAGS = 0x80086602
FS_IOC_SET_ENCRYPTION_POLICY = 0x400c6613
F_GETLK = 0x7
F_GETLK64 = 0x7

@ -125,9 +125,9 @@ type Statfs_t struct {
Owner uint32
Fsid Fsid
Charspare [80]int8
Fstypename [16]int8
Mntfromname [1024]int8
Mntonname [1024]int8
Fstypename [16]byte
Mntfromname [1024]byte
Mntonname [1024]byte
}
type statfs_freebsd11_t struct {
@ -150,9 +150,9 @@ type statfs_freebsd11_t struct {
Owner uint32
Fsid Fsid
Charspare [80]int8
Fstypename [16]int8
Mntfromname [88]int8
Mntonname [88]int8
Fstypename [16]byte
Mntfromname [88]byte
Mntonname [88]byte
}
type Flock_t struct {

@ -132,6 +132,7 @@ github.com/cenkalti/backoff
# github.com/cespare/xxhash v1.1.0
github.com/cespare/xxhash
# github.com/cespare/xxhash/v2 v2.1.1
## explicit
github.com/cespare/xxhash/v2
# github.com/containerd/containerd v1.3.4
github.com/containerd/containerd/errdefs
@ -753,7 +754,8 @@ github.com/samuel/go-zookeeper/zk
github.com/satori/go.uuid
# github.com/sean-/seed v0.0.0-20170313163322-e2103e2c3529
github.com/sean-/seed
# github.com/segmentio/fasthash v0.0.0-20180216231524-a72b379d632e
# github.com/segmentio/fasthash v1.0.2
## explicit
github.com/segmentio/fasthash/fnv1a
# github.com/sercand/kuberesolver v2.4.0+incompatible
github.com/sercand/kuberesolver
@ -1028,7 +1030,8 @@ golang.org/x/oauth2/jwt
# golang.org/x/sync v0.0.0-20200317015054-43a5402ce75a
golang.org/x/sync/errgroup
golang.org/x/sync/semaphore
# golang.org/x/sys v0.0.0-20200615200032-f1bc736245b1
# golang.org/x/sys v0.0.0-20200625212154-ddb9806d33ae
## explicit
golang.org/x/sys/cpu
golang.org/x/sys/internal/unsafeheader
golang.org/x/sys/unix

Loading…
Cancel
Save