Add better observability to queryReadiness (#5946)

* Add two metrics to the IndexGateway.

- Add a new `query_readiness_duration_seconds` metric, that reports
  query readiness duration of a tablemanager/index gateway instance. We
  should use it later to report performance against the ring mode
- Add a new `usersToBeQueryReadyForTotal` metric, that reports number of
  users involved in the query readiness operation. We should use it
  later to correlate number of users with the query readiness duration.

* Remove `usersToBeQueryReadyForTotal`.

- It will report all users always for now, so it isn't too helpful the
  way it is.

* Rename metric help text to not mislead people.

* Log queryReadiness duration.

* Fix where log message and duration and triggered.

* Join users list in a single string.

- This is necessary since go-kit doesn't support array type.

* Tweak queryReadiness log messages.

- As suggested by Ed on
  https://github.com/grafana/loki/pull/5972#discussion_r859734129 and
  https://github.com/grafana/loki/pull/5972#discussion_r859736072

* Ensure queryReadinessDuration metric.

- It is redundant with a recently added log line.

* noop
pull/6048/head
Dylan Guedes 3 years ago committed by GitHub
parent fc8c4f0592
commit cd02d6a478
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
  1. 9
      pkg/storage/stores/shipper/downloads/table_manager.go

@ -7,6 +7,7 @@ import (
"path/filepath"
"regexp"
"strconv"
"strings"
"sync"
"time"
@ -241,6 +242,11 @@ func (tm *TableManager) cleanupCache() error {
// ensureQueryReadiness compares tables required for being query ready with the tables we already have and downloads the missing ones.
func (tm *TableManager) ensureQueryReadiness(ctx context.Context) error {
start := time.Now()
defer func() {
level.Info(util_log.Logger).Log("msg", "query readiness setup completed", "duration", time.Since(start))
}()
activeTableNumber := getActiveTableNumber()
// find the largest query readiness number
@ -309,9 +315,12 @@ func (tm *TableManager) ensureQueryReadiness(ctx context.Context) error {
return err
}
perTableStart := time.Now()
if err := table.EnsureQueryReadiness(ctx, usersToBeQueryReadyFor); err != nil {
return err
}
joinedUsers := strings.Join(usersToBeQueryReadyFor, ",")
level.Info(util_log.Logger).Log("msg", "index pre-download for query readiness completed", "users", joinedUsers, "duration", time.Since(perTableStart), "table", tableName)
}
return nil

Loading…
Cancel
Save