* promtool: backfill: allow configuring block duration
When backfilling large amounts of data across long periods of time, it
may in certain circumstances be useful to use a longer block duration to
increase the efficiency and speed of the backfilling process. This patch
adds a flag --block-duration-power to allow a user to choose the power N
where the block duration is 2^(N+1)h.
Signed-off-by: Steve Kuznetsov <skuznets@redhat.com>
* promtool: use sub-tests in backfill testing
Signed-off-by: Steve Kuznetsov <skuznets@redhat.com>
* backfill: add messages to tests for clarity
When someone new breaks a test, seeing "expected: false, got: true" is
really not useful. A nice message helps here.
Signed-off-by: Steve Kuznetsov <skuznets@redhat.com>
* backfill: test long block durations
A test that uses a long block duration to write bigger blocks is added.
The check to make sure all blocks are the default duration is removed.
Signed-off-by: Steve Kuznetsov <skuznets@redhat.com>
require.Equal(t,block.MinTime()/expectedBlockDuration,(block.MaxTime()-1)/expectedBlockDuration,"block %d contains data outside of one aligned block duration",i)
importCmd:=tsdbCmd.Command("create-blocks-from","[Experimental] Import samples from input and produce TSDB blocks. Please refer to the storage docs for more details.")
importHumanReadable:=importCmd.Flag("human-readable","Print human readable values.").Short('r').Bool()
importQuiet:=importCmd.Flag("quiet","Do not print created blocks.").Short('q').Bool()
maxBlockDuration:=importCmd.Flag("max-block-duration","Maximum duration created blocks may span. Anything less than 2h is ignored.").Hidden().PlaceHolder("<duration>").Duration()
openMetricsImportCmd:=importCmd.Command("openmetrics","Import samples from OpenMetrics input and produce TSDB blocks. Please refer to the storage docs for more details.")
// TODO(aSquare14): add flag to set default block duration
importFilePath:=openMetricsImportCmd.Arg("input file","OpenMetrics file to read samples from.").Required().String()
importDBPath:=openMetricsImportCmd.Arg("output directory","Output directory for generated blocks.").Default(defaultDBPath).String()
importRulesCmd:=importCmd.Command("rules","Create blocks of data for new recording rules.")
After the creation of the blocks, move it to the data directory of Prometheus. If there is an overlap with the existing blocks in Prometheus, the flag `--storage.tsdb.allow-overlapping-blocks` needs to be set. Note that any backfilled data is subject to the retention configured for your Prometheus server (by time or size).
#### Longer Block Durations
By default, the promtool will use the default block duration (2h) for the blocks; this behavior is the most generally applicable and correct. However, when backfilling data over a long range of times, it may be advantageous to use a larger value for the block duration to backfill faster and prevent additional compactions by TSDB later.
The `--max-block-duration` flag allows the user to configure a maximum duration of blocks. The backfilling tool will pick a suitable block duration no larger than this.
While larger blocks may improve the performance of backfilling large datasets, drawbacks exist as well. Time-based retention policies must keep the entire block around if even one sample of the (potentially large) block is still within the retention policy. Conversely, size-based retention policies will remove the entire block even if the TSDB only goes over the size limit in a minor way.
Therefore, backfilling with few blocks, thereby choosing a larger block duration, must be done with care and is not recommended for any production instances.