Promtail: Add compressed files support (#6708)

**What this PR does / why we need it**:
Adds to Promtail the ability to read compressed files. It works by:
1. Infer which compression format to use based on the file extension
2. Uncompress the file with the native `golang/compress` packages
3. Iterate over uncompressed lines and send them to Loki

Its usage is the same as our current file tailing.

**Which issue(s) this PR fixes**:
Fixes #5956 

Co-authored-by: Danny Kopping <dannykopping@gmail.com>
pull/7268/head
Dylan Guedes 3 years ago committed by GitHub
parent 5ee0d09389
commit 73bea7e4fa
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
  1. 4
      CHANGELOG.md
  2. 299
      clients/pkg/promtail/targets/file/decompresser.go
  3. 178
      clients/pkg/promtail/targets/file/decompresser_test.go
  4. 74
      clients/pkg/promtail/targets/file/filetarget.go
  5. 26
      clients/pkg/promtail/targets/file/filetarget_test.go
  6. 9
      clients/pkg/promtail/targets/file/reader.go
  7. 14
      clients/pkg/promtail/targets/file/tailer.go
  8. BIN
      clients/pkg/promtail/targets/file/test_fixtures/long-access.gz
  9. BIN
      clients/pkg/promtail/targets/file/test_fixtures/long-access.tar.gz
  10. 1
      clients/pkg/promtail/targets/file/test_fixtures/onelinelog.log
  11. BIN
      clients/pkg/promtail/targets/file/test_fixtures/onelinelog.log.bz2
  12. BIN
      clients/pkg/promtail/targets/file/test_fixtures/onelinelog.log.gz
  13. BIN
      clients/pkg/promtail/targets/file/test_fixtures/onelinelog.tar.gz
  14. 2000
      clients/pkg/promtail/targets/file/test_fixtures/short-access.log
  15. BIN
      clients/pkg/promtail/targets/file/test_fixtures/short-access.tar.gz
  16. 40
      docs/sources/clients/promtail/_index.md

@ -38,8 +38,9 @@
#### Promtail
##### Enhancements
* [6708](https://github.com/grafana/loki/pull/6708) **DylanGuedes**: Add compressed files support to Promtail.
* [5977](https://github.com/grafana/loki/pull/5977) **juissi-t** lambda-promtail: Add support for Kinesis data stream events
* [6395](https://github.com/grafana/loki/pull/6395) **DylanGuedes**: Add encoding support
* [6828](https://github.com/grafana/loki/pull/6828) **alexandre1984rj** Add the BotScore and BotScoreSrc fields once the Cloudflare API returns those two fields on the list of all available log fields.
* [6656](https://github.com/grafana/loki/pull/6656) **carlospeon**: Allow promtail to add matches to the journal reader
@ -118,6 +119,7 @@ Here is the list with the changes that were produced since the previous release.
* [6102](https://github.com/grafana/loki/pull/6102) **timchenko-a**: Add multi-tenancy support to lambda-promtail.
* [6099](https://github.com/grafana/loki/pull/6099) **cstyan**: Drop lines with malformed JSON in Promtail JSON pipeline stage.
* [5715](https://github.com/grafana/loki/pull/5715) **chaudum**: Allow promtail to push RFC5424 formatted syslog messages
* [6395](https://github.com/grafana/loki/pull/6395) **DylanGuedes**: Add encoding support
##### Fixes
* [6034](https://github.com/grafana/loki/pull/6034) **DylanGuedes**: Promtail: Fix symlink tailing behavior.

@ -0,0 +1,299 @@
package file
import (
"bufio"
"compress/bzip2"
"compress/gzip"
"compress/zlib"
"fmt"
"io"
"os"
"path/filepath"
"strings"
"sync"
"time"
"unsafe"
"github.com/go-kit/log"
"github.com/go-kit/log/level"
"github.com/pkg/errors"
"github.com/prometheus/common/model"
"go.uber.org/atomic"
"golang.org/x/text/encoding"
"golang.org/x/text/encoding/ianaindex"
"golang.org/x/text/transform"
"github.com/grafana/loki/pkg/logproto"
"github.com/grafana/loki/clients/pkg/promtail/api"
"github.com/grafana/loki/clients/pkg/promtail/positions"
)
func supportedCompressedFormats() map[string]struct{} {
return map[string]struct{}{
".gz": {},
".tar.gz": {},
".z": {},
".bz2": {},
// TODO: add support for .zip extension.
}
}
type decompressor struct {
metrics *Metrics
logger log.Logger
handler api.EntryHandler
positions positions.Positions
path string
posAndSizeMtx sync.Mutex
stopOnce sync.Once
running *atomic.Bool
posquit chan struct{}
posdone chan struct{}
done chan struct{}
decoder *encoding.Decoder
position int64
size int64
}
func newDecompressor(metrics *Metrics, logger log.Logger, handler api.EntryHandler, positions positions.Positions, path string, encodingFormat string) (*decompressor, error) {
logger = log.With(logger, "component", "decompressor")
pos, err := positions.Get(path)
if err != nil {
return nil, errors.Wrap(err, "get positions")
}
var decoder *encoding.Decoder
if encodingFormat != "" {
level.Info(logger).Log("msg", "decompressor will decode messages", "from", encodingFormat, "to", "UTF8")
encoder, err := ianaindex.IANA.Encoding(encodingFormat)
if err != nil {
return nil, errors.Wrap(err, "error doing IANA encoding")
}
decoder = encoder.NewDecoder()
}
decompressor := &decompressor{
metrics: metrics,
logger: logger,
handler: api.AddLabelsMiddleware(model.LabelSet{FilenameLabel: model.LabelValue(path)}).Wrap(handler),
positions: positions,
path: path,
running: atomic.NewBool(false),
posquit: make(chan struct{}),
posdone: make(chan struct{}),
done: make(chan struct{}),
position: pos,
decoder: decoder,
}
go decompressor.readLines()
go decompressor.updatePosition()
metrics.filesActive.Add(1.)
return decompressor, nil
}
// mountReader instantiate a reader ready to be used by the decompressor.
//
// The selected reader implementation is based on the extension of the given file name.
// It'll error if the extension isn't supported.
func mountReader(f *os.File, logger log.Logger) (reader io.Reader, err error) {
ext := filepath.Ext(f.Name())
var decompressLib string
if strings.Contains(ext, "gz") { // .gz, .tar.gz
decompressLib = "compress/gzip"
reader, err = gzip.NewReader(f)
} else if ext == ".z" {
decompressLib = "compress/zlib"
reader, err = zlib.NewReader(f)
} else if ext == ".bz2" {
decompressLib = "bzip2"
reader = bzip2.NewReader(f)
}
// TODO: add support for .zip extension.
level.Debug(logger).Log("msg", fmt.Sprintf("using %q to decompress file %q", decompressLib, f.Name()))
if reader != nil {
return reader, nil
}
if err != nil && err != io.EOF {
return nil, err
}
supportedExtsList := strings.Builder{}
for ext := range supportedCompressedFormats() {
supportedExtsList.WriteString(ext)
}
return nil, fmt.Errorf("file %q has unsupported extension, it has to be one of %q", f.Name(), supportedExtsList.String())
}
func (t *decompressor) updatePosition() {
positionSyncPeriod := t.positions.SyncPeriod()
positionWait := time.NewTicker(positionSyncPeriod)
defer func() {
positionWait.Stop()
level.Info(t.logger).Log("msg", "position timer: exited", "path", t.path)
close(t.posdone)
}()
for {
select {
case <-positionWait.C:
if err := t.MarkPositionAndSize(); err != nil {
level.Error(t.logger).Log("msg", "position timer: error getting position and/or size, stopping decompressor", "path", t.path, "error", err)
return
}
case <-t.posquit:
return
}
}
}
// readLines read all existing lines of the given compressed file.
//
// It first decompress the file as a whole using a reader and then it will iterate
// over its chunks, separated by '\n'.
// During each iteration, the parsed and decoded log line is then sent to the API with the current timestamp.
func (t *decompressor) readLines() {
level.Info(t.logger).Log("msg", "read lines routine: started", "path", t.path)
t.running.Store(true)
defer func() {
t.cleanupMetrics()
level.Info(t.logger).Log("msg", "read lines routine finished", "path", t.path)
close(t.done)
}()
entries := t.handler.Chan()
f, err := os.Open(t.path)
if err != nil {
level.Error(t.logger).Log("msg", "error reading file", "path", t.path, "error", err)
return
}
defer f.Close()
r, err := mountReader(f, t.logger)
if err != nil {
level.Error(t.logger).Log("msg", "error mounting new reader", "err", err)
return
}
level.Info(t.logger).Log("msg", "successfully mounted reader", "path", t.path, "ext", filepath.Ext(t.path))
maxLoglineSize := 4096
buffer := make([]byte, maxLoglineSize)
scanner := bufio.NewScanner(r)
scanner.Buffer(buffer, maxLoglineSize)
for line := 1; ; line++ {
if !scanner.Scan() {
break
}
if scannerErr := scanner.Err(); scannerErr != nil {
if scannerErr != io.EOF {
level.Error(t.logger).Log("msg", "error scanning", "err", scannerErr)
}
break
}
if line <= int(t.position) {
// skip already seen lines.
continue
}
text := scanner.Text()
var finalText string
if t.decoder != nil {
var err error
finalText, err = t.convertToUTF8(text)
if err != nil {
level.Debug(t.logger).Log("msg", "failed to convert encoding", "error", err)
t.metrics.encodingFailures.WithLabelValues(t.path).Inc()
finalText = fmt.Sprintf("the requested encoding conversion for this line failed in Promtail/Grafana Agent: %s", err.Error())
}
} else {
finalText = text
}
t.metrics.readLines.WithLabelValues(t.path).Inc()
entries <- api.Entry{
Labels: model.LabelSet{},
Entry: logproto.Entry{
Timestamp: time.Now(),
Line: finalText,
},
}
t.size = int64(unsafe.Sizeof(finalText))
t.position++
}
}
func (t *decompressor) MarkPositionAndSize() error {
// Lock this update as there are 2 timers calling this routine, the sync in filetarget and the positions sync in this file.
t.posAndSizeMtx.Lock()
defer t.posAndSizeMtx.Unlock()
t.metrics.totalBytes.WithLabelValues(t.path).Set(float64(t.size))
t.metrics.readBytes.WithLabelValues(t.path).Set(float64(t.position))
t.positions.Put(t.path, t.position)
return nil
}
func (t *decompressor) Stop() {
// stop can be called by two separate threads in filetarget, to avoid a panic closing channels more than once
// we wrap the stop in a sync.Once.
t.stopOnce.Do(func() {
// Shut down the position marker thread
close(t.posquit)
<-t.posdone
// Save the current position before shutting down tailer
if err := t.MarkPositionAndSize(); err != nil {
level.Error(t.logger).Log("msg", "error marking file position when stopping decompressor", "path", t.path, "error", err)
}
// Wait for readLines() to consume all the remaining messages and exit when the channel is closed
<-t.done
level.Info(t.logger).Log("msg", "stopped decompressor", "path", t.path)
t.handler.Stop()
})
}
func (t *decompressor) IsRunning() bool {
return t.running.Load()
}
func (t *decompressor) convertToUTF8(text string) (string, error) {
res, _, err := transform.String(t.decoder, text)
if err != nil {
return "", errors.Wrap(err, "error decoding text")
}
return res, nil
}
// cleanupMetrics removes all metrics exported by this tailer
func (t *decompressor) cleanupMetrics() {
// When we stop tailing the file, also un-export metrics related to the file
t.metrics.filesActive.Add(-1.)
t.metrics.readLines.DeleteLabelValues(t.path)
t.metrics.readBytes.DeleteLabelValues(t.path)
t.metrics.totalBytes.DeleteLabelValues(t.path)
}
func (t *decompressor) Path() string {
return t.path
}

@ -0,0 +1,178 @@
package file
import (
"os"
"sync"
"testing"
"time"
"github.com/go-kit/log"
"github.com/grafana/loki/clients/pkg/promtail/api"
"github.com/grafana/loki/clients/pkg/promtail/client/fake"
"github.com/prometheus/client_golang/prometheus"
"github.com/stretchr/testify/require"
"go.uber.org/atomic"
)
type noopClient struct {
noopChan chan api.Entry
wg sync.WaitGroup
once sync.Once
}
func (n *noopClient) Chan() chan<- api.Entry {
return n.noopChan
}
func (n *noopClient) Stop() {
n.once.Do(func() { close(n.noopChan) })
}
func newNoopClient() *noopClient {
c := &noopClient{noopChan: make(chan api.Entry)}
c.wg.Add(1)
go func() {
defer c.wg.Done()
for range c.noopChan {
// noop
}
}()
return c
}
func BenchmarkReadlines(b *testing.B) {
entryHandler := newNoopClient()
scenarios := []struct {
name string
file string
}{
{
name: "2000 lines of log .tar.gz compressed",
file: "test_fixtures/short-access.tar.gz",
},
{
name: "100000 lines of log .gz compressed",
file: "test_fixtures/long-access.gz",
},
}
for _, tc := range scenarios {
b.Run(tc.name, func(b *testing.B) {
decBase := &decompressor{
logger: log.NewNopLogger(),
running: atomic.NewBool(false),
handler: entryHandler,
path: tc.file,
}
for i := 0; i < b.N; i++ {
newDec := decBase
newDec.metrics = NewMetrics(prometheus.NewRegistry())
newDec.done = make(chan struct{})
newDec.readLines()
<-newDec.done
}
})
}
}
func TestGigantiqueGunzipFile(t *testing.T) {
file := "test_fixtures/long-access.gz"
handler := fake.New(func() {})
d := &decompressor{
logger: log.NewNopLogger(),
running: atomic.NewBool(false),
handler: handler,
path: file,
done: make(chan struct{}),
metrics: NewMetrics(prometheus.NewRegistry()),
}
d.readLines()
<-d.done
time.Sleep(time.Millisecond * 200)
entries := handler.Received()
require.Equal(t, 100000, len(entries))
}
// TestOnelineFiles test the supported formats for log lines that only contain 1 line.
//
// Based on our experience, this is the scenario with the most edge cases.
func TestOnelineFiles(t *testing.T) {
fileContent, err := os.ReadFile("test_fixtures/onelinelog.log")
require.NoError(t, err)
t.Run("gunzip file", func(t *testing.T) {
file := "test_fixtures/onelinelog.log.gz"
handler := fake.New(func() {})
d := &decompressor{
logger: log.NewNopLogger(),
running: atomic.NewBool(false),
handler: handler,
path: file,
done: make(chan struct{}),
metrics: NewMetrics(prometheus.NewRegistry()),
}
d.readLines()
<-d.done
time.Sleep(time.Millisecond * 200)
entries := handler.Received()
require.Equal(t, 1, len(entries))
require.Equal(t, string(fileContent), entries[0].Line)
})
t.Run("bzip2 file", func(t *testing.T) {
file := "test_fixtures/onelinelog.log.bz2"
handler := fake.New(func() {})
d := &decompressor{
logger: log.NewNopLogger(),
running: atomic.NewBool(false),
handler: handler,
path: file,
done: make(chan struct{}),
metrics: NewMetrics(prometheus.NewRegistry()),
}
d.readLines()
<-d.done
time.Sleep(time.Millisecond * 200)
entries := handler.Received()
require.Equal(t, 1, len(entries))
require.Equal(t, string(fileContent), entries[0].Line)
})
t.Run("tar.gz file", func(t *testing.T) {
file := "test_fixtures/onelinelog.tar.gz"
handler := fake.New(func() {})
d := &decompressor{
logger: log.NewNopLogger(),
running: atomic.NewBool(false),
handler: handler,
path: file,
done: make(chan struct{}),
metrics: NewMetrics(prometheus.NewRegistry()),
}
d.readLines()
<-d.done
time.Sleep(time.Millisecond * 200)
entries := handler.Received()
require.Equal(t, 1, len(entries))
firstEntry := entries[0]
require.Contains(t, firstEntry.Line, "onelinelog.log") // contains .tar.gz headers
require.Contains(t, firstEntry.Line, `5.202.214.160 - - [26/Jan/2019:19:45:25 +0330] "GET / HTTP/1.1" 200 30975 "https://www.zanbil.ir/" "Mozilla/5.0 (Windows NT 6.2; WOW64; rv:21.0) Gecko/20100101 Firefox/21.0" "-"`)
})
}

@ -73,7 +73,7 @@ type FileTarget struct {
quit chan struct{}
done chan struct{}
tails map[string]*tailer
readers map[string]Reader
targetConfig *Config
@ -106,7 +106,7 @@ func NewFileTarget(
positions: positions,
quit: make(chan struct{}),
done: make(chan struct{}),
tails: map[string]*tailer{},
readers: map[string]Reader{},
targetConfig: targetConfig,
fileEventWatcher: fileEventWatcher,
targetEventHandler: targetEventHandler,
@ -119,7 +119,7 @@ func NewFileTarget(
// Ready if at least one file is being tailed
func (t *FileTarget) Ready() bool {
return len(t.tails) > 0
return len(t.readers) > 0
}
// Stop the target.
@ -147,7 +147,7 @@ func (t *FileTarget) Labels() model.LabelSet {
// Details implements a Target
func (t *FileTarget) Details() interface{} {
files := map[string]int64{}
for fileName := range t.tails {
for fileName := range t.readers {
files[fileName], _ = t.positions.Get(fileName)
}
return files
@ -155,8 +155,8 @@ func (t *FileTarget) Details() interface{} {
func (t *FileTarget) run() {
defer func() {
for _, v := range t.tails {
v.stop()
for _, v := range t.readers {
v.Stop()
}
level.Info(t.logger).Log("msg", "filetarget: watcher closed, tailer stopped, positions saved", "path", t.path)
close(t.done)
@ -268,7 +268,7 @@ func (t *FileTarget) sync() error {
t.startTailing(matches)
// Stop tailing any files which no longer exist
toStopTailing := toStopTailing(matches, t.tails)
toStopTailing := toStopTailing(matches, t.readers)
t.stopTailingAndRemovePosition(toStopTailing)
return nil
@ -302,7 +302,7 @@ func (t *FileTarget) stopWatching(dirs map[string]struct{}) {
func (t *FileTarget) startTailing(ps []string) {
for _, p := range ps {
if _, ok := t.tails[p]; ok {
if _, ok := t.readers[p]; ok {
continue
}
@ -319,24 +319,48 @@ func (t *FileTarget) startTailing(ps []string) {
continue
}
level.Debug(t.logger).Log("msg", "tailing new file", "filename", p)
tailer, err := newTailer(t.metrics, t.logger, t.handler, t.positions, p, t.encoding)
if err != nil {
level.Error(t.logger).Log("msg", "failed to start tailer", "error", err, "filename", p)
continue
var reader Reader
if isCompressed(p) {
level.Debug(t.logger).Log("msg", "reading from compressed file", "filename", p)
decompressor, err := newDecompressor(t.metrics, t.logger, t.handler, t.positions, p, t.encoding)
if err != nil {
level.Error(t.logger).Log("msg", "failed to start decompressor", "error", err, "filename", p)
continue
}
reader = decompressor
} else {
level.Debug(t.logger).Log("msg", "tailing new file", "filename", p)
tailer, err := newTailer(t.metrics, t.logger, t.handler, t.positions, p, t.encoding)
if err != nil {
level.Error(t.logger).Log("msg", "failed to start tailer", "error", err, "filename", p)
continue
}
reader = tailer
}
t.tails[p] = tailer
t.readers[p] = reader
}
}
func isCompressed(p string) bool {
ext := filepath.Ext(p)
for format := range supportedCompressedFormats() {
if ext == format {
return true
}
}
return false
}
// stopTailingAndRemovePosition will stop the tailer and remove the positions entry.
// Call this when a file no longer exists and you want to remove all traces of it.
func (t *FileTarget) stopTailingAndRemovePosition(ps []string) {
for _, p := range ps {
if tailer, ok := t.tails[p]; ok {
tailer.stop()
t.positions.Remove(tailer.path)
delete(t.tails, p)
if reader, ok := t.readers[p]; ok {
reader.Stop()
t.positions.Remove(reader.Path())
delete(t.readers, p)
}
if h, ok := t.handler.(api.InstrumentedEntryHandler); ok {
h.UnregisterLatencyMetric(prometheus.Labels{client.LatencyLabel: p})
@ -347,18 +371,18 @@ func (t *FileTarget) stopTailingAndRemovePosition(ps []string) {
// pruneStoppedTailers removes any tailers which have stopped running from
// the list of active tailers. This allows them to be restarted if there were errors.
func (t *FileTarget) pruneStoppedTailers() {
toRemove := make([]string, 0, len(t.tails))
for k, t := range t.tails {
if !t.isRunning() {
toRemove := make([]string, 0, len(t.readers))
for k, t := range t.readers {
if !t.IsRunning() {
toRemove = append(toRemove, k)
}
}
for _, tr := range toRemove {
delete(t.tails, tr)
delete(t.readers, tr)
}
}
func toStopTailing(nt []string, et map[string]*tailer) []string {
func toStopTailing(nt []string, et map[string]Reader) []string {
// Make a set of all existing tails
existingTails := make(map[string]struct{}, len(et))
for file := range et {
@ -383,8 +407,8 @@ func toStopTailing(nt []string, et map[string]*tailer) []string {
func (t *FileTarget) reportSize(ms []string) {
for _, m := range ms {
// Ask the tailer to update the size if a tailer exists, this keeps position and size metrics in sync
if tailer, ok := t.tails[m]; ok {
err := tailer.markPositionAndSize()
if reader, ok := t.readers[m]; ok {
err := reader.MarkPositionAndSize()
if err != nil {
level.Warn(t.logger).Log("msg", "failed to get file size from tailer, ", "file", m, "error", err)
return

@ -76,7 +76,7 @@ func TestFileTargetSync(t *testing.T) {
if len(target.watches) != 0 {
t.Fatal("Expected watches to be 0 at this point in the test...")
}
if len(target.tails) != 0 {
if len(target.readers) != 0 {
t.Fatal("Expected tails to be 0 at this point in the test...")
}
@ -90,7 +90,7 @@ func TestFileTargetSync(t *testing.T) {
if len(target.watches) != 0 {
t.Fatal("Expected watches to be 0 at this point in the test...")
}
if len(target.tails) != 0 {
if len(target.readers) != 0 {
t.Fatal("Expected tails to be 0 at this point in the test...")
}
@ -106,7 +106,7 @@ func TestFileTargetSync(t *testing.T) {
assert.Equal(t, 1, len(target.watches),
"Expected watches to be 1 at this point in the test...",
)
assert.Equal(t, 1, len(target.tails),
assert.Equal(t, 1, len(target.readers),
"Expected tails to be 1 at this point in the test...",
)
require.Eventually(t, func() bool {
@ -123,7 +123,7 @@ func TestFileTargetSync(t *testing.T) {
assert.Equal(t, 1, len(target.watches),
"Expected watches to be 1 at this point in the test...",
)
assert.Equal(t, 2, len(target.tails),
assert.Equal(t, 2, len(target.readers),
"Expected tails to be 2 at this point in the test...",
)
@ -137,7 +137,7 @@ func TestFileTargetSync(t *testing.T) {
assert.Equal(t, 1, len(target.watches),
"Expected watches to be 1 at this point in the test...",
)
assert.Equal(t, 1, len(target.tails),
assert.Equal(t, 1, len(target.readers),
"Expected tails to be 1 at this point in the test...",
)
@ -151,7 +151,7 @@ func TestFileTargetSync(t *testing.T) {
assert.Equal(t, 0, len(target.watches),
"Expected watches to be 0 at this point in the test...",
)
assert.Equal(t, 0, len(target.tails),
assert.Equal(t, 0, len(target.readers),
"Expected tails to be 0 at this point in the test...",
)
require.Eventually(t, func() bool {
@ -228,7 +228,7 @@ func TestFileTargetPathExclusion(t *testing.T) {
if len(target.watches) != 0 {
t.Fatal("Expected watches to be 0 at this point in the test...")
}
if len(target.tails) != 0 {
if len(target.readers) != 0 {
t.Fatal("Expected tails to be 0 at this point in the test...")
}
@ -246,7 +246,7 @@ func TestFileTargetPathExclusion(t *testing.T) {
if len(target.watches) != 0 {
t.Fatal("Expected watches to be 0 at this point in the test...")
}
if len(target.tails) != 0 {
if len(target.readers) != 0 {
t.Fatal("Expected tails to be 0 at this point in the test...")
}
@ -264,7 +264,7 @@ func TestFileTargetPathExclusion(t *testing.T) {
assert.Equal(t, 2, len(target.watches),
"Expected watches to be 2 at this point in the test...",
)
assert.Equal(t, 3, len(target.tails),
assert.Equal(t, 3, len(target.readers),
"Expected tails to be 3 at this point in the test...",
)
require.Eventually(t, func() bool {
@ -285,7 +285,7 @@ func TestFileTargetPathExclusion(t *testing.T) {
assert.Equal(t, 1, len(target.watches),
"Expected watches to be 1 at this point in the test...",
)
assert.Equal(t, 1, len(target.tails),
assert.Equal(t, 1, len(target.readers),
"Expected tails to be 1 at this point in the test...",
)
require.Eventually(t, func() bool {
@ -360,13 +360,13 @@ func TestHandleFileCreationEvent(t *testing.T) {
Op: fsnotify.Create,
}
require.Eventually(t, func() bool {
return len(target.tails) == 1
return len(target.readers) == 1
}, time.Second*10, time.Millisecond*1, "Expected tails to be 1 at this point in the test...")
}
func TestToStopTailing(t *testing.T) {
nt := []string{"file1", "file2", "file3", "file4", "file5", "file6", "file7", "file11", "file12", "file15"}
et := make(map[string]*tailer, 15)
et := make(map[string]Reader, 15)
for i := 1; i <= 15; i++ {
et[fmt.Sprintf("file%d", i)] = nil
}
@ -386,7 +386,7 @@ func TestToStopTailing(t *testing.T) {
func BenchmarkToStopTailing(b *testing.B) {
nt := []string{"file1", "file2", "file3", "file4", "file5", "file6", "file7", "file11", "file12", "file15"}
et := make(map[string]*tailer, 15)
et := make(map[string]Reader, 15)
for i := 1; i <= 15; i++ {
et[fmt.Sprintf("file%d", i)] = nil
}

@ -0,0 +1,9 @@
package file
// Reader contains the set of expected calls the file target manager relies on.
type Reader interface {
Stop()
IsRunning() bool
Path() string
MarkPositionAndSize() error
}

@ -120,7 +120,7 @@ func (t *tailer) updatePosition() {
for {
select {
case <-positionWait.C:
err := t.markPositionAndSize()
err := t.MarkPositionAndSize()
if err != nil {
level.Error(t.logger).Log("msg", "position timer: error getting tail position and/or size, stopping tailer", "path", t.path, "error", err)
err := t.tail.Stop()
@ -190,7 +190,7 @@ func (t *tailer) readLines() {
}
}
func (t *tailer) markPositionAndSize() error {
func (t *tailer) MarkPositionAndSize() error {
// Lock this update as there are 2 timers calling this routine, the sync in filetarget and the positions sync in this file.
t.posAndSizeMtx.Lock()
defer t.posAndSizeMtx.Unlock()
@ -216,7 +216,7 @@ func (t *tailer) markPositionAndSize() error {
return nil
}
func (t *tailer) stop() {
func (t *tailer) Stop() {
// stop can be called by two separate threads in filetarget, to avoid a panic closing channels more than once
// we wrap the stop in a sync.Once.
t.stopOnce.Do(func() {
@ -225,7 +225,7 @@ func (t *tailer) stop() {
<-t.posdone
// Save the current position before shutting down tailer
err := t.markPositionAndSize()
err := t.MarkPositionAndSize()
if err != nil {
level.Error(t.logger).Log("msg", "error marking file position when stopping tailer", "path", t.path, "error", err)
}
@ -242,7 +242,7 @@ func (t *tailer) stop() {
})
}
func (t *tailer) isRunning() bool {
func (t *tailer) IsRunning() bool {
return t.running.Load()
}
@ -263,3 +263,7 @@ func (t *tailer) cleanupMetrics() {
t.metrics.readBytes.DeleteLabelValues(t.path)
t.metrics.totalBytes.DeleteLabelValues(t.path)
}
func (t *tailer) Path() string {
return t.path
}

@ -0,0 +1 @@
5.202.214.160 - - [26/Jan/2019:19:45:25 +0330] "GET / HTTP/1.1" 200 30975 "https://www.zanbil.ir/" "Mozilla/5.0 (Windows NT 6.2; WOW64; rv:21.0) Gecko/20100101 Firefox/21.0" "-"

@ -36,6 +36,46 @@ Just like Prometheus, `promtail` is configured using a `scrape_configs` stanza.
drop, and the final metadata to attach to the log line. Refer to the docs for
[configuring Promtail](configuration/) for more details.
### Support for compressed files
Promtail now has native support for ingesting compressed files by a mechanism that
relies on file extensions. If a discovered file has an expected compression file
extension, Promtail will **lazily** decompress the compressed file and push the
parsed data to Loki. Important details are:
* It relies on the `\n` character to separate the data into different log lines.
* The max expected log line is 4096 bytes within the compressed file.
* The data is decompressed in blocks of 4096 bytes. i.e: it first fetches a block of 4096 bytes
from the compressed file and process it. After processing this block and pushing the data to Loki,
it fetches the following 4096 bytes, and so on.
* It supports the following extensions:
- `.gz`: Data will be decompressed with the native Gunzip Golang pkg (`pkg/compress/gzip`)
- `.z`: Data will be decompressed with the native Zlib Golang pkg (`pkg/compress/zlib`)
- `.bz2`: Data will be decompressed with the native Bzip2 Golang pkg (`pkg/compress/bzip2`)
- `.tar.gz`: Data will be decompressed exactly as the `.gz` extension.
However, because `tar` will add its metadata at the beggining of the
compressed file, **the first parsed line will contains metadata together with
your log line**. It is illustrated at
`./clients/pkg/promtail/targets/file/decompresser_test.go`.
* `.zip` extension isn't supported as of now because it doesn't support some of the interfaces
Promtail requires. We have plans to add support for it in the near future.
* The decompression is quite CPU intensive and a lot of allocations are expected
to work, especially depending on the size of the file. You can expect the number
of garbage collection runs and the CPU usage to skyrocket, but no memory leak is
expected.
* Positions are supported. That means that, if you interrupt Promtail after
parsing and pushing (for example) 45% of your compressed file data, you can expect Promtail
to resume work from the last scraped line and process the rest of the remaining 55%.
* Since decompression and pushing can be very fast, depending on the size
of your compressed file Loki will rate-limit your ingestion. In that case you
might configure Promtail's [`limits` stage](https://grafana.com/docs/loki/latest/clients/promtail/stages/limit/) to slow the pace or increase
[ingestion limits on Loki](https://grafana.com/docs/loki/latest/configuration/#limits_config).
* Log rotations **aren't supported as of now**, mostly because it requires us modifying Promtail to
rely on file inodes instead of file names. If you'd like to see support for it, please create a new
issue on Github asking for it and explaining your use case.
* If you would like to see support for a compression protocol that isn't listed here, please
create a new issue on Github asking for it and explaining your use case.
## Loki Push API
Promtail can also be configured to receive logs from another Promtail or any Loki client by exposing the [Loki Push API](../../api#post-lokiapiv1push) with the [loki_push_api](configuration#loki_push_api_config) scrape config.

Loading…
Cancel
Save