lambda-promtail: Add ability to ingest logs from S3 (#5065)

* Add ability to ingest logs from S3 on lambda-promtail

* fix ci

* fix typo

* bump golang and alpine version

* update changelog

* add s3 permissions on terraform

* use for_each instead of count

* fix typo

* improve function naming

* add documentation and an example of a s3 file path

* refact logic to identify event type

* add missing iam permission to allow lambda to run inside a vpc

* fix typo

* allow lambda to access only specified s3 buckets

* configure a default log retention policy on log group

* add missing depends_on to make sure iam role is created before lambda function

* update docs

* fix label naming convention

* fix merge conflict

* use new backoff lib and update dependencies

* add option to limit batch size

* cache s3 client

* update docs and terraform

* address some feedback on PR

* fix typo
pull/5325/head
Andre Ziviani 4 years ago committed by GitHub
parent d787a0fe28
commit 699fffe9e5
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
  1. 1
      CHANGELOG.md
  2. 26
      docs/sources/clients/lambda-promtail/_index.md
  3. 6
      tools/lambda-promtail/Dockerfile
  4. 2
      tools/lambda-promtail/Makefile
  5. 19
      tools/lambda-promtail/go.mod
  6. 38
      tools/lambda-promtail/go.sum
  7. 53
      tools/lambda-promtail/lambda-promtail/cw.go
  8. 105
      tools/lambda-promtail/lambda-promtail/main.go
  9. 187
      tools/lambda-promtail/lambda-promtail/promtail.go
  10. 142
      tools/lambda-promtail/lambda-promtail/s3.go
  11. 59
      tools/lambda-promtail/main.tf
  12. 14
      tools/lambda-promtail/variables.tf

@ -20,6 +20,7 @@
* [5081](https://github.com/grafana/loki/pull/5081) **SasSwart**: Add the option to configure memory ballast for Loki
* [5085](https://github.com/grafana/loki/pull/5085) **aknuds1**: Upgrade Cortex to [e0807c4eb487](https://github.com/cortexproject/cortex/compare/4e9fc3a2b5ab..e0807c4eb487) and Prometheus to [692a54649ed7](https://github.com/prometheus/prometheus/compare/2a3d62ac8456..692a54649ed7)
* [5067](https://github.com/grafana/loki/pull/5057) **cstyan**: Add a metric to Azure Blob Storage client to track total egress bytes
* [5056](https://github.com/grafana/loki/pull/5056) **AndreZiviani**: lambda-promtail: Add ability to ingest logs from S3
* [4950](https://github.com/grafana/loki/pull/4950) **DylanGuedes**: Implement common instance addr/net interface
* [4949](https://github.com/grafana/loki/pull/4949) **ssncferreira**: Add query `queueTime` metric to statistics and metrics.go
* [4938](https://github.com/grafana/loki/pull/4938) **DylanGuedes**: Implement ring status page for the distributor

@ -4,7 +4,7 @@ weight: 20
---
# Lambda Promtail
Grafana Loki includes [Terraform](https://www.terraform.io/) and [CloudFormation](https://aws.amazon.com/cloudformation/) for shipping Cloudwatch logs to Loki via a [lambda function](https://aws.amazon.com/lambda/). This is done via [lambda-promtail](https://github.com/grafana/loki/tree/master/tools/lambda-promtail) which processes cloudwatch events and propagates them to Loki (or a Promtail instance) via the push-api [scrape config](../promtail/configuration#loki_push_api_config).
Grafana Loki includes [Terraform](https://www.terraform.io/) and [CloudFormation](https://aws.amazon.com/cloudformation/) for shipping Cloudwatch and loadbalancer logs to Loki via a [lambda function](https://aws.amazon.com/lambda/). This is done via [lambda-promtail](https://github.com/grafana/loki/tree/master/tools/lambda-promtail) which processes cloudwatch events and propagates them to Loki (or a Promtail instance) via the push-api [scrape config](../promtail/configuration#loki_push_api_config).
## Deployment
@ -15,9 +15,9 @@ For both deployment types there are a few values that must be defined:
- basic auth username/password if the write address is a Loki endpoint and has authentication
- the lambda-promtail image, full ECR repo path:tag
The Terraform deployment also takes in an array of log group names, and can take arrays for VPC subnets and security groups.
The Terraform deployment also takes in an array of log group and bucket names, and can take arrays for VPC subnets and security groups.
There's also a flag to keep the log stream label when propagating the logs, which defaults to false. This can be helpful when the cardinality is too large, such as the case of a log stream per lambda invocation.
There's also a flag to keep the log stream label when propagating the logs from Cloudwatch, which defaults to false. This can be helpful when the cardinality is too large, such as the case of a log stream per lambda invocation.
In an effort to make deployment of lambda-promtail as simple as possible, we've created a [public ECR repo](https://gallery.ecr.aws/grafana/lambda-promtail) to publish our builds of lambda-promtail. Users are still able to clone this repo, make their own modifications to the Go code, and upload their own image to their own ECR repo if they wish.
@ -25,7 +25,7 @@ In an effort to make deployment of lambda-promtail as simple as possible, we've
Terraform:
```
terraform apply -var "lambda_promtail_image=<repo:tag>" -var "write_address=https://logs-prod-us-central1.grafana.net/loki/api/v1/push" -var "password=<password>" -var "username=<user>" -var 'log_group_names=["/aws/lambda/log-group-1", "/aws/lambda/log-group-2]'
terraform apply -var "lambda_promtail_image=<repo:tag>" -var "write_address=https://logs-prod-us-central1.grafana.net/loki/api/v1/push" -var "password=<password>" -var "username=<user>" -var 'log_group_names=["/aws/lambda/log-group-1", "/aws/lambda/log-group-2"]' -var 'bucket_names=["bucket-a", "bucket-b"]' -var 'batch_size=131072'
```
The first few lines of `main.tf` define the AWS region to deploy to, you are free to modify this or remove and deploy to
@ -37,7 +37,7 @@ provider "aws" {
To keep the log group label add `-var "keep_stream=true"`.
Note that the creation of subscription filter in the provided Terraform file only accepts an array of log group names, it does **not** accept strings for regex filtering on the logs contents via the subscription filters. We suggest extending the Terraform file to do so, or having lambda-promtail write to Promtail and using [pipeline stages](https://grafana.com/docs/loki/latest/clients/promtail/stages/drop/).
Note that the creation of subscription filter on Cloudwatch in the provided Terraform file only accepts an array of log group names, it does **not** accept strings for regex filtering on the logs contents via the subscription filters. We suggest extending the Terraform file to do so, or having lambda-promtail write to Promtail and using [pipeline stages](https://grafana.com/docs/loki/latest/clients/promtail/stages/drop/).
CloudFormation:
```
@ -72,13 +72,20 @@ For those using Cloudwatch and wishing to test out Loki in a low-risk way, this
Note: Propagating logs from Cloudwatch to Loki means you'll still need to _pay_ for Cloudwatch.
### Loadbalancer logs
This workflow allows ingesting AWS loadbalancer logs stored on S3 to Loki.
## Propagated Labels
Incoming logs can have three special labels assigned to them which can be used in [relabeling](../promtail/configuration/#relabel_config) or later stages in a Promtail [pipeline](../promtail/pipelines/):
- `__aws_log_type`: Where this log came from (Cloudwatch or S3).
- `__aws_cloudwatch_log_group`: The associated Cloudwatch Log Group for this log.
- `__aws_cloudwatch_log_stream`: The associated Cloudwatch Log Stream for this log (if `KEEP_STREAM=true`).
- `__aws_cloudwatch_owner`: The AWS ID of the owner of this event.
- `__aws_s3_log_lb`: The name of the loadbalancer.
- `__aws_s3_log_lb_owner`: The Account ID of the loadbalancer owner.
## Limitations
@ -100,11 +107,13 @@ For availability concerns, run a set of Promtails behind a load balancer.
Relevant if lambda-promtail is configured to write to Promtail. Since Promtail batches writes to Loki for performance, it's possible that Promtail will receive a log, issue a successful `204` http status code for the write, then be killed at a later time before it writes upstream to Loki. This should be rare, but is a downside this workflow has.
This lambda will flush logs when the batch size hits the default value of `131072` (128KB), this can be changed with `BATCH_SIZE` environment variable, which is set to the number of bytes to use.
### Templating/Deployment
The current CloudFormation template is rudimentary. If you need to add vpc configs, extra log groups to monitor, subnet declarations, etc, you'll need to edit the template manually. If you need to subscribe to more than one Cloudwatch Log Group you'll also need to copy paste that section of the template for each group.
The Terraform file is a bit more fleshed out, and can be configured to take in an array of log group names, as well as vpc configuration.
The Terraform file is a bit more fleshed out, and can be configured to take in an array of log group and bucket names, as well as vpc configuration.
The provided Terraform and CloudFormation files are meant to cover the default use case, and more complex deployments will likely require some modification and extenstion of the provided files.
@ -133,9 +142,14 @@ scrape_configs:
# Adds a label on all streams indicating it was processed by the lambda-promtail workflow.
promtail: 'lambda-promtail'
relabel_configs:
- source_labels: ['__aws_log_type']
taget_label: 'log_type'
# Maps the cloudwatch log group into a label called `log_group` for use in Loki.
- source_labels: ['__aws_cloudwatch_log_group']
target_label: 'log_group'
# Maps the loadbalancer name into a label called `loadbalancer_name` for use in Loki.
- source_label: ['__aws_s3_log_lb']
taget_label: 'loadbalancer_name'
```
## Multiple Promtail Deployment

@ -1,4 +1,4 @@
FROM golang:1-alpine3.13 AS build-image
FROM golang:1-alpine3.14 AS build-image
COPY tools/lambda-promtail /src/lambda-promtail
WORKDIR /src/lambda-promtail
@ -9,10 +9,10 @@ RUN apk update && apk upgrade && \
apk add --no-cache bash git
RUN go mod download
RUN go build -tags lambda.norpc -ldflags="-s -w" lambda-promtail/main.go
RUN go build -o ./main -tags lambda.norpc -ldflags="-s -w" lambda-promtail/*.go
FROM alpine:3.13
FROM alpine:3.14
WORKDIR /app

@ -1,7 +1,7 @@
all: build docker
build:
GOOS=linux CGO_ENABLED=0 go build lambda-promtail/main.go
GOOS=linux CGO_ENABLED=0 go build -o ./main lambda-promtail/*.go
clean:
rm main

@ -4,14 +4,31 @@ go 1.17
require (
github.com/aws/aws-lambda-go v1.26.0
github.com/aws/aws-sdk-go-v2 v1.11.2
github.com/aws/aws-sdk-go-v2/config v1.11.1
github.com/aws/aws-sdk-go-v2/service/s3 v1.22.0
github.com/gogo/protobuf v1.3.2
github.com/golang/snappy v0.0.4
github.com/grafana/dskit v0.0.0-20220105080720-01ce9286d7d5
github.com/grafana/loki v1.6.2-0.20220128102010-431d018ec64f
github.com/prometheus/common v0.32.1
)
require (
github.com/HdrHistogram/hdrhistogram-go v1.1.2 // indirect
github.com/armon/go-metrics v0.3.9 // indirect
github.com/aws/aws-sdk-go-v2/aws/protocol/eventstream v1.0.0 // indirect
github.com/aws/aws-sdk-go-v2/credentials v1.6.5 // indirect
github.com/aws/aws-sdk-go-v2/feature/ec2/imds v1.8.2 // indirect
github.com/aws/aws-sdk-go-v2/internal/configsources v1.1.2 // indirect
github.com/aws/aws-sdk-go-v2/internal/endpoints/v2 v2.0.2 // indirect
github.com/aws/aws-sdk-go-v2/internal/ini v1.3.2 // indirect
github.com/aws/aws-sdk-go-v2/service/internal/accept-encoding v1.5.0 // indirect
github.com/aws/aws-sdk-go-v2/service/internal/presigned-url v1.5.2 // indirect
github.com/aws/aws-sdk-go-v2/service/internal/s3shared v1.9.2 // indirect
github.com/aws/aws-sdk-go-v2/service/sso v1.7.0 // indirect
github.com/aws/aws-sdk-go-v2/service/sts v1.12.0 // indirect
github.com/aws/smithy-go v1.9.0 // indirect
github.com/beorn7/perks v1.0.1 // indirect
github.com/cespare/xxhash/v2 v2.1.2 // indirect
github.com/coreos/etcd v3.3.25+incompatible // indirect
@ -30,7 +47,6 @@ require (
github.com/golang/protobuf v1.5.2 // indirect
github.com/google/btree v1.0.1 // indirect
github.com/gorilla/mux v1.8.0 // indirect
github.com/grafana/dskit v0.0.0-20220105080720-01ce9286d7d5 // indirect
github.com/grpc-ecosystem/go-grpc-middleware v1.3.0 // indirect
github.com/hashicorp/consul/api v1.12.0 // indirect
github.com/hashicorp/errwrap v1.0.0 // indirect
@ -89,6 +105,7 @@ require (
google.golang.org/genproto v0.0.0-20211223182754-3ac035c7e7cb // indirect
google.golang.org/grpc v1.40.1 // indirect
google.golang.org/protobuf v1.27.1 // indirect
gopkg.in/check.v1 v1.0.0-20201130134442-10cb98267c6c // indirect
gopkg.in/yaml.v2 v2.4.0 // indirect
gopkg.in/yaml.v3 v3.0.0-20210107192922-496545a6307b // indirect
)

@ -90,8 +90,9 @@ github.com/BurntSushi/xgb v0.0.0-20160522181843-27f122750802/go.mod h1:IVnqGOEym
github.com/DATA-DOG/go-sqlmock v1.3.3/go.mod h1:f/Ixk793poVmq4qj/V1dPUg2JEAKC73Q5eFN3EC/SaM=
github.com/DATA-DOG/go-sqlmock v1.4.1/go.mod h1:f/Ixk793poVmq4qj/V1dPUg2JEAKC73Q5eFN3EC/SaM=
github.com/DataDog/datadog-go v3.2.0+incompatible/go.mod h1:LButxg5PwREeZtORoXG3tL4fMGNddJ+vMq1mwgfaqoQ=
github.com/HdrHistogram/hdrhistogram-go v1.1.0 h1:6dpdDPTRoo78HxAJ6T1HfMiKSnqhgRRqzCuPshRkQ7I=
github.com/HdrHistogram/hdrhistogram-go v1.1.0/go.mod h1:yDgFjdqOqDEKOvasDdhWNXYg9BVp4O+o5f6V/ehm6Oo=
github.com/HdrHistogram/hdrhistogram-go v1.1.2 h1:5IcZpTvzydCQeHzK4Ef/D5rrSqwxob0t8PQPMybUNFM=
github.com/HdrHistogram/hdrhistogram-go v1.1.2/go.mod h1:yDgFjdqOqDEKOvasDdhWNXYg9BVp4O+o5f6V/ehm6Oo=
github.com/Knetic/govaluate v3.0.1-0.20171022003610-9aa49832a739+incompatible/go.mod h1:r7JcOSlj0wfOMncg0iLm8Leh48TZaKVeNIfJntJ2wa0=
github.com/Masterminds/semver v1.4.2/go.mod h1:MB6lktGJrhw8PrUyiEoblNEGEQ+RzHPF078ddwwvV3Y=
github.com/Masterminds/sprig v2.16.0+incompatible/go.mod h1:y6hNFY5UBTIWBxnzTeuNhlNS5hqE0NB0E6fgfo2Br3o=
@ -173,6 +174,36 @@ github.com/aws/aws-sdk-go v1.38.35/go.mod h1:hcU610XS61/+aQV88ixoOzUoG7v3b31pl2z
github.com/aws/aws-sdk-go v1.40.11/go.mod h1:585smgzpB/KqRA+K3y/NL/oYRqQvpNJYvLm+LY1U59Q=
github.com/aws/aws-sdk-go v1.42.8/go.mod h1:585smgzpB/KqRA+K3y/NL/oYRqQvpNJYvLm+LY1U59Q=
github.com/aws/aws-sdk-go-v2 v0.18.0/go.mod h1:JWVYvqSMppoMJC0x5wdwiImzgXTI9FuZwxzkQq9wy+g=
github.com/aws/aws-sdk-go-v2 v1.11.2 h1:SDiCYqxdIYi6HgQfAWRhgdZrdnOuGyLDJVRSWLeHWvs=
github.com/aws/aws-sdk-go-v2 v1.11.2/go.mod h1:SQfA+m2ltnu1cA0soUkj4dRSsmITiVQUJvBIZjzfPyQ=
github.com/aws/aws-sdk-go-v2/aws/protocol/eventstream v1.0.0 h1:yVUAwvJC/0WNPbyl0nA3j1L6CW1CN8wBubCRqtG7JLI=
github.com/aws/aws-sdk-go-v2/aws/protocol/eventstream v1.0.0/go.mod h1:Xn6sxgRuIDflLRJFj5Ev7UxABIkNbccFPV/p8itDReM=
github.com/aws/aws-sdk-go-v2/config v1.11.1 h1:KXSjb7ZMLRtjxClFptukTYibiOqJS9NwBO+9WD3UMto=
github.com/aws/aws-sdk-go-v2/config v1.11.1/go.mod h1:VvfkzUhVtntSg1JfGFMSKS0CyiTZd3NqBxK5af4zsME=
github.com/aws/aws-sdk-go-v2/credentials v1.6.5 h1:ZrsO2js2v4T95rsCIWoAb/ck5+U1kwkizGdZHY+ni3s=
github.com/aws/aws-sdk-go-v2/credentials v1.6.5/go.mod h1:HWSOnsnqVMbLcWUmom6AN1cqhcLzLJ62AObW28CbYbU=
github.com/aws/aws-sdk-go-v2/feature/ec2/imds v1.8.2 h1:KiN5TPOLrEjbGCvdTQR4t0U4T87vVwALZ5Bg3jpMqPY=
github.com/aws/aws-sdk-go-v2/feature/ec2/imds v1.8.2/go.mod h1:dF2F6tXEOgmW5X1ZFO/EPtWrcm7XkW07KNcJUGNtt4s=
github.com/aws/aws-sdk-go-v2/internal/configsources v1.1.2 h1:XJLnluKuUxQG255zPNe+04izXl7GSyUVafIsgfv9aw4=
github.com/aws/aws-sdk-go-v2/internal/configsources v1.1.2/go.mod h1:SgKKNBIoDC/E1ZCDhhMW3yalWjwuLjMcpLzsM/QQnWo=
github.com/aws/aws-sdk-go-v2/internal/endpoints/v2 v2.0.2 h1:EauRoYZVNPlidZSZJDscjJBQ22JhVF2+tdteatax2Ak=
github.com/aws/aws-sdk-go-v2/internal/endpoints/v2 v2.0.2/go.mod h1:xT4XX6w5Sa3dhg50JrYyy3e4WPYo/+WjY/BXtqXVunU=
github.com/aws/aws-sdk-go-v2/internal/ini v1.3.2 h1:IQup8Q6lorXeiA/rK72PeToWoWK8h7VAPgHNWdSrtgE=
github.com/aws/aws-sdk-go-v2/internal/ini v1.3.2/go.mod h1:VITe/MdW6EMXPb0o0txu/fsonXbMHUU2OC2Qp7ivU4o=
github.com/aws/aws-sdk-go-v2/service/internal/accept-encoding v1.5.0 h1:lPLbw4Gn59uoKqvOfSnkJr54XWk5Ak1NK20ZEiSWb3U=
github.com/aws/aws-sdk-go-v2/service/internal/accept-encoding v1.5.0/go.mod h1:80NaCIH9YU3rzTTs/J/ECATjXuRqzo/wB6ukO6MZ0XY=
github.com/aws/aws-sdk-go-v2/service/internal/presigned-url v1.5.2 h1:CKdUNKmuilw/KNmO2Q53Av8u+ZyXMC2M9aX8Z+c/gzg=
github.com/aws/aws-sdk-go-v2/service/internal/presigned-url v1.5.2/go.mod h1:FgR1tCsn8C6+Hf+N5qkfrE4IXvUL1RgW87sunJ+5J4I=
github.com/aws/aws-sdk-go-v2/service/internal/s3shared v1.9.2 h1:GnPGH1FGc4fkn0Jbm/8r2+nPOwSJjYPyHSqFSvY1ii8=
github.com/aws/aws-sdk-go-v2/service/internal/s3shared v1.9.2/go.mod h1:eDUYjOYt4Uio7xfHi5jOsO393ZG8TSfZB92a3ZNadWM=
github.com/aws/aws-sdk-go-v2/service/s3 v1.22.0 h1:J78RE/YNohCGbUyIbc3hr+UwnttfOn2dJUkNfvDkT30=
github.com/aws/aws-sdk-go-v2/service/s3 v1.22.0/go.mod h1:lQ5AeEW2XWzu8hwQ3dCqZFWORQ3RntO0Kq135Xd9VCo=
github.com/aws/aws-sdk-go-v2/service/sso v1.7.0 h1:E4fxAg/UE8a6yiLZYv8/EP0uXKPPRImiMau4ift6S/g=
github.com/aws/aws-sdk-go-v2/service/sso v1.7.0/go.mod h1:KnIpszaIdwI33tmc/W/GGXyn22c1USYxA/2KyvoeDY0=
github.com/aws/aws-sdk-go-v2/service/sts v1.12.0 h1:7g0252k2TF3eA1DtfkTQB/tqI41YvbUPaolwTR0/ITc=
github.com/aws/aws-sdk-go-v2/service/sts v1.12.0/go.mod h1:UV2N5HaPfdbDpkgkz4sRzWCvQswZjdO1FfqCWl0t7RA=
github.com/aws/smithy-go v1.9.0 h1:c7FUdEqrQA1/UVKKCNDFQPNKGp4FQg3YW4Ck5SLTG58=
github.com/aws/smithy-go v1.9.0/go.mod h1:SObp3lf9smib00L/v3U2eAKG8FyQ7iLrJnQiAmR5n+E=
github.com/beevik/ntp v0.2.0/go.mod h1:hIHWr+l3+/clUnF44zdK+CWW7fO8dR5cIylAQ76NRpg=
github.com/benbjohnson/clock v1.1.0 h1:Q92kusRqC1XV2MjkWETPvjJVqKetz1OzxZB7mHJLju8=
github.com/benbjohnson/clock v1.1.0/go.mod h1:J11/hYXuz8f4ySSvYwY0FKfm+ezbsZBKZxNJlLklBHA=
@ -884,6 +915,7 @@ github.com/kr/fs v0.1.0/go.mod h1:FFnZGqtBN9Gxj7eW1uZ42v5BccTP0vu6NEaFoC2HwRg=
github.com/kr/logfmt v0.0.0-20140226030751-b84e30acd515/go.mod h1:+0opPa2QZZtGFBFZlji/RkVcI2GknAs/DXo4wKdlNEc=
github.com/kr/pretty v0.1.0/go.mod h1:dAy3ld7l9f0ibDNOQOHHMYYIIbhfbHSm3C4ZsoJORNo=
github.com/kr/pretty v0.2.0/go.mod h1:ipq/a2n7PKx3OHsz4KJII5eveXtPO4qwEXGdVfWzfnI=
github.com/kr/pretty v0.2.1 h1:Fmg33tUaq4/8ym9TJN1x7sLJnHVwhP33CNkpYV/7rwI=
github.com/kr/pretty v0.2.1/go.mod h1:ipq/a2n7PKx3OHsz4KJII5eveXtPO4qwEXGdVfWzfnI=
github.com/kr/pty v1.1.1/go.mod h1:pFQYn66WHrOpPYNljwOMqo10TkYh1fy3cYio2l3bCsQ=
github.com/kr/pty v1.1.5/go.mod h1:9r2w37qlBe7rQ6e1fg1S/9xpWHSnaqNdHD3WcMdbPDA=
@ -1010,7 +1042,6 @@ github.com/nats-io/nkeys v0.1.0/go.mod h1:xpnFELMwJABBLVhffcfd1MZx6VsNRFpEugbxzi
github.com/nats-io/nkeys v0.1.3/go.mod h1:xpnFELMwJABBLVhffcfd1MZx6VsNRFpEugbxziKVo7w=
github.com/nats-io/nuid v1.0.1/go.mod h1:19wcPz3Ph3q0Jbyiqsd0kePYG7A95tJPxeL+1OSON2c=
github.com/ncw/swift v1.0.47/go.mod h1:23YIA4yWVnGwv2dQlN4bB7egfYX6YLn0Yo/S6zZO/ZM=
github.com/niemeyer/pretty v0.0.0-20200227124842-a10e7caefd8e h1:fD57ERR4JtEqsWbfPhv4DMiApHyliiK5xCTNVSPiaAs=
github.com/niemeyer/pretty v0.0.0-20200227124842-a10e7caefd8e/go.mod h1:zD1mROLANZcx1PVRCS0qkT7pwLkGfwJo4zjcN/Tysno=
github.com/nxadm/tail v1.4.4/go.mod h1:kenIhsEOeOJmVchQTgglprH7qJGnHDVpk1VPCcaMI8A=
github.com/oklog/oklog v0.3.2/go.mod h1:FCV+B7mhrz4o+ueLpx+KqkyXRGMWOYEvfiXtdGtbWGs=
@ -1969,8 +2000,9 @@ gopkg.in/check.v1 v0.0.0-20161208181325-20d25e280405/go.mod h1:Co6ibVJAznAaIkqp8
gopkg.in/check.v1 v1.0.0-20141024133853-64131543e789/go.mod h1:Co6ibVJAznAaIkqp8huTwlJQCZ016jof/cbN4VW5Yz0=
gopkg.in/check.v1 v1.0.0-20180628173108-788fd7840127/go.mod h1:Co6ibVJAznAaIkqp8huTwlJQCZ016jof/cbN4VW5Yz0=
gopkg.in/check.v1 v1.0.0-20190902080502-41f04d3bba15/go.mod h1:Co6ibVJAznAaIkqp8huTwlJQCZ016jof/cbN4VW5Yz0=
gopkg.in/check.v1 v1.0.0-20200227125254-8fa46927fb4f h1:BLraFXnmrev5lT+xlilqcH8XK9/i0At2xKjWk4p6zsU=
gopkg.in/check.v1 v1.0.0-20200227125254-8fa46927fb4f/go.mod h1:Co6ibVJAznAaIkqp8huTwlJQCZ016jof/cbN4VW5Yz0=
gopkg.in/check.v1 v1.0.0-20201130134442-10cb98267c6c h1:Hei/4ADfdWqJk1ZMxUNpqntNwaWcugrBjAiHlqqRiVk=
gopkg.in/check.v1 v1.0.0-20201130134442-10cb98267c6c/go.mod h1:JHkPIbrfpd72SG/EVd6muEfDQjcINNoR0C8j2r3qZ4Q=
gopkg.in/cheggaaa/pb.v1 v1.0.25/go.mod h1:V/YB90LKu/1FcN3WVnfiiE5oMCibMjukxqG/qStrOgw=
gopkg.in/errgo.v2 v2.1.0/go.mod h1:hNsd1EY+bozCKY1Ytp96fpM3vjJbqLJn88ws8XvfDNI=
gopkg.in/fsnotify.v1 v1.4.7/go.mod h1:Tz8NjZHkW78fSQdbUxIjBTcgA1z1m8ZHf0WmKUhAMys=

@ -0,0 +1,53 @@
package main
import (
"context"
"fmt"
"time"
"github.com/aws/aws-lambda-go/events"
"github.com/grafana/loki/pkg/logproto"
"github.com/prometheus/common/model"
)
func parseCWEvent(ctx context.Context, b *batch, ev *events.CloudwatchLogsEvent) error {
data, err := ev.AWSLogs.Parse()
if err != nil {
fmt.Println("error parsing log event: ", err)
return err
}
labels := model.LabelSet{
model.LabelName("__aws_cloudwatch_log_group"): model.LabelValue(data.LogGroup),
model.LabelName("__aws_cloudwatch_owner"): model.LabelValue(data.Owner),
}
if keepStream {
labels[model.LabelName("__aws_cloudwatch_log_stream")] = model.LabelValue(data.LogStream)
}
for _, event := range data.LogEvents {
timestamp := time.UnixMilli(event.Timestamp)
b.add(ctx, entry{labels, logproto.Entry{
Line: event.Message,
Timestamp: timestamp,
}})
}
return nil
}
func processCWEvent(ctx context.Context, ev *events.CloudwatchLogsEvent) error {
batch, _ := newBatch(ctx)
err := parseCWEvent(ctx, batch, ev)
if err != nil {
return err
}
err = sendToPromtail(ctx, batch)
if err != nil {
return err
}
return nil
}

@ -1,25 +1,18 @@
package main
import (
"bufio"
"bytes"
"context"
"encoding/json"
"errors"
"fmt"
"io"
"net/http"
"net/url"
"os"
"strconv"
"strings"
"github.com/aws/aws-lambda-go/events"
"github.com/aws/aws-lambda-go/lambda"
"github.com/gogo/protobuf/proto"
"github.com/golang/snappy"
"github.com/prometheus/common/model"
"github.com/grafana/loki/pkg/logproto"
"github.com/grafana/loki/pkg/util"
"github.com/aws/aws-sdk-go-v2/service/s3"
)
const (
@ -33,18 +26,22 @@ var (
writeAddress *url.URL
username, password string
keepStream bool
batchSize int
s3Clients map[string]*s3.Client
)
func init() {
addr := os.Getenv("WRITE_ADDRESS")
if addr == "" {
panic(errors.New("required environmental variable WRITE_ADDRESS not present"))
panic(errors.New("required environmental variable WRITE_ADDRESS not present, format: https://<hostname>/loki/api/v1/push"))
}
var err error
writeAddress, err = url.Parse(addr)
if err != nil {
panic(err)
}
fmt.Println("write address: ", writeAddress.String())
username = os.Getenv("USERNAME")
@ -61,72 +58,54 @@ func init() {
keepStream = true
}
fmt.Println("keep stream: ", keepStream)
}
func handler(ctx context.Context, ev events.CloudwatchLogsEvent) error {
data, err := ev.AWSLogs.Parse()
if err != nil {
fmt.Println("error parsing log event: ", err)
return err
}
labels := model.LabelSet{
model.LabelName("__aws_cloudwatch_log_group"): model.LabelValue(data.LogGroup),
model.LabelName("__aws_cloudwatch_owner"): model.LabelValue(data.Owner),
}
if keepStream {
labels[model.LabelName("__aws_cloudwatch_log_stream")] = model.LabelValue(data.LogStream)
batch := os.Getenv("BATCH_SIZE")
batchSize = 131072
if batch != "" {
batchSize, _ = strconv.Atoi(batch)
}
stream := logproto.Stream{
Labels: labels.String(),
Entries: make([]logproto.Entry, 0, len(data.LogEvents)),
}
s3Clients = make(map[string]*s3.Client)
}
for _, entry := range data.LogEvents {
stream.Entries = append(stream.Entries, logproto.Entry{
Line: entry.Message,
// It's best practice to ignore timestamps from cloudwatch as promtail is responsible for adding those.
Timestamp: util.TimeFromMillis(entry.Timestamp),
})
}
func checkEventType(ev map[string]interface{}) (interface{}, error) {
var s3Event events.S3Event
var cwEvent events.CloudwatchLogsEvent
buf, err := proto.Marshal(&logproto.PushRequest{
Streams: []logproto.Stream{stream},
})
if err != nil {
return err
}
types := [...]interface{}{&s3Event, &cwEvent}
// Push to promtail
buf = snappy.Encode(nil, buf)
req, err := http.NewRequest("POST", writeAddress.String(), bytes.NewReader(buf))
if err != nil {
fmt.Println("error: ", err)
return err
}
req.Header.Set("Content-Type", contentType)
j, _ := json.Marshal(ev)
reader := strings.NewReader(string(j))
d := json.NewDecoder(reader)
d.DisallowUnknownFields()
for _, t := range types {
err := d.Decode(t)
// If either is not empty both should be (see initS), but just to be safe.
if username != "" && password != "" {
fmt.Println("adding basic auth to request")
req.SetBasicAuth(username, password)
if err == nil {
return t, nil
}
reader.Seek(0, 0)
}
resp, err := http.DefaultClient.Do(req.WithContext(ctx))
return nil, fmt.Errorf("unknown event type!")
}
func handler(ctx context.Context, ev map[string]interface{}) error {
event, err := checkEventType(ev)
if err != nil {
fmt.Println("error: ", err)
fmt.Println("invalid event: %s", ev)
return err
}
if resp.StatusCode/100 != 2 {
scanner := bufio.NewScanner(io.LimitReader(resp.Body, maxErrMsgLen))
line := ""
if scanner.Scan() {
line = scanner.Text()
}
err = fmt.Errorf("server returned HTTP status %s (%d): %s", resp.Status, resp.StatusCode, line)
fmt.Println("error:", err)
switch event.(type) {
case *events.S3Event:
return processS3Event(ctx, event.(*events.S3Event))
case *events.CloudwatchLogsEvent:
return processCWEvent(ctx, event.(*events.CloudwatchLogsEvent))
}
return err
}

@ -0,0 +1,187 @@
package main
import (
"bufio"
"bytes"
"context"
"fmt"
"io"
"net/http"
"sort"
"strings"
"time"
"github.com/gogo/protobuf/proto"
"github.com/golang/snappy"
"github.com/grafana/dskit/backoff"
"github.com/grafana/loki/pkg/logproto"
"github.com/prometheus/common/model"
)
const (
timeout = 5 * time.Second
minBackoff = 100 * time.Millisecond
maxBackoff = 30 * time.Second
maxRetries = 10
reservedLabelTenantID = "__tenant_id__"
userAgent = "lambda-promtail"
)
type entry struct {
labels model.LabelSet
entry logproto.Entry
}
type batch struct {
streams map[string]*logproto.Stream
size int
}
func newBatch(ctx context.Context, entries ...entry) (*batch, error) {
b := &batch{
streams: map[string]*logproto.Stream{},
}
for _, entry := range entries {
err := b.add(ctx, entry)
return b, err
}
return b, nil
}
func (b *batch) add(ctx context.Context, e entry) error {
labels := labelsMapToString(e.labels, reservedLabelTenantID)
stream, ok := b.streams[labels]
if !ok {
b.streams[labels] = &logproto.Stream{
Labels: labels,
Entries: []logproto.Entry{},
}
stream = b.streams[labels]
}
stream.Entries = append(stream.Entries, e.entry)
b.size += len(e.entry.Line)
if b.size > batchSize {
return b.flushBatch(ctx)
}
return nil
}
func labelsMapToString(ls model.LabelSet, without ...model.LabelName) string {
lstrs := make([]string, 0, len(ls))
Outer:
for l, v := range ls {
for _, w := range without {
if l == w {
continue Outer
}
}
lstrs = append(lstrs, fmt.Sprintf("%s=%q", l, v))
}
sort.Strings(lstrs)
return fmt.Sprintf("{%s}", strings.Join(lstrs, ", "))
}
func (b *batch) encode() ([]byte, int, error) {
req, entriesCount := b.createPushRequest()
buf, err := proto.Marshal(req)
if err != nil {
return nil, 0, err
}
buf = snappy.Encode(nil, buf)
return buf, entriesCount, nil
}
func (b *batch) createPushRequest() (*logproto.PushRequest, int) {
req := logproto.PushRequest{
Streams: make([]logproto.Stream, 0, len(b.streams)),
}
entriesCount := 0
for _, stream := range b.streams {
req.Streams = append(req.Streams, *stream)
entriesCount += len(stream.Entries)
}
return &req, entriesCount
}
func (b *batch) flushBatch(ctx context.Context) error {
err := sendToPromtail(ctx, b)
if err != nil {
return err
}
b.streams = make(map[string]*logproto.Stream)
return nil
}
func sendToPromtail(ctx context.Context, b *batch) error {
buf, _, err := b.encode()
if err != nil {
return err
}
backoff := backoff.New(ctx, backoff.Config{minBackoff, maxBackoff, maxRetries})
var status int
for {
// send uses `timeout` internally, so `context.Background` is good enough.
status, err = send(context.Background(), buf)
// Only retry 429s, 500s and connection-level errors.
if status > 0 && status != 429 && status/100 != 5 {
break
}
fmt.Printf("error sending batch, will retry, status: %d error: %s\n", status, err)
backoff.Wait()
// Make sure it sends at least once before checking for retry.
if !backoff.Ongoing() {
break
}
}
if err != nil {
fmt.Printf("Failed to send logs! %s\n", err)
return err
}
return nil
}
func send(ctx context.Context, buf []byte) (int, error) {
ctx, cancel := context.WithTimeout(ctx, timeout)
defer cancel()
req, err := http.NewRequest("POST", writeAddress.String(), bytes.NewReader(buf))
if err != nil {
return -1, err
}
req.Header.Set("Content-Type", contentType)
req.Header.Set("User-Agent", userAgent)
resp, err := http.DefaultClient.Do(req.WithContext(ctx))
if err != nil {
return -1, err
}
if resp.StatusCode/100 != 2 {
scanner := bufio.NewScanner(io.LimitReader(resp.Body, maxErrMsgLen))
line := ""
if scanner.Scan() {
line = scanner.Text()
}
err = fmt.Errorf("server returned HTTP status %s (%d): %s", resp.Status, resp.StatusCode, line)
}
return resp.StatusCode, err
}

@ -0,0 +1,142 @@
package main
import (
"bufio"
"compress/gzip"
"context"
"fmt"
"io"
"regexp"
"time"
"github.com/aws/aws-lambda-go/events"
"github.com/grafana/loki/pkg/logproto"
"github.com/prometheus/common/model"
"github.com/aws/aws-sdk-go-v2/aws"
"github.com/aws/aws-sdk-go-v2/config"
"github.com/aws/aws-sdk-go-v2/service/s3"
)
var (
// regex that parses the log file name fields
// source: https://docs.aws.amazon.com/elasticloadbalancing/latest/application/load-balancer-access-logs.html#access-log-file-format
// format: bucket[/prefix]/AWSLogs/aws-account-id/elasticloadbalancing/region/yyyy/mm/dd/aws-account-id_elasticloadbalancing_region_app.load-balancer-id_end-time_ip-address_random-string.log.gz
// example: my-bucket/AWSLogs/123456789012/elasticloadbalancing/us-east-1/2022/01/24/123456789012_elasticloadbalancing_us-east-1_app.my-loadbalancer.b13ea9d19f16d015_20220124T0000Z_0.0.0.0_2et2e1mx.log.gz
filenameRegex = regexp.MustCompile(`AWSLogs\/(?P<account_id>\d+)\/elasticloadbalancing\/(?P<region>[\w-]+)\/(?P<year>\d+)\/(?P<month>\d+)\/(?P<day>\d+)\/\d+\_elasticloadbalancing\_\w+-\w+-\d_(?:(?:app|nlb)\.*?)?(?P<lb>[a-zA-Z\-]+)`)
// regex that extracts the timestamp (RFC3339) from message log
timestampRegex = regexp.MustCompile(`\w+ (?P<timestamp>\d+-\d+-\d+T\d+:\d+:\d+\.\d+Z)`)
)
func getS3Object(ctx context.Context, labels map[string]string) (io.ReadCloser, error) {
var s3Client *s3.Client
if c, ok := s3Clients[labels["bucket_region"]]; ok {
s3Client = c
} else {
cfg, err := config.LoadDefaultConfig(ctx, config.WithRegion(labels["bucket_region"]))
if err != nil {
return nil, err
}
s3Client = s3.NewFromConfig(cfg)
s3Clients[labels["bucket_region"]] = s3Client
}
obj, err := s3Client.GetObject(ctx,
&s3.GetObjectInput{
Bucket: aws.String(labels["bucket"]),
Key: aws.String(labels["key"]),
ExpectedBucketOwner: aws.String(labels["bucketOwner"]),
})
if err != nil {
fmt.Println("Failed to get object %s from bucket %s on account %s", labels["key"], labels["bucket"], labels["bucketOwner"])
return nil, err
}
return obj.Body, nil
}
func parseS3Log(ctx context.Context, b *batch, labels map[string]string, obj io.ReadCloser) error {
gzreader, err := gzip.NewReader(obj)
if err != nil {
return err
}
scanner := bufio.NewScanner(gzreader)
ls := model.LabelSet{
model.LabelName("__aws_log_type"): model.LabelValue("s3_lb"),
model.LabelName("__aws_s3_log_lb"): model.LabelValue(labels["lb"]),
model.LabelName("__aws_s3_log_lb_owner"): model.LabelValue(labels["account_id"]),
}
for scanner.Scan() {
i := 0
log_line := scanner.Text()
match := timestampRegex.FindStringSubmatch(log_line)
timestamp, err := time.Parse(time.RFC3339, match[1])
if err != nil {
return err
}
b.add(ctx, entry{ls, logproto.Entry{
Line: log_line,
Timestamp: timestamp,
}})
i++
}
return nil
}
func getLabels(record events.S3EventRecord) (map[string]string, error) {
labels := make(map[string]string)
labels["key"] = record.S3.Object.Key
labels["bucket"] = record.S3.Bucket.Name
labels["bucket_owner"] = record.S3.Bucket.OwnerIdentity.PrincipalID
labels["bucket_region"] = record.AWSRegion
match := filenameRegex.FindStringSubmatch(labels["key"])
for i, name := range filenameRegex.SubexpNames() {
if i != 0 && name != "" {
labels[name] = match[i]
}
}
return labels, nil
}
func processS3Event(ctx context.Context, ev *events.S3Event) error {
batch, _ := newBatch(ctx)
for _, record := range ev.Records {
labels, err := getLabels(record)
if err != nil {
return err
}
obj, err := getS3Object(ctx, labels)
if err != nil {
return err
}
err = parseS3Log(ctx, batch, labels, obj)
if err != nil {
return err
}
}
err := sendToPromtail(ctx, batch)
if err != nil {
return err
}
return nil
}

@ -34,11 +34,34 @@ resource "aws_iam_role_policy" "logs" {
],
"Effect" : "Allow",
"Resource" : "arn:aws:logs:*:*:*",
},
{
"Action" : [
"s3:GetObject",
],
"Effect" : "Allow",
"Resource" : [
for bucket in toset(var.bucket_names) : "arn:aws:s3:::${bucket}/*"
]
}
]
})
}
data "aws_iam_policy" "lambda_vpc_execution" {
name = "AWSLambdaVPCAccessExecutionRole"
}
resource "aws_iam_role_policy_attachment" "lambda_vpc_execution" {
role = aws_iam_role.iam_for_lambda.name
policy_arn = data.aws_iam_policy.lambda_vpc_execution.arn
}
resource "aws_cloudwatch_log_group" "lambda_promtail" {
name = "/aws/lambda/lambda_promtail"
retention_in_days = 14
}
resource "aws_lambda_function" "lambda_promtail" {
image_uri = var.lambda_promtail_image
function_name = "lambda_promtail"
@ -61,8 +84,14 @@ resource "aws_lambda_function" "lambda_promtail" {
USERNAME = var.username
PASSWORD = var.password
KEEP_STREAM = var.keep_stream
BATCH_SIZE = var.batch_size
}
}
depends_on = [
aws_iam_role_policy.logs,
aws_iam_role_policy_attachment.lambda_vpc_execution,
]
}
resource "aws_lambda_function_event_invoke_config" "lambda_promtail_invoke_config" {
@ -81,11 +110,33 @@ resource "aws_lambda_permission" "lambda_promtail_allow_cloudwatch" {
# However, if you need to provide an actual filter_pattern for a specific log group you should
# copy this block and modify it accordingly.
resource "aws_cloudwatch_log_subscription_filter" "lambdafunction_logfilter" {
name = "lambdafunction_logfilter_${var.log_group_names[count.index]}"
count = length(var.log_group_names)
log_group_name = var.log_group_names[count.index]
for_each = toset(var.log_group_names)
name = "lambdafunction_logfilter_${each.value}"
log_group_name = each.value
destination_arn = aws_lambda_function.lambda_promtail.arn
# required but can be empty string
filter_pattern = ""
depends_on = [aws_iam_role_policy.logs]
}
}
resource "aws_lambda_permission" "allow-s3-invoke-lambda-promtail" {
for_each = toset(var.bucket_names)
action = "lambda:InvokeFunction"
function_name = aws_lambda_function.lambda_promtail.arn
principal = "s3.amazonaws.com"
source_arn = "arn:aws:s3:::${each.value}"
}
resource "aws_s3_bucket_notification" "push-to-lambda-promtail" {
for_each = toset(var.bucket_names)
bucket = each.value
lambda_function {
lambda_function_arn = aws_lambda_function.lambda_promtail.arn
events = ["s3:ObjectCreated:*"]
filter_prefix = "AWSLogs/"
filter_suffix = ".log.gz"
}
depends_on = [aws_lambda_permission.allow-s3-invoke-lambda-promtail]
}

@ -4,10 +4,16 @@ variable "write_address" {
default = "http://localhost:8080/loki/api/v1/push"
}
variable "bucket_names" {
type = list(string)
description = "List of S3 bucket names to create Event Notifications for."
default = []
}
variable "log_group_names" {
type = list(string)
description = "List of CloudWatch Log Group names to create Subscription Filters for."
default = [""]
default = []
}
variable "lambda_promtail_image" {
@ -35,6 +41,12 @@ variable "keep_stream" {
default = "false"
}
variable "batch_size" {
type = string
description = "Determines when to flush the batch of logs (bytes)."
default = ""
}
variable "lambda_vpc_subnets" {
type = list(string)
description = "List of subnet IDs associated with the Lambda function."

Loading…
Cancel
Save