Name the logger using lowercase characters, e.g.`log.New("my-logger")` using snake_case or kebab-case styling.
Name the logger using lowercase characters, for example,`log.New("my-logger")` using snake_case or kebab-case styling.
Prefix the logger name with an area name when using different loggers across a feature or related packages, e.g.`log.New("plugin.loader")` and `log.New("plugin.client")`.
Prefix the logger name with an area name when using different loggers across a feature or related packages; for example,`log.New("plugin.loader")` and `log.New("plugin.client")`.
Start the log message with a capital letter, e.g.`logger.Info("Hello world")` instead of `logger.Info("hello world")`. The log message should be an identifier for the log entry, avoid parameterization in favor of key-value pairs for additional data.
Start the log message with a capital letter, for example,`logger.Info("Hello world")` instead of `logger.Info("hello world")`. The log message should be an identifier for the log entry. Avoid parameterization in favor of key-value pairs for additional data.
Prefer using camelCase style when naming log keys, e.g. _remoteAddr_, to be consistent with Go identifiers.
To be consistent with Go identifiers, prefer using camelCase style when naming log keys; for example, `remoteAddr`.
Use the key _error_ when logging Go errors, e.g.`logger.Error("Something failed", "error", fmt.Errorf("BOOM"))`.
Use the key `Error` when logging Go errors; for example,`logger.Error("Something failed", "error", fmt.Errorf("BOOM"))`.
### Validate and sanitize input coming from user input
If log messages or key/value pairs originates from user input they **should** be validated and sanitized.
If log messages or key/value pairs originate from user input they should be validated and sanitized.
Be **careful** to not expose any sensitive information in log messages e.g. secrets, credentials etc. It's especially easy to do by mistake when including a struct as value.
Be careful not to expose any sensitive information in log messages; for example, secrets and credentials. It's easy to do this by mistake if you include a struct as a value.
### Log levels
When to use which log level?
When should you use each log level?
- **Debug:** Informational messages of high frequency and/or less-important messages during normal operations.
- **Info:** Informational messages of low frequency and/or important messages.
- **Warning:**Should in normal cases not be used/needed. If used should be actionable.
- **Error:** Error messages indicating some operation failed (with an error) and the program didn't have a way of handle the error.
- **Debug:** Informational messages of high frequency, less-important messages during normal operations, or both.
- **Info:** Informational messages of low frequency, important messages, or both.
- **Warning:**Use warning messages sparingly. If used, messages should be actionable.
- **Error:** Error messages indicating some operation failed (with an error) and the program didn't have a way to handle the error.
### Contextual logging
Use a contextual logger to include additional key/value pairs attached to `context.Context`, e.g. `traceID`, to allow correlating logs with traces and/or correlate logs with a common identifier.
Use a contextual logger to include additional key/value pairs attached to `context.Context`. For example, a `traceID`, used to allow correlating logs with traces, correlate logs with a common identifier, either or both.
You must [Enable tracing in Grafana](#2-enable-tracing-in-grafana) to get a traceID
You must [Enable tracing in Grafana](#2-enable-tracing-in-grafana) to get a `traceID`.
During development, it's convenient to enable certain log level, e.g. debug, for certain loggers to minimize the generated log output and make it easier to find things. See [[log.filters]](https://grafana.com/docs/grafana/latest/setup-grafana/configure-grafana/#filters) for information how to configure this.
You can enable certain log levels during development to make logging easier. For example, you can enable `debug` to allow certain loggers to minimize the generated log output and makes it easier to find things. Refer to [[log.filters]](https://grafana.com/docs/grafana/latest/setup-grafana/configure-grafana/#filters) for information on how to to set different levels for specific loggers.
It's also possible to configure multiple loggers:
You can also configure multiple loggers. For example:
```ini
[log]
@ -114,41 +114,41 @@ There are many possible types of metrics that can be tracked. One popular method
### Naming conventions
Use the namespace _grafana_ as that would prefix any defined metric names with `grafana_`. This will make it clear for operators that any metric named `grafana_*` belongs to Grafana.
Use the namespace `grafana` to prefix any defined metric names with `grafana_`. This prefix makes it clear for operators that any metric named `grafana_*` belongs to Grafana.
Use snake*case style when naming metrics, e.g. \_http_request_duration_seconds* instead of _httpRequestDurationSeconds_.
Use snake_case style when naming metrics; for example, `http_request_duration_seconds` instead of `httpRequestDurationSeconds`.
Use snake*case style when naming labels, e.g. \_status_code* instead of _statusCode_.
Use snake_case style when naming labels; for example, `status_code` instead of `statusCode`.
If metric type is a _counter_, name it with a `_total` suffix, e.g. _http_requests_total_.
If a metric type is a counter, name it with a `_total` suffix; for example, `http_requests_total`.
If metric type is a _histogram_ and you're measuring duration, name it with a `_<unit>` suffix, e.g. _http_request_duration_seconds_.
If a metric type is a histogram and you're measuring duration, name it with a `_<unit>` suffix; for example, `http_request_duration_seconds`.
If metric type is a _gauge_, name it to denote it's a value that can increase and decrease , e.g. _http_request_in_flight_.
If a metric type is a gauge, name it to denote that it's a value that can increase and decrease; for example, `http_request_in_flight`.
### Label values and high cardinality
Be careful with what label values you add/accept. Using/allowing too many label values could result in [high cardinality problems](https://grafana.com/blog/2022/02/15/what-are-cardinality-spikes-and-why-do-they-matter/).
Be careful with what label values you accept or add. Using or allowing too many label values could result in [high cardinality problems](https://grafana.com/blog/2022/02/15/what-are-cardinality-spikes-and-why-do-they-matter/).
If label values originates from user input they **should** be validated. Use `metricutil.SanitizeLabelName(<label value>`) from _pkg/infra/metrics/metricutil_ package to sanitize label names. Very **important** to only allow a pre-defined set of labels to minimize the risk of high cardinality problems.
If label values originate from user input they should be validated. Use `metricutil.SanitizeLabelName(<label value>)` from the `pkg/infra/metrics/metricutil` package to sanitize label names.
Be **careful** to not expose any sensitive information in label values, e.g. secrets, credentials etc.
> **Important:** Only allow a pre-defined set of labels to minimize the risk of high cardinality problems. Be careful not to expose any sensitive information in label values such as secrets and credentials.
### Guarantee the existence of metrics
If you want to guarantee the existence of metrics before any observations has happened there's a couple of helper methods available in the _pkg/infra/metrics/metricutil_ package.
To guarantee the existence of metrics before any observations have happened, you can use the helper methods available in the `pkg/infra/metrics/metricutil` package.
### How to collect and visualize metrics locally
1. Ensure you have Docker installed and running on your machine
1. Start Prometheus
1. Ensure you have Docker installed and running on your machine.
1. Start Prometheus.
```bash
make devenv sources=prometheus
```
1. Run Grafana, and create a Prometheus datasource if you do not have one yet. Set the server URL to `http://localhost:9090`, enable basic auth, and type in the same auth you have for local Grafana
1. Use Grafana Explore or dashboards to query any exported Grafana metrics. You can also view them at http://localhost:3000/metrics
1. Run Grafana, and then create a Prometheus datasource if you do not have one yet. Set the server URL to `http://localhost:9090`, enable basic authentication, and enter the same authentication you have for local Grafana.
1. Use Grafana Explore or dashboards to query any exported Grafana metrics. You can also view them at `http://localhost:3000/metrics`.
## Traces
@ -156,9 +156,9 @@ A distributed trace is data that tracks an application request as it flows throu
### Usage
Grafana uses [OpenTelemetry](https://opentelemetry.io/) for distributed tracing. There's an interface `Tracer` in the _pkg/infra/tracing_ package that implements the [OpenTelemetry Tracer interface](go.opentelemetry.io/otel/trace), which you can use to create traces and spans. To get a hold of a `Tracer` you would need to get it injected as dependency into your service, see [Services](services.md) for more details. For more information, see https://opentelemetry.io/docs/instrumentation/go/manual/.
Grafana uses [OpenTelemetry](https://opentelemetry.io/) for distributed tracing. There's an interface `Tracer` in the `pkg/infra/tracing` package that implements the [OpenTelemetry Tracer interface](go.opentelemetry.io/otel/trace), which you can use to create traces and spans. To access `Tracer` you need to get it injected as a dependency of your service. Refer to [Services](services.md) for more details. For more information, you may also referto [The OpenTelemetry documentation](https://opentelemetry.io/docs/instrumentation/go/manual/).
Example:
For example:
```go
import (
@ -183,7 +183,7 @@ func (s *MyService) Hello(ctx context.Context, name string) (string, error) {
| `get_account` | Good, and `account_id=42` would make a nice Span attribute |
| `get_account/{accountId}` | Also good (using the “HTTP route”) |
Span attribute and span event attributes should follow the [Attribute naming specification from OpenTelemetry](https://opentelemetry.io/docs/reference/specification/common/attribute-naming/). Good attribute key examples:
Span attribute and span event attributes should follow the [attribute naming specification from OpenTelemetry](https://opentelemetry.io/docs/reference/specification/common/attribute-naming/).
- service.version
- http.status_code
These are a few examples of good attributes:
See [Trace semantic conventions from OpenTelemetry](https://opentelemetry.io/docs/reference/specification/trace/semantic_conventions/) for additional conventions regarding well-known protocols and operations.
- `service.version`
- `http.status_code`
Refer to [trace semantic conventions from OpenTelemetry](https://opentelemetry.io/docs/reference/specification/trace/semantic_conventions/) for additional conventions regarding well-known protocols and operations.
### Span names and high cardinality
Be careful with what span names you add/accept. Using/allowing too many span names could result in high cardinality problems.
Be careful with what span names you add or accept. Using or allowing too many span names can result in high cardinality problems.
### Validate and sanitize input coming from user input
If span names, attribute or event values originates from user input they **should** be validated and sanitized. It's very **important** to only allow a pre-defined set of span names to minimize the risk of high cardinality problems.
If span names, attribute values, or event values originate from user input, they should be validated and sanitized. It's very important to only allow a pre-defined set of span names to minimize the risk of high cardinality problems.
Be **careful** to not expose any sensitive information in span names, attribute or event values, e.g. secrets, credentials etc.
Be careful to not expose any sensitive information in span names, attribute or event values; for example, secrets, credentials, and so on.
### Span attributes
Consider using `attributes.<Type>("<key>", <value>)` in favor of `attributes.Key("<key>").<Type>(<value>)` since it requires less characters and thereby reads easier.
Consider using `attributes.<Type>("<key>", <value>)` instead of `attributes.Key("<key>").<Type>(<value>)` since it requires fewer characters and is easier to read.
### How to collect, visualize and query traces (and correlate logs with traces) locally
#### 1. Start Jaeger
1. Start Jaeger
```bash
make devenv sources=jaeger
```
```bash
make devenv sources=jaeger
```
#### 2. Enable tracing in Grafana
1. Enable tracing in Grafana
To enable tracing in Grafana, you must set the address in your config.ini file
To enable tracing in Grafana, you must set the address in your `config.ini` file:
```ini
[tracing.opentelemetry.jaeger]
address = http://localhost:14268/api/traces
```
```ini
[tracing.opentelemetry.jaeger]
address = http://localhost:14268/api/traces
```
#### 3. Search/browse collected logs and traces in Grafana Explore
1. Search/browse collected logs and traces in Grafana Explore
You need provisioned gdev-jaeger and gdev-loki datasources, see [developer dashboard and data sources](https://github.com/grafana/grafana/tree/main/devenv#developer-dashboards-and-data-sources) for setup instructions.
You need provisioned `gdev-jaeger` and `gdev-loki` data sources. Refer to [developer dashboard and data sources](https://github.com/grafana/grafana/tree/main/devenv#developer-dashboards-and-data-sources) for setup instructions.
Open Grafana explore and select gdev-loki datasource and use the query `{filename="/var/log/grafana/grafana.log"} | logfmt`.
Open Grafana explore and select the `gdev-loki` datasource and use the query `{filename="/var/log/grafana/grafana.log"} | logfmt`.
You can then inspect any log message that includes a `traceID` and from there click on `gdev-jaeger` to split view and inspect the trace in question.
You can then inspect any log message that includes a `traceID` and from there click `gdev-jaeger` to split the view and inspect the trace in question.
#### 4. Search/browse collected traces in Jaeger UI
1. Search or browse collected traces in Jaeger UI
You can open http://localhost:16686 to use the Jaeger UI for browsing and searching traces.
You can open `http://localhost:16686` to use the Jaeger UI for browsing and searching traces.
Typically, you'd have one or more interfaces that your service provides
in the root package along with any types, errors, and other constants
that makes sense for another service interacting with this service to
Typically, you have one or more interfaces that your service provides
in the root package. Also, you should have any types, errors, and other constants
that makes sense for another service to interact with the tea pot service to
use.
Avoid depending on other services when structuring the root package to
@ -28,37 +28,34 @@ reduce the risk of running into circular dependencies.
### Sub-packages should depend on roots, not the other way around
Small-to-medium sized packages should be able to have only a single
Small to medium-sized packages should be able to have only a single
sub-package containing the implementation of the service. By moving the
implementation into a separate package we reduce the risk of triggering
circular dependencies (in Go, circular dependencies are evaluated per
package and this structure logically moves it to be per type or function
declaration).
circular dependencies.
Large packages may need utilize multiple sub-packages at the discretion
> **Note:** In Go, circular dependencies are evaluated per package, and this structure logically moves it to be per type or function declaration.
Large packages may need to utilize multiple sub-packages at the discretion
of the implementor. Keep interfaces and domain types to the root
package.
### Try to name sub-packages for projectwide uniqueness
### Try to name sub-packages for project-wide uniqueness
Prefix sub-packages with the service name or an abbreviation of the
service name (whichever is more appropriate) to provide an ideally
unique package name. This allows `teaimpl` to be distinguished from
`coffeeimpl` without the need for package aliases, and encourages the
use of the same name to reference your package throughout the codebase.
Prefix sub-packages with the service name or an abbreviation of the service name, whichever is more appropriate, to provide a unique package name.
This allows `teaimpl` to be distinguished from `coffeeimpl` without the need for package aliases, and encourages the use of the same name to reference your package throughout the codebase.
### A well-behaving service provides test doubles for itself
Other services may depend on your service, and it's good practice to
provide means for those services to set up a test instance of the
dependency as needed. Refer to
[Google Testing's Testing on the Toilet: Know Your Test Doubles](https://testing.googleblog.com/2013/07/testing-on-toilet-know-your-test-doubles.html) for a brief
dependency as needed. Refer to Google's
[Testing on the Toilet: Know Your Test Doubles](https://testing.googleblog.com/2013/07/testing-on-toilet-know-your-test-doubles.html) for a brief
explanation of how we semantically aim to differentiate fakes, mocks,
and stubs within our codebase.
Place test doubles in a sub-package to your root package named
`<servicename>test` or `<service-abbreviation>test`, such that the `teapot` service may have the
`teapottest` or `teatest`
`teapottest` or `teatest`.
A stub or mock may be sufficient if the service is not a dependency of a
lot of services or if it's called primarily for side effects so that a
@ -70,34 +67,32 @@ regular service without the need of complicated setup.
### Separate store and logic
When building a new service, data validation, manipulation, scheduled
events and so forth should be collected in a service implementation that
is built to be agnostic about its store.
When building a new service, collect data validation, manipulation, scheduled
events, and so forth, in a service implementation. This implementation should
be built so that it is agnostic about its store.
The storage should be an interface that is not directly called from
outside the service and should be kept to a minimum complexity to
provide the functionality necessary for the service.
A litmus test to reduce the complexity of the storage interface is
whether an in-memory implementation is a feasible test double to build
to test the service.
Use a simple litmus test to determine whether an in-memory implementation is a feasible test-double to assess the service. This will reduce the complexity of the storage interface.
### Outside the service root
Some parts of the service definition remains outside the
service directory and reflects the legacy package hierarchy.
As of June 2022, the parts that remain outside the service are:
Some parts of the service definition remain outside the
service directory and reflect the legacy package hierarchy.
As of June 2022, the parts that remain outside the service are migrations and API endpoints.
#### Migrations
`pkg/services/sqlstore/migrations` contains all migrations for SQL
databases, for all services (not including Grafana Enterprise).
The `pkg/services/sqlstore/migrations` package contains all migrations for SQL
databases for all Grafana services except for Grafana Enterprise.
Migrations are written per the [database.md](database.md#migrations) document.
#### API endpoints
`pkg/api/api.go` contains the endpoint definitions for the most of
Grafana HTTP API (not including Grafana Enterprise).
The `pkg/api/api.go` package contains the endpoint definitions for the most of
A Grafana _service_ encapsulates and exposes application logic to the rest of the application, through a set of related operations.
A Grafana _service_ encapsulates and exposes application logic to the rest of the application through a set of related operations.
Grafana uses [Wire](https://github.com/google/wire), which is a code generation tool that automates connecting components using [dependency injection](https://en.wikipedia.org/wiki/Dependency_injection). Dependencies between components are represented in Wire as function parameters, encouraging explicit initialization instead of global variables.
Grafana uses [Wire](https://github.com/google/wire), which is a code generation tool that automates connecting components using [dependency injection](https://en.wikipedia.org/wiki/Dependency_injection). Wire represents dependencies between components as function parameters, which encourages explicit initialization instead of global variables.
Even though the services in Grafana do different things, they share a number of patterns. To better understand how a service works, let's build one from scratch!
Before a service can start communicating with the rest of Grafana, it needs to be registered with Wire, see `ProvideService` factory function/method in the service example below and how it's being referenced in the wire.go example below.
Before a service can start communicating with the rest of Grafana, it needs to be registered with Wire. Refer to the `ProvideService` factory method in the following service example and note how it's being referenced in the `wire.go` example.
When Wire is run, it inspects the parameters of `ProvideService` and makes sure that all its dependencies have been wired up and initialized properly.
When you run Wire, it inspects the parameters of `ProvideService` and makes sure that all its dependencies have been wired up and initialized properly.
**Service example:**
@ -49,16 +49,16 @@ func (s *Service) init() error {
// IsDisabled returns true if the service is disabled.
//
// Satisfies the registry.CanBeDisabled interface which will guarantee
// that Run() is not called if the service is disabled.
// Satisfies the registry.CanBeDisabled interface that guarantees
// that Run() isn't called if the service is disabled.
func (s *Service) IsDisabled() bool {
return !s.cfg.IsServiceEnabled()
}
// Run runs the service in the background.
//
// Satisfies the registry.BackgroundService interface which will
// guarantee that the service can be registered as a background service.
// Satisfies the registry.BackgroundService interface which
// guarantees that the service can be registered as a background service.
func (s *Service) Run(ctx context.Context) error {
A background service runs in the background of the lifecycle between Grafana startup and shutdown. To run your service in the background, it must satisfy the `registry.BackgroundService` interface. Pass it through to the `NewBackgroundServiceRegistry` call in the [ProvideBackgroundServiceRegistry](/pkg/registry/backgroundsvcs/background_services.go) function to register it.
You can see an example implementation above of the Run method.
For an example of the `Run` method, see the previous example.
## Disabled services
If you want to guarantee that a background service is not run by Grafana when certain criteria are met/service is disabled, your service must satisfy the `registry.CanBeDisabled` interface. When the `service.IsDisabled` method returns true, Grafana will not call the `service.Run` method.
If you want to guarantee that a background service is not run by Grafana when certain criteria are met, or if a service is disabled, your service must satisfy the `registry.CanBeDisabled` interface. When the `service.IsDisabled` method returns `true`, Grafana won't call the `service.Run` method.
If you want to run certain initialization code whether service is disabled or not, you need to handle this in the service factory method.
You can see an example implementation above of the `IsDisabled` method and custom initialization code when the service is disabled.
For an example of the `IsDisabled` method and custom initialization code when the service is disabled, see the previous implementation code.
## Run Wire / generate code
## Run Wire (generate code)
Running `make run` calls `make gen-go` on the first run. `gen-go` in turn calls the wire binary and generates the code in [wire_gen.go](/pkg/server/wire_gen.go) and [wire_gen.go](/pkg/cmd/grafana-cli/runner/wire_gen.go). The wire binary is installed using [bingo](https://github.com/bwplotka/bingo) which downloads and installs all the tools needed, including the Wire binary at the specified version.
Running `make run` calls `make gen-go` on the first run. The `gen-go` in turn calls the Wire binary and generates the code in [`wire_gen.go`](/pkg/server/wire_gen.go) and [`wire_gen.go`](/pkg/cmd/grafana-cli/runner/wire_gen.go). The Wire binary is installed using [`bingo`](https://github.com/bwplotka/bingo) which downloads and installs all the tools needed, including the Wire binary at the specified version.
## OSS vs Enterprise
## OSS vs. Enterprise
Grafana OSS and Grafana Enterprise share code and dependencies. Grafana Enterprise overrides or extends certain OSS services.
There's a [wireexts_oss.go](/pkg/server/wireexts_oss.go) that has the `wireinject` and `oss` build tags as requirements. Here services that might have other implementations, e.g. Grafana Enterprise, can be registered.
There's a [`wireexts_oss.go`](/pkg/server/wireexts_oss.go) that has the `wireinject` and `oss` build tags as requirements. Here you can register services that might have other implementations, for example, Grafana Enterprise.
Similarly, there's a wireexts_enterprise.go file in the Enterprise source code repository where other service implementations can be overridden/be registered.
Similarly, there's a `wireexts_enterprise.go` file in the Enterprise source code repository where you can override or register other service implementations.
To extend an OSS background service, create a specific background interface for that type and inject that type to [ProvideBackgroundServiceRegistry](/pkg/registry/backgroundsvcs/background_services.go) instead of the concrete type. Then add a wire binding for that interface in [wireexts_oss.go](/pkg/server/wireexts_oss.go) and in the enterprise wireexts file.
To extend an OSS background service, create a specific background interface for that type and inject that type to [`ProvideBackgroundServiceRegistry`](/pkg/registry/backgroundsvcs/background_services.go) instead of the concrete type. Next, add a Wire binding for that interface in [`wireexts_oss.go`](/pkg/server/wireexts_oss.go) and in the enterprise `wireexts` file.
## Methods
Any public method of a service should take `context.Context` as its first argument. If the method calls the bus, other services or the database the context should be propagated, if possible.
Any public method of a service should take `context.Context` as its first argument. If the method calls the bus, it will propagate other services or the database context, if possible.