Troubleshooting: instructions for loki + istio (#6205)

<!--  Thanks for sending a pull request!  Before submitting:

1. Read our CONTRIBUTING.md guide
2. Name your PR as `<Feature Area>: Describe your change`.
a. Do not end the title with punctuation. It will be added in the
changelog.
b. Start with an imperative verb. Example: Fix the latency between
System A and System B.
  c. Use sentence case, not title case.
d. Use a complete phrase or sentence. The PR title will appear in a
changelog, so help other people understand what your change will be.
3. Rebase your PR if it gets out of sync with main
-->

**What this PR does / why we need it**:
Add troubleshooting instructions for running istio and loki. It takes a
long time
to figure out how to make it work, hope it will help others.

**Which issue(s) this PR fixes**:
Fixes #<issue number>

**Special notes for your reviewer**:

<!--
Note about CHANGELOG entries, if a change adds:
* an important feature
* fixes an issue present in a previous release, 
* causes a change in operation that would be useful for an operator of
Loki to know
then please add a CHANGELOG entry.

For documentation changes, build changes, simple fixes etc please skip
this step. We are attempting to curate a changelog of the most relevant
and important changes to be easier to ingest by end users of Loki.

Note about the upgrade guide, if this changes:
* default configuration values
* metric names or label names
* changes existing log lines such as the metrics.go query output line
* configuration parameters 
* anything to do with any API
* any other change that would require special attention or extra steps
to upgrade
Please document clearly what changed AND what needs to be done in the
upgrade guide.
-->
**Checklist**
- [x] Documentation added
- [ ] Tests updated
- [ ] Is this an important fix or new feature? Add an entry in the
`CHANGELOG.md`.
- [ ] Changes that require user attention or interaction to upgrade are
documented in `docs/sources/upgrading/_index.md`

Co-authored-by: Karen Miller <84039272+KMiller-Grafana@users.noreply.github.com>
pull/7003/head
Ludovic Cleroux 3 years ago committed by GitHub
parent 9c47b3084e
commit e2952f9ce1
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
  1. 25
      docs/sources/operations/troubleshooting.md

@ -155,3 +155,28 @@ If you deploy with Helm, use the following command:
```bash
$ helm upgrade --install loki loki/loki --set "loki.tracing.jaegerAgentHost=YOUR_JAEGER_AGENT_HOST"
```
## Running Loki with Istio Sidecars
An Istio sidecar runs alongside a pod. It intercepts all traffic to and from the pod.
When a pod tries to communicate with another pod using a given protocol, Istio inspects the destination's service using [Protocol Selection](https://istio.io/latest/docs/ops/configuration/traffic-management/protocol-selection/).
This mechanism uses a convention on the port name (for example, `http-my-port` or `grpc-my-port`)
to determine how to handle this outgoing traffic. Istio can then do operations such as authorization and smart routing.
This works fine when one pod communicates with another pod using a hostname. But,
Istio does not allow pods to communicate with other pods using IP addresses,
unless the traffic type is `tcp`.
Loki internally uses DNS to resolve the IP addresses of the different components.
Loki attempts to send a request to the IP address of those pods. The
Loki services have a `grpc` (:9095/:9096) port defined, so Istio will consider
this to be `grpc` traffic. It will not allow Loki components to reach each other using
an IP address. So, the traffic will fail, and the ring will remain unhealthy.
The solution to this issue is to add `appProtocol: tcp` to all of the `grpc`
(:9095) and `grpclb` (:9096) service ports of Loki components. This
overrides the Istio protocol selection, and it force Istio to consider this traffic raw `tcp`, which allows pods to communicate using raw ip addresses.
This disables part of the Istio traffic interception mechanism,
but still enables mTLS. This allows pods to communicate between themselves
using IP addresses over grpc.

Loading…
Cancel
Save