fix(loki-canary): Send to Loki after updating `totalEntries`. (#7211)

**What this PR does / why we need it**:
Fixes: #7142 (take a look for more details about the problem there)

After trying few things to avoid this race and discussin it with Ed,
This small changes seems to be more accurate and seems to fix the issue.

This is unblocking rolling out new canary in our internal cells

Update: For some reason I thought the fix didn't fix the original issue
when I tested it with internal loki dev cell, almost a month ago. But
after testing it for more than 2hr on same cell. I realized it is
working fine (will leave it for a day like this just to confirm)

Verified following:
```
{.... container="loki-canary"} |= "websocket missing"
```
and noticed it's happening no longer after the fix.

Also verified the metrics
```
sum(increase(loki_canary_websocket_missing_entries_total{...}[$__range]))
```
It dropped to zero.


![Screenshot_2022-10-30_21-08-43](https://user-images.githubusercontent.com/3735252/198899443-e1f9ee08-3f6a-41f8-8817-0da64b423343.png)

![Screenshot_2022-10-30_21-08-29](https://user-images.githubusercontent.com/3735252/198899444-62f23a5e-b505-4b92-a717-cf8c5220e6ed.png)

Update 2: After running it for whole day, looks there was two log
entries missed on the websocket. My guess is since websocket is
long-live connection and something can interrupt the connection loosing
this message.


![Screenshot_2022-10-31_10-32-20](https://user-images.githubusercontent.com/3735252/198976826-261af2bd-291e-40d9-aee4-01fdc0122ed3.png)

I also quickly checked how often this happens on other bigger envs (say
ops) and looks like it's not that uncommon.


![Screenshot_2022-10-31_10-30-56](https://user-images.githubusercontent.com/3735252/198976486-2458e31f-cb0b-4389-a098-0a828d96f2c4.png)

**Which issue(s) this PR fixes**:
Fixes https://github.com/grafana/loki/issues/7142

**Special notes for your reviewer**:
Tested it on internal loki dev cells

**Checklist**
- [x] Reviewed the `CONTRIBUTING.md` guide
- [ ] Documentation added
- [ ] Tests updated
- [ ] `CHANGELOG.md` updated
- [ ] Changes that require user attention or interaction to upgrade are
documented in `docs/sources/upgrading/_index.md`
pull/7428/head
Kaviraj Kanagaraj 3 years ago committed by GitHub
parent 2881c52da4
commit ceb09efdf8
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
  1. 3
      pkg/canary/writer/writer.go

@ -94,12 +94,11 @@ func (w *Writer) run() {
w.pad = str.String()
w.prevTsLen = tsLen
}
w.sent <- t
_, err := fmt.Fprintf(w.w, LogEntry, ts, w.pad)
if err != nil {
level.Error(w.logger).Log("msg", "failed to write log entry", "error", err)
}
w.sent <- t
case <-w.quit:
return
}

Loading…
Cancel
Save