mirror of https://github.com/coturn/coturn
Tag:
Branch:
Tree:
master
codex/fix-relay-threads-override
copilot/fix-1649
debian/buster
debian/buster-backports
debian/master
debian/stretch
debian/stretch-backports
dependabot/github_actions/actions/checkout-7
dh2066
fix-oauth
fix_msvc_analyzer
input-validation
libevent.rpm
macos_upgrade
master
move_to_verbose
pristine-tar
sparc64
sparc64-rebase
sparc64-rebase-slim
try-fix-docker
turnbis
upstream/latest
wferi/callback
4.10.0
4.11.0
4.12.0
4.13.0
4.13.1
4.4.5.3
4.4.5.4
4.5.0.1
4.5.0.2
4.5.0.3
4.5.0.4
4.5.0.5
4.5.0.6
4.5.0.7
4.5.0.8
4.5.1.0
4.5.1.1
4.5.1.2
4.5.1.3
4.5.2
4.6.0
4.6.1
4.6.2
4.6.3
4.7.0
4.8.0
4.9.0
debian/4.5.0.2-3
debian/4.5.0.2-3_bpo8+1
debian/4.5.0.3-1
debian/4.5.0.4-1
debian/4.5.0.4-4
debian/4.5.0.5-1
debian/4.5.0.5-1+deb9u1
debian/4.5.0.5-1+deb9u2
debian/4.5.0.5-1+deb9u3
debian/4.5.0.6-1
debian/4.5.0.7-1
debian/4.5.0.8-1
debian/4.5.1.0-1
debian/4.5.1.0-1_bpo9+1
debian/4.5.1.1-1
debian/4.5.1.1-1.1
debian/4.5.1.1-1.1+deb10u1
debian/4.5.1.1-1.1+deb10u2
debian/4.5.1.1-1.2
debian/4.5.1.3-1
debian/4.5.2-1
debian/4.5.2-1_bpo10+1
debian/4.5.2-2
debian/4.5.2-3
docker/4.10.0-r0
docker/4.10.0-r1
docker/4.11.0-r0
docker/4.12.0-r0
docker/4.13.0-r0
docker/4.13.1-r0
docker/4.5.2-r0
docker/4.5.2-r1
docker/4.5.2-r10
docker/4.5.2-r11
docker/4.5.2-r12
docker/4.5.2-r13
docker/4.5.2-r14
docker/4.5.2-r2
docker/4.5.2-r3
docker/4.5.2-r4
docker/4.5.2-r5
docker/4.5.2-r6
docker/4.5.2-r7
docker/4.5.2-r8
docker/4.5.2-r9
docker/4.6.0-r0
docker/4.6.0-r1
docker/4.6.1-r0
docker/4.6.1-r1
docker/4.6.1-r2
docker/4.6.1-r3
docker/4.6.2-r0
docker/4.6.2-r1
docker/4.6.2-r10
docker/4.6.2-r11
docker/4.6.2-r12
docker/4.6.2-r13
docker/4.6.2-r2
docker/4.6.2-r3
docker/4.6.2-r4
docker/4.6.2-r5
docker/4.6.2-r6
docker/4.6.2-r7
docker/4.6.2-r8
docker/4.6.2-r9
docker/4.6.3-r0
docker/4.6.3-r1
docker/4.6.3-r2
docker/4.6.3-r3
docker/4.7.0-r0
docker/4.7.0-r1
docker/4.7.0-r2
docker/4.7.0-r3
docker/4.7.0-r4
docker/4.8.0-r0
docker/4.8.0-r1
docker/4.9.0-r0
upstream/4.0.0.0
upstream/4.0.0.1
upstream/4.0.0.2
upstream/4.0.1.2
upstream/4.0.1.3
upstream/4.1.0.1
upstream/4.1.0.2
upstream/4.1.1.1
upstream/4.1.2.1
upstream/4.2.1.2
upstream/4.2.2.2
upstream/4.2.3.1
upstream/4.3.1.1
upstream/4.3.1.2
upstream/4.3.1.3
upstream/4.3.2.1
upstream/4.3.2.2
upstream/4.3.3.1
upstream/4.4.1.1
upstream/4.4.1.2
upstream/4.4.2.1
upstream/4.4.2.2
upstream/4.4.2.3
upstream/4.4.4.1
upstream/4.4.4.2
upstream/4.4.5.1
upstream/4.4.5.2
upstream/4.4.5.3
upstream/4.4.5.4
upstream/4.5.0.1
upstream/4.5.0.2
upstream/4.5.0.3
upstream/4.5.0.4
upstream/4.5.0.5
upstream/4.5.0.6
upstream/4.5.0.7
upstream/4.5.0.8
upstream/4.5.1.0
upstream/4.5.1.1
upstream/4.5.1.3
upstream/4.5.2
${ noResults }
4 Commits (master)
| Author | SHA1 | Message | Date |
|---|---|---|---|
|
|
8c7d8fcb86
|
Enable --udp-recvmmsg by default on Linux (#1930)
## Summary Flips the Linux default for `--udp-recvmmsg` from **off** to **on**. Operators opt out with `--udp-recvmmsg=false` (or `=0`). > **Stacked on #1929.** This depends on the recvmmsg-scoping change in #1929 and is based on that branch, so the diff shows only the default-on change. GitHub will auto-retarget the base to `master` once #1929 merges. Merge #1929 first. ## Why this is now safe The original objection to default-on (recorded in `docs/PerformanceIterationLog.md`) was the **per-session-relay-socket prealloc tax**: `--udp-recvmmsg` applied the 16-buffer batch path to every connected relay socket, which only ever carries one flow, so the churn ate the listener-side win. #1929 scoped recvmmsg to **shared fan-in sockets only** (`udp_recvmmsg_eligible`: the client listener, plus the per-thread shared relay socket under `--multiplex-peer`). Per-session relay sockets now stay on the single-recv path regardless of the flag, so that tax is gone. The one socket touched by default — the client listener — is a genuine fan-in point: - batches whenever client concurrency is non-trivial (measured `avg_batch ≈ 16` under load), and - costs little when idle (few packets ⇒ few prealloc cycles). ## What changed - `mainrelay.c`: `turn_params.udp_recvmmsg` default `false → true` (Linux only). - Removed the now-dead `--multiplex-peer` auto-enable block and the `udp_recvmmsg_set_explicitly` tracking it relied on; multiplex-peer gets its recvmmsg window from the default. The opt-out flows through the normal `get_bool_value` path. - Help text, `man/man1/turnserver.1`, `examples/etc/turnserver.conf`, `CLAUDE.md`, and `docs/PerformanceIterationLog.md` updated for the new default + opt-out. Per-session relay sockets and DTLS session sockets are unchanged. ## Validation - **Format:** clang-format 15.0.7 clean. - **macOS:** build + ctest 6/6 + `run_tests.sh` pass. - **Linux (Docker, clean build):** ctest 5/5; `run_tests.sh`, `run_tests_conf.sh`, `run_tests_multiplex_peer.sh` all pass (no FAIL). - **Runtime proof (loopback, `--udp-recvmmsg-log`):** - Default, no flag: recvmmsg active, `calls=13714 packets=219306 avg_batch=15.99`. - `--udp-recvmmsg=false`: zero recvmmsg activity — opt-out confirmed. 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com> |
3 weeks ago |
|
|
5959ecfb13
|
Add UDP-GSO send path (--udp-gso) (#1907)
## Summary - New `--udp-gso` flag (Linux, requires `--udp-sendmmsg`) collapses same-destination, same-size sendmmsg batches into a single `sendmsg` with a `UDP_SEGMENT` cmsg, so the kernel allocates one super-skb that traverses the network stack once and is segmented at egress instead of running `udp_sendmsg → ip_finish_output → __dev_queue_xmit` per datagram. - Also wraps the relay-side `recvmmsg` callback loop in `udp_sendmmsg_batch_begin/end` so peer→client sends triggered inside a recv batch can also coalesce — without that wrapping the relay path issues one `sendto` per delivered datagram. - Sticky-disable on `EINVAL/ENOPROTOOPT` for older kernels/NICs that lack UDP-GSO; one warning logged, then transparent fallback to the existing `sendmmsg` and `udp_send` paths. ## Why The `--udp-recvmmsg` and `--udp-sendmmsg` follow-ups confirmed (see [docs/PerformanceIterationLog.md](docs/PerformanceIterationLog.md)) that on the relay flood workload the dominant cost is the per-datagram kernel TX path. mmsg-style batching reduces only the syscall entry/exit, not the per-skb stack traversal — UDP-GSO collapses both. ## Result DigitalOcean nyc1 c-4, 30 s alternating A/B, `-Y packet -m 1`, eth1 TX as the authoritative server forwarding metric: | Variant | eth1 RX | eth1 TX | sys CPU | idle CPU | |---|---:|---:|---:|---:| | baseline (no flags) | 322,091 | 127,445 | 22.9 % | 67.5 % | | `--udp-recvmmsg --udp-sendmmsg --udp-gso` | 266,068 | **257,996** | 15.0 % | 78.7 % | | baseline (no flags) | 309,475 | 125,573 | 20.9 % | 70.7 % | | `--udp-recvmmsg --udp-sendmmsg --udp-gso` | 275,992 | **225,366** | 14.9 % | 74.3 % | Mean server forwarding rate: **126.5 k → 241.7 k pps (+91 %, 1.91×)**, mean system CPU **21.9 % → 14.9 %** — about **2.8× CPU efficiency** (TX pps per system-CPU-%). Full perf-children comparison and methodology in the new section of [docs/PerformanceIterationLog.md](docs/PerformanceIterationLog.md). ## Notes for reviewers - `--udp-gso` is opt-in and requires `--udp-sendmmsg` (the help text states the dependency). Without `--udp-sendmmsg` the batch state never accumulates and GSO has nothing to flush. - GSO eligibility resets on every `_begin/_end`. Mixed-destination, mixed-size, or oversize batches transparently fall back through `sendmmsg` / `udp_send`. - Rebased onto current `master`; the recvmmsg dependency is already merged via #1906. ## Test plan - [x] `cmake --build build --target turnserver` (RelWithDebInfo + ASan local builds clean) - [x] `ctest --test-dir build --output-on-failure` — 3/3 unit tests pass - [x] `examples/run_tests.sh` — TCP/TLS/UDP pass; DTLS pre-existing failure on macOS environment, unrelated to this change - [x] DigitalOcean A/B perf validation captured above - [ ] Reviewer to confirm CI green on Linux build/test/CodeQL --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
1 month ago |
|
|
a5005c4193
|
Relay recvmmsg (#1906)
## Summary Extends the existing Linux-only `--udp-recvmmsg` flag from the UDP listener socket to also cover **connected per-session UDP relay sockets**, so steady-state client→relay and peer→relay traffic on plain UDP is read in batches of up to 16 datagrams per `recvmmsg(2)` instead of one `recvmsg` per packet. DTLS sessions still go through the SSL read path and are unchanged. The flag stays **opt-in**: receive-side batching works correctly, but on the current `m=1` / `m=100` benchmarks throughput is flat to slightly negative — the bottleneck has moved past receive (see results below). ## What's in the change - **Shared receive helpers** (`src/apps/relay/ns_ioalib_engine_impl.c`, `src/apps/relay/ns_ioalib_impl.h`): - `ioa_parse_udp_recvmsg_cmsg()` — single TTL/TOS/`IP_RECVERR` cmsg parser used by both `udp_recvfrom()` and the new batch path. Replaces the duplicated parser previously inlined in `dtls_listener.c` and `udp_recvfrom()`. - `ioa_init_recvmmsg_hdr()` — single initializer for `mmsghdr`/`iovec`/cmsg/source-address fields, also used by the listener. - New `IOA_UDP_RECVMMSG_MAX_BATCH = 16` constant; both listener and relay paths now share it. - **Connected relay batch read** (`socket_udp_read_batch_recvmmsg` in `ns_ioalib_engine_impl.c`): called from `socket_input_worker` for non-SSL UDP sockets when `--udp-recvmmsg` is on. Allocates per-message `stun_buffer_list_elem`s, calls `recvmmsg(MSG_DONTWAIT)`, dispatches each datagram through the existing `read_cb` path, and falls back cleanly on `ENOSYS`/`EINVAL`/`EOPNOTSUPP` (auto-disables the flag) and on `EAGAIN`/short-batch (releases unused buffers). - **Per-engine scratch state**: the `mmsghdr[16]` / `iovec[16]` / cmsg / src-addr arrays live on `ioa_engine`, not on every socket — keeps memory flat at thousands of allocations. - **TTL/TOS-sized cmsg buffers** in the listener: the listener previously over-allocated `64 KiB` per slot; it now uses the same TTL+TOS sizing as the relay path. - **Opt-in occupancy stats** behind a new `--udp-recvmmsg-log` flag: every 10 s the relay logs `udp-recvmmsg stats: calls=… packets=… avg_batch=… wouldblock=… unavailable=… no_buffer=… hist_1=… hist_2=… hist_3_4=… hist_5_8=… hist_9_16=…`. Counters are always tracked (cheap); the periodic log is gated by the new flag so default operation is silent. - **CLI plumbing**: `--udp-recvmmsg-log` long option in `mainrelay.c`/`mainrelay.h`, `cli_print_flag` entry in `turn_admin_server.c`, doc updates in `README.turnserver`. - **Docs**: `docs/PerformanceIterationLog.md` records the iteration steps, validation, and two rounds of DigitalOcean A/B numbers. `CLAUDE.md` load-test instructions updated to mention the new flag and the `tot_recv_msgs` / `tot_recv_bytes` workaround. |
1 month ago |
|
|
69bc0e7351
|
Load generator mode in turnutils_uclient (#1894)
## Summary Adds load-generator modes to `turnutils_uclient` for repeatable TURN server performance testing: - Adds `-Y packet|alloc|invalid` load modes. - Supports packet flood, allocation flood, and invalid-packet flood workflows. - Adds unique local client ports for allocation flood mode. - Removes default packet pacing in load-generator modes unless explicitly set. - Adds helper scripts under `examples/loadtest/`. - Documents load-test usage in `README.turnutils`, `man/man1/turnutils.1`, `CLAUDE.md`, and `docs/PerformanceIterationLog.md`. The performance log captures DigitalOcean benchmark methodology, A/B lessons, hot-path findings, and future optimization candidates. |
2 months ago |