| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Introduce AddPerCpu and StorePerCpu over an rseq-sharded per-CPU array.
On the x86-64 Linux fast path the update is committed by a hand-rolled
rseq critical section (non-atomic, migration-safe): addq for the 8-byte
accumulate, movq / movdqu for the 8- or 16-byte store. The kernel
restarts the sequence on preemption or migration, and only one thread
runs on a CPU at a time, so no atomic or lock is needed. Off the fast
path (other arches, no kernel rseq) the operation falls back to an
atomic on the slot indexed by sched_getcpu().
A naturally-aligned 8-byte store is single-copy atomic on x86-64, so it
is never observed torn; the 16-byte store may be, which is acceptable for
a last-writer-wins gauge.
commit_hash:6250f6e9e35cf3895ebafe0b534ec12cca50b03b
|
|
|
Self-contained current-CPU-id reader backed by Linux **rseq** (restartable
sequences), with **no third-party dependency** (no librseq):
* The rseq ABI is hand-defined; the calling thread is registered lazily via the
rseq syscall.
* Fast path is a single inlined, **branch-free** thread-local read. The offset
always points at a readable `cpu_id` -- the glibc-owned area when glibc registers
rseq (>= 2.35, via the weak `__rseq_offset`/`__rseq_size`), otherwise our own
area -- so an unregistered thread reads `-1` and routes to the slow path.
* Falls back to `sched_getcpu()` (Linux) or `0` (darwin/windows). Works on glibc
**and musl** alike (librseq does not build on musl).
Fiber-TLS contract: the inlined read must be reached only via a non-inlinable,
fiber-switch-free frame (a virtual call or `YT_PREVENT_TLS_CACHING`).
#### Benchmark -- cost of one cpu-id read
| source | time / call |
|---|---|
| `GetCurrentCpuId()` (rseq) | **0.34 ns** |
| `sched_getcpu()` (vDSO) | 3.5 ns |
| `rdtscp` (what `TTscp::Get()` does) | 23 ns |
This is an alternative to the librseq-based review/13886037 -- same speed, but no
contrib dependency and it also covers musl. The unit test pins to each allowed CPU
and asserts the reported id matches.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
commit_hash:09d282c2f48755836b1cd68cedbffc3c6a662eed
|