ydb/library/cpp/yt/rseq/per_cpu.cpp, branch CLI

ydb/library/cpp/yt/rseq/per_cpu.cpp, branch CLI_2.32.0 Mirror of YDB github repos https://code.mastervirt.ru/ydb/atom?h=CLI_2.32.0 2026-06-19T12:12:00Z Add lock-free per-CPU primitives to library/cpp/yt/rseq 2026-06-19T12:12:00Z babenko babenko@yandex-team.com 2026-06-19T11:27:43Z urn:sha1:89c0e29c8f9ba29ecdc736fefda87286482ac213 Introduce AddPerCpu and StorePerCpu over an rseq-sharded per-CPU array. On the x86-64 Linux fast path the update is committed by a hand-rolled rseq critical section (non-atomic, migration-safe): addq for the 8-byte accumulate, movq / movdqu for the 8- or 16-byte store. The kernel restarts the sequence on preemption or migration, and only one thread runs on a CPU at a time, so no atomic or lock is needed. Off the fast path (other arches, no kernel rseq) the operation falls back to an atomic on the slot indexed by sched_getcpu(). A naturally-aligned 8-byte store is single-copy atomic on x86-64, so it is never observed torn; the 16-byte store may be, which is acceptable for a last-writer-wins gauge. commit_hash:6250f6e9e35cf3895ebafe0b534ec12cca50b03b