<feed xmlns='http://www.w3.org/2005/Atom'>
<title>ydb/library/cpp/yt/rseq/rseq.h, branch main</title>
<subtitle>Mirror of YDB github repos</subtitle>
<id>https://code.mastervirt.ru/ydb/atom?h=main</id>
<link rel='self' href='https://code.mastervirt.ru/ydb/atom?h=main'/>
<link rel='alternate' type='text/html' href='https://code.mastervirt.ru/ydb/'/>
<updated>2026-06-22T21:50:43Z</updated>
<entry>
<title>YT-28458: Make per-CPU rseq fast path dlopen-safe</title>
<updated>2026-06-22T21:50:43Z</updated>
<author>
<name>babenko</name>
<email>babenko@yandex-team.com</email>
</author>
<published>2026-06-22T21:32:05Z</published>
<link rel='alternate' type='text/html' href='https://code.mastervirt.ru/ydb/commit/?id=6ce03d8a5070753a84defdb8c4c9b43d852ad65b'/>
<id>urn:sha1:6ce03d8a5070753a84defdb8c4c9b43d852ad65b</id>
<content type='text'>
Hardens `library/cpp/yt/rseq` for the case where it is linked into a dlopen'd,
position-independent module (e.g. a YQL UDF `.so`). Extracted from the profiling
work that enables the rseq fast path by default.

**TLS model.** The weak `__rseq_abi` gets `global-dynamic` linkage under
`__PIC__/__PIE__` (`initial-exec` otherwise), mirroring `contrib/libs/tcmalloc`.
`initial-exec` needs a slot in the static TLS block reserved at startup, which
the loader cannot grant a module dlopen'd later — the module would fail to load
with "cannot allocate memory in static TLS block". This only changes the cold
`&amp;__rseq_abi` accesses; the hot path still reads `*(thread_pointer + CpuIdFieldOffset)`.

**Runtime safety probe `IsPerCpuFastPathSafe()`.** The cached thread-pointer
offset is valid only when `__rseq_abi` sits at a fixed offset from the thread
pointer — a glibc-owned area or the static TLS block (incl. tcmalloc), the common
case. When our `__rseq_abi` instead lands in a dlopen'd module's *dynamically
allocated* TLS, the offset is valid only on the thread that computed it; on other
threads the hot path's first store (`area-&gt;rseq_cs`) would corrupt unrelated
memory. The probe spawns one thread and checks — by pointer comparison, never
dereferencing the suspect offset — that the offset names that thread's rseq area;
if not, callers use the atomic fallback. Decided once and cached (one thread spawn
at first use).=
commit_hash:633f58f500d9d097800da81f526c56283445ffc7
</content>
</entry>
<entry>
<title>Add library/cpp/yt/rseq: NYT::GetCurrentCpuId() via Linux rseq</title>
<updated>2026-06-13T22:35:19Z</updated>
<author>
<name>babenko</name>
<email>babenko@yandex-team.com</email>
</author>
<published>2026-06-13T22:15:56Z</published>
<link rel='alternate' type='text/html' href='https://code.mastervirt.ru/ydb/commit/?id=450b2fac082943e2029b3390c99501c365795d64'/>
<id>urn:sha1:450b2fac082943e2029b3390c99501c365795d64</id>
<content type='text'>
Self-contained current-CPU-id reader backed by Linux **rseq** (restartable
sequences), with **no third-party dependency** (no librseq):

* The rseq ABI is hand-defined; the calling thread is registered lazily via the
  rseq syscall.
* Fast path is a single inlined, **branch-free** thread-local read. The offset
  always points at a readable `cpu_id` -- the glibc-owned area when glibc registers
  rseq (&gt;= 2.35, via the weak `__rseq_offset`/`__rseq_size`), otherwise our own
  area -- so an unregistered thread reads `-1` and routes to the slow path.
* Falls back to `sched_getcpu()` (Linux) or `0` (darwin/windows). Works on glibc
  **and musl** alike (librseq does not build on musl).

Fiber-TLS contract: the inlined read must be reached only via a non-inlinable,
fiber-switch-free frame (a virtual call or `YT_PREVENT_TLS_CACHING`).

#### Benchmark -- cost of one cpu-id read

| source | time / call |
|---|---|
| `GetCurrentCpuId()` (rseq) | **0.34 ns** |
| `sched_getcpu()` (vDSO) | 3.5 ns |
| `rdtscp` (what `TTscp::Get()` does) | 23 ns |

This is an alternative to the librseq-based review/13886037 -- same speed, but no
contrib dependency and it also covers musl. The unit test pins to each allowed CPU
and asserts the reported id matches.

🤖 Generated with [Claude Code](https://claude.com/claude-code)
commit_hash:09d282c2f48755836b1cd68cedbffc3c6a662eed
</content>
</entry>
</feed>
