summaryrefslogtreecommitdiffstats
path: root/contrib/python/ipython/py3/IPython/utils/_sysinfo.py
diff options
context:
space:
mode:
authorbabenko <[email protected]>2026-06-24 13:14:54 +0300
committerbabenko <[email protected]>2026-06-24 14:20:07 +0300
commita224436a8d395602cd14b8e5aa3cecb466ba126f (patch)
treeff0414f872bd9fc09da53ee4e2ea5610227ecd55 /contrib/python/ipython/py3/IPython/utils/_sysinfo.py
parentf0f748358580ca75f9aedce7d9a6572b0c8f4c58 (diff)
YT-28458: Add rseq-backed hot sensors
Rename `percpu.{cpp,h}` to `per_cpu.{cpp,h}` and update all includers. Add an rseq-backed implementation of the hot per-CPU counter, time counter and gauge (`rseq_sensor_impl.{h,cpp}`): each update commits to the calling CPU's shard lock-free via an rseq critical section (`library/cpp/yt/rseq` `AddPerCpu`/`StorePerCpu`) — no atomic and no lock on the fast path. The shard array is sized to `NRseq::GetCpuCount()` and folded into the sensor allocation (`NewWithExtraSpace`); reads aggregate with `LoadPerCpu`. The gauge keeps last-writer-wins semantics. Linux-only. The existing `TPerCpu{Counter,TimeCounter,Gauge}` stay (`per_cpu_sensor_impl.{h,cpp}`) as the atomic sharded fallback. The two are interchangeable and chosen per sensor at construction in `TSolomonRegistry`, so the hot Increment/Update path carries no dispatch: * The rseq fast path is **off by default**; opt in via `singletons/solomon_registry/enable_rseq`. Even when on, a hot sensor uses it only in a process where the kernel rseq area sits at a fixed thread-pointer offset (tcmalloc/glibc-owned), per the rseq library's runtime safety probe (`NRseq::IsPerCpuFastPathSafe`). Everything else — notably a `dlopen`'d YQL UDF whose `__rseq_abi` lands in dynamically allocated TLS — uses the atomic sharded sensors. * `TSolomonRegistry` is a reconfigurable singleton (`solomon_registry`) with an `enable_rseq` knob (default false), settable in static config and updatable via dynamic config. Off Linux the rseq sensors do not exist and the registry uses the atomic sharded sensors for hot requests. The per-CPU summary is unchanged (TTscp + spinlock). Unit tests cover the atomic sensors, the rseq sensors (Linux-only), and the simple sensors. The controller-agent memory-watchdog integration tests are made resilient to the (core-count-dependent) per-sensor footprint. Benchmark (hot per-CPU path, 64-core host), atomic sharded vs rseq: | Benchmark | atomic | rseq | speedup | | --- | --- | --- | --- | | BM_PerCpuCounter, threads:1 | 30.7 ns | 3.6 ns | ~8.5x | | BM_PerCpuCounter, threads:16 | 30.9 ns | 4.4 ns | ~7x | | BM_PerCpuGauge, threads:1 | 32.8 ns | 12.4 ns | ~2.6x | | BM_PerCpuGauge, threads:16 | 32.7 ns | 12.5 ns | ~2.5x | commit_hash:8c633d31f03b2cc862ed2217ae08342bf42adc52
Diffstat (limited to 'contrib/python/ipython/py3/IPython/utils/_sysinfo.py')
0 files changed, 0 insertions, 0 deletions