diff options
| author | babenko <[email protected]> | 2026-05-27 10:08:33 +0300 |
|---|---|---|
| committer | babenko <[email protected]> | 2026-05-27 10:59:32 +0300 |
| commit | 7e3ceef4b45703f7d682f6fc15b8a5fcd2e73ac1 (patch) | |
| tree | 7eb6db22a5c012bba7112c5748a10824bd19c147 /contrib/python/ipython/py2 | |
| parent | c9b2537b9cd2449550cc11e02abeedb68bca2f0e (diff) | |
Micro-optimize TCallback propagating storage handling
Adds Google Benchmark microbenchmarks for the `TCallback` / `TPropagatingStorage` hot paths under `yt/yt/core/actions/benchmarks` (a `TTraceContext` is constructed at startup so the realistic production switch handler is active in every benchmark), then optimizes the path so that `TCallback::Run()` pays less per invocation.
## Headline benchmark deltas
Release build, single thread, trace-context switch handler registered:
| Benchmark | Trunk | Opt | Δ |
|--------------------------------------|--------|--------|-------|
| `BM_Capture_NoStorage` | 34 ns | 31 ns | -7% |
| `BM_Capture_EmptyStorage` | 38 ns | 35 ns | -6% |
| `BM_Capture_NonEmptyStorage` | 38 ns | 36 ns | -5% |
| `BM_Capture_NoPropagate_WithStorage` | 24 ns | 26 ns | noise |
| `BM_Run_NullPS` | 30 ns | 16 ns | **-46%** |
| `BM_Run_EmptyPS` | 74 ns | 44 ns | **-41%** |
| `BM_Run_NonEmptyPS` | 76 ns | 44 ns | **-41%** |
| `BM_Run_NoPropagate` | 2 ns | 2 ns | unchanged |
| `BM_CaptureAndRun_NullPS` | 67 ns | 52 ns | **-23%** |
| `BM_CaptureAndRun_NonEmptyPS` | 112 ns | 79 ns | **-30%** |
`BIND_NO_PROPAGATE` is unchanged.
## Changes vs trunk
### Manager dispatch
- `TPropagatingStorageManager` loses its `LeakySingleton` wrapper and becomes a `constinit` static in `NDetail`. The per-call double-checked-locking pointer load on every `TCallback::Run` goes away; access is now a plain address.
- `TPropagatingStorageManager::SwitchPropagatingStorage` stays out-of-line in the cpp to avoid bloating every `TCallback::Run` instantiation.
- `TPropagatingStorageGuard`'s ctor and dtor move from the cpp into the inl, so the install/restore call site goes directly to `PropagatingStorageManager.SwitchPropagatingStorage(...)` without the trunk's `TPropagatingStorageManager::Get()->...` member-pointer indirection.
### FLS pointer caching
- `TPropagatingStorageGuard` gains a `TFls*` cache populated on its first `SwitchPropagatingStorage` call and reused on its second. Drops one `Y_NO_INLINE` `GetCurrentFls()` call per `Run`.
- Inside `SwitchPropagatingStorage`, the `TFls*&` parameter is hoisted once into a local register at function entry so subsequent in-function uses don't go through the reference.
### Storage internals
- `TPropagatingStorage::Impl_` moves from `TIntrusivePtr<TImpl>` (private nested) to `TIntrusivePtr<TPropagatingStorageImpl>` (`NDetail`, defined in `propagating_storage-inl.h`). This lets the read paths — `IsNull`, `IsEmpty`, `FindRaw`, and the templated `Find` / `Has` — all inline at callers. Mutating ops (`ExchangeRaw`/`RemoveRaw`/`Clone`) stay out-of-line.
- `IsNull`, `IsEmpty`, `FindRaw`, `CurrentPropagatingStorage`, and `GetCurrentPropagatingStorage` move from cpp out-of-line definitions to inl inline definitions accordingly.
- `TPropagatingStorage`'s explicit special-member declarations are dropped (relying on implicit generation), except for the default ctor (`= default`) which has to stay declared because the private converting ctor would otherwise suppress it.
- `TPropagatingStorageImpl`'s copy ctor is hand-written (members init explicitly) because `TRefCounted`'s copy ctor is deleted; it preserves trunk's behavior that `Clone()` carries the signal subscriber lists alongside the data map.
### Trace-context lookup
- `TryGetTraceContextFromPropagatingStorage` is now defined inline in `trace_context-inl.h` (one definition; the previous `Fast` variant is merged in). Hot-path callers — `OnPropagatingStorageAfterSwitch` and the `YT_ASSERT` in `OnPropagatingStorageBeforeSwitch` — inline the lookup. An out-of-line copy is force-emitted via `[[gnu::used]]` so the GDB fiber printer (`devtools/gdb/yt_fibers_printer.py`) can still resolve the symbol at runtime.
### Inlining nudges
- `TPropagateMixin::MakePropagatingStorageGuard` gains `Y_FORCE_INLINE` so it actually inlines into `TBindState::Run` rather than being emitted as a separate weak symbol per instantiation.
- `TPropagatingStorage::IsNull` is also marked `[[gnu::used]]` for the same GDB-symbol-resolution reason as `TryGetTraceContextFromPropagatingStorage`.
commit_hash:86793cebfc5ff891affab25dcc67d987e553854c
Diffstat (limited to 'contrib/python/ipython/py2')
0 files changed, 0 insertions, 0 deletions
