| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
TTscp::GetApproximate takes the processor id from the rseq fast path
(GetCurrentCpuId) and the instant from a non-serializing rdtsc, instead of the
single serializing rdtscp of TTscp::Get.
TPerCpuGauge::Update switches to it: the per-shard timestamp only orders writes
across shards to pick the freshest value, so the lower precision is fine. Update
is virtual and now YT_PREVENT_TLS_CACHING -- the fiber-TLS boundary the inlined
rseq read needs.
#### Benchmark
sas2-2769 (glibc 2.31 + tcmalloc, rseq fast path), median of 5:
| primitive | time |
|---|---|
| TTscp::Get() (rdtscp) | 14.1 ns |
| TTscp::GetApproximate() (rseq + rdtsc) | 10.6 ns (-25%) |
commit_hash:b277b6551accd6d0b879f8ffb168bcbe8d9fbb74
|