| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
| |
commit_hash:bfab0d0115b50949f66878004cf718b988575734
|
| |
|
|
| |
commit_hash:acb3e84437f5bdb125d7c1807847eb5edecbb11f
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
GetMinValue()/GetMaxValue() are constexpr, but when called from a runtime
context for a large-domain enum, clang does not fold the min/max_element and
emits a runtime scan over the whole domain on every call. This is hot on the
master replay path: TEnumIndexedArray::operator[] bounds-checks against these
(e.g. TCypressManager::FindHandler), and TCompositeAutomaton::RememberReign
hits GetCurrentReign() = GetMaxValue() over the ~3300-entry EMasterReign domain
per mutation.
Bind the result to a constexpr local to force compile-time evaluation. Verified
by disasm on a 240-value sample enum: getmin() goes from a ~44-instruction
runtime scan to a single 'mov $const'. No behavior change.
Part of YT-28453 (master replay-speed optimizations).
commit_hash:7cdb969e00ba219415d80c5c8c984aa8bbde99d2
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Profile-driven optimizations of the `Format` hot path, benchmarked against a representative master debug log (structured `"Key: %v"` messages dominated by GUIDs, strings, integers, bools and durations). Median improvements of ~15-20% across the workload, measured on a dedicated host.
Changes:
- `string_builder`: use `resize_uninitialized` in `DoReserve` to avoid zero-filling the buffer on every `Format` call.
- `format`: replace the per-argument `memchr` (`spec.Contains('n')`) with an inline scan, force-inline `RunFormatterAt`, and add a `FormatString` fast path for the common plain `%v` / empty spec.
- `guid`: rewrite `WriteGuidToBuffer` using a `clz`-derived digit count and a back-to-front fill instead of the per-magnitude branch cascade (cut from ~26% to ~12% of a GUID-heavy line). Validated against an `%x` reference over 2M random GUIDs plus edge cases.
Also adds `library/cpp/yt/string/benchmark` to track `Format` performance.
### Benchmarks
Median ns/op (lower is better), pinned core on a dedicated Xeon E5-2650 v2, 9x1s repetitions. See `library/cpp/yt/string/benchmark`.
| Benchmark | What it formats | Before | After | Speedup |
| --- | --- | ---: | ---: | ---: |
| `ManyMixedArgs` | ~18 args: GUIDs, strings, duration, ints | 1030 | 833 | -19% |
| `StringAndTwoGuids` | literal prefix + two GUIDs | 233 | 185 | -21% |
| `IntAndGuid` | one int + one GUID | 205 | 179 | -13% |
| `ManyInts` | six integers | 389 | 340 | -13% |
| `Guid` | a single GUID | 156 | 131 | -16% |
| `String` | a single string | 139 | 104 | -25% |
| `Int` | a single integer | 142 | 120 | -15% |
| `NoArgs` | a literal with no arguments | 88.8 | 85.7 | -3% |
commit_hash:ce9957a06c3ff28b2889aa65fbbddf4ca444f9fe
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
## Motivation
Profiling the YT master Automaton thread showed TOriginAttributes::Capture (run on every non-OK TError) spending ~60% of its time in a getpid() syscall — uncached on glibc >= 2.25. NYT::GetCurrentThreadId() (gettid) feeds hot thread-affinity / log-manager checks on the same thread.
## Changes
- New library/cpp/yt/system/process_id.* with cached GetProcessId(); GetSystemThreadId() now caches the kernel tid in TLS. Both caches reset in the child after fork.
- Moved thread_name.{h,cpp} from misc to system.
- Removed GetCurrentProcessId/GetCurrentThreadId shims from yt/yt/core/misc/proc.{h,cpp}; migrated all callers to NYT::GetProcessId / NYT::GetSystemThreadId.
- TOriginAttributes::Capture uses the cached getters; recorded Tid is now the real kernel tid (matches perf/ps).
- Added microbenchmarks (library/cpp/yt/system/benchmarks, yt/yt/core/benchmarks/error.cpp).
## Microbenchmarks (release)
| | before | after |
|---|---|---|
| getpid | 101 ns | 0.33 ns |
| gettid | 102 ns | 1.64 ns |
| Capture | 161 ns | 50 ns |
| failed TError | 221 ns | 74 ns |
commit_hash:ee37ae57d61a5a2dd33daee935270f4bb93b7ff9
|
| |
|
|
| |
commit_hash:2d2808f61599fcfea314ad660585e984d50ffbb3
|
| |
|
|
| |
commit_hash:25c6545fed2bffe20f7a008a218b9245896926ec
|
| |
|
|
| |
commit_hash:c574736c9cbb7c6da6502dc751214d8d7f343568
|
| |
|
|
|
|
| |
Y_NO_SANITIZE("address")
commit_hash:30841b1871a64fd6b3cc1eebcc9e4d5f1281c4fa
|
| |
|
|
| |
commit_hash:5bb34cf1e8e039b59fff79917c694509fff4666c
|
| |
|
|
|
| |
add realization of hash-functions for 32-bit platforms
commit_hash:3247a0524d3b66d759bf5ebd598be84c8dfb5837
|
| |
|
|
| |
commit_hash:96d6c16b241e44c6cd7910b16864fd0d037c6e8b
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
+20-30% throughput on UpdateColumnarStatistics benchmark (with large statistics enabled)
- Do not factor in value.Id when calculating column digest
- Pack metadata directly instead of multiple HashCombine calls
- Use SplitMix64 finalizer for proper bit distribution
- Use cheaper xor with metadata instead of HashCombine
- Use XXH3 for strings
- Remove unnecessary copy
- Measured quality increased: on 20 (c=1..20) sequences `{nc | n \in [1..10^6]}` MAE dropped from ~36% to ~20%
HLL digests might temporarily suffer a 2x increase upon merging with the previously computed ones.
commit_hash:0bf661245cf1848ba9ef8b6c840c18dfd05bd2a4
|
| |
|
|
| |
commit_hash:f257ebdacfbf0549a0f55cc37df2c059629bac3a
|
| |
|
|
| |
commit_hash:8a1bcbd29a7a3e7dfb5a62379fd921e8d164331f
|
| |
|
|
|
|
| |
keys in TAsyncExpiringCache
commit_hash:94c7b2f6b585daa4f3ff011c701500987b972356
|
| |
|
|
| |
commit_hash:10106fd04f0a11a13521c191b1b9f6fd3a5b2422
|
| |
|
|
| |
commit_hash:3436f18ce66beb90bf8f89a674e715ac4b9a1098
|
| |
|
|
| |
commit_hash:b9ffae1ce4077a1f26ccbd0abf0596cae292d225
|
| |
|
|
| |
commit_hash:dac730c0d9dc052edce7dd7873c51687ea19082e
|
| |
|
|
| |
commit_hash:48c6dc49d8c0ffb3bbb5fa773dc38bdee243f3c3
|
| |
|
|
| |
commit_hash:7bda0c36d13d3a9c586f65b48d6f23f854c0e088
|
| |
|
|
|
|
|
|
|
|
|
| |
```
(dflt) thegeorg@jakku:~/arcadia@trunk$ ya tool c++ -E -dM - -msse4.2 < /dev/null | grep SSE4
#define __SSE4_1__ 1
#define __SSE4_2__ 1
```
On Windows, this macros is defined by the means of ymake.core.conf.
commit_hash:ec670bbe09b73580df6c7acf4760fedce7597676
|
| |
|
|
| |
commit_hash:c29f08fc16d8bd974d4ce516af499de848607ab8
|
| |
|
|
| |
commit_hash:abf11126ef1a914939d506a79dd7c4f11df177f2
|
| |
|
|
|
|
|
|
|
| |
- Changelog entry
Type: feature
Component: master
Introduce TCompactTableSchema, that holds wire protobuf schema representation and lighter than TTableSchema
commit_hash:21801854b37fc25c5004ee01e5b79a3b3b6ea983
|
| |
|
|
|
| |
[nodiff:runtime]
commit_hash:1ba799aed1703ab7c6304b6da7090b3337f768dd
|
| |
|
|
| |
commit_hash:55e574599005f5286f646ebba93d5550325708bc
|
| |
|
|
| |
commit_hash:deadebefdfd81b6c737b9464435356b8f652e296
|
| |
|
|
| |
commit_hash:41c16027e2f796197b98307419a63da9fa3f1a88
|
| |
|
|
| |
commit_hash:7d3055f901a21e63f7860f443252a86e9895fd08
|
| |
|
|
| |
commit_hash:50f729d3716d8b1f5b852cfc008e228172fb79c4
|
| |
|
|
| |
commit_hash:5ce8019253cdb971d1af36350e3efa3a4ec8545c
|
| |
|
|
| |
commit_hash:17dbd6d7e5fc440afa2d816e245a73e25135bfb5
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Plan:
1) Remove `IAttributedDictionary` type from the public API. \+
2) Remove `Set` method from public API in favor of `operator<<=`. \+
3) Adopt `ConvertTo<T>` (or other name) CPO with proper extension into `NYT::NYson::ConvertTo` from `yt/core`.
4) Use CPO from (3) to eliminate direct dependency on `yt/core` of `Get/Find` methods from attributes API.
5) Adopt `ConvertToYsonString` (or other name) CPO with proper extension into `yt/core` customisations.
6) Use CPO from (5) to eliminate direct dependency on `yt/core` of `TErrorAttribute` ctor.
7) Swap attributes implementation to the one which doesn’t use `IAttributeDictionary`.
8) At this point `stripped_error*` can be moved to library/cpp/yt and so can recursively dependant on THROW macro methods `Get/Find/…`.
9) Adjust CPO’s to work with `std::string` instead of `TYsonString` assuming text format to be used (maybe `TString` for now).
10) Remove dep of `library/cpp/yt/error` on `yson` entirely.
This pr addresses 3-4 steps of plan. Below is a brief explanation of design decisions.
We want to have a concept which detects if there is a `ConvertTo` method and if true, try calling it. Templates can only perform unqualified name lookup and if we allow non-ADL overloads to be found, we would have dependency on inclusion order (if `ConvertTo` is included prior to our code, everything works fine, but if the order is reverse, templated dispatch would fail, but direct call would work just fine). That is why we adopt niebloids which first disable ADL lookup of the name `ConvertTo` by directing it to niebloid implemented via `TagInvoke` mechanism. TagInvoke design <https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p1895r0.pdf> . TL;DR: we want to have behavior which is consistent with inclusion order.
Key difference now is that `ConvertTo` works consistently in both manual calls and template function body calls and is no longer visible for ADL part of the unqualified name lookup
commit_hash:32af641bd0af559bfe670c2ceb36721fb4afc2dd
|
| |
|
|
| |
commit_hash:46d59ab3acbd313753d3e46f3a6f10a8ebc424d8
|
| |
|
|
| |
commit_hash:7f7600d332c3ddb5c8372e921bcba3b4fbed68f8
|
| |
|
|
| |
commit_hash:cbc39112d8384b8c4bcd2410f0a203466b400c10
|
| |
|
|
| |
commit_hash:14c7e42422af750383f04855b4a7ea6b267b92d2
|
| |
|
|
|
| |
(HIDDEN_URL
commit_hash:d9358ac48da1ab4a4ef9ccdbf7eb77a100cf3897
|
| |
|
|
| |
commit_hash:ceb575c0377d4a48c0507590d878e690e92f5c63
|
| |
|
|
| |
commit_hash:429a843ed1a0e0fe3a5bc7d237f586b6671b8997
|
| |
|
|
| |
commit_hash:3ef81205ed4cf9360829f834baa07c2fbf69b999
|
| |
|
|
|
|
|
|
|
| |
No description
---
b181413931eab2909c605b373dc858657e8bcb14
Pull Request resolved: https://github.com/ytsaurus/ytsaurus/pull/816
|
| | |
|
| |
|
|
| |
378099ca41e7698fba0ceda68b8d2b554e61b6ea
|
| |
|
|
|
|
| |
This reverts commit d9b67f1778da2d15dd94f7285afe4e3490a233ab, reversing
changes made to 461a09e0c18bd14cef7df8060e7f9537e3ad74b5.
92cdaf4185661b7058f6a30d5a532ad40b725345
|
| |
|
|
| |
d9b67f1778da2d15dd94f7285afe4e3490a233ab
|
| |
|
|
|
| |
https://en.cppreference.com/w/cpp/types/numeric_limits/lowest
349097f620987be824e9db00f76af89746741c75
|
| |
|
|
|
|
|
|
|
|
|
|
| |
The new format is better optimized for VectorizedRead, which is needed in Sequoia.
Previously, the output for the BeginCopy verb was a serialized subtree plus the list of opaque children. In a non-Sequoia world, that is useful and easy to work with. However, in Sequoia, a subtree might not be contained on the same cell, which makes it difficult to use the old data transfer format.
Now it's a vector of nodes with clear separation between them. This change makes it easier to work with VectorizedRead since the conversion between BeginCopy return and VectorizedRead(BeginCopy) return is very straight-forward. Additionally, this change helps with copying from Cypress to Sequoia, since now it's much easier to process each node individually (which is needed for Sequoia to work and not clog up one cell).
In the next PR, I'm planning on:
1. Making BeginCopy a read-request (removing snapshot locks). So the verb will only return meta-information about the requested nodes.
2. Adding a verb, executed before BeginCopy, where the locks would be taken and the general structure of the tree is returned.
017ec9971e8e0a611a7286ed748b6071cfc89048
|