summaryrefslogtreecommitdiffstats
path: root/library/cpp/yt/misc
Commit message (Collapse)AuthorAgeFilesLines
* YT-22593: migrate assorted library/cpp/yt string APIs to std::stringbabenko8 days1-1/+1
| | | | commit_hash:bfab0d0115b50949f66878004cf718b988575734
* YT-28504: Support heterogeneous lookup in cachesbabenko11 days2-2/+4
| | | | commit_hash:acb3e84437f5bdb125d7c1807847eb5edecbb11f
* Fold TEnumTraits GetMinValue/GetMaxValue to compile timebabenko2026-06-141-6/+12
| | | | | | | | | | | | | | | | | GetMinValue()/GetMaxValue() are constexpr, but when called from a runtime context for a large-domain enum, clang does not fold the min/max_element and emits a runtime scan over the whole domain on every call. This is hot on the master replay path: TEnumIndexedArray::operator[] bounds-checks against these (e.g. TCypressManager::FindHandler), and TCompositeAutomaton::RememberReign hits GetCurrentReign() = GetMaxValue() over the ~3300-entry EMasterReign domain per mutation. Bind the result to a constexpr local to force compile-time evaluation. Verified by disasm on a 240-value sample enum: getmin() goes from a ~44-instruction runtime scan to a single 'mov $const'. No behavior change. Part of YT-28453 (master replay-speed optimizations). commit_hash:7cdb969e00ba219415d80c5c8c984aa8bbde99d2
* Speed up NYT::Formatbabenko2026-06-131-66/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | Profile-driven optimizations of the `Format` hot path, benchmarked against a representative master debug log (structured `"Key: %v"` messages dominated by GUIDs, strings, integers, bools and durations). Median improvements of ~15-20% across the workload, measured on a dedicated host. Changes: - `string_builder`: use `resize_uninitialized` in `DoReserve` to avoid zero-filling the buffer on every `Format` call. - `format`: replace the per-argument `memchr` (`spec.Contains('n')`) with an inline scan, force-inline `RunFormatterAt`, and add a `FormatString` fast path for the common plain `%v` / empty spec. - `guid`: rewrite `WriteGuidToBuffer` using a `clz`-derived digit count and a back-to-front fill instead of the per-magnitude branch cascade (cut from ~26% to ~12% of a GUID-heavy line). Validated against an `%x` reference over 2M random GUIDs plus edge cases. Also adds `library/cpp/yt/string/benchmark` to track `Format` performance. ### Benchmarks Median ns/op (lower is better), pinned core on a dedicated Xeon E5-2650 v2, 9x1s repetitions. See `library/cpp/yt/string/benchmark`. | Benchmark | What it formats | Before | After | Speedup | | --- | --- | ---: | ---: | ---: | | `ManyMixedArgs` | ~18 args: GUIDs, strings, duration, ints | 1030 | 833 | -19% | | `StringAndTwoGuids` | literal prefix + two GUIDs | 233 | 185 | -21% | | `IntAndGuid` | one int + one GUID | 205 | 179 | -13% | | `ManyInts` | six integers | 389 | 340 | -13% | | `Guid` | a single GUID | 156 | 131 | -16% | | `String` | a single string | 139 | 104 | -25% | | `Int` | a single integer | 142 | 120 | -15% | | `NoArgs` | a literal with no arguments | 88.8 | 85.7 | -3% | commit_hash:ce9957a06c3ff28b2889aa65fbbddf4ca444f9fe
* Cache process/thread id getters and use them in TError origin capturebabenko2026-06-073-76/+0
| | | | | | | | | | | | | | | | | | | | | ## Motivation Profiling the YT master Automaton thread showed TOriginAttributes::Capture (run on every non-OK TError) spending ~60% of its time in a getpid() syscall — uncached on glibc >= 2.25. NYT::GetCurrentThreadId() (gettid) feeds hot thread-affinity / log-manager checks on the same thread. ## Changes - New library/cpp/yt/system/process_id.* with cached GetProcessId(); GetSystemThreadId() now caches the kernel tid in TLS. Both caches reset in the child after fork. - Moved thread_name.{h,cpp} from misc to system. - Removed GetCurrentProcessId/GetCurrentThreadId shims from yt/yt/core/misc/proc.{h,cpp}; migrated all callers to NYT::GetProcessId / NYT::GetSystemThreadId. - TOriginAttributes::Capture uses the cached getters; recorded Tid is now the real kernel tid (matches perf/ps). - Added microbenchmarks (library/cpp/yt/system/benchmarks, yt/yt/core/benchmarks/error.cpp). ## Microbenchmarks (release) | | before | after | |---|---|---| | getpid | 101 ns | 0.33 ns | | gettid | 102 ns | 1.64 ns | | Capture | 161 ns | 50 ns | | failed TError | 221 ns | 74 ns | commit_hash:ee37ae57d61a5a2dd33daee935270f4bb93b7ff9
* Add Abseil-compatible support hashersbabenko2026-04-272-0/+18
| | | | commit_hash:2d2808f61599fcfea314ad660585e984d50ffbb3
* YT-27872: Refactor BIND to fix ODR violationsdann2392026-04-032-1/+43
| | | | commit_hash:25c6545fed2bffe20f7a008a218b9245896926ec
* YT-18571: Drop YT_ATTRIBUTE_NO_UNIQUE_ADDRESS in favor of Y_NO_UNIQUE_ADDRESSbabenko2026-03-231-6/+0
| | | | commit_hash:c574736c9cbb7c6da6502dc751214d8d7f343568
* YT-18571: Drop YT_ATTRIBUTE_NO_SANITIZE_ADDRESS in favor of ↵babenko2026-03-231-2/+0
| | | | | | Y_NO_SANITIZE("address") commit_hash:30841b1871a64fd6b3cc1eebcc9e4d5f1281c4fa
* [yt/misc] remove 64-bit requirement error directivevasko2026-03-111-5/+0
| | | | commit_hash:5bb34cf1e8e039b59fff79917c694509fff4666c
* [yt] make SplitMix & HashCombine bit-independentvasko2026-03-053-13/+39
| | | | | add realization of hash-functions for 32-bit platforms commit_hash:3247a0524d3b66d759bf5ebd598be84c8dfb5837
* YT-27436 Introduce PP_IS_EMPTYdann2392026-02-182-0/+21
| | | | commit_hash:96d6c16b241e44c6cd7910b16864fd0d037c6e8b
* YT-27167: Better TBitwiseUnversionedValueHashpavook2026-02-113-0/+29
| | | | | | | | | | | | | | | +20-30% throughput on UpdateColumnarStatistics benchmark (with large statistics enabled) - Do not factor in value.Id when calculating column digest - Pack metadata directly instead of multiple HashCombine calls - Use SplitMix64 finalizer for proper bit distribution - Use cheaper xor with metadata instead of HashCombine - Use XXH3 for strings - Remove unnecessary copy - Measured quality increased: on 20 (c=1..20) sequences `{nc | n \in [1..10^6]}` MAE dropped from ~36% to ~20% HLL digests might temporarily suffer a 2x increase upon merging with the previously computed ones. commit_hash:0bf661245cf1848ba9ef8b6c840c18dfd05bd2a4
* YT-27244: range helpers move to librarypanesher2026-02-056-0/+421
| | | | commit_hash:f257ebdacfbf0549a0f55cc37df2c059629bac3a
* YT-22593: Switch enum ToString to std::stringdgolear2026-01-072-4/+4
| | | | commit_hash:8a1bcbd29a7a3e7dfb5a62379fd921e8d164331f
* [core] YT-26666: Ensure shard index and bucket index independence for random ↵apachee2025-12-012-1/+11
| | | | | | keys in TAsyncExpiringCache commit_hash:94c7b2f6b585daa4f3ff011c701500987b972356
* Cosmetics for strongtypedefh0pless2025-11-262-22/+19
| | | | commit_hash:10106fd04f0a11a13521c191b1b9f6fd3a5b2422
* Annotate BYREF attribute accessors with Y_LIFETIME_BOUNDbabenko2025-11-121-11/+12
| | | | commit_hash:3436f18ce66beb90bf8f89a674e715ac4b9a1098
* Add an option to remove comparison operators for strong typedefh0pless2025-11-103-49/+68
| | | | commit_hash:b9ffae1ce4077a1f26ccbd0abf0596cae292d225
* YT-19137: Make full_read a first class citizencoteeq2025-08-183-1/+37
| | | | commit_hash:dac730c0d9dc052edce7dd7873c51687ea19082e
* YT-18571: Cosmetics for YT_DEFINE_STRONG_TYPEDEFh0pless2025-08-111-0/+1
| | | | commit_hash:48c6dc49d8c0ffb3bbb5fa773dc38bdee243f3c3
* Add Save/Load methods for TStrongTypedefyurial2025-06-132-0/+19
| | | | commit_hash:7bda0c36d13d3a9c586f65b48d6f23f854c0e088
* yt: Use well-known macros to test if we are building with SSE4.2thegeorg2025-05-161-5/+0
| | | | | | | | | | | ``` (dflt) thegeorg@jakku:~/arcadia@trunk$ ya tool c++ -E -dM - -msse4.2 < /dev/null | grep SSE4 #define __SSE4_1__ 1 #define __SSE4_2__ 1 ``` On Windows, this macros is defined by the means of ymake.core.conf. commit_hash:ec670bbe09b73580df6c7acf4760fedce7597676
* Make some methods of smart enum being constexprhiddenpath2025-03-252-13/+14
| | | | commit_hash:c29f08fc16d8bd974d4ce516af499de848607ab8
* YT: Allow serializing and deserializing plain enums to uint64dgolear2025-03-062-5/+24
| | | | commit_hash:abf11126ef1a914939d506a79dd7c4f11df177f2
* YT-21910: Master compact table schemacherepashka2025-03-061-0/+13
| | | | | | | | | - Changelog entry Type: feature Component: master Introduce TCompactTableSchema, that holds wire protobuf schema representation and lighter than TTableSchema commit_hash:21801854b37fc25c5004ee01e5b79a3b3b6ea983
* YT-22593: More trivial TString->std::string migrationsbabenko2025-02-223-4/+4
| | | | | [nodiff:runtime] commit_hash:1ba799aed1703ab7c6304b6da7090b3337f768dd
* Fix unaligned load/store UB in bus and zstd compressionnadya732025-02-102-54/+0
| | | | commit_hash:55e574599005f5286f646ebba93d5550325708bc
* Babenkoed: Shadow Warssabdenovch2025-01-161-0/+17
| | | | commit_hash:deadebefdfd81b6c737b9464435356b8f652e296
* Intermediate changesrobot-piglet2024-12-246-0/+134
| | | | commit_hash:41c16027e2f796197b98307419a63da9fa3f1a88
* Introduce (any use) YT_STATIC_INITIALIZERbabenko2024-12-141-0/+19
| | | | commit_hash:7d3055f901a21e63f7860f443252a86e9895fd08
* Fix various issuespogorelov2024-12-065-231/+1
| | | | commit_hash:50f729d3716d8b1f5b852cfc008e228172fb79c4
* YT-22885: Handle unknown values in (Try)CheckedEnumCastbabenko2024-11-163-3/+61
| | | | commit_hash:5ce8019253cdb971d1af36350e3efa3a4ec8545c
* Intermediate changesrobot-piglet2024-11-082-2/+4
| | | | commit_hash:17dbd6d7e5fc440afa2d816e245a73e25135bfb5
* YT-21233: Rewrite ConvertTo CPO using TagInvokearkady-e1ppa2024-11-085-0/+299
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Plan: 1) Remove `IAttributedDictionary` type from the public API. \+ 2) Remove `Set` method from public API in favor of `operator<<=`. \+ 3) Adopt `ConvertTo<T>` (or other name) CPO with proper extension into `NYT::NYson::ConvertTo` from `yt/core`. 4) Use CPO from (3) to eliminate direct dependency on `yt/core` of `Get/Find` methods from attributes API. 5) Adopt `ConvertToYsonString` (or other name) CPO with proper extension into `yt/core` customisations. 6) Use CPO from (5) to eliminate direct dependency on `yt/core` of `TErrorAttribute` ctor. 7) Swap attributes implementation to the one which doesn’t use `IAttributeDictionary`. 8) At this point `stripped_error*` can be moved to library/cpp/yt and so can recursively dependant on THROW macro methods `Get/Find/…`. 9) Adjust CPO’s to work with `std::string` instead of `TYsonString` assuming text format to be used (maybe `TString` for now). 10) Remove dep of `library/cpp/yt/error` on `yson` entirely. This pr addresses 3-4 steps of plan. Below is a brief explanation of design decisions. We want to have a concept which detects if there is a `ConvertTo` method and if true, try calling it. Templates can only perform unqualified name lookup and if we allow non-ADL overloads to be found, we would have dependency on inclusion order (if `ConvertTo` is included prior to our code, everything works fine, but if the order is reverse, templated dispatch would fail, but direct call would work just fine). That is why we adopt niebloids which first disable ADL lookup of the name `ConvertTo` by directing it to niebloid implemented via `TagInvoke` mechanism. TagInvoke design <https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p1895r0.pdf> . TL;DR: we want to have behavior which is consistent with inclusion order. Key difference now is that `ConvertTo` works consistently in both manual calls and template function body calls and is no longer visible for ADL part of the unqualified name lookup commit_hash:32af641bd0af559bfe670c2ceb36721fb4afc2dd
* NaN-safe comparison and hashingbabenko2024-11-069-2/+294
| | | | commit_hash:46d59ab3acbd313753d3e46f3a6f10a8ebc424d8
* YT-22885: Refactor checked castsbabenko2024-11-038-51/+122
| | | | commit_hash:7f7600d332c3ddb5c8372e921bcba3b4fbed68f8
* Add GetAllSetValue and IsKnownValuebabenko2024-10-273-5/+72
| | | | commit_hash:cbc39112d8384b8c4bcd2410f0a203466b400c10
* YT-22885: DEFINE_ENUM_UNKNOWN_VALUE, string-related conversionsbabenko2024-10-263-15/+49
| | | | commit_hash:14c7e42422af750383f04855b4a7ea6b267b92d2
* YT: Support plain enum deserializationdgolear2024-10-103-10/+33
| | | | | (HIDDEN_URL commit_hash:d9358ac48da1ab4a4ef9ccdbf7eb77a100cf3897
* Add an option to disable refcounted trackingnastprol2024-09-251-3/+5
| | | | commit_hash:ceb575c0377d4a48c0507590d878e690e92f5c63
* remove cpp/yt/misc no longer depends on cpp/yt/stringarkady-e1ppa2024-09-186-124/+0
| | | | commit_hash:429a843ed1a0e0fe3a5bc7d237f586b6671b8997
* Intermediate changesrobot-piglet2024-09-181-11/+13
| | | | commit_hash:3ef81205ed4cf9360829f834baa07c2fbf69b999
* fisco -> fiascoMaxim Akhmedov2024-09-051-1/+1
| | | | | | | | | No description --- b181413931eab2909c605b373dc858657e8bcb14 Pull Request resolved: https://github.com/ytsaurus/ytsaurus/pull/816
* Intermediate changesrobot-piglet2024-09-034-0/+228
|
* YT-22642: Fix unaligned access UBbabenko2024-08-312-0/+54
| | | | 378099ca41e7698fba0ceda68b8d2b554e61b6ea
* Revert "YT-21306: Add EnumHasDefaultValue"dtorilov2024-08-202-18/+0
| | | | | | This reverts commit d9b67f1778da2d15dd94f7285afe4e3490a233ab, reversing changes made to 461a09e0c18bd14cef7df8060e7f9537e3ad74b5. 92cdaf4185661b7058f6a30d5a532ad40b725345
* YT-21306: Add EnumHasDefaultValuedtorilov2024-08-082-0/+18
| | | | d9b67f1778da2d15dd94f7285afe4e3490a233ab
* YTORM-1042 Fix casts for floating pointdeep2024-07-091-2/+2
| | | | | https://en.cppreference.com/w/cpp/types/numeric_limits/lowest 349097f620987be824e9db00f76af89746741c75
* YT-20614: Change cross-cell copy format (in preparation for Sequoia)h0pless2024-07-081-0/+17
| | | | | | | | | | | | The new format is better optimized for VectorizedRead, which is needed in Sequoia. Previously, the output for the BeginCopy verb was a serialized subtree plus the list of opaque children. In a non-Sequoia world, that is useful and easy to work with. However, in Sequoia, a subtree might not be contained on the same cell, which makes it difficult to use the old data transfer format. Now it's a vector of nodes with clear separation between them. This change makes it easier to work with VectorizedRead since the conversion between BeginCopy return and VectorizedRead(BeginCopy) return is very straight-forward. Additionally, this change helps with copying from Cypress to Sequoia, since now it's much easier to process each node individually (which is needed for Sequoia to work and not clog up one cell). In the next PR, I'm planning on: 1. Making BeginCopy a read-request (removing snapshot locks). So the verb will only return meta-information about the requested nodes. 2. Adding a verb, executed before BeginCopy, where the locks would be taken and the general structure of the tree is returned. 017ec9971e8e0a611a7286ed748b6071cfc89048