summaryrefslogtreecommitdiffstats
path: root/build/scripts/fatbinary_wrapper.py
Commit message (Collapse)AuthorAgeFilesLines
* [build] cuda: Add .module_id sanity check to CUDA_SRCSdeshevoy2026-02-021-0/+11
| | | | | | | IDs generated by different cicc invocations should match ISSUE: commit_hash:7cd593cee44b31875e7166709d7614dcfa3f1f14
* [build] cuda: Introduce CUDA_SRCS macro utilizing parallelized device code ↵deshevoy2026-02-011-0/+30
compilation Instead of a single graph node launching NVCC to compile .cu for both host and all device architectures CUDA_SRCS generates multiple nodes: - node per each device architecture producing PTX and CUBIN - node merging all PTX and CUBIN files into a single FATBIN blob - node producing .cpp with host code - node compiling host .cpp with embedded FATBIN blob CUDA_ARCHITECTURES variable is used to determine the list of architectures to compile device code for. ISSUE: commit_hash:0a4c2797dd238ae062482af30694df6978301278