summaryrefslogtreecommitdiffstats
path: root/build/plugins/cuda.py
Commit message (Collapse)AuthorAgeFilesLines
* [build] cuda: Fix CUB with CUDA_SRCS()deshevoy2026-03-261-2/+12
| | | | | Specify `__CUDA_ARCH_LIST__` explicitly so CUB namespace stay the same across all nvcc invokations commit_hash:2100ccb2307100378bcead498fd34cd11e44c566
* [build] cuda: Disable some warnings when compiling host codedeshevoy2026-02-091-1/+8
| | | | | | | nvcc disables them implicitly ISSUE: commit_hash:0b68decce1f030902bd770b8b98fc8102c97e738
* [build] cuda: Add .module_id sanity check to CUDA_SRCSdeshevoy2026-02-021-0/+1
| | | | | | | IDs generated by different cicc invocations should match ISSUE: commit_hash:7cd593cee44b31875e7166709d7614dcfa3f1f14
* [build] cuda: Fix CUDA_SRCS to support architecture- and family-specific ↵deshevoy2026-02-021-11/+3
| | | | | | | | | features E.g. sm_90a or sm_100f ISSUE: commit_hash:250df064a8abcac925db676565582b5ef05401bb
* [build] cuda: Fixdeshevoy2026-02-011-1/+1
| | | | commit_hash:f73df3ec27f0695b21e7047ee465b15b201ea06b
* [build] cuda: Introduce CUDA_SRCS macro utilizing parallelized device code ↵deshevoy2026-02-011-0/+45
compilation Instead of a single graph node launching NVCC to compile .cu for both host and all device architectures CUDA_SRCS generates multiple nodes: - node per each device architecture producing PTX and CUBIN - node merging all PTX and CUBIN files into a single FATBIN blob - node producing .cpp with host code - node compiling host .cpp with embedded FATBIN blob CUDA_ARCHITECTURES variable is used to determine the list of architectures to compile device code for. ISSUE: commit_hash:0a4c2797dd238ae062482af30694df6978301278