aboutsummaryrefslogtreecommitdiffstats
path: root/libavcodec/vulkan_ffv1.c
Commit message (Collapse)AuthorAgeFilesLines
* vulkan/ffv1dec: fix FFVkSPIRVCompiler leakaverne2025-06-111-35/+23
|
* vulkan/ffv1dec: fix leak in FFVulkanDecodeSharedaverne2025-06-111-0/+2
|
* vulkan_ffv1: pipe through slice decoding statusLynne2025-05-201-17/+43
|
* ffv1enc_vulkan: switch to 2-line cache, unify prediction codeLynne2025-05-201-1/+4
|
* vulkan_ffv1: add cached symbol reader for AMDLynne2025-04-141-1/+6
| | | | | | Speeds up everything on AMD by 3x. This uses 32 local invocations to load state into cache, as well as to do the RCT faster.
* vulkan_ffv1: remove need for scratch data during setupLynne2025-04-141-22/+1
| | | | This saves on some VRAM, but mainly allows for a more unified path.
* vulkan_ffv1: externalize extended lookup checkLynne2025-04-141-0/+6
| | | | 8% speedup on nvidia on 4k.
* ffv1/vulkan: redo context count tracking and quant_table_idx managementLynne2025-04-141-14/+8
| | | | | | | | This commit also makes it possible for the encoder to choose a different quantization table on a per-slice basis, as well as adding this capability to the decoder. Also, this commit fully fixes decoding of context=1 encoded files.
* vulkan_ffv1: cache only 2 lines when decoding RGBLynne2025-04-141-202/+81
| | | | | | | | This reduces the intermediate VRAM used for RGB decoding by a factor of 100x for 6k video. This also speeds the decoder up by 16% for 4k RGB24 and 31% for 6k video. This is equivalent to what the software decoder does, but with less pointers.
* vulkan_ffv1: improve buffer barrier correctness for slice stateLynne2025-04-141-3/+2
| | | | This is likely a nanooptimization, but its more correct.
* vulkan_ffv1: fix reset shader dependenciesLynne2025-04-141-19/+17
| | | | | Without a barrier upfront, the reset shader may read data fields not yet set by the setup shader.
* vulkan_ffv1: fallback to upload if mapping packet fails, fix fallbackLynne2025-04-141-12/+7
| | | | | | The commit which added support for host mapping accidentally broke the original, upload route. For drivers without host-mapping (very few), fix it.
* vulkan_ffv1: allocate just as much memory for slice state as neededLynne2025-04-141-4/+4
| | | | | Rather than always using the maximum allowed slices, just use the number of slices present in this frame.
* vulkan_ffv1: remove unused defineLynne2025-04-141-2/+0
| | | | Leftover debug macro.
* vulkan_ffv1: enable acceleration on IntelLynne2025-04-141-14/+0
| | | | Fixed by previous commit.
* ffv1: add a Vulkan-based decoderLynne2025-03-171-0/+1317
This patch adds a fully-featured level 3 and 4 decoder for FFv1, supporting Golomb and all Range coding variants, all pixel formats, and all features, except for the newly added floating-point formats. On a 6000 Ada, for 3840x2160 bgr0 content at 50Mbps (standard desktop recording), it is able to do 400fps. An Alder Lake with 24 threads can barely do 100fps.