aboutsummaryrefslogtreecommitdiffstats
path: root/libavutil/hwcontext.c
diff options
context:
space:
mode:
authorNiklas Haas <git@haasn.dev>2025-07-11 20:48:28 +0200
committerNiklas Haas <git@haasn.dev>2025-07-17 12:26:06 +0200
commite44a1aaeecc14fc396e0c715969ddd3cc939933d (patch)
tree0aae4c0051f61d26dfdf0da0b114981fa0add2da /libavutil/hwcontext.c
parent91f2d146d418d536e14b0d0c2d32f81cb95f6b7f (diff)
downloadffmpeg-e44a1aaeecc14fc396e0c715969ddd3cc939933d.tar.gz
avfilter/x86/scene_sad: add high bit depth AVX2/AVX512 version
Since psadbw only exists for 8-bits, we have to emulate it for 16-bit inputs. The simplest sequence is to use a normal subtraction, which is safe as long as the inputs do not exceed 32767 - so limit this implementation to 15-bit inputs and below. For 16-bit inputs, we could in theory instead use a pminw / pmaxw to ensure the resulting difference does not overflow, but this is slower, and also breaks the subsequent use of pmaddwd, so I opted to skip 16-bit SIMD for now. scene_sad10_c: 114175.6 ( 1.00x) scene_sad10_avx2: 9617.7 (11.87x) scene_sad10_avx512: 5208.8 (21.92x) scene_sad12_c: 114537.8 ( 1.00x) scene_sad12_avx2: 9614.0 (11.91x) scene_sad12_avx512: 5186.3 (22.08x) scene_sad14_c: 114113.9 ( 1.00x) scene_sad14_avx2: 9612.9 (11.87x) scene_sad14_avx512: 5186.0 (22.00x) scene_sad15_c: 114108.9 ( 1.00x) scene_sad15_avx2: 9612.3 (11.87x) scene_sad15_avx512: 5186.4 (22.00x) scene_sad16_c: 114136.0 ( 1.00x)
Diffstat (limited to 'libavutil/hwcontext.c')
0 files changed, 0 insertions, 0 deletions