diff options
author | Niklas Haas <git@haasn.dev> | 2025-07-11 20:48:28 +0200 |
---|---|---|
committer | Niklas Haas <git@haasn.dev> | 2025-07-17 12:26:06 +0200 |
commit | e44a1aaeecc14fc396e0c715969ddd3cc939933d (patch) | |
tree | 0aae4c0051f61d26dfdf0da0b114981fa0add2da /doc/APIchanges | |
parent | 91f2d146d418d536e14b0d0c2d32f81cb95f6b7f (diff) | |
download | ffmpeg-e44a1aaeecc14fc396e0c715969ddd3cc939933d.tar.gz |
avfilter/x86/scene_sad: add high bit depth AVX2/AVX512 version
Since psadbw only exists for 8-bits, we have to emulate it for 16-bit
inputs. The simplest sequence is to use a normal subtraction, which is safe
as long as the inputs do not exceed 32767 - so limit this implementation
to 15-bit inputs and below.
For 16-bit inputs, we could in theory instead use a pminw / pmaxw to ensure
the resulting difference does not overflow, but this is slower, and also
breaks the subsequent use of pmaddwd, so I opted to skip 16-bit SIMD for
now.
scene_sad10_c: 114175.6 ( 1.00x)
scene_sad10_avx2: 9617.7 (11.87x)
scene_sad10_avx512: 5208.8 (21.92x)
scene_sad12_c: 114537.8 ( 1.00x)
scene_sad12_avx2: 9614.0 (11.91x)
scene_sad12_avx512: 5186.3 (22.08x)
scene_sad14_c: 114113.9 ( 1.00x)
scene_sad14_avx2: 9612.9 (11.87x)
scene_sad14_avx512: 5186.0 (22.00x)
scene_sad15_c: 114108.9 ( 1.00x)
scene_sad15_avx2: 9612.3 (11.87x)
scene_sad15_avx512: 5186.4 (22.00x)
scene_sad16_c: 114136.0 ( 1.00x)
Diffstat (limited to 'doc/APIchanges')
0 files changed, 0 insertions, 0 deletions