diff options
author | Krzysztof Pyrkosz <ffmpeg@szaka.eu> | 2025-03-03 22:00:23 +0100 |
---|---|---|
committer | Martin Storsjö <martin@martin.st> | 2025-03-04 10:16:44 +0200 |
commit | d765e5f043d981294303fe210d643c5156efeeb3 (patch) | |
tree | 5148d5fe2d3627a4706618cb9f50c68167b73e42 /libavformat/movenc.c | |
parent | 081c865867beaa43b9f3506613e165c8594f3f28 (diff) | |
download | ffmpeg-d765e5f043d981294303fe210d643c5156efeeb3.tar.gz |
swscale/aarch64: dotprod implementation of rgba32_to_Y
The idea is to split the 16 bit coefficients into lower and upper half,
invoke udot for the lower half, shift by 8, and follow by udot for the
upper half.
Benchmark on A78:
bgra_to_y_128_c: 682.0 ( 1.00x)
bgra_to_y_128_neon: 181.2 ( 3.76x)
bgra_to_y_128_dotprod: 117.8 ( 5.79x)
bgra_to_y_1080_c: 5742.5 ( 1.00x)
bgra_to_y_1080_neon: 1472.5 ( 3.90x)
bgra_to_y_1080_dotprod: 906.5 ( 6.33x)
bgra_to_y_1920_c: 10194.0 ( 1.00x)
bgra_to_y_1920_neon: 2589.8 ( 3.94x)
bgra_to_y_1920_dotprod: 1573.8 ( 6.48x)
Signed-off-by: Martin Storsjö <martin@martin.st>
Diffstat (limited to 'libavformat/movenc.c')
0 files changed, 0 insertions, 0 deletions