diff options
author | Krzysztof Pyrkosz <ffmpeg@szaka.eu> | 2025-03-01 13:59:00 +0100 |
---|---|---|
committer | Martin Storsjö <martin@martin.st> | 2025-03-02 01:17:29 +0200 |
commit | 38929b824bcc4b3307af3e0711c5c03b823a83e3 (patch) | |
tree | 15b70b7fbc7f912ee6c2dd8eb8a3986c55f384af /libavutil/timecode.c | |
parent | 76b18100177843f443276e02a9592bbae9cd09c8 (diff) | |
download | ffmpeg-38929b824bcc4b3307af3e0711c5c03b823a83e3.tar.gz |
swscale/aarch64: Refactor hscale_16_to_15__fs_4
This patch removes the use of stack for temporary state and replaces
interleaved ld4 loads with ld1.
Before/after:
A78
hscale_16_to_15__fs_4_dstW_8_neon: 86.8 ( 1.72x)
hscale_16_to_15__fs_4_dstW_24_neon: 147.5 ( 2.73x)
hscale_16_to_15__fs_4_dstW_128_neon: 614.0 ( 3.14x)
hscale_16_to_15__fs_4_dstW_144_neon: 680.5 ( 3.18x)
hscale_16_to_15__fs_4_dstW_256_neon: 1193.2 ( 3.19x)
hscale_16_to_15__fs_4_dstW_512_neon: 2305.0 ( 3.27x)
hscale_16_to_15__fs_4_dstW_8_neon: 86.0 ( 1.74x)
hscale_16_to_15__fs_4_dstW_24_neon: 106.8 ( 3.78x)
hscale_16_to_15__fs_4_dstW_128_neon: 404.0 ( 4.81x)
hscale_16_to_15__fs_4_dstW_144_neon: 451.8 ( 4.80x)
hscale_16_to_15__fs_4_dstW_256_neon: 760.5 ( 5.06x)
hscale_16_to_15__fs_4_dstW_512_neon: 1520.0 ( 5.01x)
A72
hscale_16_to_15__fs_4_dstW_8_neon: 156.8 ( 1.52x)
hscale_16_to_15__fs_4_dstW_24_neon: 217.8 ( 2.52x)
hscale_16_to_15__fs_4_dstW_128_neon: 906.8 ( 2.90x)
hscale_16_to_15__fs_4_dstW_144_neon: 1014.5 ( 2.91x)
hscale_16_to_15__fs_4_dstW_256_neon: 1751.5 ( 2.96x)
hscale_16_to_15__fs_4_dstW_512_neon: 3469.3 ( 2.97x)
hscale_16_to_15__fs_4_dstW_8_neon: 151.2 ( 1.54x)
hscale_16_to_15__fs_4_dstW_24_neon: 173.4 ( 3.15x)
hscale_16_to_15__fs_4_dstW_128_neon: 660.0 ( 3.98x)
hscale_16_to_15__fs_4_dstW_144_neon: 735.7 ( 4.00x)
hscale_16_to_15__fs_4_dstW_256_neon: 1273.5 ( 4.09x)
hscale_16_to_15__fs_4_dstW_512_neon: 2488.2 ( 4.16x)
Signed-off-by: Martin Storsjö <martin@martin.st>
Diffstat (limited to 'libavutil/timecode.c')
0 files changed, 0 insertions, 0 deletions