diff options
author | Krzysztof Pyrkosz <ffmpeg@szaka.eu> | 2025-02-28 22:21:50 +0100 |
---|---|---|
committer | Martin Storsjö <martin@martin.st> | 2025-03-02 01:17:53 +0200 |
commit | e8d4c559871ef93fc94a8efb8144f1738eba4c62 (patch) | |
tree | 5825c03dfa1a205177ceb5a8508924738758b5b7 /libavutil/timecode.c | |
parent | 38929b824bcc4b3307af3e0711c5c03b823a83e3 (diff) | |
download | ffmpeg-e8d4c559871ef93fc94a8efb8144f1738eba4c62.tar.gz |
avcodec/aarch64/ac3dsp_neon.S: Optimize ac3_sum_square_butterfly_int32_neon
Instead of calculating a^2, b^2, (a+b)^2 and (a-b)^2, calculate only
a^2, b^2 and 2*a*b in each iteration and derive the latter parts from
these three at the end.
Before and after:
A78
ac3_sum_square_bufferfly_int32_neon: 484.8 ( 2.00x)
ac3_sum_square_bufferfly_int32_neon: 468.2 ( 2.08x)
A72
ac3_sum_square_bufferfly_int32_neon: 793.6 ( 1.26x)
ac3_sum_square_bufferfly_int32_neon: 527.3 ( 1.92x)
Signed-off-by: Martin Storsjö <martin@martin.st>
Diffstat (limited to 'libavutil/timecode.c')
0 files changed, 0 insertions, 0 deletions