ffmpeg - Mirror of FFmpeg git repo

diff options

author	Martin Storsjö <martin@martin.st>	2019-02-01 00:12:46 +0200
committer	Martin Storsjö <martin@martin.st>	2019-02-19 11:46:28 +0200
commit	7e42d5f0ab2aeac811fd01e122627c9198b13f01 (patch)
tree	7a04b33369318adf05a6a859f80519abff58b71d /libavcodec/amfenc.h
parent	49f9c4272c4029b57ff300d908ba03c6332fc9c4 (diff)
download	ffmpeg-7e42d5f0ab2aeac811fd01e122627c9198b13f01.tar.gz

aarch64: vp8: Optimize vp8_idct_add_neon for aarch64

The previous version was a pretty exact translation of the arm version. This version does do some unnecessary arithemetic (it does more operations on vectors that are only half filled; it does 4 uaddw and 4 sqxtun instead of 2 of each), but it reduces the overhead of packing data together (which could be done for free in the arm version). This gives a decent speedup on Cortex A53, a minor speedup on A72 and a very minor slowdown on Cortex A73. Before: Cortex A53 A72 A73 vp8_idct_add_neon: 79.7 67.5 65.0 After: vp8_idct_add_neon: 67.7 64.8 66.7 Signed-off-by: Martin Storsjö <martin@martin.st>

Diffstat (limited to 'libavcodec/amfenc.h')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: