aboutsummaryrefslogtreecommitdiffstats
path: root/tests
diff options
context:
space:
mode:
authorMartin Storsjö <martin@martin.st>2017-01-14 13:22:30 +0200
committerMartin Storsjö <martin@martin.st>2017-02-24 00:03:09 +0200
commit575e31e931e4178e9f1e24407503c9b4ec0ef9ba (patch)
treef1d574cfcc51f393f8a4918929f7776fb97e5318 /tests
parent3bf9c48320f25f3d5557485b0202f22ae60748b0 (diff)
downloadffmpeg-575e31e931e4178e9f1e24407503c9b4ec0ef9ba.tar.gz
arm: vp9lpf: Implement the mix2_44 function with one single filter pass
For this case, with 8 inputs but only changing 4 of them, we can fit all 16 input pixels into a q register, and still have enough temporary registers for doing the loop filter. The wd=8 filters would require too many temporary registers for processing all 16 pixels at once though. Before: Cortex A7 A8 A9 A53 vp9_loop_filter_mix2_v_44_16_neon: 289.7 256.2 237.5 181.2 After: vp9_loop_filter_mix2_v_44_16_neon: 221.2 150.5 177.7 138.0 Signed-off-by: Martin Storsjö <martin@martin.st>
Diffstat (limited to 'tests')
0 files changed, 0 insertions, 0 deletions