diff options
author | Martin Storsjö <martin@martin.st> | 2017-01-14 13:22:30 +0200 |
---|---|---|
committer | Martin Storsjö <martin@martin.st> | 2017-03-11 13:14:51 +0200 |
commit | a88db8b9a016fe47997029e3653cdac4777994b4 (patch) | |
tree | 3f38e334c94cb0fec00c77abc4738410fc79966e /libavcodec/arm/audiodsp_init_neon.c | |
parent | f32690a298badbf2df66319e9b38236ad3d3e321 (diff) | |
download | ffmpeg-a88db8b9a016fe47997029e3653cdac4777994b4.tar.gz |
arm: vp9lpf: Implement the mix2_44 function with one single filter pass
For this case, with 8 inputs but only changing 4 of them, we can fit
all 16 input pixels into a q register, and still have enough temporary
registers for doing the loop filter.
The wd=8 filters would require too many temporary registers for
processing all 16 pixels at once though.
Before: Cortex A7 A8 A9 A53
vp9_loop_filter_mix2_v_44_16_neon: 289.7 256.2 237.5 181.2
After:
vp9_loop_filter_mix2_v_44_16_neon: 221.2 150.5 177.7 138.0
This is cherrypicked from libav commit
575e31e931e4178e9f1e24407503c9b4ec0ef9ba.
Signed-off-by: Martin Storsjö <martin@martin.st>
Diffstat (limited to 'libavcodec/arm/audiodsp_init_neon.c')
0 files changed, 0 insertions, 0 deletions