diff options
author | Martin Storsjö <martin@martin.st> | 2017-01-02 22:08:41 +0200 |
---|---|---|
committer | Martin Storsjö <martin@martin.st> | 2017-02-24 00:03:44 +0200 |
commit | 65aa002d54433154a6924dc13e498bec98451ad0 (patch) | |
tree | eae3fb9f11250ef3040e2743b6b6b03b13a6a7e4 /tests/ref/fate/dds-dx10-bc1 | |
parent | 402546a17233a8815307df9e14ff88cd70424537 (diff) | |
download | ffmpeg-65aa002d54433154a6924dc13e498bec98451ad0.tar.gz |
aarch64: vp9itxfm: Avoid reloading the idct32 coefficients
The idct32x32 function actually pushed d8-d15 onto the stack even
though it didn't clobber them; there are plenty of registers that
can be used to allow keeping all the idct coefficients in registers
without having to reload different subsets of them at different
stages in the transform.
After this, we still can skip pushing d12-d15.
Before:
vp9_inv_dct_dct_32x32_sub32_add_neon: 8128.3
After:
vp9_inv_dct_dct_32x32_sub32_add_neon: 8053.3
Signed-off-by: Martin Storsjö <martin@martin.st>
Diffstat (limited to 'tests/ref/fate/dds-dx10-bc1')
0 files changed, 0 insertions, 0 deletions