diff options
author | Martin Storsjö <martin@martin.st> | 2016-11-23 10:56:12 +0200 |
---|---|---|
committer | Martin Storsjö <martin@martin.st> | 2017-02-09 12:31:40 +0200 |
commit | 0331c3f5e8cb6e6b53fab7893e91d1be1bfa979c (patch) | |
tree | 8d2882a85649c521a944642c9f3e6621fd7ce7a7 /libavcodec/libx265.c | |
parent | c546147db07d16a76c2fb698d2e8a3057f393475 (diff) | |
download | ffmpeg-0331c3f5e8cb6e6b53fab7893e91d1be1bfa979c.tar.gz |
arm: vp9itxfm: Make the larger core transforms standalone functions
This work is sponsored by, and copyright, Google.
This reduces the code size of libavcodec/arm/vp9itxfm_neon.o from
15324 to 12388 bytes.
This gives a small slowdown of a couple tens of cycles, up to around
150 cycles for the full case of the largest transform, but makes
it more feasible to add more optimized versions of these transforms.
Before: Cortex A7 A8 A9 A53
vp9_inv_dct_dct_16x16_sub4_add_neon: 2063.4 1516.0 1719.5 1245.1
vp9_inv_dct_dct_16x16_sub16_add_neon: 3279.3 2454.5 2525.2 1982.3
vp9_inv_dct_dct_32x32_sub4_add_neon: 10750.0 7955.4 8525.6 6754.2
vp9_inv_dct_dct_32x32_sub32_add_neon: 18574.0 17108.4 14216.7 12010.2
After:
vp9_inv_dct_dct_16x16_sub4_add_neon: 2060.8 1608.5 1735.7 1262.0
vp9_inv_dct_dct_16x16_sub16_add_neon: 3211.2 2443.5 2546.1 1999.5
vp9_inv_dct_dct_32x32_sub4_add_neon: 10682.0 8043.8 8581.3 6810.1
vp9_inv_dct_dct_32x32_sub32_add_neon: 18522.4 17277.4 14286.7 12087.9
Signed-off-by: Martin Storsjö <martin@martin.st>
Diffstat (limited to 'libavcodec/libx265.c')
0 files changed, 0 insertions, 0 deletions