diff options
author | sunyuechi <sunyuechi@iscas.ac.cn> | 2024-05-30 23:26:53 +0800 |
---|---|---|
committer | Rémi Denis-Courmont <remi@remlab.net> | 2024-06-12 18:38:41 +0300 |
commit | 8d9fb7b5cf0a3b32e2e59be5c32cbe0dab84567b (patch) | |
tree | d6fcb030ec056b8cd61dbeded03b7dab88fc5209 /tests/ref/fate/g2m4 | |
parent | d72a5fe719c01da07af30e4402a7c3cd994b4cfc (diff) | |
download | ffmpeg-8d9fb7b5cf0a3b32e2e59be5c32cbe0dab84567b.tar.gz |
lavc/vp8dsp: R-V V put_bilin_h v unroll
Since len < 64, the registers are sufficient, so it can be
directly unrolled (a4 is even).
Another benefit of unrolling is that it reduces one load operation
vertically compared to horizontally.
old new
C908 X60 C908 X60
vp8_put_bilin4_h_c : 6.2 5.5 : 6.2 5.5
vp8_put_bilin4_h_rvv_i32 : 2.2 2.0 : 1.5 1.5
vp8_put_bilin4_v_c : 6.5 5.7 : 6.2 5.7
vp8_put_bilin4_v_rvv_i32 : 2.2 2.0 : 1.2 1.5
vp8_put_bilin8_h_c : 24.2 21.5 : 24.2 21.5
vp8_put_bilin8_h_rvv_i32 : 5.2 4.7 : 3.5 3.5
vp8_put_bilin8_v_c : 24.5 21.7 : 24.5 21.7
vp8_put_bilin8_v_rvv_i32 : 5.2 4.7 : 3.5 3.2
vp8_put_bilin16_h_c : 48.0 42.7 : 48.0 42.7
vp8_put_bilin16_h_rvv_i32 : 5.7 5.0 : 5.2 4.5
vp8_put_bilin16_v_c : 48.2 43.0 : 48.2 42.7
vp8_put_bilin16_v_rvv_i32 : 5.7 5.2 : 4.5 4.2
Signed-off-by: Rémi Denis-Courmont <remi@remlab.net>
Diffstat (limited to 'tests/ref/fate/g2m4')
0 files changed, 0 insertions, 0 deletions