diff options
author | Rémi Denis-Courmont <remi@remlab.net> | 2024-05-12 19:26:29 +0300 |
---|---|---|
committer | Rémi Denis-Courmont <remi@remlab.net> | 2024-05-21 21:16:30 +0300 |
commit | d452db8410256550d500864ef7ba5917d6afc864 (patch) | |
tree | 1ecf31b953c9318daf0de265da2c6d3f1ccb83ee /libavcodec/mips/hpeldsp_msa.c | |
parent | 6093367147b72c0f3c78a61597a3452f7f0e510e (diff) | |
download | ffmpeg-d452db8410256550d500864ef7ba5917d6afc864.tar.gz |
lavc/vc1dsp: R-V V vc1_unescape_buffer
Notes:
- The loop is biased toward no unescaped bytes as that should be most common.
- The input byte array is slid rather than the (8 times smaller) bit-mask,
as RISC-V V does not provide a bit-mask (or bit-wise) slide instruction.
- There are two comparisons with 0 per iteration, for the same reason.
- In case of match, bytes are copied until the first match, and the loop is
restarted after the escape byte. Vector compression (vcompress.vm) could
discard all escape bytes but that is slower if escape bytes are rare.
Further optimisations should be possible, e.g.:
- processing 2 bytes fewer per iteration to get rid of a 2 slides,
- taking a short cut if the input vector contains less than 2 zeroes.
But this is a good starting point:
T-Head C908:
vc1dsp.vc1_unescape_buffer_c: 12749.5
vc1dsp.vc1_unescape_buffer_rvv_i32: 6009.0
SpacemiT X60:
vc1dsp.vc1_unescape_buffer_c: 11038.0
vc1dsp.vc1_unescape_buffer_rvv_i32: 2061.0
Diffstat (limited to 'libavcodec/mips/hpeldsp_msa.c')
0 files changed, 0 insertions, 0 deletions