diff options
author | Rémi Denis-Courmont <remi@remlab.net> | 2023-12-16 10:02:08 +0200 |
---|---|---|
committer | Rémi Denis-Courmont <remi@remlab.net> | 2023-12-21 17:54:08 +0200 |
commit | db32f75c635c5783b76e7c3fd8060548d0917180 (patch) | |
tree | ee33f660b6621d1bad99c4d06600824263730328 /libavdevice/v4l2-common.c | |
parent | 04cb30750878d0f7df35ede5e261914d48dbb31d (diff) | |
download | ffmpeg-db32f75c635c5783b76e7c3fd8060548d0917180.tar.gz |
lavc/opusdsp: simplify R-V V postfilter
This skips the round-trip to scalar register for the sliding 'x'
coefficients, improving performance by about 5%. The trick here is that
the vector slide-up instruction preserves elements in destination vector
until the slide offset.
The switch from vfslide1up.vf to vslideup.vi also allows the elimination
of data dependencies on consecutive slides. Since the specifications
recommend sticking to power of two offsets, we could slide as follows:
vslideup.vi v8, v0, 2
vslideup.vi v4, v0, 1
vslideup.vi v12, v8, 1
vslideup.vi v16, v8, 2
However in the device under test, this seems to make performance slightly
worse, so this is left for (in)validation with future better hardware.
Diffstat (limited to 'libavdevice/v4l2-common.c')
0 files changed, 0 insertions, 0 deletions