aboutsummaryrefslogtreecommitdiffstats
path: root/Makefile
diff options
context:
space:
mode:
authorRémi Denis-Courmont <remi@remlab.net>2023-12-16 10:02:08 +0200
committerRémi Denis-Courmont <remi@remlab.net>2023-12-21 17:54:08 +0200
commitdb32f75c635c5783b76e7c3fd8060548d0917180 (patch)
treeee33f660b6621d1bad99c4d06600824263730328 /Makefile
parent04cb30750878d0f7df35ede5e261914d48dbb31d (diff)
downloadffmpeg-db32f75c635c5783b76e7c3fd8060548d0917180.tar.gz
lavc/opusdsp: simplify R-V V postfilter
This skips the round-trip to scalar register for the sliding 'x' coefficients, improving performance by about 5%. The trick here is that the vector slide-up instruction preserves elements in destination vector until the slide offset. The switch from vfslide1up.vf to vslideup.vi also allows the elimination of data dependencies on consecutive slides. Since the specifications recommend sticking to power of two offsets, we could slide as follows: vslideup.vi v8, v0, 2 vslideup.vi v4, v0, 1 vslideup.vi v12, v8, 1 vslideup.vi v16, v8, 2 However in the device under test, this seems to make performance slightly worse, so this is left for (in)validation with future better hardware.
Diffstat (limited to 'Makefile')
0 files changed, 0 insertions, 0 deletions