aboutsummaryrefslogtreecommitdiffstats
path: root/tests/checkasm/float_dsp.c
diff options
context:
space:
mode:
authorRémi Denis-Courmont <remi@remlab.net>2023-11-02 21:08:56 +0200
committerRémi Denis-Courmont <remi@remlab.net>2023-11-06 22:09:30 +0200
commitadc87a5f7c99c89364b89e51d7c42fcb36cf15b0 (patch)
tree3e2437f743423144c756461a7e66f0abbe0af9ed /tests/checkasm/float_dsp.c
parent02594c8c011ca2cdc20334e9bb812ec8f6f37cf3 (diff)
downloadffmpeg-adc87a5f7c99c89364b89e51d7c42fcb36cf15b0.tar.gz
lavc/opusdsp: rewrite R-V V postfilter
This uses a more traditional approach allowing up processing of up to period minus two elements per iteration. This also allows the algorithm to work for all and any vector length. As the T-Head C908 device under test can load 16 elements loop, there is unsurprisingly a little performance drop when the period is minimal and the parallelism is capped at 13 elements: Before: postfilter_15_c: 21222.2 postfilter_15_rvv_f32: 22007.7 postfilter_512_c: 20189.7 postfilter_512_rvv_f32: 22004.2 postfilter_1022_c: 20189.7 postfilter_1022_rvv_f32: 22004.2 After: postfilter_15_c: 20189.5 postfilter_15_rvv_f32: 7057.2 postfilter_512_c: 20189.5 postfilter_512_rvv_f32: 5667.2 postfilter_1022_c: 20192.7 postfilter_1022_rvv_f32: 5667.2
Diffstat (limited to 'tests/checkasm/float_dsp.c')
0 files changed, 0 insertions, 0 deletions