diff options
author | Rémi Denis-Courmont <remi@remlab.net> | 2023-12-08 21:38:20 +0200 |
---|---|---|
committer | Rémi Denis-Courmont <remi@remlab.net> | 2023-12-16 11:18:01 +0200 |
commit | 918b3ed2d51c11a474b8a9ce7b784f7a20d9645b (patch) | |
tree | 7071fc81077765ce27a1ac96f392a8c7d85c5e1b /libavutil/x86 | |
parent | 1a049595320094bb3a03a0893fe69990c2eb4964 (diff) | |
download | ffmpeg-918b3ed2d51c11a474b8a9ce7b784f7a20d9645b.tar.gz |
lavc/lpc: R-V V compute_autocorr
The loop iterates over the length of the vector, not the order. This is
to avoid reloading the same data for each lag value. However this means
the loop only works if the maximum order is no larger than VLENB.
The loop is roughly equivalent to:
for (size_t j = 0; j < lag; j++)
autoc[j] = 1.;
while (len > lag) {
for (ptrdiff_t j = 0; j < lag; j++)
autoc[j] += data[j] * *data;
data++;
len--;
}
while (len > 0) {
for (ptrdiff_t j = 0; j < len; j++)
autoc[j] += data[j] * *data;
data++;
len--;
}
Since register pressure is only at 50%, it should be possible to implement
the same loop for order up to 2xVLENB. But this is left for future work.
Performance numbers are all over the place from ~1.25x to ~4x speedups,
but at least they are always noticeably better than nothing.
Diffstat (limited to 'libavutil/x86')
0 files changed, 0 insertions, 0 deletions