diff options
author | John Cox <jc@kynesim.co.uk> | 2023-07-04 14:04:44 +0000 |
---|---|---|
committer | Martin Storsjö <martin@martin.st> | 2023-07-06 00:21:05 +0300 |
commit | 697533e76dbea8cc7fd6a0642bc60050cc05ead8 (patch) | |
tree | 6e42886cd5d23cbfbb42ac12941e7dc49a79b1cb /tests/ref/seek/vsynth_lena-rgb | |
parent | 94cb94a2c0910d364a7181fc5cc0e9556b777d0a (diff) | |
download | ffmpeg-697533e76dbea8cc7fd6a0642bc60050cc05ead8.tar.gz |
avfilter/vf_bwdif: Add a filter_line3 method for optimisation
Add an optional filter_line3 to the available optimisations.
filter_line3 is equivalent to filter_line, memcpy, filter_line
filter_line shares quite a number of loads and some calculations in
common with its next iteration and testing shows that using aarch64
neon filter_line3s performance is 30% better than two filter_lines
and a memcpy.
Adds a test for vf_bwdif filter_line3 to checkasm
Rounds job start lines down to a multiple of 4. This means that if
filter_line3 exists then filter_line will not sometimes be called
once at the end of a slice depending on thread count. The final slice
may do up to 3 extra lines but filter_edge is faster than filter_line
so it is unlikely to create any noticable thread load variation.
Signed-off-by: John Cox <jc@kynesim.co.uk>
Signed-off-by: Martin Storsjö <martin@martin.st>
Diffstat (limited to 'tests/ref/seek/vsynth_lena-rgb')
0 files changed, 0 insertions, 0 deletions