diff options
author | Lynne <dev@lynne.ee> | 2022-05-20 10:00:43 +0200 |
---|---|---|
committer | Lynne <dev@lynne.ee> | 2022-05-20 10:12:34 +0200 |
commit | 82a68a8771ca39564f6a74e0f875d6852e7a0c2a (patch) | |
tree | 92992c7067cf22f88b701ec6a143496592cac03c /libavutil/softfloat_tables.h | |
parent | 41a558fea06cc0a23b8d2d0dfb03ef6a25cf5100 (diff) | |
download | ffmpeg-82a68a8771ca39564f6a74e0f875d6852e7a0c2a.tar.gz |
x86/tx_float: remove vgatherdpd usage
Its performance loss ranges from either being just as fast as individual loads
(Skylake), a few percent slower (Alderlake), 8% slower (Zen 3), to completely
disasterous (older/other CPUs).
Sadly, gathers never panned out fast on x86, even with the benefit of time and
implementation experience.
This also saves a register, as there's no need to fill out an additional
register mask.
Zen 3 (16384-point transform):
Before: 1561050 decicycles in av_tx (fft), 131072 runs, 0 skips
After: 1449621 decicycles in av_tx (fft), 131072 runs, 0 skips
Alderlake:
2% slower on big transforms (65536), to 1% (131072), to a few percent for smaller
sizes.
Diffstat (limited to 'libavutil/softfloat_tables.h')
0 files changed, 0 insertions, 0 deletions