diff options
author | Justin Ruggles <justin.ruggles@gmail.com> | 2012-05-03 15:23:32 -0400 |
---|---|---|
committer | Justin Ruggles <justin.ruggles@gmail.com> | 2012-05-09 16:17:59 -0400 |
commit | 5cc6d5244d4ec89b3ac855abff4a3d19caee22f1 (patch) | |
tree | 7c3643f535a67203b60bd001f1382b9b1282d3af /libavutil/x86/x86util.asm | |
parent | 0b45334a5880d6e2a4b3642adcd5feab8a27a150 (diff) | |
download | ffmpeg-5cc6d5244d4ec89b3ac855abff4a3d19caee22f1.tar.gz |
lavr: replace the SSE version of ff_conv_fltp_to_flt_6ch() with SSE4 and AVX
The current SSE version is slower than the MMX version on Athlon64 and Sandy
Bridge, but the SSE4 and AVX versions are faster on Sandy Bridge.
Diffstat (limited to 'libavutil/x86/x86util.asm')
-rw-r--r-- | libavutil/x86/x86util.asm | 7 |
1 files changed, 3 insertions, 4 deletions
diff --git a/libavutil/x86/x86util.asm b/libavutil/x86/x86util.asm index 55f4a936e2..508f24e2b5 100644 --- a/libavutil/x86/x86util.asm +++ b/libavutil/x86/x86util.asm @@ -42,10 +42,9 @@ %endmacro %macro SBUTTERFLYPS 3 - movaps m%3, m%1 - unpcklps m%1, m%2 - unpckhps m%3, m%2 - SWAP %2, %3 + unpcklps m%3, m%1, m%2 + unpckhps m%1, m%1, m%2 + SWAP %1, %3, %2 %endmacro %macro TRANSPOSE4x4B 5 |