summaryrefslogtreecommitdiffstats
path: root/libavutil/x86
Commit message (Collapse)AuthorAgeFilesLines
* x86/cpu: implement get/set_eflags using intrinsicsRonald S. Bultje2012-07-101-0/+14
| | | | | Signed-off-by: Diego Biurrun <[email protected]> Signed-off-by: Martin Storsjö <[email protected]>
* x86/cpu: implement support for cpuid through intrinsicsRonald S. Bultje2012-07-101-0/+14
| | | | Signed-off-by: Martin Storsjö <[email protected]>
* x86/cpu: implement support for xgetbv through intrinsicsRonald S. Bultje2012-07-101-0/+11
| | | | Signed-off-by: Martin Storsjö <[email protected]>
* x86/timer: implement an intrinsic-based version for rdtsc (AV_READ_TIME).Ronald S. Bultje2012-07-071-0/+8
|
* x86inc: add SPLATB_LOAD, SPLATB_REG, PSHUFLW macrosLoren Merritt2012-07-051-4/+37
| | | | Signed-off-by: Diego Biurrun <[email protected]>
* x86inc: modify ALIGN to not generate long nops on i586Loren Merritt2012-07-051-0/+4
| | | | Signed-off-by: Diego Biurrun <[email protected]>
* x86: cpu: clean up check for cpuid instruction supportMans Rullgard2012-07-011-20/+16
| | | | | | | This adds macros for accessing the EFLAGS register and uses these instead of coding the entire check in inline asm. Signed-off-by: Mans Rullgard <[email protected]>
* x86: cpu: whitespace (mostly) cosmeticsMans Rullgard2012-06-251-34/+36
| | | | | | | | | This adds whitespace around operators, aligns line continuation backslashes, and breaks long lines. Also fixes an ifdef halfway through a statement. The one line of duplication this saved is not worth the ugliness. Signed-off-by: Mans Rullgard <[email protected]>
* x86: place some inline asm under #if HAVE_INLINE_ASMRonald S. Bultje2012-06-252-0/+5
| | | | Signed-off-by: Mans Rullgard <[email protected]>
* x86: Add CPU flag for the i686 cmov instructionDiego Biurrun2012-06-231-0/+2
|
* float_dsp: add x86-optimized functions for vector_fmac_scalar()Justin Ruggles2012-06-182-0/+54
|
* Add a float DSP framework to libavutilJustin Ruggles2012-06-083-0/+99
| | | | Move vector_fmul() from DSPContext to AVFloatDSPContext.
* x86: Avoid movs on BUTTERFLYPS when in AVX modeVitor Sessak2012-05-291-7/+6
| | | | Signed-off-by: Janne Grunau <[email protected]>
* lavr: replace the SSE version of ff_conv_fltp_to_flt_6ch() with SSE4 and AVXJustin Ruggles2012-05-091-4/+3
| | | | | The current SSE version is slower than the MMX version on Athlon64 and Sandy Bridge, but the SSE4 and AVX versions are faster on Sandy Bridge.
* Add libavresampleJustin Ruggles2012-04-241-0/+9
| | | | | This is a new library for audio sample format, channel layout, and sample rate conversion.
* x86inc: support AVX abstraction for 2-operand instructionsLoren Merritt2012-04-181-2/+8
| | | | | | Add cvtdq2ps and cvtps2dq to the AVX instruction list. Signed-off-by: Justin Ruggles <[email protected]>
* build: Move all arch OBJS declarations into arch subdirectory Makefiles.Diego Biurrun2012-04-121-0/+1
|
* x86inc improvements for 64-bitHenrik Gramner2012-04-111-104/+114
| | | | | | | | | | | | Add support for all x86-64 registers Prefer caller-saved register over callee-saved on WIN64 Support up to 15 function arguments Also (by Ronald S. Bultje) Fix up our asm to work with new x86inc.asm. Signed-off-by: Ronald S. Bultje <[email protected]> Signed-off-by: Justin Ruggles <[email protected]>
* x86inc: add *mp named argument support to DEFINE_ARGS.Ronald S. Bultje2012-03-141-0/+2
|
* x86inc: don't "bake" stack_offset in named arguments.Loren Merritt2012-03-031-1/+4
| | | | Signed-off-by: Ronald S. Bultje <[email protected]>
* x86inc: support yasm -f win64 flag also.Haruhiko Yamagata2012-02-081-0/+2
| | | | | | | This sets __OUTPUT_FORMAT__ to win64 instead of win32, even though both (through -m amd64) produce 64-bit binary code. Signed-off-by: Ronald S. Bultje <[email protected]>
* x86inc: allow manual use of WIN64_SPILL_XMM.Henrik Gramner2012-02-081-4/+5
| | | | | | | | Functions using INIT_MMX may still access XMM registers through direct means (xmm0-15). Therefore, they still need to be marked for clobber so they can be properly saved/restored. Signed-off-by: Ronald S. Bultje <[email protected]>
* win64: add a XMM clobber test configure option.Ronald S. Bultje2012-02-021-0/+71
| | | | | | | This will be useful to test more aggressively for failures to mark XMM registers as clobbered in Win64 builds, and prevent regressions thereof. Based on a patch by Ramiro Polla <[email protected]>
* x86inc.asm: fix typo.Ronald S. Bultje2012-01-271-1/+1
| | | | Assemblers don't understand ! in %if statements.
* config.asm: change %ifdef directives to %if directives.Ronald S. Bultje2012-01-272-14/+16
| | | | This allows combining multiple conditionals in a single statement.
* mpegaudiodec: optimized iMDCT transformVitor Sessak2012-01-081-0/+2
| | | | Signed-off-by: Ronald S. Bultje <[email protected]>
* x86: bswap: remove test for bswap instructionMans Rullgard2011-12-121-7/+0
| | | | | | | | Firstly, this test never worked as intended, always reporting success. Secondly, bswap is available from 486 onward and can thus be assumed present. Signed-off-by: Mans Rullgard <[email protected]>
* bswap: make generic implementation more compiler-friendlyMans Rullgard2011-12-121-0/+4
| | | | | | | | With these changes, gcc 4.5 and later recognise it as a bswap and use the proper instructions on ARM and x86. On x86, the 16-bit bswap is recognised from gcc 4.1. Signed-off-by: Mans Rullgard <[email protected]>
* x86 cpuid: set vendor union members separatelySean McGovern2011-12-081-1/+4
| | | | | | | | | Solaris Studio (suncc) has difficulty with filling in members of a union. Instead, let's retrieve and store the cpuid() results separately. This is still a compiler bug, however this fix does not cause a regression on other platforms. Signed-off-by: Janne Grunau <[email protected]>
* x86inc: Flag shufps as an floating-point instruction for the AVX emulation code.Vitor Sessak2011-11-271-1/+1
| | | | | | | Without this, code like "shufps m0, m1, m2, 0xaa" would not work in CPUs not supporting SSE2. Signed-off-by: Ronald S. Bultje <[email protected]>
* x86inc: use sse versions of common macros instead of sse2 when applicableJustin Ruggles2011-11-061-0/+5
|
* Update x86inc.asm to latest x264 version, and add AVX symmetry.Loren Merritt2011-11-051-183/+340
| | | | | | | We keep INIT_AVX (for backwards compatibility). 3arg AVX ops with a memory arg can only have it in src2, whereas SSE emulation of 3arg prefers to have it in src1 (i.e. the mov). So, if the op is symmetric and the wrong one is memory, swap them.
* fmtconvert: port int32_to_float_fmul_scalar() x86 inline asm to yasmJustin Ruggles2011-10-211-0/+12
|
* x86: XOP/FMA4 CPU detection supportJason Garrett-Glaser2011-09-261-0/+9
|
* cpu detection: avoid a signed overflowSean McGovern2011-09-031-1/+1
| | | | | | 1<<31 overflows because 1 is signed, so force it to unsigned. Signed-off-by: Ronald S. Bultje <[email protected]>
* Move clipd macros to x86util.asm.Ronald S. Bultje2011-08-171-0/+33
| | | | This allows sharing them between multiple .asm files.
* Move x86util.asm from libavcodec/ to libavutil/.Ronald S. Bultje2011-08-121-0/+542
| | | | This allows using it in swscale also.
* Move x86inc.asm to libavutil/.Ronald S. Bultje2011-08-121-0/+905
| | | | This allows using it in libswscale/ also.
* bswap: use native types for av_bwap16().Jason Garrett-Glaser2011-04-221-2/+2
| | | | | | | | | | This prevents a call to bytestream_get_be16() using a movzwl both before and after the ror instruction, which is obviously inefficient. Arm uses the same trick also. Sintel decoding goes from (avg+SD) 9.856 +/- 0.003 to 9.797 +/- 0.003 sec. Signed-off-by: Ronald S. Bultje <[email protected]>
* cosmetics: indentationJustin Ruggles2011-03-221-8/+8
|
* Add a CPU flag for the Atom processor.Justin Ruggles2011-03-221-2/+10
| | | | | | | The Atom has SSSE3 support, which is useful in many cases, but sometimes the SSSE3 version is slower than the SSE2 equivalent on the Atom, but is generally faster on other processors supporting SSSE3. This flag allows for selectively disabling certain SSSE3 functions on the Atom.
* Replace FFmpeg with Libav in licence headersMans Rullgard2011-03-195-20/+20
| | | | Signed-off-by: Mans Rullgard <[email protected]>
* x86: use raw opcode for xgetbv instructionMans Rullgard2011-02-201-1/+1
| | | | | | | | | | This allows the CPU detection to work with assemblers not supporting the xgetbv mnemonic. These include clang and some BSD versions. All AVX code will be written for yasm, where the main assembler is not involved. Signed-off-by: Mans Rullgard <[email protected]>
* x86: check for AVX supportMans Rullgard2011-02-201-0/+12
| | | | | | This adds configure and runtime checks for AVX support on x86 CPUs. Signed-off-by: Mans Rullgard <[email protected]>
* Add check for Athlon64 and similar AMD processors with slow SSE2.Justin Ruggles2011-02-111-1/+16
| | | | Signed-off-by: Ronald S. Bultje <[email protected]>
* consolidate .gitignore patters into a single fileJanne Grunau2011-01-181-3/+0
| | | | Signed-off-by: Janne Grunau <[email protected]>
* convert svn:ignore properties to .gitignore filesJanne Grunau2011-01-171-0/+3
| | | | Signed-off-by: Janne Grunau <[email protected]>
* Add missing #include <string.h> in x86/cpu.cMåns Rullgård2010-09-091-0/+1
| | | | Originally committed as revision 25088 to svn://svn.ffmpeg.org/ffmpeg/trunk
* Clean up av_get_cpu_flag()Måns Rullgård2010-09-091-6/+1
| | | | | | | | Instead of defining functions in per-arch header files included by the main cpu.c, define them normally and call them from the generic one. Originally committed as revision 25084 to svn://svn.ffmpeg.org/ffmpeg/trunk
* Move mm_support() from libavcodec to libavutil, make it a publicStefano Sabatini2010-09-081-0/+128
| | | | | | function and rename it to av_get_cpu_flags(). Originally committed as revision 25076 to svn://svn.ffmpeg.org/ffmpeg/trunk