aboutsummaryrefslogtreecommitdiffstats
path: root/libavutil/x86/x86util.asm
Commit message (Collapse)AuthorAgeFilesLines
* x86: port PSIGNW to cpuflagsJames Almer2015-09-111-5/+5
| | | | | Reviewed-by: Ronald S. Bultje <rsbultje@gmail.com> Signed-off-by: James Almer <jamrial@gmail.com>
* x86: move XOP emulation code back to x86incJames Almer2015-08-031-19/+0
| | | | | | | | | | Only two functions that use xop multiply-accumulate instructions where the first operand is the same as the fourth actually took advantage of the macros. This further reduces differences with x264's x86inc. Reviewed-by: Ronald S. Bultje <rsbultje@gmail.com> Signed-off-by: James Almer <jamrial@gmail.com>
* x86/swr: add SSE2/AVX pack_8ch functionsJames Almer2014-12-301-0/+37
| | | | | | Reviewed-by: Michael Niedermayer <michaelni@gmx.at> Reviewed-by: Ronald S. Bultje <rsbultje@gmail.com> Signed-off-by: James Almer <jamrial@gmail.com>
* v210enc: Add SIMD optimised 8-bit and 10-bit encodersKieran Kunhya2014-11-261-0/+5
| | | | Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* x86/hevc_deblock: improve 8bit transpose store macrosJames Almer2014-08-031-0/+9
| | | | | | | Up to four instructions less depending on function and instruction set. Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* x86/hevc_idct: replace old and unused idct functionsJames Almer2014-07-261-1/+3
| | | | | | | | | | | | | | | | | | | | | | Only 8-bit and 10-bit idct_dc() functions are included (adding others should be trivial). Benchmarks on an Intel Core i5-4200U: idct8x8_dc SSE2 MMXEXT C cycles 22 26 57 idct16x16_dc AVX2 SSE2 C cycles 27 32 249 idct32x32_dc AVX2 SSE2 C cycles 62 126 1375 Signed-off-by: James Almer <jamrial@gmail.com> Reviewed-by: Mickaƫl Raulet <mraulet@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* x86util: add and use RSHIFT/LSHIFT macrosChristophe Gisquet2014-06-151-0/+16
| | | | | | | Those macros take a byte number as shift argument, as this argument differs between MMX and SSE2 instructions. Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* x86: hpeldsp: better factorizationChristophe Gisquet2014-05-291-1/+9
| | | | Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* x86/dsputilenc: implement SSE2 versions of pix_{sum16, norm1}James Almer2014-05-281-0/+5
| | | | | Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* x86: move horizontal add macros to x86utilJames Almer2014-04-171-0/+33
| | | | | | | | | Also port relevant AVX2/XOP optimizations from x264 with permission to relicense to LGPL from the corresponding authors Signed-off-by: James Almer <jamrial@gmail.com> Reviewed-by: "Ronald S. Bultje" <rsbultje@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* x86: Move XOP emulation to x86utilJames Almer2014-02-241-0/+19
| | | | | | | | | | | We need the emulation to support the cases where the first argument is the same as the fourth. To achieve this a fifth argument working as a temporary may be needed. Emulation that doesn't obey the original instruction semantics can't be in x86inc. Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* Merge commit 'c6908d6b4b377a04a5d055ba874bdbcf06c80497'Michael Niedermayer2013-10-141-1/+3
|\ | | | | | | | | | | | | * commit 'c6908d6b4b377a04a5d055ba874bdbcf06c80497': x86inc: FMA3/4 Support Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * x86inc: FMA3/4 SupportJason Garrett-Glaser2013-10-141-1/+3
| | | | | | | | Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
* | Merge commit '206895708ea2b464755d340e44501daf9a07c310'Michael Niedermayer2013-10-141-0/+11
|\| | | | | | | | | | | | | * commit '206895708ea2b464755d340e44501daf9a07c310': x86inc: Remove our FMA4 support Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * x86inc: Remove our FMA4 supportDerek Buitenhuis2013-10-141-0/+11
| | | | | | | | | | | | | | | | This is so we can sync to x264's version of FMA4 support. This partialy reverts commit 79687079a97a039c325ab79d7a95920d800b791f. Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
* | Merge commit 'd633d12b2cc999cee3ac25bf9a810fe7ff03726d'Michael Niedermayer2013-01-191-0/+1
|\| | | | | | | | | | | | | * commit 'd633d12b2cc999cee3ac25bf9a810fe7ff03726d': x86inc: Add cvisible macro for C functions with public prefix Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * x86inc: Add cvisible macro for C functions with public prefixDiego Biurrun2013-01-181-0/+1
| | | | | | | | | | | | This allows defining externally visible library symbols. Signed-off-by: Diego Biurrun <diego@biurrun.de>
* | Merge commit 'ef5d41a5534b65f03d02f2e11a503ab8416bfc3b'Michael Niedermayer2013-01-191-1/+1
|\| | | | | | | | | | | | | | | | | | | | | * commit 'ef5d41a5534b65f03d02f2e11a503ab8416bfc3b': x86inc: Rename "program_name" to "private_prefix" configure: Run SHFLAGS through ldflags_filter() Conflicts: configure Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * x86inc: Rename "program_name" to "private_prefix"Diego Biurrun2013-01-181-1/+1
| | | | | | | | | | | | | | The new name is more descriptive and will allow defining a separate public prefix for externally visible library symbols. Signed-off-by: Diego Biurrun <diego@biurrun.de>
* | Merge commit 'dae1d507af94261bafd3b11549884e5d1eca590e'Michael Niedermayer2013-01-161-0/+8
|\| | | | | | | | | | | | | | | | | * commit 'dae1d507af94261bafd3b11549884e5d1eca590e': x86: Add PAVGB macro to abstract pavgb/pavgusb instruction via cpuflags vf_fps: add final flushed frames to the dropped frame count rv34_parser: Adjust #if for disabling individual parsers Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * x86: Add PAVGB macro to abstract pavgb/pavgusb instruction via cpuflagsDiego Biurrun2013-01-151-0/+8
| |
* | Merge remote-tracking branch 'qatar/master'Michael Niedermayer2013-01-151-8/+6
|\| | | | | | | | | | | | | * qatar/master: x86: ABSB2: port to cpuflags Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * x86: ABSB2: port to cpuflagsDiego Biurrun2013-01-151-8/+6
| |
* | Merge commit '094a7405e5d8463d7d167d893e04934ec1a84ecd'Michael Niedermayer2013-01-151-6/+5
|\| | | | | | | | | | | | | | | * commit '094a7405e5d8463d7d167d893e04934ec1a84ecd': x86: ABSB: port to cpuflags sdp: Include SRTP crypto params if using the srtp protocol Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * x86: ABSB: port to cpuflagsDiego Biurrun2013-01-151-6/+5
| |
* | Merge commit 'd8c772de53d29afb1bada88afa859fce8489c668'Michael Niedermayer2013-01-151-16/+13
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | * commit 'd8c772de53d29afb1bada88afa859fce8489c668': nutdec: Always return a value from nut_read_timestamp() configure: Make warnings from -Wreturn-type fatal errors x86: ABS2: port to cpuflags vdpau: Remove av_unused attribute from function declaration h264: fix ff_generate_sliding_window_mmcos() prototype. Conflicts: configure libavformat/nutdec.c Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * x86: ABS2: port to cpuflagsDiego Biurrun2013-01-141-16/+13
| |
* | Merge commit '5b4dfbffc258f90a7d2540d21209ac23afcf7cd0'Michael Niedermayer2013-01-071-12/+9
|\| | | | | | | | | | | | | | | * commit '5b4dfbffc258f90a7d2540d21209ac23afcf7cd0': x86: ABS1: port to cpuflags v210x: cosmetics, reformat Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * x86: ABS1: port to cpuflagsDiego Biurrun2013-01-061-12/+9
| |
* | Merge commit '9d5c62ba5b586c80af508b5914934b1c439f6652'Michael Niedermayer2012-12-061-0/+11
|\| | | | | | | | | | | | | | | | | | | | | | | | | * commit '9d5c62ba5b586c80af508b5914934b1c439f6652': lavu/opt: do not filter out the initial sign character except for flags eval: treat dB as decibels instead of decibytes float_dsp: add vector_dmul_scalar() to multiply a vector of doubles Conflicts: libavutil/eval.c tests/ref/fate/eval Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * float_dsp: add vector_dmul_scalar() to multiply a vector of doublesJustin Ruggles2012-12-051-0/+11
| | | | | | | | Include x86-optimized versions for SSE2 and AVX.
* | Merge remote-tracking branch 'qatar/master'Michael Niedermayer2012-11-191-16/+6
|\| | | | | | | | | | | | | | | * qatar/master: x86: h264_intrapred: Fix C function names in comments x86: SPLATD: port to cpuflags Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * x86: SPLATD: port to cpuflagsDiego Biurrun2012-11-181-16/+6
| |
* | Merge remote-tracking branch 'qatar/master'Michael Niedermayer2012-11-141-1/+1
|\| | | | | | | | | | | | | | | | | | | | | | | * qatar/master: x86: mmx2 ---> mmxext in asm constructs Conflicts: libavcodec/x86/h264_chromamc_10bit.asm libavcodec/x86/h264_deblock.asm libavcodec/x86/h264dsp_init.c Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * x86: mmx2 ---> mmxext in asm constructsDiego Biurrun2012-11-141-1/+1
| |
* | Merge commit '802713c4e7b41bc2deed754d78649945c3442063'Michael Niedermayer2012-11-121-0/+1
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | * commit '802713c4e7b41bc2deed754d78649945c3442063': mss2: prevent potential uninitialized reads mss2: reindent after last commit mss2: fix handling of unmasked implicit WMV9 rectangles configure: add lavu dependency to lavr/lavfi .pc files x86inc: Set program_name outside of x86inc.asm Conflicts: configure Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * x86inc: Set program_name outside of x86inc.asmDiego Biurrun2012-11-111-0/+1
| | | | | | | | This reduces the local difference to the x264 upstream version.
* | Merge remote-tracking branch 'qatar/master'Michael Niedermayer2012-11-101-8/+8
|\| | | | | | | | | | | | | | | * qatar/master: x86: PALIGNR: port to cpuflags x86: h264_qpel_10bit: port to cpuflags Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * x86: PALIGNR: port to cpuflagsDiego Biurrun2012-11-091-8/+8
| |
* | Merge commit '930e26a3ea9d223e04bac4cdde13697cec770031'Michael Niedermayer2012-11-051-12/+10
|\| | | | | | | | | | | | | | | | | | | | | | | | | * commit '930e26a3ea9d223e04bac4cdde13697cec770031': x86: h264qpel: Only define mmxext QPEL functions if H264QPEL is enabled x86: PABSW: port to cpuflags x86: vc1dsp: port to cpuflags rtmp: Use av_strlcat instead of strncat Conflicts: libavcodec/x86/h264_qpel.c Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * x86: PABSW: port to cpuflagsDiego Biurrun2012-11-051-12/+10
| |
* | Merge commit '9221efef7968463f3e3d9ce79ea72eaca082e73f'Michael Niedermayer2012-11-031-0/+12
|\| | | | | | | | | | | | | | | | | | | | | | | * commit '9221efef7968463f3e3d9ce79ea72eaca082e73f': lavf: fix av_interleaved_write_frame() doxy. lavf: clarify the lifetime of demuxed packets. avconv: do not free muxed packet on streamcopy. crc: move doxy to the header vf_drawtext: do not use deprecated av_tree_node_size x86: Refactor PSWAPD fallback implementations and port to cpuflags Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * x86: Refactor PSWAPD fallback implementations and port to cpuflagsDiego Biurrun2012-11-021-0/+12
| |
* | Merge commit '9a07c1332cfe092b57b5758f22b686ca58806c60'Michael Niedermayer2012-11-031-5/+5
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | * commit '9a07c1332cfe092b57b5758f22b686ca58806c60': parser: Move Doxygen documentation to the header files PGS subtitles: Expose forced flag x86: PMINUB: port to cpuflags Conflicts: libavcodec/avcodec.h libavcodec/pgssubdec.c libavcodec/version.h libavcodec/x86/ac3dsp.asm Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * x86: PMINUB: port to cpuflagsDiego Biurrun2012-11-021-5/+5
| |
* | Merge commit '9ce02e14f01de50fcc6f7f459544b140be66d615'Michael Niedermayer2012-11-031-0/+2
|\| | | | | | | | | | | | | | | | | | | | | | | * commit '9ce02e14f01de50fcc6f7f459544b140be66d615': x86: ac3dsp: port to cpuflags x86util: Add cpuflags_mmxext alias for cpuflags_mmx2 x86inc: Only define program_name if the macro is unset Conflicts: libavcodec/x86/ac3dsp.asm Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * x86util: Add cpuflags_mmxext alias for cpuflags_mmx2Diego Biurrun2012-11-021-0/+2
| | | | | | | | "mmxext" is a more sensible name and more common in outside projects.
| * x86: Fix assembly with NASMDave Yeo2012-10-311-1/+1
| | | | | | | | | | | | | | Unlike YASM, NASM only looks for include files in the current directory, not in the directory that included files reside in. Signed-off-by: Diego Biurrun <diego@biurrun.de>
* | Merge commit 'be923ed659016350592acb9b3346f706f8170ac5'Michael Niedermayer2012-10-311-3/+3
|\| | | | | | | | | | | | | | | * commit 'be923ed659016350592acb9b3346f706f8170ac5': x86: fmtconvert: port to cpuflags x86: MMX2 ---> MMXEXT in macro names Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * x86: MMX2 ---> MMXEXT in macro namesDiego Biurrun2012-10-311-3/+3
| |