ffmpeg - Mirror of FFmpeg git repo

	Commit message (Collapse)	Author	Age	Files	Lines
*	x86: use the new helper macros where useful	James Almer	2016-02-14	2	-2/+2
\| \| \| \| \|	Reviewed-by: Michael Niedermayer <michael@niedermayer.cc> Signed-off-by: James Almer <jamrial@gmail.com>
*	x86: add some more helper macros to check for slow cpuflags	James Almer	2016-02-14	1	-0/+4
\| \| \| \| \|	Reviewed-by: Michael Niedermayer <michael@niedermayer.cc> Signed-off-by: James Almer <jamrial@gmail.com>
*	x86/cpu: set avxslow cpuflag on btver2 CPUs	James Almer	2016-02-07	1	-6/+4
\| \| \| \| \| \| \|	They are also slow when using 256 bit wide registers Reviewed-by: Hendrik Leppkes <h.leppkes@gmail.com> Signed-off-by: James Almer <jamrial@gmail.com>
*	x86/emms: empty the mmx state unconditionally on supported targets	James Almer	2016-02-04	1	-0/+6
\| \| \| \| \|	Reviewed-by: Michael Niedermayer <michael@niedermayer.cc> Signed-off-by: James Almer <jamrial@gmail.com>
*	all: Add missing header guards	Timothy Gu	2016-01-28	1	-0/+5
\|
*	x86inc: Add debug symbols indicating sizes of compiled functions	Geza Lore	2016-01-21	1	-0/+23
\| \| \| \| \| \| \| \| \|	Some debuggers/profilers use this metadata to determine which function a given instruction is in; without it they get can confused by local labels (if you haven't stripped those). On the other hand, some tools are still confused even with this metadata. e.g. this fixes `gdb`, but not `perf`. Currently only implemented for ELF.
*	x86inc: Avoid creating unnecessary local labels	Henrik Gramner	2016-01-21	1	-2/+4
\| \| \| \| \| \| \| \| \| \|	The REP_RET workaround is only needed on old AMD cpus, and the labels clutter up the symbol table and confuse debugging/profiling tools, so use EQU to create SHN_ABS symbols instead of creating local labels. Furthermore, skip the workaround completely in functions that definitely won't run on such cpus. Note that EQU is just creating a local label when using nasm instead of yasm. This is probably a bug, but at least it doesn't break anything.
*	x86inc: Simplify AUTO_REP_RET	Henrik Gramner	2016-01-21	1	-4/+2
\| \| \| \| \| \|	cpuflags is never undefined any more, it's set to 0 instead. Also fix an incorrect comment.
*	x86inc: Use more consistent indentation	Henrik Gramner	2016-01-21	1	-67/+67
\|
*	x86inc: Preserve arguments when allocating stack space	Henrik Gramner	2016-01-21	1	-2/+5
\| \| \| \| \| \|	When allocating stack space with a larger alignment than the known stack alignment a temporary register is used for storing the stack pointer. Ensure that this isn't one of the registers used for passing arguments.
*	x86inc: Improve FMA instruction handling	Henrik Gramner	2016-01-21	1	-40/+37
\| \| \| \| \| \| \| \| \| \| \| \|	* Correctly handle FMA instructions with memory operands. * Print a warning if FMA instructions are used without the correct cpuflag. * Simplify the instantiation code. * Clarify documentation. Only the last operand in FMA3 instructions can be a memory operand. When converting FMA4 instructions to FMA3 instructions we can utilize the fact that multiply is a commutative operation and reorder operands if necessary to ensure that a memory operand is used only as the last operand.
*	x86inc: Be more verbose in assertion failures	Henrik Gramner	2016-01-21	1	-1/+1
\|
*	x86/intmath: disable sse av_clip functions when using ICC	James Almer	2016-01-21	1	-2/+2
\| \| \| \| \| \| \| \| \|	It seems to miscompile them Should fix fate-ra-288 and fate-twinvq Reviewed-by: Michael Niedermayer <michael@niedermayer.cc> Signed-off-by: James Almer <jamrial@gmail.com>
*	x86/fixed_dsp: add ff_butterflies_fixed_sse2	James Almer	2016-01-16	3	-0/+85
\| \| \| \| \|	Reviewed-by: Paul B Mahol <onemda@gmail.com> Signed-off-by: James Almer <jamrial@gmail.com>
*	lavu/x86/lls: add fma3 optimizations for update_lls	Ganesh Ajjanagadde	2016-01-15	2	-2/+61
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This improves accuracy (very slightly) and speed for processors having fma3. Sample benchmark (fate flac-16-lpc-cholesky, Haswell): old: 5993610 decicycles in ff_lpc_calc_coefs, 64 runs, 0 skips 5951528 decicycles in ff_lpc_calc_coefs, 128 runs, 0 skips new: 5252410 decicycles in ff_lpc_calc_coefs, 64 runs, 0 skips 5232869 decicycles in ff_lpc_calc_coefs, 128 runs, 0 skips Tested with FATE and --disable-fma3, also examined contents of lavu/lls-test. Reviewed-by: James Almer <jamrial@gmail.com> Reviewed-by: Henrik Gramner <henrik@gramner.com> Signed-off-by: Ganesh Ajjanagadde <gajjanagadde@gmail.com>
*	x86/intmath: add missing early clobber to output operands	James Almer	2016-01-15	1	-2/+2
\| \| \| \|	Signed-off-by: James Almer <jamrial@gmail.com>
*	x86/float_dsp: zero extend offset from ff_scalarproduct_float_sse	James Almer	2016-01-08	1	-3/+3
\| \| \| \| \|	Reviewed-by: Christophe Gisquet <christophe.gisquet@gmail.com> Signed-off-by: James Almer <jamrial@gmail.com>
*	x86/float_dsp: zero extend len from ff_butterflies_float_sse implicitly	James Almer	2016-01-08	1	-4/+1
\| \| \| \| \|	Reviewed-by: Christophe Gisquet <christophe.gisquet@gmail.com> Signed-off-by: James Almer <jamrial@gmail.com>
*	x86/float_dsp: remove len check from ff_butterflies_float_sse	James Almer	2016-01-08	1	-3/+0
\| \| \| \| \| \| \|	The function documentation explicitly mentions it needs to be a multiple of 4. Reviewed-by: Christophe Gisquet <christophe.gisquet@gmail.com> Signed-off-by: James Almer <jamrial@gmail.com>
*	x86/intmath: add sse optimized av_clipf and av_clipd	James Almer	2016-01-07	1	-0/+33
\| \| \| \| \| \|	Reviewed-by: Michael Niedermayer <michael@niedermayer.cc> Reviewed-by: Ronald S. Bultje <rsbultje@gmail.com> Signed-off-by: James Almer <jamrial@gmail.com>
*	avutil/x86/bswap: Remove warning about bswap intrinsics with msvc.	Matt Oliver	2015-11-23	1	-0/+3
\| \| \| \|	Signed-off-by: Matt Oliver <protogonoi@gmail.com>
*	avutil/x86/intmath: Fix intrinsic header include when using newer gcc with ↵	Matt Oliver	2015-11-12	1	-1/+1
\| \| \| \| \| \|	older icc. Signed-off-by: Matt Oliver <protogonoi@gmail.com>
*	avutil/x86/bswap: Add msvc bswap instrinsics.	Matt Oliver	2015-11-12	1	-1/+24
\| \| \| \| \| \|	This adds msvc optimisations as well as fixing an error in icl whereby it will generate invalid code otherwise. Signed-off-by: Matt Oliver <protogonoi@gmail.com>
*	avutil/x86/intmath: Disable use of tzcnt on older intel compilers.	Matt Oliver	2015-11-11	1	-1/+1
\| \| \| \| \| \| \|	ICC versions older than atleast 12.1.6 dont have the tzcnt intrinsics. Reviewed-by: Michael Niedermayer <michael@niedermayer.cc> Signed-off-by: Matt Oliver <protogonoi@gmail.com>
*	avutil/x86/intmath: Correct intrinsic headers for older compilers.	Matt Oliver	2015-11-09	1	-2/+6
\| \| \| \|	Signed-off-by: Matt Oliver <protogonoi@gmail.com>
*	avutil/x86/intmath: Add missing header.	Matt Oliver	2015-11-01	1	-0/+3
\| \| \| \|	Signed-off-by: Matt Oliver <protogonoi@gmail.com>
*	avutil/x86/intmath: Use tzcnt in place of bsf.	Matt Oliver	2015-10-31	1	-39/+15
\| \| \| \|	Signed-off-by: Matt Oliver <protogonoi@gmail.com>
*	lavu: add AESNI CPU flag	Rodger Combs	2015-10-28	3	-6/+12
\|
*	lavu/intmath.h: Move x86 only msvc/icl functions to x86 specific header.	Matt Oliver	2015-10-19	1	-0/+20
\| \| \| \|	Signed-off-by: Matt Oliver <protogonoi@gmail.com>
*	lavu/intmath.h: Add msvc/icl ctzll optimisations.	Matt Oliver	2015-10-19	1	-0/+35
\| \| \| \|	Signed-off-by: Matt Oliver <protogonoi@gmail.com>
*	x86inc: Make cpuflag() and notcpuflag() return 0 or 1	Henrik Gramner	2015-10-01	1	-2/+3
\| \| \| \|	Makes it possible to use them in arithmetic expressions.
*	avutil/attributes: add AV_GCC_VERSION_AT_MOST	James Almer	2015-09-18	1	-4/+4
\| \| \| \| \|	Reviewed-by: Michael Niedermayer <michaelni@gmx.at> Signed-off-by: James Almer <jamrial@gmail.com>
*	x86: port PSIGNW to cpuflags	James Almer	2015-09-11	1	-5/+5
\| \| \| \| \|	Reviewed-by: Ronald S. Bultje <rsbultje@gmail.com> Signed-off-by: James Almer <jamrial@gmail.com>
*	avutil/x86/asm: rename REG_SP to REG_sp	Ganesh Ajjanagadde	2015-08-22	1	-2/+3
\| \| \| \| \| \| \| \| \|	REG_SP is defined by Solaris system headers. This fixes a sea of warnings while building on Solaris: http://fate.ffmpeg.org/report.cgi?time=20150820233505&slot=x86-opensolaris-gcc4.3 Signed-off-by: Ganesh Ajjanagadde <gajjanagadde@gmail.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
*	x86inc: warn if XOP integer FMA instruction emulation is impossible	Anton Mitrofanov	2015-08-05	1	-1/+3
\| \| \| \|	Signed-off-by: Henrik Gramner <henrik@gramner.com>
*	x86inc: Drop SECTION_TEXT macro	Henrik Gramner	2015-08-04	2	-13/+1
\| \| \| \| \|	The .text section is already 16-byte aligned by default on all supported platforms so `SECTION_TEXT` isn't any different from `SECTION .text`.
*	x86inc: Support arbitrary stack alignments	Henrik Gramner	2015-08-04	1	-22/+40
\| \| \| \| \| \|	Change ALLOC_STACK to always align the stack before allocating stack space for consistency. Previously alignment would occur either before or after allocating stack space depending on whether manual alignment was required or not.
*	x86: move XOP emulation code back to x86inc	James Almer	2015-08-03	2	-19/+16
\| \| \| \| \| \| \| \| \| \|	Only two functions that use xop multiply-accumulate instructions where the first operand is the same as the fourth actually took advantage of the macros. This further reduces differences with x264's x86inc. Reviewed-by: Ronald S. Bultje <rsbultje@gmail.com> Signed-off-by: James Almer <jamrial@gmail.com>
*	x86inc: Various minor backports from x264	Henrik Gramner	2015-08-03	1	-11/+21
\| \| \| \| \|	Reviewed-by: "Ronald S. Bultje" <rsbultje@gmail.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
*	x86inc: Disable vpbroadcastq workaround in newer yasm versions	Henrik Gramner	2015-08-03	1	-9/+11
\| \| \| \| \| \| \|	The bug was fixed in 1.3.0, so only perform the workaround in earlier versions. Reviewed-by: "Ronald S. Bultje" <rsbultje@gmail.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
*	x86/float_dsp: add missing colon to labels	James Almer	2015-07-26	1	-1/+1
\| \| \| \| \| \|	Silences warnings with Nasm Signed-off-by: James Almer <jamrial@gmail.com>
*	avutil/x86/bswap: force inline asm versions with ICC	James Almer	2015-07-18	1	-1/+1
\| \| \| \| \| \| \| \|	Recent ICC versions that define GCC as >= 4.5 (like ICC 13) apparently can't optimize the generic C versions of av_bswap*() on their own. Reviewed-by: Michael Niedermayer <michaelni@gmx.at> Signed-off-by: James Almer <jamrial@gmail.com>
*	Merge commit 'd1a6cb195f610978ba5d2351e60f938f7f261d59'	Michael Niedermayer	2015-07-09	1	-1/+6
\|\ \| \| \| \| \| \| \| \| \| \| \| \|	* commit 'd1a6cb195f610978ba5d2351e60f938f7f261d59': x86: Serialize rdtsc in read_time() Merged-by: Michael Niedermayer <michaelni@gmx.at>
\| *	x86: Serialize rdtsc in read_time()	Henrik Gramner	2015-07-09	1	-1/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Improves the accuracy of measurements, especially in short sections. To quote the Intel 64 and IA-32 Architectures Software Developer's Manual: "The RDTSC instruction is not a serializing instruction. It does not necessarily wait until all previous instructions have been executed before reading the counter. Similarly, subsequent instructions may begin execution before the read operation is performed. If software requires RDTSC to be executed only after all previous instructions have completed locally, it can either use RDTSCP (if the processor supports that instruction) or execute the sequence LFENCE;RDTSC." SSE2 is a requirement for lfence so only use it on SSE2-capable systems. Prefer lfence;rdtsc over rdtscp since rdtscp is supported on fewer systems. Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
\| *	x86: check for AV_CPU_FLAG_AVXSLOW where useful	James Almer	2015-05-31	2	-2/+2
\| \| \| \| \| \| \| \| \| \|	Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
* \|	avutil/x86/intmath: add missing check for inline assembly	James Almer	2015-06-27	1	-1/+1
\| \| \| \| \| \| \| \|	Signed-off-by: James Almer <jamrial@gmail.com>
* \|	avutil/x86/intmath: use bzhi gcc builtin in av_mod_uintp2()	James Almer	2015-06-27	1	-0/+7
\| \| \| \| \| \| \| \|	Signed-off-by: James Almer <jamrial@gmail.com>
* \|	x86: check for AV_CPU_FLAG_AVXSLOW where useful	James Almer	2015-06-01	2	-3/+3
\| \| \| \| \| \| \| \| \| \|	Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* \|	Merge commit 'cae39851201b7781f1262e1c23627b45e6e80bb4'	Michael Niedermayer	2015-05-31	1	-0/+18
\|\\| \| \| \| \| \| \| \| \| \| \| \| \|	* commit 'cae39851201b7781f1262e1c23627b45e6e80bb4': x86: Add helper macros to check for slow cpuflags Merged-by: Michael Niedermayer <michaelni@gmx.at>
\| *	x86: Add helper macros to check for slow cpuflags	James Almer	2015-05-31	1	-0/+18
\| \| \| \| \| \| \| \| \| \|	Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Luca Barbato <lu_zero@gentoo.org>