ffmpeg - Mirror of FFmpeg git repo

	Commit message (Collapse)	Author	Age	Files	Lines
*	libavfilter/x86/vf_convolution: add sobel filter optimization and unit test ↵	bwang30	2022-11-14	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \|	with intel AVX512 VNNI This commit enabled assembly code with intel AVX512 VNNI and added unit test for sobel filter sobel_c: 4537 sobel_avx512icl 2136 Signed-off-by: bwang30 <bin.wang@intel.com> Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
*	checkasm: add a verbose check function for uint32_t data	James Darnley	2022-11-04	1	-0/+1
\|
*	lavu/riscv: CPU flag for the Zbb extension	Rémi Denis-Courmont	2022-10-05	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \|	Unfortunately, it is common, and will remain so, that the Bit manipulations are not enabled at compilation time. This is an official policy for Debian ports in general (though they do not support RISC-V officially as of yet) to stick to the minimal target baseline, which does not include the B extension or even its Zbb subset. For inline helpers (CPOP, REV8), compiler builtins (CTZ, CLZ) or even plain C code (MIN, MAX, MINU, MAXU), run-time detection seems impractical. But at least it can work for the byte-swap DSP functions.
*	lavu/cpu: CPU flags for the RISC-V Vector extension	Rémi Denis-Courmont	2022-09-27	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	RVV defines a total of 12 different extensions, including: - 5 different instruction subsets: - Zve32x: 8-, 16- and 32-bit integers, - Zve32f: Zve32x plus single precision floats, - Zve64x: Zve32x plus 64-bit integers, - Zve64f: Zve32f plus Zve64x, - Zve64d: Zve64f plus double precision floats. - 6 different vector lengths: - Zvl32b (embedded only), - Zvl64b (embedded only), - Zvl128b, - Zvl256b, - Zvl512b, - Zvl1024b, - and the V extension proper: equivalent to Zve64f and Zvl128b. In total, there are 6 different possible sets of supported instructions (including the empty set), but for convenience we allocate one bit for each type sets: up-to-32-bit ints (RVV_I32), floats (RVV_F32), 64-bit ints (RVV_I64) and doubles (RVV_F64). Whence the vector size is needed, it can be retrieved by reading the unprivileged read-only vlenb CSR. This should probably be a separate helper macro if needed at a later point.
*	lavu/cpu: detect RISC-V base extensions	Rémi Denis-Courmont	2022-09-27	1	-0/+4
\| \| \| \| \| \| \| \| \| \|	This introduces compile-time and run-time CPU detection on RISC-V. In practice, I doubt that FFmpeg will ever see a RISC-V CPU without all of I, F and D extensions, and if it does, it probably won't have run-time detection. So the flags are essentially always set. But as things stand, checkasm wants them that way. Compare the ARMV8 flag on AArch64. We are nowhere near running short on CPU flag bits.
*	x86/lpc: implement a new Welch windowing function	Lynne	2022-09-21	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \|	Old one was written with the assumption only even inputs would be given. This very messy replacement supports even and odd inputs, and supports AVX2 for extra speed. The buffers given are usually quite big (4k samples), so the speedup is worth it. The new SSE version is still faster than the old inline asm version by 33%. Also checkasm is provided to make sure this monstrosity works. This fixes some FATE tests.
*	tests/checkasm: add a test for VorbisDSPContext	James Almer	2022-09-19	1	-0/+3
\| \| \| \|	Signed-off-by: James Almer <jamrial@gmail.com>
*	avcodec/flacdsp: Split encoder-only parts into a ctx of its own	Andreas Rheinhardt	2022-08-05	1	-1/+1
\| \| \| \|	Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
*	lavc/aarch64: motion estimation functions in neon	Swinney, Jonathan	2022-06-28	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	- ff_pix_abs16_neon - ff_pix_abs16_xy2_neon In direct micro benchmarks of these ff functions verses their C implementations, these functions performed as follows on AWS Graviton 3. ff_pix_abs16_neon: pix_abs_0_0_c: 141.1 pix_abs_0_0_neon: 19.6 ff_pix_abs16_xy2_neon: pix_abs_0_3_c: 269.1 pix_abs_0_3_neon: 39.3 Tested with: ./tests/checkasm/checkasm --test=motion --bench --disable-linux-perf Signed-off-by: Jonathan Swinney <jswinney@amazon.com> Signed-off-by: Martin Storsjö <martin@martin.st>
*	checkasm: Add idctdsp add/put-pixels-clamped tests	Ben Avison	2022-04-01	1	-0/+3
\| \| \| \| \|	Signed-off-by: Ben Avison <bavison@riscosopen.org> Signed-off-by: Martin Storsjö <martin@martin.st>
*	checkasm: Add vc1dsp in-loop deblocking filter tests	Ben Avison	2022-04-01	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \|	Note that the benchmarking results for these functions are highly dependent upon the input data. Therefore, each function is benchmarked twice, corresponding to the best and worst case complexity of the reference C implementation. The performance of a real stream decode will fall somewhere between these two extremes. Signed-off-by: Ben Avison <bavison@riscosopen.org> Signed-off-by: Martin Storsjö <martin@martin.st>
*	configure: Use a separate config_components.h header for $ALL_COMPONENTS	Martin Storsjö	2022-03-16	1	-0/+1
\| \| \| \| \| \| \| \|	This avoids unnecessary rebuilds of most source files if only the list of enabled components has changed, but not the other properties of the build, set in config.h. Signed-off-by: Martin Storsjö <martin@martin.st>
*	avutil/cpu: add AVX512 Icelake flag	Wu Jianhua	2022-03-10	1	-17/+18
\| \| \| \| \| \|	Signed-off-by: Wu Jianhua <jianhua.wu@intel.com> Reviewed-by: Henrik Gramner <henrik@gramner.com> Signed-off-by: James Almer <jamrial@gmail.com>
*	swscale/x86/output.asm: add x86-optimized planer gbr yuv2anyX functions	Mark Reid	2022-01-11	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	changes since v2: * fixed label changes since v1: * remove vex intruction on sse4 path * some load/pack marcos use less intructions * fixed some typos yuv2gbrp_full_X_4_512_c: 12757.6 yuv2gbrp_full_X_4_512_sse2: 8946.6 yuv2gbrp_full_X_4_512_sse4: 5138.6 yuv2gbrp_full_X_4_512_avx2: 3889.6 yuv2gbrap_full_X_4_512_c: 15368.6 yuv2gbrap_full_X_4_512_sse2: 11916.1 yuv2gbrap_full_X_4_512_sse4: 6294.6 yuv2gbrap_full_X_4_512_avx2: 3477.1 yuv2gbrp9be_full_X_4_512_c: 14381.6 yuv2gbrp9be_full_X_4_512_sse2: 9139.1 yuv2gbrp9be_full_X_4_512_sse4: 5150.1 yuv2gbrp9be_full_X_4_512_avx2: 2834.6 yuv2gbrp9le_full_X_4_512_c: 12990.1 yuv2gbrp9le_full_X_4_512_sse2: 9118.1 yuv2gbrp9le_full_X_4_512_sse4: 5132.1 yuv2gbrp9le_full_X_4_512_avx2: 2833.1 yuv2gbrp10be_full_X_4_512_c: 14401.6 yuv2gbrp10be_full_X_4_512_sse2: 9133.1 yuv2gbrp10be_full_X_4_512_sse4: 5126.1 yuv2gbrp10be_full_X_4_512_avx2: 2837.6 yuv2gbrp10le_full_X_4_512_c: 12718.1 yuv2gbrp10le_full_X_4_512_sse2: 9106.1 yuv2gbrp10le_full_X_4_512_sse4: 5120.1 yuv2gbrp10le_full_X_4_512_avx2: 2826.1 yuv2gbrap10be_full_X_4_512_c: 18535.6 yuv2gbrap10be_full_X_4_512_sse2: 33617.6 yuv2gbrap10be_full_X_4_512_sse4: 6264.1 yuv2gbrap10be_full_X_4_512_avx2: 3422.1 yuv2gbrap10le_full_X_4_512_c: 16724.1 yuv2gbrap10le_full_X_4_512_sse2: 11787.1 yuv2gbrap10le_full_X_4_512_sse4: 6282.1 yuv2gbrap10le_full_X_4_512_avx2: 3441.6 yuv2gbrp12be_full_X_4_512_c: 13723.6 yuv2gbrp12be_full_X_4_512_sse2: 9128.1 yuv2gbrp12be_full_X_4_512_sse4: 7997.6 yuv2gbrp12be_full_X_4_512_avx2: 2844.1 yuv2gbrp12le_full_X_4_512_c: 12257.1 yuv2gbrp12le_full_X_4_512_sse2: 9107.6 yuv2gbrp12le_full_X_4_512_sse4: 5142.6 yuv2gbrp12le_full_X_4_512_avx2: 2837.6 yuv2gbrap12be_full_X_4_512_c: 18511.1 yuv2gbrap12be_full_X_4_512_sse2: 12156.6 yuv2gbrap12be_full_X_4_512_sse4: 6251.1 yuv2gbrap12be_full_X_4_512_avx2: 3444.6 yuv2gbrap12le_full_X_4_512_c: 16687.1 yuv2gbrap12le_full_X_4_512_sse2: 11785.1 yuv2gbrap12le_full_X_4_512_sse4: 6243.6 yuv2gbrap12le_full_X_4_512_avx2: 3446.1 yuv2gbrp14be_full_X_4_512_c: 13690.6 yuv2gbrp14be_full_X_4_512_sse2: 9120.6 yuv2gbrp14be_full_X_4_512_sse4: 5138.1 yuv2gbrp14be_full_X_4_512_avx2: 2843.1 yuv2gbrp14le_full_X_4_512_c: 14995.6 yuv2gbrp14le_full_X_4_512_sse2: 9119.1 yuv2gbrp14le_full_X_4_512_sse4: 5126.1 yuv2gbrp14le_full_X_4_512_avx2: 2843.1 yuv2gbrp16be_full_X_4_512_c: 12367.1 yuv2gbrp16be_full_X_4_512_sse2: 8233.6 yuv2gbrp16be_full_X_4_512_sse4: 4820.1 yuv2gbrp16be_full_X_4_512_avx2: 2666.6 yuv2gbrp16le_full_X_4_512_c: 10904.1 yuv2gbrp16le_full_X_4_512_sse2: 8214.1 yuv2gbrp16le_full_X_4_512_sse4: 4824.1 yuv2gbrp16le_full_X_4_512_avx2: 2629.1 yuv2gbrap16be_full_X_4_512_c: 26569.6 yuv2gbrap16be_full_X_4_512_sse2: 10884.1 yuv2gbrap16be_full_X_4_512_sse4: 5488.1 yuv2gbrap16be_full_X_4_512_avx2: 3272.1 yuv2gbrap16le_full_X_4_512_c: 14010.1 yuv2gbrap16le_full_X_4_512_sse2: 10562.1 yuv2gbrap16le_full_X_4_512_sse4: 5463.6 yuv2gbrap16le_full_X_4_512_avx2: 3255.1 yuv2gbrpf32be_full_X_4_512_c: 14524.1 yuv2gbrpf32be_full_X_4_512_sse2: 8552.6 yuv2gbrpf32be_full_X_4_512_sse4: 4636.1 yuv2gbrpf32be_full_X_4_512_avx2: 2474.6 yuv2gbrpf32le_full_X_4_512_c: 13060.6 yuv2gbrpf32le_full_X_4_512_sse2: 9682.6 yuv2gbrpf32le_full_X_4_512_sse4: 4298.1 yuv2gbrpf32le_full_X_4_512_avx2: 2453.1 yuv2gbrapf32be_full_X_4_512_c: 18629.6 yuv2gbrapf32be_full_X_4_512_sse2: 11363.1 yuv2gbrapf32be_full_X_4_512_sse4: 15201.6 yuv2gbrapf32be_full_X_4_512_avx2: 3727.1 yuv2gbrapf32le_full_X_4_512_c: 16677.6 yuv2gbrapf32le_full_X_4_512_sse2: 10221.6 yuv2gbrapf32le_full_X_4_512_sse4: 5693.6 yuv2gbrapf32le_full_X_4_512_avx2: 3656.6 Reviewed-by: Paul B Mahol <onemda@gmail.com> Signed-off-by: James Almer <jamrial@gmail.com>
*	avutil: [loongarch] Add support for loongarch SIMD.	Shiyou Yin	2021-12-15	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \|	LSX and LASX is loongarch SIMD extention. They are enabled by default if compiler support it, and can be disabled with '--disable-lsx' '--disable-lasx'. Change-Id: Ie2608ea61dbd9b7fffadbf0ec2348bad6c124476 Reviewed-by: Shiyou Yin <yinshiyou-hf@loongson.cn> Reviewed-by: guxiwei <guxiwei-hf@loongson.cn> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
*	checkasm: collapse hevc pel tests	J. Dekker	2021-08-24	1	-10/+1
\| \| \| \| \| \|	Also add to `make fate-checkasm' target. Signed-off-by: J. Dekker <jdek@itanimul.li>
*	lavu/kperf: use ff_thread_once()	J. Dekker	2021-07-21	1	-5/+1
\| \| \| \|	Signed-off-by: J. Dekker <jdek@itanimul.li>
*	lavu/checkasm: add (private) kperf timing for macOS	J. Dekker	2021-07-20	1	-2/+12
\| \| \| \|	Signed-off-by: J. Dekker <jdek@itanimul.li>
*	checkasm: add av_tx FFT SIMD testing code	Lynne	2021-04-24	1	-0/+1
\| \| \| \| \| \|	This sadly required making changes to the code itself, due to the same context needing to be reused for both versions. The lookup table had to be duplicated for both versions.
*	checkasm: add hevc_pel tests	Josh Dekker	2021-01-25	1	-0/+10
\| \| \| \| \|	Co-authored-by: Niklas Haas <git@haasn.xyz> Signed-off-by: Josh Dekker <josh@itanimul.li>
*	libavutil: Detect MMI and MSA flags for MIPS	Jiaxun Yang	2020-07-23	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add MMI & MSA runtime detection for MIPS. Basically there are two code pathes. For systems that natively support CPUCFG instruction or kernel emulated that instruction, we'll sense this feature from HWCAP and report the flags according to values grab from CPUCFG. For systems that have no CPUCFG (or not export it in HWCAP), we'll parse /proc/cpuinfo instead. Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com> Reviewed-by: Shiyou Yin <yinshiyou-hf@loongson.cn> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
*	checkasm: add hscale test	Josh de Kock	2020-05-15	1	-0/+1
\| \| \| \| \| \| \|	This tests the hscale 8bpp to 14/18bpp functions with different filter sizes. Signed-off-by: Josh de Kock <josh@itanimul.li>
*	checkasm: add function to check and diff memory	Martin Storsjö	2020-05-15	1	-0/+42
\| \| \| \| \| \|	This was ported from dav1d (c950e7101bdf5f7117bfca816984a21e550509f0). Signed-off-by: Josh de Kock <josh@itanimul.li>
*	checkasm: Check HAVE_GETSTDHANDLE here as well	Martin Storsjö	2020-01-24	1	-3/+3
\| \| \| \| \| \|	This was missed in 63418e374fcf26. Signed-off-by: Martin Storsjö <martin@martin.st>
*	checkasm/vf_eq: add test for vf_eq	Ting Fu	2019-09-26	1	-0/+3
\| \| \| \| \|	Signed-off-by: Ting Fu <ting.fu@intel.com> Signed-off-by: Ruiling Song <ruiling.song@intel.com>
*	checkasm: add opusdsp tests	Lynne	2019-09-11	1	-0/+3
\|
*	checkasm/vf_gblur: add test for horiz_slice simd	Ruiling Song	2019-06-12	1	-0/+3
\| \| \| \|	Signed-off-by: Ruiling Song <ruiling.song@intel.com>
*	checkasm: add test for v210dec	James Darnley	2019-05-02	1	-0/+3
\|
*	tests/checkasm/checkasm: Do not define an unused function.	Carl Eugen Hoyos	2019-01-31	1	-0/+2
\| \| \| \| \|	Fixes the following warning: tests/checkasm/checkasm.c:615:12: warning: 'bench_init_ffmpeg' defined but not used
*	checkasm: add an af_afir test	James Almer	2019-01-03	1	-0/+3
\| \| \| \| \|	Reviewed-by: Paul B Mahol <onemda@gmail.com> Signed-off-by: James Almer <jamrial@gmail.com>
*	checkasm: add vf_nlmeans test for ssd_integral_image	Clément Bœsch	2018-05-08	1	-0/+3
\|
*	tests/checkasm/checkasm: Provide verbose failure information on ↵	Michael Niedermayer	2018-04-14	1	-1/+5
\| \| \| \| \| \| \| \|	float_near_abs_eps() failures This will make understanding failures and adjusting EPS easier Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
*	checkasm/swscale : add test for rgb shuffle_bytes func	Martin Vignali	2018-03-24	1	-0/+3
\|
*	checkasm/hevc_sao : add hevc_sao for checkasm	Yingming Fan	2018-03-07	1	-0/+1
\| \| \| \|	Signed-off-by: James Almer <jamrial@gmail.com>
*	checkasm : add test for losslessvideoencdsp for diff bytes and sub_left_pred	Martin Vignali	2018-01-28	1	-0/+3
\|
*	checkasm: support for AVX-512 functions	James Darnley	2017-12-24	1	-0/+1
\|
*	Revert "checkasm/vf_interlace : add test for lowpass_line 8 and 16"	James Almer	2017-12-19	1	-3/+0
\| \| \| \| \| \| \| \|	This reverts commit adff97be5e2ff51c0bb66080c2f904ed40b6c571. It currently fails on Windows targets. Signed-off-by: James Almer <jamrial@gmail.com>
*	checkasm/vf_interlace : add test for lowpass_line 8 and 16	Martin Vignali	2017-12-19	1	-0/+3
\|
*	checkasm/vf_hflip : add test for vf_hflip byte and short simd	Martin Vignali	2017-12-13	1	-0/+3
\|
*	checkasm/vf_threshold : add checkasm test for threshold8	Martin Vignali	2017-12-03	1	-0/+3
\|
*	checkasm : add test for huffyuvdsp add_int16	Martin Vignali	2017-11-21	1	-0/+3
\|
*	checkasm : add utvideodsp test	Martin Vignali	2017-11-21	1	-0/+3
\|
*	checkasm: add an exrdsp test	James Almer	2017-09-17	1	-0/+3
\| \| \| \|	Signed-off-by: James Almer <jamrial@gmail.com>
*	checkasm: use perf API on Linux ARM*	Clément Bœsch	2017-09-08	1	-20/+87
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	On ARM platforms, accessing the PMU registers requires special user access permissions. Since there is no other way to get accurate timers, the current implementation of timers in FFmpeg rely on these registers. Unfortunately, enabling user access to these registers on Linux is not trivial, and generally involve compiling a random and unreliable github kernel module, or patching somehow your kernel. Such module is very unlikely to reach the upstream anytime soon. Quoting Robin Murphin from ARM: > Say you do give userspace direct access to the PMU; now run two or more > programs at once that believe they can use the counters for their own > "minimal-overhead" profiling. Have fun interpreting those results... > > And that's not even getting into the implications of scheduling across > different CPUs, CPUidle, etc. where the PMU state is completely beyond > userspace's control. In general, the plan to provide userspace with > something which might happen to just about work in a few corner cases, > but is meaningless, misleading or downright broken in all others, is to > never do so. As a result, the alternative is to use the Performance Monitoring Linux API which makes use of these registers internally (assuming the PMU of your ARM board is supported in the kernel, which is definitely not a given...). While the Linux API is obviously cross platform, it does have a significant overhead which needs to be taken into account. As a result, that mode is only weakly enabled on ARM platforms exclusively. Note on the non flexibility of the implementation: the timers (native FFmpeg vs Linux API) are selected at compilation time to prevent the need of function calls, which would result in a negative impact on the cycle counters.
*	checkasm: add a g722dsp test	James Almer	2017-07-13	1	-0/+3
\| \| \| \|	Signed-off-by: James Almer <jamrial@gmail.com>
*	checkasm: add sbrdsp tests	Matthieu Bouron	2017-07-03	1	-0/+1
\|
*	checkasm: add AAC PS tests	Clément Bœsch	2017-06-28	1	-0/+3
\| \| \| \| \| \|	This includes various fixes and improvements from James Almer. Signed-off-by: James Almer <jamrial@gmail.com>
*	checkasm: add float_dsp tests	James Almer	2017-06-14	1	-0/+20
\| \| \| \| \| \|	Ported from libavutil/tests/float_dsp.c Signed-off-by: James Almer <jamrial@gmail.com>
*	checkasm: add fixed_dsp tests	James Almer	2017-04-11	1	-0/+3
\| \| \| \| \|	Tested-by: Michael Niedermayer <michael@niedermayer.cc> Signed-off-by: James Almer <jamrial@gmail.com>
*	Merge commit '9498237049d15812cecb79df47b196c73013908b'	Clément Bœsch	2017-03-31	1	-10/+18
\|\ \| \| \| \| \| \| \| \| \| \| \| \|	* commit '9498237049d15812cecb79df47b196c73013908b': checkasm: Add --test parameter to check only specific components Merged-by: Clément Bœsch <cboesch@gopro.com>