aboutsummaryrefslogtreecommitdiffstats
path: root/libavcodec/x86
Commit message (Collapse)AuthorAgeFilesLines
* Remove unnecessary libavutil/(avutil|common|internal).h inclusionsAndreas Rheinhardt2022-02-243-3/+7
| | | | | | | | | | Some of these were made possible by moving several common macros to libavutil/macros.h. While just at it, also improve the other headers a bit. Reviewed-by: Martin Storsjö <martin@martin.st> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
* avutil/avassert: Don't include avutil.hAndreas Rheinhardt2022-02-241-0/+2
| | | | | Reviewed-by: Martin Storsjö <martin@martin.st> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
* Remove obsolete version.h inclusionsAndreas Rheinhardt2022-02-241-2/+0
| | | | | | | Forgotten in e7bd47e657bbf9e1ce9915e93bc80cb1a29fb7f3. Reviewed-by: Martin Storsjö <martin@martin.st> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
* lavc/mpeg*: drop the XvMC hwaccel codeAnton Khirnov2022-02-151-4/+0
| | | | | XvMC was last relevant over 10 years ago, if ever. There is no reason to use it today.
* avcodec/mpegvideo: Move encoder-only stuff to a new headerAndreas Rheinhardt2022-02-131-1/+1
| | | | Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
* lavc/x86/h264_qpel: stop unnecessarily including h264decAnton Khirnov2022-01-261-1/+0
|
* av(codec|device): Don't cast pointers to intAndreas Rheinhardt2021-11-213-8/+8
| | | | | | | | | | C99/C11 6.3.2.3 5: "Any pointer type may be converted to an integer type. [...] If the result cannot be represented in the integer type, the behavior is undefined." So stop casting pointers to int; use uintptr_t instead. Reviewed-by: Michael Niedermayer <michael@niedermayer.cc> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
* get_cabac_inline_x86: Don't inline if 32-bit clang on windowsChristopher Degawa2021-08-191-2/+7
| | | | | | | | | Fixes https://trac.ffmpeg.org/ticket/8903 relevant https://github.com/msys2/MINGW-packages/discussions/9258 Signed-off-by: Christopher Degawa <ccom@randomderp.com> Signed-off-by: Martin Storsjö <martin@martin.st>
* avcodec/h264dsp, h264idct: Fix lengths of array parametersAndreas Rheinhardt2021-08-081-2/+2
| | | | | | Fixes many -Warray-parameter warnings from GCC 11. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
* Remove/replace some unnecessary avcodec.h inclusionsAndreas Rheinhardt2021-07-225-4/+13
| | | | | | | Also remove other unnecessary headers and include headers directly while at it. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
* Remove unnecessary mem.h inclusionsAndreas Rheinhardt2021-07-229-9/+1
| | | | Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
* avutil/internal, swresample/audioconvert: Remove cpu.h inclusionsAndreas Rheinhardt2021-07-221-1/+0
| | | | | | | | | | These inclusions are not necessary, as cpu.h is already included wherever it is needed (via direct inclusion or via the arch-specific headers). Also remove other unnecessary cpu.h inclusions from ordinary non-headers. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
* avcodec: Remove deprecated old encode/decode APIsAndreas Rheinhardt2021-04-271-35/+0
| | | | | | | | Deprecated in commits 7fc329e2dd6226dfecaa4a1d7adf353bf2773726 and 31f6a4b4b83aca1d73f3cfc99ce2b39331970bf3. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com> Signed-off-by: James Almer <jamrial@gmail.com>
* Include attributes.h directlyAndreas Rheinhardt2021-04-1916-0/+17
| | | | | | | | Some files currently rely on libavutil/cpu.h to include it for them; yet said file won't use include it any more after the currently deprecated functions are removed, so include attributes.h directly. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
* avcodec/x86: add cfhdenc SIMDPaul B Mahol2021-02-273-0/+482
|
* avcodec: add missing FF_API_OLD_ENCDEC wrappers to xmm clobber functionsJames Almer2021-02-261-15/+17
| | | | Signed-off-by: James Almer <jamrial@gmail.com>
* avcodec/x86/constants: Remove unused ff_pw_17Andreas Rheinhardt2021-02-242-5/+0
| | | | | | Unused since 80944df720da98d6e5ee0e355db5814735914ec9. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
* avcodec/x86/diracdsp_init: Reuse macroAndreas Rheinhardt2021-02-241-36/+2
| | | | Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
* avcodec/x86/diracdsp_init: Simplify macroAndreas Rheinhardt2021-02-241-13/+13
| | | | Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
* avcodec/x86/diracdsp_init: Make functions only used here staticAndreas Rheinhardt2021-02-241-32/+23
| | | | | | | | | This allowed to remove forward declarations. Because compilers expect declarations for all functions they encounter even when it is within blocks disabled via "if (0 && foo)", one has to use a real #if in ff_diracdsp_init_x86. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
* avcodec/x86/diracdsp_init: Remove unused MMX functionsAndreas Rheinhardt2021-02-241-7/+0
| | | | | | | | Unused since a1f3b18bf55f106c974eacb1dc831be4d2bd5277, yet as nonstatic functions the compiler can't detect this, so that these functions aren't stripped and no warning is emitted. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
* avcodec/cabac_functions, x86/cabac: Include stddef.hAndreas Rheinhardt2021-02-041-0/+2
| | | | | | Fixes checkheaders after 8c01eb0a315fec8f09ba6210ce8b0296de6cc784. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
* ac3enc_fixed: drop unnecessary fixed-point DSP codeLynne2021-01-142-309/+1
|
* lavu/mem: move the DECLARE_ALIGNED macro family to mem_internal on next+1 bumpAnton Khirnov2021-01-014-1/+4
| | | | They are not properly namespaced and not intended for public use.
* lavu: move LOCAL_ALIGNED from internal.h to mem_internal.hAnton Khirnov2021-01-019-0/+10
| | | | That is a more appropriate place for it.
* avcodec/mpegaudiodsp: Make ff_mpadsp_init() thread-safeAndreas Rheinhardt2020-11-241-3/+6
| | | | | | | | | | The only thing missing for this is to make ff_mpadsp_init_x86() thread-safe; it currently isn't because a static table is initialized every time ff_mpadsp_init() is called (when ARCH_X86 is true). Solve this by initializing this table only once, namely together with the ordinary not-arch specific tables. This also allows to reuse their AVOnce. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
* x86/cfhddsp: zero extend int argumentsJames Almer2020-08-281-10/+6
| | | | | | | | | | if taken from stack, they may have garbage in the upper bits otherwise. Also, there are only 8 arguments, so don't attempt to load 11. Fixes SIGSEV crashes in some targets. Reviewed-by: durandal_1707 Signed-off-by: James Almer <jamrial@gmail.com>
* avcodec/x86/cfhddsp: try to fix build on x32Paul B Mahol2020-08-261-7/+7
|
* avcodec/cfhd: add x86 SIMDPaul B Mahol2020-08-263-0/+755
| | | | Overall speed changes for 1920x1080, yuv422p10le, 60fps from: 0.19x to 0.343x
* x86/h264_deblock: fix warning about trailing empty parameterJames Almer2020-07-121-1/+1
| | | | | | Fixes part of ticket #8771 Signed-off-by: James Almer <jamrial@gmail.com>
* pixblockdsp, avdct: Add get_pixels_unalignedMartin Storsjö2020-05-131-2/+6
| | | | | | | | | | | | | Use this in vf_spp.c, where the get_pixels operation is done on unaligned source addresses. Hook up the x86 (mmx and sse) versions of get_pixels to this function pointer, as those implementations seem to support unaligned use. This fixes fate-filter-spp on armv7. Signed-off-by: Martin Storsjö <martin@martin.st>
* lavc/x86/hevc_add_res: Fix coeff overflow in ADD_RES_SSE_16_32_8Linjie Fu2020-03-271-19/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | Fix overflow for coeff -32768 in function ADD_RES_SSE_16_32_8 with no performance drop.(SSE2/AVX/AVX2) ./checkasm --test=hevc_add_res --bench Mainline: - hevc_add_res.add_residual [OK] hevc_add_res_32x32_8_sse2: 127.5 hevc_add_res_32x32_8_avx: 127.0 hevc_add_res_32x32_8_avx2: 86.5 Add overflow test case: - hevc_add_res.add_residual [FAILED] After: - hevc_add_res.add_residual [OK] hevc_add_res_32x32_8_sse2: 126.8 hevc_add_res_32x32_8_avx: 128.3 hevc_add_res_32x32_8_avx2: 86.8 Signed-off-by: Xu Guangxin <guangxin.xu@intel.com> Signed-off-by: Linjie Fu <linjie.fu@intel.com> Signed-off-by: Anton Khirnov <anton@khirnov.net>
* lavc/x86/hevc_add_res: Fix overflow in ADD_RES_SSE_8_8Linjie Fu2020-03-271-23/+22
| | | | | | | | | | | | | | | | | | | | | | Fix overflow for coeff -32768 in function ADD_RES_SSE_8_8 with no performance drop. ./checkasm --test=hevc_add_res --bench Mainline: - hevc_add_res.add_residual [OK] hevc_add_res_8x8_8_sse2: 15.5 Add overflow test case: - hevc_add_res.add_residual [FAILED] After: - hevc_add_res.add_residual [OK] hevc_add_res_8x8_8_sse2: 15.5 Signed-off-by: Xu Guangxin <guangxin.xu@intel.com> Signed-off-by: Linjie Fu <linjie.fu@intel.com> Signed-off-by: Anton Khirnov <anton@khirnov.net>
* lavc/x86/hevc_add_res: Fix overflow in ADD_RES_MMX_4_8Linjie Fu2020-03-271-12/+11
| | | | | | | | | | | | | | | | | | | | | | Fix overflow for coeff -32768 in function ADD_RES_MMX_4_8 with no performance drop. ./checkasm --test=hevc_add_res --bench Mainline: - hevc_add_res.add_residual [OK] hevc_add_res_4x4_8_mmxext: 15.5 Add overflow test case: - hevc_add_res.add_residual [FAILED] After: - hevc_add_res.add_residual [OK] hevc_add_res_4x4_8_mmxext: 15.0 Signed-off-by: Xu Guangxin <guangxin.xu@intel.com> Signed-off-by: Linjie Fu <linjie.fu@intel.com> Signed-off-by: Anton Khirnov <anton@khirnov.net>
* avcodec/x86/diracdsp: Fix high bits on Windows x86_64Michael Niedermayer2020-01-311-1/+1
| | | | Found-by: james
* avcodec/x86/diracdsp: Fix incorrect src addressing in dequant_subband_32()Michael Niedermayer2020-01-301-1/+2
| | | | | | | | | | Fixes: Segfault (not reproducable with asm, which made this hard to debug) Fixes: decoding errors Fixes: 19854/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_DIRAC_fuzzer-5729372837511168 Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg Reviewed-by: Paul B Mahol <onemda@gmail.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* vp4: prevent unaligned memory access in loop filterPeter Ross2019-10-301-2/+2
| | | | | | | | | | | | VP4 applies a loop filter during motion compensation, causing the block offset will often by unaligned. This produces a bus error on some platforms, namely ARMv7 NEON. This patch adds a unaligned version of the loop filter function pointer to VP3DSPContext. Reported-by: Mike Melanson <mike@multimedia.cx> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* x85/opusdsp: enable the functions on all FMA3 CPUsJames Almer2019-09-111-1/+1
| | | | | | | It's not using ymm registers, so limiting it to CPUs with fast AVX is not necessary. Signed-off-by: James Almer <jamrial@gmail.com>
* x86/opusdps: clear the high bits from some gprsJames Almer2019-09-111-3/+4
| | | | | | | Fixes checkasm on systems like win64. Reviewed-by: Lynne Signed-off-by: James Almer <jamrial@gmail.com>
* avcodec/Makefile: add missing pngdsp dependency to the lscr decoderJames Almer2019-05-141-0/+2
| | | | Signed-off-by: James Almer <jamrial@gmail.com>
* x86/v210dec: use named registersJames Almer2019-05-031-16/+16
| | | | Signed-off-by: James Almer <jamrial@gmail.com>
* x86/v210dec: don't reserve more xmm regs than neededJames Almer2019-05-031-1/+1
| | | | | | | Prevents pointless register saving on win64 for the sse3 and avx versions of the function. Signed-off-by: James Almer <jamrial@gmail.com>
* x86/v210dec: remove duplicate load instructionJames Almer2019-05-031-1/+0
| | | | Signed-off-by: James Almer <jamrial@gmail.com>
* avcodec/x86/v210: fix operands of vpblendd used in new avx2 codeJames Darnley2019-05-021-1/+1
| | | | Assembly failed when using yasm rather than nasm.
* libavcodec Adding ff_v210_planar_unpack AVX2Michael Stoner2019-05-022-16/+64
| | | | | Replaced VSHUFPS with VPBLENDD to relieve port 5 bottleneck AVX2 is 1.4x faster than AVX
* x86/opusdsp: replace loads with shufflesLynne2019-04-261-6/+3
| | | | | | | | Has a slight speedup. Can't be carried over to aarch64, since it has no shufps-like instruction. Reviewed-by: Paul B Mahol <onemda@gmail.com> Signed-off-by: James Almer <jamrial@gmail.com>
* x86/opusdsp: fix WIN64 return valueLynne2019-04-011-3/+2
| | | | Signed-off-by: James Almer <jamrial@gmail.com>
* x86/opusdsp: implement FMA3 accelerated postfilter and deemphasisLynne2019-04-013-0/+151
| | | | | | | | | | | | | | | | | | | | | | | | | 58893 decicycles in deemphasis_c, 130548 runs, 524 skips 9475 decicycles in deemphasis_fma3, 130686 runs, 386 skips -> 6.21x speedup 24866 decicycles in postfilter_c, 65386 runs, 150 skips 5268 decicycles in postfilter_fma3, 65505 runs, 31 skips -> 4.72x speedup Total decoder speedup: ~14% Deemphasis SIMD based on the following unrolling: const float c1 = CELT_EMPH_COEFF, c2 = c1*c1, c3 = c2*c1, c4 = c3*c1; float state = coeff; for (int i = 0; i < len; i += 4) { y[0] = x[0] + c1*state; y[1] = x[1] + c2*state + c1*x[0]; y[2] = x[2] + c3*state + c1*x[1] + c2*x[0]; y[3] = x[3] + c4*state + c1*x[2] + c2*x[1] + c3*x[0]; state = y[3]; y += 4; x += 4; }
* celt_pvq_init: only build when CONFIG_OPUS_ENCODER is enabledLynne2019-03-312-3/+0
| | | | The entire function was defined away before.
* x86/opus_dsp: rename to celt_pvqLynne2019-03-313-4/+4
| | | | Its only used in the encoder and in CELT's PVQ.