aboutsummaryrefslogtreecommitdiffstats
path: root/libswscale/utils.c
Commit message (Collapse)AuthorAgeFilesLines
...
* swscale/input: add UYVA input supportJames Almer2024-10-081-0/+1
| | | | Signed-off-by: James Almer <jamrial@gmail.com>
* swscale/input: add AYUV input supportJames Almer2024-10-081-0/+1
| | | | Signed-off-by: James Almer <jamrial@gmail.com>
* swscale/internal: turn cascaded_tmp into an arrayNiklas Haas2024-10-071-8/+8
| | | | | | | Slightly more convenient to access from the new wrapping code. Sponsored-by: Sovereign Tech Fund Signed-off-by: Niklas Haas <git@haasn.dev>
* swscale/utils: fix leak on threaded ctx init failureNiklas Haas2024-07-141-2/+1
| | | | | | | | | | | | This count gets incremented after init succeeds, when it should be incremented after *alloc* succeeds. Otherwise, we leak the context on failure. There are no negative consequences of incrementing for allocated-but-not-initialized contexts, as the only functions that reference it will, in the worst case, simply behave as if called on allocated-but-not-initialized contexts, which is in line with expected behavior when sws_init_context() fails.
* swscale: remove unconditional #define DITHER1XBPPRamiro Polla2024-07-101-4/+0
| | | | | This seems to have had an use in the past, but it is now defined unconditionally.
* swscale/aarch64: add neon {lum,chr}ConvertRangeRamiro Polla2024-06-181-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | chrRangeFromJpeg_8_c: 29.2 chrRangeFromJpeg_8_neon: 19.5 chrRangeFromJpeg_24_c: 80.5 chrRangeFromJpeg_24_neon: 34.0 chrRangeFromJpeg_128_c: 413.7 chrRangeFromJpeg_128_neon: 156.0 chrRangeFromJpeg_144_c: 471.0 chrRangeFromJpeg_144_neon: 174.2 chrRangeFromJpeg_256_c: 842.0 chrRangeFromJpeg_256_neon: 305.5 chrRangeFromJpeg_512_c: 1699.0 chrRangeFromJpeg_512_neon: 608.0 chrRangeToJpeg_8_c: 51.7 chrRangeToJpeg_8_neon: 22.7 chrRangeToJpeg_24_c: 149.7 chrRangeToJpeg_24_neon: 38.0 chrRangeToJpeg_128_c: 761.7 chrRangeToJpeg_128_neon: 176.7 chrRangeToJpeg_144_c: 866.2 chrRangeToJpeg_144_neon: 198.7 chrRangeToJpeg_256_c: 1516.5 chrRangeToJpeg_256_neon: 348.7 chrRangeToJpeg_512_c: 3067.2 chrRangeToJpeg_512_neon: 692.7 lumRangeFromJpeg_8_c: 24.0 lumRangeFromJpeg_8_neon: 17.0 lumRangeFromJpeg_24_c: 56.7 lumRangeFromJpeg_24_neon: 21.0 lumRangeFromJpeg_128_c: 294.5 lumRangeFromJpeg_128_neon: 76.7 lumRangeFromJpeg_144_c: 332.5 lumRangeFromJpeg_144_neon: 86.7 lumRangeFromJpeg_256_c: 586.0 lumRangeFromJpeg_256_neon: 152.2 lumRangeFromJpeg_512_c: 1190.0 lumRangeFromJpeg_512_neon: 298.0 lumRangeToJpeg_8_c: 31.7 lumRangeToJpeg_8_neon: 19.5 lumRangeToJpeg_24_c: 83.5 lumRangeToJpeg_24_neon: 24.2 lumRangeToJpeg_128_c: 440.5 lumRangeToJpeg_128_neon: 91.0 lumRangeToJpeg_144_c: 504.2 lumRangeToJpeg_144_neon: 101.0 lumRangeToJpeg_256_c: 879.7 lumRangeToJpeg_256_neon: 177.2 lumRangeToJpeg_512_c: 1794.2 lumRangeToJpeg_512_neon: 354.0
* swscale/x86: add sse2 and avx2 {lum,chr}ConvertRangeRamiro Polla2024-06-161-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | chrRangeFromJpeg_8_c: 22.3 chrRangeFromJpeg_8_sse2: 13.3 chrRangeFromJpeg_8_avx2: 13.3 chrRangeFromJpeg_24_c: 72.8 chrRangeFromJpeg_24_sse2: 22.3 chrRangeFromJpeg_24_avx2: 17.5 chrRangeFromJpeg_128_c: 345.5 chrRangeFromJpeg_128_sse2: 106.0 chrRangeFromJpeg_128_avx2: 57.8 chrRangeFromJpeg_144_c: 380.5 chrRangeFromJpeg_144_sse2: 118.5 chrRangeFromJpeg_144_avx2: 62.3 chrRangeFromJpeg_256_c: 646.3 chrRangeFromJpeg_256_sse2: 218.8 chrRangeFromJpeg_256_avx2: 109.0 chrRangeFromJpeg_512_c: 1461.5 chrRangeFromJpeg_512_sse2: 426.5 chrRangeFromJpeg_512_avx2: 211.5 chrRangeToJpeg_8_c: 37.8 chrRangeToJpeg_8_sse2: 10.5 chrRangeToJpeg_8_avx2: 14.0 chrRangeToJpeg_24_c: 114.3 chrRangeToJpeg_24_sse2: 23.5 chrRangeToJpeg_24_avx2: 16.3 chrRangeToJpeg_128_c: 633.5 chrRangeToJpeg_128_sse2: 107.5 chrRangeToJpeg_128_avx2: 55.0 chrRangeToJpeg_144_c: 758.3 chrRangeToJpeg_144_sse2: 132.0 chrRangeToJpeg_144_avx2: 64.5 chrRangeToJpeg_256_c: 1345.0 chrRangeToJpeg_256_sse2: 218.0 chrRangeToJpeg_256_avx2: 105.3 chrRangeToJpeg_512_c: 2524.0 chrRangeToJpeg_512_sse2: 417.0 chrRangeToJpeg_512_avx2: 218.8 lumRangeFromJpeg_8_c: 11.8 lumRangeFromJpeg_8_sse2: 11.0 lumRangeFromJpeg_8_avx2: 10.3 lumRangeFromJpeg_24_c: 38.5 lumRangeFromJpeg_24_sse2: 15.5 lumRangeFromJpeg_24_avx2: 12.5 lumRangeFromJpeg_128_c: 232.3 lumRangeFromJpeg_128_sse2: 60.0 lumRangeFromJpeg_128_avx2: 26.8 lumRangeFromJpeg_144_c: 259.5 lumRangeFromJpeg_144_sse2: 65.3 lumRangeFromJpeg_144_avx2: 29.0 lumRangeFromJpeg_256_c: 464.5 lumRangeFromJpeg_256_sse2: 107.5 lumRangeFromJpeg_256_avx2: 54.0 lumRangeFromJpeg_512_c: 897.5 lumRangeFromJpeg_512_sse2: 224.5 lumRangeFromJpeg_512_avx2: 109.8 lumRangeToJpeg_8_c: 17.8 lumRangeToJpeg_8_sse2: 11.0 lumRangeToJpeg_8_avx2: 11.8 lumRangeToJpeg_24_c: 56.3 lumRangeToJpeg_24_sse2: 11.0 lumRangeToJpeg_24_avx2: 12.5 lumRangeToJpeg_128_c: 333.8 lumRangeToJpeg_128_sse2: 53.3 lumRangeToJpeg_128_avx2: 26.5 lumRangeToJpeg_144_c: 375.5 lumRangeToJpeg_144_sse2: 60.8 lumRangeToJpeg_144_avx2: 29.0 lumRangeToJpeg_256_c: 652.0 lumRangeToJpeg_256_sse2: 109.5 lumRangeToJpeg_256_avx2: 53.5 lumRangeToJpeg_512_c: 1284.3 lumRangeToJpeg_512_sse2: 218.0 lumRangeToJpeg_512_avx2: 108.3
* swscale: [LA] Optimize range convert for yuvj420p.Shiyou Yin2024-04-111-1/+5
| | | | | Reviewed-by: 陈昊 <chenhao@loongson.cn> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* swscale/utils: Fix xInc overflowMichael Niedermayer2024-04-041-1/+1
| | | | | | | | Fixes: signed integer overflow: 2 * 1073741824 cannot be represented in type 'int' Fixes: 67802/clusterfuzz-testcase-minimized-ffmpeg_SWS_fuzzer-6249515855183872 Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* avutil/common: Don't auto-include mem.hAndreas Rheinhardt2024-03-311-0/+1
| | | | | | | | | | | There are lots of files that don't need it: The number of object files that actually need it went down from 2011 to 884 here. Keep it for external users in order to not cause breakages. Also improve the other headers a bit while just at it. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
* swscale/swscale_internal: Don't export internal functionAndreas Rheinhardt2024-03-311-40/+43
| | | | | | | sws_alloc_set_opts() can actually be made internal to utils.c. This commit does so. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
* libswscale/utils: Fix bayer to yuvjMichael Niedermayer2024-02-211-1/+2
| | | | | | | | | Fixes: out of array access. Earlier code assumes that a unscaled bayer to yuvj420 converter exists but the later code then skips yuvj420 Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* Revert "swscale: fix sws_setColorspaceDetails after sws_init_context"Michael Niedermayer2024-02-211-3/+7
| | | | | | | | | Suggested by: Niklas Haas in Ticket10824 Fixes: Assertion failure Fixes: Ticket10824 This reverts commit cedf589c09c567b72bf4c1a58db53d94622567e1.
* swscale/utils: Allocate more dithererrorMichael Niedermayer2024-02-211-1/+1
| | | | | Fixes: out of array read Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* swscale: don't omit ff_sws_init_range_convert for high-bitNiklas Haas2023-11-091-3/+1
| | | | | | | This was a complete hack seemingly designed to work around a different bug, which was fixed in the previous commit. As such, there is no more reason not to do this, as it simply breaks changing color range in sws_setColorspaceDetails for no reason.
* swscale: fix sws_setColorspaceDetails after sws_init_contextNiklas Haas2023-11-091-7/+3
| | | | | | | | | | | | | | | | | | | | | | | | | More commonly, this fixes the case of sws_setColorspaceDetails after sws_getContext, since the latter implies sws_init_context. The problem here is that sws_init_context sets up the range conversion and fast path tables based on the values of srcRange/dstRange at init time. This may result in locking in a "wrong" path (either using unscaled fast path when range conversion later required, or using scaled slow path when range conversion becomes no longer required). There are two way outs: 1. Always initialize range conversion and unscaled converters, even if they will be unused, and extend the runtime check. 2. Re-do initialization if the values change after sws_setColorspaceDetails. I opted for approach 1 because it was simpler and easier to reason about. Reword the av_log message to make it clear that this special converter is not necessarily used, depending on whether or not there is range conversion or YUV matrix conversion going on.
* swscale: add GBRAP14 format supportPaul B Mahol2023-09-281-0/+6
|
* avutil/internal: Don't auto-include emms.hAndreas Rheinhardt2023-09-041-0/+1
| | | | | | Instead include emms.h wherever it is needed. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
* configure: Set WIN32_LEAN_AND_MEAN at configure timeL. E. Segovia2023-08-141-1/+0
| | | | | | | | | | | | | Including winsock2.h or windows.h without WIN32_LEAN_AND_MEAN cause bzlib.h to parse as nonsense, due to an instance of #define char small in rpcndr.h. See: https://stackoverflow.com/a/27794577 Signed-off-by: L. E. Segovia <amy@amyspark.me> Signed-off-by: Martin Storsjö <martin@martin.st>
* lsws: add in/out support for the new 12-bit 2-plane 422 and 444 pixfmtsLynne2023-05-291-0/+4
|
* swscale/la: Optimize the functions of the swscale series with lsx.Lu Wang2023-05-251-1/+2
| | | | | | | | | | | ./configure --disable-lasx ffmpeg -i ~/media/1_h264_1080p_30fps_3Mbps.mp4 -f rawvideo -s 640x480 -pix_fmt bgra -y /dev/null -an before: 91fps after: 160fps Reviewed-by: Shiyou Yin <yinshiyou-hf@loongson.cn> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* sws/utils.c: Do not uselessly call initFilter() when unscalingTomas Härdin2023-02-081-31/+31
|
* swscale/utils: Fix indentationAndreas Rheinhardt2022-11-241-10/+10
| | | | | | Forgotten after c1eb3e7fecdc270e03a700d61ef941600a6af491. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
* swscale/utils: Derive range from YUVJ-pix-fmt only onceAndreas Rheinhardt2022-11-241-8/+10
| | | | | | | | | | | | | | Currently, it is done once per slice-thread, leading to one warning per slice-thread in case a YUVJ pixel format has been originally used. This also fixes the anomaly that said parameter are only updated for the user-facing context (whose values are retrievable via av_opt_get()) if slice-threading is not in use. Fixes ticket #9860. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
* swscale/utils: Move functions to avoid forward declarationsAndreas Rheinhardt2022-11-241-207/+200
| | | | Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
* swscale/utils: Avoid calling ff_thread_once() unnecessarilyAndreas Rheinhardt2022-11-241-3/+4
| | | | Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
* swscale/utils: Don't allocate AVFrames for slice contextsAndreas Rheinhardt2022-11-241-10/+5
| | | | | | Only the parent context's AVFrames are ever used. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
* swscale/utils: Factor initializing single slice context outAndreas Rheinhardt2022-11-241-10/+21
| | | | | | | | | | | | | | | | | | Initializing slice threads currently uses the function (sws_init_context()) that is also used for initializing user-facing contexts with the only difference being that nb_threads is set to one before initializing the slice contexts. Yet sws_init_context() also initializes lots of stuff that is not slice-dependent, i.e. (src|dst)Range. This currently only works because the code sets these fields to the same values for all slice contexts. This is not nice; even worse, it entails that log messages are printed once per slice context (and therefore fill the screen). This commit lays the groundwork to fix this. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
* swscale/utils: Remove obsolete 3DNow referenceAndreas Rheinhardt2022-11-091-2/+0
| | | | | | | swscale does not use 3DNow any more since commit 608319a311a31f7d85333a7b08286c00be38eab6. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
* swscale/la: Optimize hscale functions with lasx.Hao Chen2022-09-101-1/+12
| | | | | | | | | | ffmpeg -i 1_h264_1080p_30fps_3Mbps.mp4 -f rawvideo -s 640x480 -y /dev/null -an before: 101fps after: 138fps Signed-off-by: Hao Chen <chenhao@loongson.cn> Reviewed-by: yinshiyou-hf@loongson.cn Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* swscale/output: add support for Y210LE and Y212LEPhilip Langdale2022-09-101-2/+2
|
* swscale/output: add support for XV30LEPhilip Langdale2022-09-101-1/+1
|
* swscale/output: add support for XV36LEPhilip Langdale2022-09-101-1/+1
|
* swscale/output: add support for P012Philip Langdale2022-09-101-2/+2
| | | | This generalises the existing P010 support.
* swscale/input: add support for Y212LEPhilip Langdale2022-09-061-0/+1
|
* swscale/input: add support for XV30LEPhilip Langdale2022-09-061-0/+1
|
* swscale/input: add support for P012Philip Langdale2022-09-061-0/+2
| | | | | As we now have three of these formats, I added macros to generate the conversion functions.
* swscale/input: add support for XV36LEPhilip Langdale2022-09-061-0/+1
|
* libswscale: add support for VUYX formatPhilip Langdale2022-08-251-0/+1
| | | | | | As we already have support for VUYA, I figured I should do the small amount of work to support VUYX as well. That means a little refactoring to share code.
* swscale/input: add rgbaf16 input supportTimo Rothenpieler2022-08-191-0/+2
| | | | | | This is by no means perfect, since at least ddagrab will return scRGB data with values outside of 0.0f to 1.0f for HDR values. Its primary purpose is to be able to work with the format at all.
* libswscale: Enable hscale_avx2 for all input sizes.Alan Kelly2022-08-181-3/+16
| | | | | | ff_shuffle_filter_coefficients shuffles the tail as required. Signed-off-by: Anton Khirnov <anton@khirnov.net>
* swscale/output: add VUYA output supportJames Almer2022-08-071-1/+1
| | | | Signed-off-by: James Almer <jamrial@gmail.com>
* swscale/input: add VUYA input supportJames Almer2022-08-051-0/+1
| | | | | Reviewed-by: Philip Langdale <philipl@overt.org> Signed-off-by: James Almer <jamrial@gmail.com>
* swscale: add NV16 input/outputMatthieu Bouron2022-07-191-0/+1
| | | | Signed-off-by: Anton Khirnov <anton@khirnov.net>
* all: Replace if (ARCH_FOO) checks by #if ARCH_FOOAndreas Rheinhardt2022-06-151-3/+4
| | | | | | | | | | | | | | | | | | This is more spec-compliant because it does not rely on dead-code elimination by the compiler. Especially MSVC has problems with this, as can be seen in https://ffmpeg.org/pipermail/ffmpeg-devel/2022-May/296373.html or https://ffmpeg.org/pipermail/ffmpeg-devel/2022-May/297022.html This commit does not eliminate every instance where we rely on dead code elimination: It only tackles branching to the initialization of arch-specific dsp code, not e.g. all uses of CONFIG_ and HAVE_ checks. But maybe it is already enough to compile FFmpeg with MSVC with whole-programm-optimizations enabled (if one does not disable too many components). Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
* swscale/aarch64: add hscale specializationsSwinney, Jonathan2022-05-281-1/+1
| | | | | | | | | | | | | | | | | | | | | | | This patch adds code to support specializations of the hscale function and adds a specialization for filterSize == 4. ff_hscale8to15_4_neon is a complete rewrite. Since the main bottleneck here is loading the data from src, this data is loaded a whole block ahead and stored back to the stack to be loaded again with ld4. This arranges the data for most efficient use of the vector instructions and removes the need for completion adds at the end. The number of iterations of the C per iteration of the assembly is increased from 4 to 8, but because of the prefetching, there must be a special section without prefetching when dstW < 16. This improves speed on Graviton 2 (Neoverse N1) dramatically in the case where previously fs=8 would have been required. before: hscale_8_to_15__fs_8_dstW_512_neon: 1962.8 after : hscale_8_to_15__fs_4_dstW_512_neon: 1220.9 Signed-off-by: Jonathan Swinney <jswinney@amazon.com> Signed-off-by: Martin Storsjö <martin@martin.st>
* lib*/version: Move library version functions into files of their ownAndreas Rheinhardt2022-05-101-20/+0
| | | | | | | This avoids having to rebuild big files every time FFMPEG_VERSION changes (which it does with every commit). Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
* libswscale: Split version.hMartin Storsjö2022-03-161-0/+1
| | | | Signed-off-by: Martin Storsjö <martin@martin.st>
* swscale: Take the destination range into account for yuv->rgb->yuv conversionsMartin Storsjö2022-02-251-3/+8
| | | | | | | | | | | | The range parameters need to be set up before calling sws_init_context (which selects which fastpaths can be used; this gets called by sws_getContext); solely passing them via sws_setColorspaceDetails isn't enough. This fixes producing full range YUV range output when doing YUV->YUV conversions between different YUV color spaces. Signed-off-by: Martin Storsjö <martin@martin.st>
* libswscale: Re-factor ff_shuffle_filter_coefficients.Alan Kelly2022-02-171-30/+36
| | | | | | Make the code more readable and follow the style guide. Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>