aboutsummaryrefslogtreecommitdiffstats
path: root/libswscale/ppc
Commit message (Collapse)AuthorAgeFilesLines
* swscale/ppc: remove hScale8To19_vsxLynne2023-05-201-60/+0
| | | | Fails checkasm on a Power9 system.
* sws: rename SwsContext.swscale to convert_unscaledAnton Khirnov2021-07-031-2/+2
| | | | That function pointer is now used only for unscaled conversion.
* Include attributes.h directlyAndreas Rheinhardt2021-04-191-0/+1
| | | | | | | | Some files currently rely on libavutil/cpu.h to include it for them; yet said file won't use include it any more after the currently deprecated functions are removed, so include attributes.h directly. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
* lsws/ppc/yuv2rgb_altivec: Fix build in non-VSX environmentsChip Kerchner2021-02-221-0/+10
| | | | | | | | | Add inline function for vec_xl if VSX is not supported. vec_xl intrinsic is only available on POWER 7 or higher. Fixes ticket #8750. Signed-off-by: Andriy Gelman <andriy.gelman@gmail.com>
* lsws/ppc/yuv2rgb: Fix transparency converting from yuv->rgb32.Carl Eugen Hoyos2021-01-241-4/+4
| | | | | | Based on 68363b69 by Reimar Döffinger. Fixes ticket #9077.
* lavu/mem: move the DECLARE_ALIGNED macro family to mem_internal on next+1 bumpAnton Khirnov2021-01-011-0/+1
| | | | They are not properly namespaced and not intended for public use.
* lavu: move LOCAL_ALIGNED from internal.h to mem_internal.hAnton Khirnov2021-01-012-0/+3
| | | | That is a more appropriate place for it.
* swscale: Fix AltiVec/VSX build with recent GCCDaniel Kolesa2019-10-041-4/+8
| | | | | | | | | | | | | The argument to vec_splat_u16 must be a literal. By making the function always inline and marking the arguments const, gcc can turn those into literals, and avoid build errors like: swscale_vsx.c:165:53: error: argument 1 must be a 5-bit signed literal Fixes #7861. Signed-off-by: Daniel Kolesa <daniel@octaforge.org> Signed-off-by: Lauri Kasanen <cand@gmx.com>
* swscale: Replace illegal vector keyword usage in altivec codeDaniel Kolesa2019-10-042-459/+459
| | | | | | | | | | | | | | | | | | | | | While this technically compiles in current ffmpeg, this is only because ffmpeg is compiled in strict ISO C mode, which disables the builtin 'vector' keyword for AltiVec/VSX. Instead this gets replaced with a macro inside altivec.h, which defines vector to be actually __vector, which accepts random types. Normally, the vector keyword should be used only with plain scalar non-typedef types, such as unsigned int. But we have the vec_(s|u)(8|16|32) macros, which can be used in a portable manner, in util_altivec.h in libavutil. This is also consistent with other AltiVec/VSX code elsewhere in the tree. Fixes #7861. Signed-off-by: Daniel Kolesa <daniel@octaforge.org> Signed-off-by: Lauri Kasanen <cand@gmx.com>
* lsws/ppc/yuv2rgb_altivec: Replace vec_lvsl/vec_perm with vec_xlChip Kerchner2019-08-131-20/+4
| | | | | | | | | | gcc 6.x and 7.x generate wrong code for little endian machines for the vec_lvsl/vec_perm instruction combos in some cases. The bug was fixed in version 8.x If these instructions are replaced with vec_xl, the problem goes away for all versions of the compilers. Fixes ticket #7124.
* swscale: Add support for NV24 and NV42Philip Langdale2019-05-122-4/+2
| | | | | | | | | | | The implementation is pretty straight-forward. Most of the existing NV12 codepaths work regardless of subsampling and are re-used as is. Where necessary I wrote the slightly different NV24 versions. Finally, the one thing that confused me for a long time was the asm specific x86 path that did an explicit exclusion check for NV12. I replaced that with a semi-planar check and also updated the equivalent PPC code, which Lauri kindly checked.
* swscale/ppc: Shorten power8 tests via a varLauri Kasanen2019-05-071-13/+14
|
* swscale/ppc: VSX-optimize hScale16To*Lauri Kasanen2019-05-071-0/+159
| | | | | | | | | | | | | | | | | | ./ffmpeg -loop 1 -s 1200x1440 -i tux16.png \ -s 2400x720 -f rawvideo -y -vframes 5 -pix_fmt yuv420p16le -nostats test.raw ./ffmpeg -loop 1 -s 1200x1440 -i tux16.png \ -s 2400x720 -f rawvideo -y -vframes 5 -pix_fmt yuv420p -nostats test.raw 32-bit mul, power8 only 2x speedup for hScale8To19_vsx (x86 SSE2 is 2.37): 30896 UNITS in hscale, 8192 runs, 0 skips 63956 UNITS in hscale, 8192 runs, 0 skips 2.06 for hScale16To15_vsx: 30531 UNITS in hscale, 8192 runs, 0 skips 63161 UNITS in hscale, 8192 runs, 0 skips
* swscale/ppc: IndentLauri Kasanen2019-05-071-6/+6
|
* swscale/ppc: VSX-optimize hScale8To19Lauri Kasanen2019-05-071-1/+63
| | | | | | | | | ./ffmpeg -f lavfi -i yuvtestsrc=duration=1:size=1200x1440 \ -s 2400x720 -f rawvideo -y -vframes 5 -pix_fmt yuv420p16le -nostats test.raw 2.26 speedup (x86 SSE2 is 2.32): 23772 UNITS in hscale, 4096 runs, 0 skips 53862 UNITS in hscale, 4096 runs, 0 skips
* swscale/ppc: VSX-optimize hscale_fastLauri Kasanen2019-04-301-0/+196
| | | | | | | | | | | | | ./ffmpeg -f lavfi -i yuvtestsrc=duration=1:size=1200x1440 -sws_flags fast_bilinear \ -s 2400x720 -f rawvideo -vframes 5 -pix_fmt abgr -nostats test.raw 4.27 speedup for hyscale_fast: 24796 UNITS in hyscale_fast, 4096 runs, 0 skips 5797 UNITS in hyscale_fast, 4096 runs, 0 skips 4.48 speedup for hcscale_fast: 19911 UNITS in hcscale_fast, 4095 runs, 1 skips 4437 UNITS in hcscale_fast, 4096 runs, 0 skips
* swscale/ppc: VSX-optimize non-full-chroma yuv2rgb_2Lauri Kasanen2019-04-111-0/+188
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ./ffmpeg -f lavfi -i yuvtestsrc=duration=1:size=1200x1440 -sws_flags fast_bilinear \ -s 1200x720 -f null -vframes 100 -pix_fmt $i -nostats \ -cpuflags 0 -v error - 32-bit mul, power8 only. ~2x speedup: rgb24 24431 UNITS in yuv2packed2, 16384 runs, 0 skips 13783 UNITS in yuv2packed2, 16383 runs, 1 skips bgr24 24396 UNITS in yuv2packed2, 16384 runs, 0 skips 14059 UNITS in yuv2packed2, 16384 runs, 0 skips rgba 26815 UNITS in yuv2packed2, 16383 runs, 1 skips 12797 UNITS in yuv2packed2, 16383 runs, 1 skips bgra 27060 UNITS in yuv2packed2, 16384 runs, 0 skips 13138 UNITS in yuv2packed2, 16384 runs, 0 skips argb 26998 UNITS in yuv2packed2, 16384 runs, 0 skips 12728 UNITS in yuv2packed2, 16381 runs, 3 skips bgra 26651 UNITS in yuv2packed2, 16384 runs, 0 skips 13124 UNITS in yuv2packed2, 16384 runs, 0 skips This is a low speedup, but the x86 mmx version also gets only ~2x. The mmx version is also heavily inaccurate, while the vsx version has high accuracy.
* swscale/ppc: VSX-optimize yuv2rgb_full_XLauri Kasanen2019-04-071-0/+160
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | ./ffmpeg -f lavfi -i yuvtestsrc=duration=1:size=1200x1440 \ -s 1200x720 -f null -vframes 100 -pix_fmt $i -nostats \ -cpuflags 0 -v error - 32-bit mul, power8 only. ~6.4x speedup: rgb24 214278 UNITS in yuv2packedX, 16384 runs, 0 skips 33249 UNITS in yuv2packedX, 16384 runs, 0 skips bgr24 214616 UNITS in yuv2packedX, 16384 runs, 0 skips 33233 UNITS in yuv2packedX, 16384 runs, 0 skips rgba 214517 UNITS in yuv2packedX, 16384 runs, 0 skips 33271 UNITS in yuv2packedX, 16384 runs, 0 skips bgra 214973 UNITS in yuv2packedX, 16384 runs, 0 skips 33397 UNITS in yuv2packedX, 16384 runs, 0 skips argb 214613 UNITS in yuv2packedX, 16384 runs, 0 skips 33310 UNITS in yuv2packedX, 16384 runs, 0 skips bgra 214637 UNITS in yuv2packedX, 16384 runs, 0 skips 33330 UNITS in yuv2packedX, 16384 runs, 0 skips
* swscale/ppc: VSX-optimize yuv2rgb_full_2Lauri Kasanen2019-04-071-0/+166
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | ./ffmpeg -f lavfi -i yuvtestsrc=duration=1:size=1200x1440 -sws_flags area \ -s 1200x720 -f null -vframes 100 -pix_fmt $i -nostats \ -cpuflags 0 -v error - 32-bit mul, power8 only. ~4x speedup: rgb24 52763 UNITS in yuv2packed2, 16384 runs, 0 skips 13453 UNITS in yuv2packed2, 16384 runs, 0 skips bgr24 53144 UNITS in yuv2packed2, 16384 runs, 0 skips 13616 UNITS in yuv2packed2, 16384 runs, 0 skips rgba 52796 UNITS in yuv2packed2, 16384 runs, 0 skips 12904 UNITS in yuv2packed2, 16384 runs, 0 skips bgra 52732 UNITS in yuv2packed2, 16384 runs, 0 skips 13262 UNITS in yuv2packed2, 16384 runs, 0 skips argb 52661 UNITS in yuv2packed2, 16384 runs, 0 skips 12879 UNITS in yuv2packed2, 16384 runs, 0 skips bgra 52662 UNITS in yuv2packed2, 16384 runs, 0 skips 12932 UNITS in yuv2packed2, 16384 runs, 0 skips
* swscale/ppc: VSX-optimize non-full-chroma yuv2rgb_1Lauri Kasanen2019-04-071-95/+330
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ./ffmpeg -f lavfi -i yuvtestsrc=duration=1:size=1200x1440 -sws_flags fast_bilinear \ -s 1200x1440 -f null -vframes 100 -pix_fmt $i -nostats \ -cpuflags 0 -v error - 32-bit mul, power8 only. 1.8-2.3x speedup: rgb24 18192 UNITS in yuv2packed1, 32767 runs, 1 skips 9983 UNITS in yuv2packed1, 32760 runs, 8 skips bgr24 18665 UNITS in yuv2packed1, 32766 runs, 2 skips 9925 UNITS in yuv2packed1, 32763 runs, 5 skips rgba 20239 UNITS in yuv2packed1, 32767 runs, 1 skips 8794 UNITS in yuv2packed1, 32759 runs, 9 skips bgra 20354 UNITS in yuv2packed1, 32768 runs, 0 skips 8770 UNITS in yuv2packed1, 32761 runs, 7 skips argb 20185 UNITS in yuv2packed1, 32768 runs, 0 skips 8761 UNITS in yuv2packed1, 32761 runs, 7 skips bgra 20360 UNITS in yuv2packed1, 32766 runs, 2 skips 8759 UNITS in yuv2packed1, 32764 runs, 4 skips This is a low speedup, but the x86 mmx version also gets only ~2x. The mmx version is also heavily inaccurate, while the vsx version has high accuracy.
* swscale/ppc: VSX-optimize yuv2422_XLauri Kasanen2019-03-311-0/+104
| | | | | | | | | | | | | | | | | | ./ffmpeg -f lavfi -i yuvtestsrc=duration=1:size=1200x1440 \ -s 1200x720 -f null -vframes 100 -pix_fmt $i -nostats \ -cpuflags 0 -v error - 7.2x speedup: yuyv422 126354 UNITS in yuv2packedX, 16384 runs, 0 skips 16383 UNITS in yuv2packedX, 16382 runs, 2 skips yvyu422 117669 UNITS in yuv2packedX, 16384 runs, 0 skips 16271 UNITS in yuv2packedX, 16379 runs, 5 skips uyvy422 117310 UNITS in yuv2packedX, 16384 runs, 0 skips 16226 UNITS in yuv2packedX, 16382 runs, 2 skips
* swscale/ppc: VSX-optimize yuv2422_2Lauri Kasanen2019-03-311-0/+69
| | | | | | | | | | | | | | | | | | ./ffmpeg -f lavfi -i yuvtestsrc=duration=1:size=1200x1440 -sws_flags area \ -s 1200x720 -f null -vframes 100 -pix_fmt $i -nostats \ -cpuflags 0 -v error - 5.1x speedup: yuyv422 19339 UNITS in yuv2packed2, 16384 runs, 0 skips 3718 UNITS in yuv2packed2, 16383 runs, 1 skips yvyu422 19438 UNITS in yuv2packed2, 16384 runs, 0 skips 3800 UNITS in yuv2packed2, 16380 runs, 4 skips uyvy422 19128 UNITS in yuv2packed2, 16384 runs, 0 skips 3721 UNITS in yuv2packed2, 16380 runs, 4 skips
* swscale/ppc: VSX-optimize yuv2422_1Lauri Kasanen2019-03-311-0/+149
| | | | | | | | | | | | | | | | | | ./ffmpeg -f lavfi -i yuvtestsrc=duration=1:size=1200x1440 \ -s 1200x1440 -f null -vframes 100 -pix_fmt $i -nostats \ -cpuflags 0 -v error - 15.3x speedup: yuyv422 14513 UNITS in yuv2packed1, 32768 runs, 0 skips 949 UNITS in yuv2packed1, 32767 runs, 1 skips yvyu422 14516 UNITS in yuv2packed1, 32767 runs, 1 skips 943 UNITS in yuv2packed1, 32767 runs, 1 skips uyvy422 14530 UNITS in yuv2packed1, 32767 runs, 1 skips 941 UNITS in yuv2packed1, 32766 runs, 2 skips
* swscale/ppc: VSX-optimize yuv2rgb_fullLauri Kasanen2019-03-271-0/+291
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | ./ffmpeg -f lavfi -i yuvtestsrc=duration=1:size=1200x1440 \ -s 1200x1440 -f null -vframes 100 -pix_fmt $i -nostats \ -cpuflags 0 -v error - This uses 32-bit mul, so POWER8 only. The following output formats get about 4.5x speedup: rgb24 39980 UNITS in yuv2packed1, 32768 runs, 0 skips 8774 UNITS in yuv2packed1, 32768 runs, 0 skips bgr24 40069 UNITS in yuv2packed1, 32768 runs, 0 skips 8772 UNITS in yuv2packed1, 32766 runs, 2 skips rgba 39759 UNITS in yuv2packed1, 32768 runs, 0 skips 8681 UNITS in yuv2packed1, 32767 runs, 1 skips bgra 39729 UNITS in yuv2packed1, 32768 runs, 0 skips 8696 UNITS in yuv2packed1, 32766 runs, 2 skips argb 39766 UNITS in yuv2packed1, 32768 runs, 0 skips 8672 UNITS in yuv2packed1, 32766 runs, 2 skips bgra 39784 UNITS in yuv2packed1, 32768 runs, 0 skips 8659 UNITS in yuv2packed1, 32767 runs, 1 skips
* swscale/ppc: Add av_unused to template vars only used in one includerLauri Kasanen2019-03-201-10/+11
|
* swscale/ppc: Clean up some mixed decl warningsLauri Kasanen2019-03-203-10/+11
|
* libswscale/ppc: VSX-optimize 9-16 bit yuv2planeXLauri Kasanen2019-02-052-6/+184
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ./ffmpeg_g -f rawvideo -pix_fmt rgb24 -s hd1080 -i /dev/zero -pix_fmt yuv420p16be \ -s 1920x1728 -f null -vframes 100 -v error -nostats - 9-14 bit funcs get about 6x speedup, 16-bit gets about 15x. Fate passes, each format tested with an image to video conversion. Only POWER8 includes 32-bit vector multiplies, so POWER7 is locked out of the 16-bit function. This includes the vec_mulo/mule functions too, not just vmuluwm. With TIMER_REPORT skips disabled: yuv420p9le 12412 UNITS in planarX, 131072 runs, 0 skips 73136 UNITS in planarX, 131072 runs, 0 skips yuv420p9be 12481 UNITS in planarX, 131072 runs, 0 skips 73410 UNITS in planarX, 131072 runs, 0 skips yuv420p10le 12322 UNITS in planarX, 131072 runs, 0 skips 72546 UNITS in planarX, 131072 runs, 0 skips yuv420p10be 12291 UNITS in planarX, 131072 runs, 0 skips 72935 UNITS in planarX, 131072 runs, 0 skips yuv420p12le 12316 UNITS in planarX, 131072 runs, 0 skips 72708 UNITS in planarX, 131072 runs, 0 skips yuv420p12be 12319 UNITS in planarX, 131072 runs, 0 skips 72577 UNITS in planarX, 131072 runs, 0 skips yuv420p14le 12259 UNITS in planarX, 131072 runs, 0 skips 72516 UNITS in planarX, 131072 runs, 0 skips yuv420p14be 12440 UNITS in planarX, 131072 runs, 0 skips 72962 UNITS in planarX, 131072 runs, 0 skips yuv420p16le 10548 UNITS in planarX, 131072 runs, 0 skips 73429 UNITS in planarX, 131072 runs, 0 skips yuv420p16be 10634 UNITS in planarX, 131072 runs, 0 skips 150959 UNITS in planarX, 131072 runs, 0 skips Signed-off-by: Lauri Kasanen <cand@gmx.com>
* swscale/output: Altivec-optimize float yuv2plane1Lauri Kasanen2018-12-261-2/+139
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This function wouldn't benefit from VSX instructions, so I put it under altivec. ./ffmpeg_g -f rawvideo -pix_fmt rgb24 -s hd1080 -i /dev/zero -pix_fmt grayf32le \ -f null -vframes 100 -v error -nostats - 3743 UNITS in planar1, 65495 runs, 41 skips -cpuflags 0 23511 UNITS in planar1, 65530 runs, 6 skips grayf32be 4647 UNITS in planar1, 65449 runs, 87 skips -cpuflags 0 28608 UNITS in planar1, 65530 runs, 6 skips The native speedup is 6.28133, and the bswapping one 6.15623. Fate passes, each format tested with an image to video conversion. Signed-off-by: Lauri Kasanen <cand@gmx.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* swscale/output: VSX-optimize 16-bit yuv2plane1Lauri Kasanen2018-12-141-0/+59
| | | | | | | | | | | | | | | | | | ./ffmpeg_g -f rawvideo -pix_fmt rgb24 -s hd1080 -i /dev/zero -pix_fmt yuv420p16le \ -f null -vframes 100 -v error -nostats - 2120 UNITS in planar1, 65393 runs, 143 skips -cpuflags 0 19157 UNITS in planar1, 65512 runs, 24 skips 9.03632 speedup, 16be similarly. Fate passes, each format tested with an image to video conversion. Signed-off-by: Lauri Kasanen <cand@gmx.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* swscale/output: VSX-optimize nbps yuv2plane1Lauri Kasanen2018-12-121-0/+83
| | | | | | | | | | | | | | | | | | | ./ffmpeg_g -f rawvideo -pix_fmt rgb24 -s hd1080 -i /dev/zero -pix_fmt yuv420p9le \ -f null -vframes 100 -v error -nostats - Speedups: yuv2plane1_9BE_vsx 11.2042 yuv2plane1_9LE_vsx 11.156 yuv2plane1_10BE_vsx 9.89428 yuv2plane1_10LE_vsx 10.3637 yuv2plane1_12BE_vsx 9.71923 yuv2plane1_12LE_vsx 11.0404 yuv2plane1_14BE_vsx 10.1763 yuv2plane1_14LE_vsx 11.2728 Fate passes, each format tested with an image to video conversion. Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* swscale/ppc: Move VSX-using code to its own fileLauri Kasanen2018-12-044-281/+392
| | | | | | | | Passes fate on LE (with "lavc/jrevdct: Avoid an aliasing violation" applied). Signed-off-by: Lauri Kasanen <cand@gmx.com> Tested-by: Michael Kostylev on BE Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* swscale/output: Altivec-optimize yuv2plane1_8Lauri Kasanen2018-11-261-0/+53
| | | | | | | | | | | | | | | | | | | | | ./ffmpeg_g -f rawvideo -pix_fmt rgb24 -s hd1080 -i /dev/zero -pix_fmt yuv420p \ -f null -vframes 100 -v error -nostats - 1158 UNITS in planar1, 65528 runs, 8 skips -cpuflags 0 19082 UNITS in planar1, 65533 runs, 3 skips 16.48 speedup ratio. On x86, SSE2 is ~7. Curiously, the Power C version takes as many cycles as the x86 SSE2 version, yikes it's fast. Note that this function uses VSX instructions, but is not marked so. This is because several existing functions also make that mistake. I'll submit a patch moving them once this is reviewed. Signed-off-by: Lauri Kasanen <cand@gmx.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* libswscale: Adds conversions from/to float gray format.Sergey Lavrushkin2018-08-141-0/+1
| | | | Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* swscale: Drop is9_OR_10BPS() use, its name is not correctMichael Niedermayer2016-11-101-1/+1
| | | | | Found-by: Luca Barbato Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* Revert "PPC64: Add versions of functions in libswscale/input.c optimized for ↵Ronald S. Bultje2016-07-112-438/+0
| | | | | | | POWER8 VSX SIMD." This reverts commit 1df908f33f658979b32599489ca6f1a39821013c. The expected performance improvements are essentially non-existent.
* PPC64: Add versions of functions in libswscale/input.c optimized for POWER8 ↵Dan Parrot2016-06-302-0/+438
| | | | | | | | | | VSX SIMD. This patch addresses Trac ticket #5570. The optimized functions are in file libswscale/ppc/input_vsx.c. Each optimized function name is a concatenation of the corresponding name in libswscale/input.c with suffix _vsx. Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* Merge commit '0f40c9098498ad90dbbd2380eb4269015e84bde4'Clément Bœsch2016-06-211-1/+0
|\ | | | | | | | | | | | | * commit '0f40c9098498ad90dbbd2380eb4269015e84bde4': Drop pointless assert.h #includes Merged-by: Clément Bœsch <clement@stupeflix.com>
| * Drop pointless assert.h #includesDiego Biurrun2016-05-031-1/+0
| |
* | swscale: cleanup unused codePedro Arthur2016-03-311-2/+2
| | | | | | | | | | Removed previous swscale code under '#ifndef NEW_FILTER' and removed unused fields of SwsContext
* | Merge commit '29c2d06d67724e994980045afa055c6c34611b30'Derek Buitenhuis2016-02-241-1/+0
|\| | | | | | | | | | | | | * commit '29c2d06d67724e994980045afa055c6c34611b30': cosmetics: Drop empty comment lines Merged-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
| * cosmetics: Drop empty comment linesDiego Biurrun2016-02-181-1/+0
| |
| * ppc: Restrict some Altivec implementations to Big EndianLuca Barbato2015-05-311-1/+1
| | | | | | | | | | In Little Endian the vec_ld/vec_st operations work as expected only for byte-vectors.
| * ppc: don't return a value from a function declared voidSean McGovern2013-08-301-1/+1
| | | | | | | | Signed-off-by: Martin Storsjö <martin@martin.st>
* | swscale/ppc/yuv2rgb_altivec: POWER LE support in the macros vec_unh() and ↵Rong Yan2015-07-161-0/+5
| | | | | | | | | | | | vec_unl() Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* | swscale/ppc/swscale_altivec.c: POWER LE support in yuv2planeX_8() delete ↵Rong Yan2015-04-271-9/+2
| | | | | | | | | | | | | | | | | | | | macro GET_VF() it was wrong GCC tool had a bug of PPC intrinsic interpret, which has been fixed in GCC 4.9.1. This bug lead to errors in two of our previous patches. We found this when we update our GCC tools to 4.9.1 and by reading the related info on GCC website. We fix our previous error in two separate commits Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* | ppc: libswscale: use LOCAL_ALIGNED instead of DECLARE_ALIGNEDChristophe Gisquet2015-03-141-2/+2
| | | | | | | | | | | | The later may yield incorrect code for on-stack variables. Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* | libswscale/ppc/swscale_altivec.c : fix hScale_altivec_real() ↵Rong Yan2014-11-121-94/+140
| | | | | | | | | | | | | | | | yuv2planeX_16_altivec() yuv2planeX_8() for little endian add marcos GET_LS() GET_VF() LOAD_FILTER() LOAD_L1() GET_VF4() FIRST_LOAD() UPDATE_PTR() LOAD_SRCV() LOAD_SRCV8() GET_VFD() for POWER LE Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* | ppc: don't return a value from a function declared voidSean McGovern2013-08-301-1/+1
| | | | | | | | Signed-off-by: Martin Storsjö <martin@martin.st>
* | Merge commit 'a6b650118543e1580e872896d8976042b7c32d01'Michael Niedermayer2013-08-301-2/+5
|\| | | | | | | | | | | | | * commit 'a6b650118543e1580e872896d8976042b7c32d01': ppc: cosmetics: Consistently format CPU flag detection invocations Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * ppc: cosmetics: Consistently format CPU flag detection invocationsDiego Biurrun2013-08-291-2/+5
| |