aboutsummaryrefslogtreecommitdiffstats
path: root/libavcodec/x86
Commit message (Collapse)AuthorAgeFilesLines
* Fix compilation with --disable-everything.Carl Eugen Hoyos2012-05-241-1/+1
|
* Merge remote-tracking branch 'qatar/master'Michael Niedermayer2012-05-231-0/+6
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * qatar/master: os_support: Define SHUT_RD, SHUT_WR and SHUT_RDWR on OS/2 http: Add support for reading http POST reply headers http: Add http_shutdown() for ending writing of posts tcp: Allow signalling end of reading/writing avio: Add a function for signalling end of reading/writing lavfi: fix comment, audio is supported now. lavfi: fix incorrect comment. lavfi: remove avfilter_null_* from public API on next bump. lavfi: remove avfilter_default_* from public API on next bump. lavfi: deprecate default config_props() callback and refactor avfilter_config_links() avfiltergraph: smarter sample format selection. avconv: rename transcode_audio/video to decode_audio/video. asyncts: reset delta to 0 when it's not used. x86: lavc: use %if HAVE_AVX guards around AVX functions in yasm code. dwt: return errors from ff_slice_buffer_init() Conflicts: ffmpeg.c libavfilter/avfilter.c libavfilter/avfilter.h libavfilter/formats.c libavfilter/version.h libavfilter/vf_blackframe.c libavfilter/vf_drawtext.c libavfilter/vf_fade.c libavfilter/vf_format.c libavfilter/vf_showinfo.c libavfilter/video.c libavfilter/video.h Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * x86: lavc: use %if HAVE_AVX guards around AVX functions in yasm code.Justin Ruggles2012-05-221-0/+8
| | | | | | | | | | | | This is needed for older versions of yasm/nasm that do not support AVX. Signed-off-by: Diego Biurrun <diego@biurrun.de>
* | Merge remote-tracking branch 'qatar/master'Michael Niedermayer2012-05-222-137/+128
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * qatar/master: dwt: check malloc calls ppc: Drop unused header regs.h af_resample: remove an extra space in the log output Convert vector_fmul range of functions to YASM and add AVX versions lavfi: add an audio split filter lavfi: rename vf_split.c to split.c Conflicts: doc/filters.texi libavcodec/ppc/regs.h libavfilter/Makefile libavfilter/allfilters.c libavfilter/f_split.c libavfilter/split.c libavfilter/version.h libavfilter/vf_split.c Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * Convert vector_fmul range of functions to YASM and add AVX versionsKieran Kunhya2012-05-212-137/+128
| | | | | | | | Signed-off-by: Justin Ruggles <justin.ruggles@gmail.com>
* | Merge remote-tracking branch 'qatar/master'Michael Niedermayer2012-05-164-32/+49
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * qatar/master: fate: Work around non-standard wc implementations at more places fate: work around non-standard wc implementations x86: rv40: Mark rv40_weight functions as MMX2; they use MMX2 instructions. ac3dsp: simplify x86 versions of ac3_max_msb_abs_int16 fate: use standard diff options tta: Fix comment about channel number; TTA supports >2 channels. avfilter: Move ff_get_ref_perms_string() to where it is used. build: Add 'check' target to run all compile and test targets. indeo3: validate new frame size before resetting decoder indeo3: when freeing buffers, set pointers referencing them to NULL as well indeo3: initialise pixel planes on allocation indeo3: ensure that decoded cell data is in 7-bit range as presumed by decoder fate: rename psx-str-v3-mdec to mdec-v3 fate: convert psx-str to a demuxer test lavf: add mdec to is_intra_only() list Conflicts: doc/developer.texi libavcodec/indeo3.c libavfilter/video.c libavformat/utils.c tests/fate/demux.mak tests/fate/video.mak tests/lavf-regression.sh tests/ref/vsynth1/cljr tests/ref/vsynth1/ffvhuff tests/ref/vsynth2/cljr tests/ref/vsynth2/ffvhuff Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * x86: rv40: Mark rv40_weight functions as MMX2; they use MMX2 instructions.Michael Kostylev2012-05-152-6/+6
| |
| * ac3dsp: simplify x86 versions of ac3_max_msb_abs_int16Justin Ruggles2012-05-152-26/+43
| | | | | | | | | | | | Simplifies the code by using cpuflags and a new macro. Also fixes the invalid use of the MMX2 pshufw operation in the MMX-only function.
* | Merge remote-tracking branch 'qatar/master'Michael Niedermayer2012-05-141-22/+23
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * qatar/master: (25 commits) vcr1: Add vcr1_ prefixes to all static functions with generic names. vcr1: Fix return type of common_init to match the function pointer signature. vcr1enc: Replace obsolete get_bit_count by put_bits_count/flush_put_bits. motion-test: remove disabled code gxfenc: remove disabled half-implemented MJPEG tag x86: use more standard construct for setting ASM functions in FFT code fate: westwood-aud: disable decoding fate: caf: disable decoding fate: film-cvid: drop pcm audio and rename test fate: d-cinema-demux: drop unnecessary flags fate: split off dpcm-interplay from interplay-mve tests fate: rename funcom-iss to adpcm-ima-iss fate: rename cryo-apc to adpcm-ima-apc fate: rename adpcm-psx-str-v3 to adpcm-xa fate: split off adpcm-ms-mono test from dxa-feeble fate: split off adpcm-ima-ws test from vqa-cc fate: add adpcm-ima-smjpeg test fate: split off adpcm-ima-amv from amv test fate: separate bmv audio and video tests fate: separate delphine-cin audio and video tests ... Conflicts: doc/platform.texi libavcodec/vcr1.c tests/fate/audio.mak tests/fate/demux.mak tests/fate/video.mak tests/ref/fate/ea-mad-pcm-planar tests/ref/fate/interplay-mve-16bit tests/ref/fate/interplay-mve-8bit tests/ref/fate/mtv tests/ref/fate/qtrle-1bit tests/ref/fate/qtrle-2bit tests/ref/fate/truemotion1-15 tests/ref/fate/truemotion1-24 tests/ref/fate/vqa-cc Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * x86: use more standard construct for setting ASM functions in FFT codeVitor Sessak2012-05-141-22/+23
| | | | | | | | Signed-off-by: Diego Biurrun <diego@biurrun.de>
* | Merge remote-tracking branch 'qatar/master'Michael Niedermayer2012-05-132-13/+0
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * qatar/master: lavfi: autoinsert resample filter when necessary. lavfi: add lavr-based audio resampling filter. x86: vc1: drop MMX loop filter implementation, which uses MMX2 instructions. Conflicts: configure doc/filters.texi libavcodec/x86/vc1dsp_mmx.c libavfilter/Makefile libavfilter/allfilters.c libavfilter/avfiltergraph.c Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * x86: vc1: drop MMX loop filter implementation, which uses MMX2 instructions.Michael Kostylev2012-05-122-13/+0
| |
* | Merge remote-tracking branch 'qatar/master'Michael Niedermayer2012-05-104-3/+480
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * qatar/master: (25 commits) rv40dsp x86: MMX/MMX2/3DNow/SSE2/SSSE3 implementations of MC ape: Use unsigned integer maths arm: dsputil: fix overreads in put/avg_pixels functions h264: K&R formatting cosmetics for header files (part II/II) h264: K&R formatting cosmetics for header files (part I/II) rtmp: Implement check bandwidth notification. rtmp: Support 'rtmp_swfurl', an option which specifies the URL of the SWF player. rtmp: Support 'rtmp_flashver', an option which overrides the version of the Flash plugin. rtmp: Support 'rtmp_tcurl', an option which overrides the URL of the target stream. cmdutils: Add fallback case to switch in check_stream_specifier(). sctp: be consistent with socket option level configure: Add _XOPEN_SOURCE=600 to Solaris preprocessor flags. vcr1enc: drop pointless empty encode_init() wrapper function vcr1: drop pointless write-only AVCodecContext member from VCR1Context vcr1: group encoder code together to save #ifdefs vcr1: cosmetics: K&R prettyprinting, typos, parentheses, dead code, comments mov: make one comment slightly more specific lavr: replace the SSE version of ff_conv_fltp_to_flt_6ch() with SSE4 and AVX lavfi: move audio-related functions to a separate file. lavfi: remove some audio-related function from public API. ... Conflicts: cmdutils.c libavcodec/h264.h libavcodec/h264_mvpred.h libavcodec/vcr1.c libavfilter/avfilter.c libavfilter/avfilter.h libavfilter/defaults.c libavfilter/internal.h Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * rv40dsp x86: MMX/MMX2/3DNow/SSE2/SSSE3 implementations of MCChristophe Gisquet2012-05-104-3/+480
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Code mostly inspired by vp8's MC, however: - its MMX2 horizontal filter is worse because it can't take advantage of the coefficient redundancy - that same coefficient redundancy allows better code for non-SSSE3 versions Benchmark (rounded to tens of unit): V8x8 H8x8 2D8x8 V16x16 H16x16 2D16x16 C 445 358 985 1785 1559 3280 MMX* 219 271 478 714 929 1443 SSE2 131 158 294 425 515 892 SSSE3 120 122 248 387 390 763 End result is overall around a 15% speedup for SSSE3 version (on 6 sequences); all loop filter functions now take around 55% of decoding time, while luma MC dsp functions are around 6%, chroma ones are 1.3% and biweight around 2.3%. Signed-off-by: Diego Biurrun <diego@biurrun.de>
| * snowdsp: explicitily state instruction size.Ronald S. Bultje2012-05-021-2/+2
| | | | | | | | Fixes a compile error with clang at -O0.
* | Merge remote-tracking branch 'qatar/master'Michael Niedermayer2012-04-293-1/+81
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | * qatar/master: 4xm: fix invalid array indexing rv34dsp: factorize a multiplication in the noround inverse transform rv40: perform bitwise checks in loop filter rv34: remove inline keyword from rv34_decode_block(). rv40: change a logical test into a bitwise one. rv34: remove constant parameter rv40: don't always do the full prev_type search dsputil x86: revert a test back to its previous value rv34dsp x86: implement MMX2 inverse transform Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * dsputil x86: revert a test back to its previous valueChristophe GISQUET2012-04-281-1/+1
| | | | | | | | | | | | Commit 356ee8d caused the initial inversion. Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
| * rv34dsp x86: implement MMX2 inverse transformChristophe Gisquet2012-04-282-0/+80
| | | | | | | | | | | | 141 cycles down to 51. Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
| * h264: new assembly version of get_cabac for x86_64 with PICRoland Scheidegger2012-04-282-23/+120
| | | | | | | | | | | | | | | | | | | | This adds a hand-optimized assembly version for get_cabac much like the existing one, but it works if the table offsets are RIP-relative. Compared to the non-RIP-relative version this adds 2 lea instructions and it needs one extra register. get_cabac() gets about 40% faster, for an overall speedup of about 5%. Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
| * h264: use one table instead of several for cabac functionsRoland Scheidegger2012-04-282-14/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | The reason is this is easier for PIC code (in particular on darwin...). Keep the old names as pointers (static in cabac_functions.h so gcc knows these are just immediate offsets) so the c code can nicely stay the same (alternatively could use offsets directly in the functions needing the tables). This should produce the same code as before with non-pic and better code (confirmed) with pic. The assembly uses the new table but still won't work for PIC case. Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
| * h264: (trivial) remove unneeded macro argument in x86/cabac.hRoland Scheidegger2012-04-281-3/+3
| | | | | | | | Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
* | h264: new assembly version of get_cabac for x86_64 with PICRoland Scheidegger2012-04-282-24/+118
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds a hand-optimized assembly version for get_cabac much like the existing one, but it works if the table offsets are RIP-relative. Compared to the non-RIP-relative version this adds 2 lea instructions and it needs one extra register. There is a surprisingly large performance improvement over the c version (more so than the generated assembly seems to suggest) just in get_cabac, I measured roughly 40% faster for get_cabac on a K8. However, overall the difference is not that big, I measured roughly 5% on a test clip on a K8 and a Core2. Hopefully it still compiles on x86 32bit... Now that only one table is used, there's some chance even darwin as compiles this (apparently the label arithmetic used previously doesn't work if it involves symbols defined in a different file, thanks to Ronald S. Bultje for helping me with this). Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* | h264: use one table instead of several for cabac functionsRoland Scheidegger2012-04-282-14/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | The reason is this is easier for PIC code (in particular on darwin...). Keep the old names as pointers (static in cabac_functions.h so gcc knows these are just immediate offsets) so the c code can nicely stay the same (alternatively could use offsets directly in the functions needing the tables). This should produce the same code as before with non-pic and better code (confirmed) with pic. The assembly uses the new table but still won't work for PIC case. Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* | h264: (trivial) remove unneeded macro argument in x86/cabac.hRoland Scheidegger2012-04-281-3/+3
| | | | | | | | Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* | lowres2 support.Michael Niedermayer2012-04-221-1/+1
| | | | | | | | | | | | | | | | | | The new lowres support is limited to decoders where lowres decoding is possible in high quality. I was not able to measure any speed difference, but if one is found the 2-3 lines that might affect speed can be made compile time conditional Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* | Merge remote-tracking branch 'qatar/master'Michael Niedermayer2012-04-221-1/+1
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * qatar/master: ARM: allow runtime masking of CPU features dsputil: remove unused functions mov: Treat keyframe indexes as 1-origin if starting at non-zero. mov: Take stps entries into consideration also about key_off. Remove lowres video decoding Conflicts: ffmpeg.c ffplay.c libavcodec/arm/vp8dsp_init_arm.c libavcodec/libopenjpegdec.c libavcodec/mjpegdec.c libavcodec/mpegvideo.c libavcodec/utils.c libavformat/mov.c Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * Remove lowres video decodingMans Rullgard2012-04-211-1/+1
| | | | | | | | | | | | | | This feature is complex, of questionable utility, and slows down normal decoding. Signed-off-by: Mans Rullgard <mans@mansr.com>
* | Merge remote-tracking branch 'qatar/master'Michael Niedermayer2012-04-211-7/+0
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * qatar/master: avcodec: remove AVCodecContext.dsp_mask avconv: fix a segfault when default encoder for a format doesn't exist. utvideo: general cosmetics aac: Handle HE-AACv2 when sniffing a channel order. movenc: Support high sample rates in isomedia formats by setting the sample rate field in stsd to 0. xxan: Remove write-only variable in xan_decode_frame_type0(). ivi_common: Initialize a variable at declaration in ff_ivi_decode_blocks(). Conflicts: ffmpeg.c libavcodec/utvideo.c Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * avcodec: remove AVCodecContext.dsp_maskMans Rullgard2012-04-211-7/+0
| | | | | | | | | | | | | | | | This removes all references to AVCodecContext.dsp_mask and marks it for eviction at the next version bump. It has been superseded by av_set_cpu_flag_mask() which, unlike this field, works everywhere. Signed-off-by: Mans Rullgard <mans@mansr.com>
* | Revert "h264: assembly version of get_cabac for x86_64 with PIC (v4)"Michael Niedermayer2012-04-212-124/+24
| | | | | | | | | | | | This broke compilation on darwin, revert until a better solution is found. This reverts commit a812b599b504b39a8021827da89d5e23fb361cc9.
* | h264: assembly version of get_cabac for x86_64 with PIC (v4)Roland Scheidegger2012-04-212-24/+124
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds a hand-optimized assembly version for get_cabac much like the existing one, but it works if the table offsets are RIP-relative. Compared to the non-RIP-relative version this adds 2 lea instructions and it needs one extra register. There is a surprisingly large performance improvement over the c version (more so than the generated assembly seems to suggest) just in get_cabac, I measured roughly 40% faster for get_cabac on a K8. However, overall the difference is not that big, I measured roughly 5% on a test clip on a K8 and a Core2. Hopefully it still compiles on x86 32bit... v2: incorporated feedback from Loren Merritt to avoid rip-relative movs for every table, and got rid of unnecessary @GOTPCREL. v3: apply similar fixes to the the decode_significance functions, and use same macro arguments for non-pic case. v4: prettify inline asm arguments, add a non-fast-cmov version (as I expect the c code to be faster otherwise since both cmov and sbb suck hard on a Prescott, even can't construct the mask with a 64bit shift as that's just as terrible - it's quite difficult to find usable instructions on that chip...). This is tested to work but not on a P4, in theory it _should_ be fast there. Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* | Merge remote-tracking branch 'qatar/master'Michael Niedermayer2012-04-171-4/+4
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * qatar/master: indeo3: add parens around some macro arguments h264: use proper PROLOGUE statement for a function using 8 registers. doc: Update sample Vim config with suitable (function) indentation settings. dv: Merge dvquant.h into dvdata.c where all other DV tables reside. dv: Move static tables only used in one place to where they are used. graphparser: set next to NULL on an entry extracted from inputs list doc/filters: update documentation. avconv: flush decoders immediately after an EOF. avconv: send EOF to vsrc_buffer. avconv: reindent. Conflicts: doc/filters.texi ffmpeg.c Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * h264: use proper PROLOGUE statement for a function using 8 registers.Ronald S. Bultje2012-04-161-4/+4
| | | | | | | | Fixes crashes when using biweight on win64.
* | Merge remote-tracking branch 'qatar/master'Michael Niedermayer2012-04-141-1/+1
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * qatar/master: vsrc_buffer: fix check from 7ae7c41. libxvid: Reorder functions to avoid forward declarations; make functions static. libxvid: drop some pointless dead code wmal: vertical alignment cosmetics wmal: Warn about missing bitstream splicing feature and ask for sample. wmal: Skip seekable_frame_in_packet. wmal: Drop unused variable num_possible_block_size. avfiltergraph: make the AVFilterInOut alloc/free API public graphparser: allow specifying sws flags in the graph description. graphparser: fix the order of connecting unlabeled links. graphparser: add avfilter_graph_parse2(). vsrc_buffer: allow using a NULL buffer to signal EOF. swscale: handle last pixel if lines have an odd width. qdm2: fix a dubious pointer cast WMAL: Do not try to read rawpcm coefficients if bits is invalid mov: Fix detecting there is no sync sample. tiffdec: K&R cosmetics avf: has_duration does not check the global one dsputil: fix optimized emu_edge function on Win64. Conflicts: doc/APIchanges libavcodec/libxvid_rc.c libavcodec/libxvidff.c libavcodec/tiff.c libavcodec/wmalosslessdec.c libavfilter/avfiltergraph.h libavfilter/graphparser.c libavfilter/version.h libavfilter/vsrc_buffer.c libswscale/output.c Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * dsputil: fix optimized emu_edge function on Win64.Ronald S. Bultje2012-04-131-1/+1
| | | | | | | | | | | | | | | | Recent register allocation changes (x86inc.asm update) changed the register order and thus opcodes for the inner loops. One of them became >128bytes, which confuses other parts of this function where it jumps to fixed-offset positions to extend the edge by fixed amounts. A simple register change fixes this.
* | Merge remote-tracking branch 'qatar/master'Michael Niedermayer2012-04-132-6/+8
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * qatar/master: swscale: K&R formatting cosmetics (part II) tiffdec: Add a malloc check and refactor another. faxcompr: Check malloc results and unify return path configure: escape colons in values written to config.fate ac3dsp: call femms/emms at the end of float_to_fixed24() for 3DNow and SSE matroska: Fix leaking memory allocated for laces. pthread: Fix crash due to fctx->delaying not being cleared. vp3: Assert on invalid filter_limit values. h264: fix 10bit biweight functions after recent x86inc.asm fixes. ffv1: Fix size mismatch in encode_line. movenc: Remove a dead initialization git-howto: Explain how to avoid Windows line endings in git checkouts. build: Move all arch OBJS declarations into arch subdirectory Makefiles. Conflicts: configure libavcodec/vp3.c libavformat/matroskadec.c libavutil/Makefile libswscale/Makefile libswscale/swscale.c libswscale/swscale_internal.h libswscale/utils.c Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * ac3dsp: call femms/emms at the end of float_to_fixed24() for 3DNow and SSEJustin Ruggles2012-04-121-2/+4
| | | | | | | | | | | | Fixes ac3-encode and eac3-encode FATE test failures with SSE2 disabled. Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
| * h264: fix 10bit biweight functions after recent x86inc.asm fixes.Ronald S. Bultje2012-04-121-4/+4
| | | | | | | | | | This should have been updated in the x86inc.asm update, but was accidently forgotten.
* | Merge remote-tracking branch 'qatar/master'Michael Niedermayer2012-04-1312-228/+204
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * qatar/master: libxvid: remove disabled code qdm2: make a table static const qdm2: simplify bitstream reader setup for some subpacket types qdm2: use get_bits_left() build: Consistently handle conditional compilation for all optimization OBJS. avpacket, bfi, bgmc, rawenc: K&R prettyprinting cosmetics msrle: convert MS RLE decoding function to bytestream2. x86inc improvements for 64-bit Conflicts: common.mak libavcodec/avpacket.c libavcodec/bfi.c libavcodec/msrledec.c libavcodec/qdm2.c Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * build: Consistently handle conditional compilation for all optimization OBJS.Diego Biurrun2012-04-121-3/+2
| |
| * x86inc improvements for 64-bitHenrik Gramner2012-04-1110-221/+198
| | | | | | | | | | | | | | | | | | | | | | | | Add support for all x86-64 registers Prefer caller-saved register over callee-saved on WIN64 Support up to 15 function arguments Also (by Ronald S. Bultje) Fix up our asm to work with new x86inc.asm. Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com> Signed-off-by: Justin Ruggles <justin.ruggles@gmail.com>
* | Merge remote-tracking branch 'qatar/master'Michael Niedermayer2012-04-102-69/+68
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * qatar/master: (22 commits) rv40dsp x86: use only one register, for both increment and loop counter rv40dsp: implement prescaled versions for biweight. avconv: use default channel layouts when they are unknown avconv: parse channel layout string nutdec: K&R formatting cosmetics vda: Signal 4 byte NAL headers to the decoder regardless of what's in the extradata mem: Consistently return NULL for av_malloc(0) vf_overlay: implement poll_frame() vf_scale: support named constants for sws flags. lavc doxy: add all installed headers to doxy groups. lavc doxy: add avfft to the main lavc group. lavc doxy: add remaining avcodec.h functions to a misc doxygen group. lavc doxy: add AVPicture functions to a doxy group. lavc doxy: add resampling functions to a doxy group. lavc doxy: replace \ with / lavc doxy: add encoding functions to a doxy group. lavc doxy: add decoding functions to a doxy group. lavc doxy: fix formatting of AV_PKT_DATA_{PARAM_CHANGE,H263_MB_INFO} lavc doxy: add AVPacket-related stuff to a separate doxy group. lavc doxy: add core functions/definitions to a doxy group. ... Conflicts: ffmpeg.c libavcodec/avcodec.h libavcodec/vda.c libavcodec/x86/rv40dsp.asm libavfilter/vf_scale.c libavformat/nutdec.c libavutil/mem.c tests/ref/acodec/pcm_s24daud Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * rv40dsp x86: use only one register, for both increment and loop counterChristophe GISQUET2012-04-101-23/+20
| | | | | | | | | | | | Around 10 cycles faster for luma. Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
| * rv40dsp: implement prescaled versions for biweight.Christophe GISQUET2012-04-102-49/+51
| | | | | | | | | | | | | | | | | | | | Quite often, the original weights are multiple of 512. By prescaling them by 1/512 when they are computed (once per frame), no intermediate shifting is needed, and no prescaling on each call either. The x86 code already used that trick. Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
* | Merge remote-tracking branch 'qatar/master'Michael Niedermayer2012-04-052-12/+10
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * qatar/master: h264: Factorize declaration of mb_sizes array. vsrc_buffer: when no frame is available, return an error instead of segfaulting. configure: add dl to frei0r extralibs. dsputil x86: use SSE float instruction instead of SSE2 integer equivalent dsputil x86: remove deprecated parameter from scalarproduct_int16 prototype vp8dsp x86: perform rounding shift with a single instruction fate: add BMP tests. swscale: handle complete dimensions for monoblack/white. aacenc: Mark deinterleave_input_samples argument as const. vf_unsharp: Mark readonly variable as const. h264: fix 4:2:2 PCM-macroblocks decoding Conflicts: configure libavcodec/h264.h libavcodec/x86/dsputil_mmx.c libavfilter/vf_unsharp.c Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * dsputil x86: use SSE float instruction instead of SSE2 integer equivalentChristophe GISQUET2012-04-042-2/+2
| | | | | | | | | | | | All the more required since the users are pure SSE functions. Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
| * dsputil x86: remove deprecated parameter from scalarproduct_int16 prototypeChristophe GISQUET2012-04-041-2/+2
| | | | | | | | Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
| * vp8dsp x86: perform rounding shift with a single instructionChristophe GISQUET2012-04-041-10/+8
| | | | | | | | Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
* | dsputil_mmx: fix scalarproduct prototypesMichael Niedermayer2012-04-011-2/+2
| | | | | | | | Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* | Merge remote-tracking branch 'qatar/master'Michael Niedermayer2012-03-292-63/+77
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * qatar/master: asf: only set index_read if the index contained entries. cabac: add overread protection to BRANCHLESS_GET_CABAC(). cabac: increment jump locations by one in callers of BRANCHLESS_GET_CABAC(). cabac: remove unused argument from BRANCHLESS_GET_CABAC_UPDATE(). cabac: use struct+offset instead of memory operand in BRANCHLESS_GET_CABAC(). h264: add overread protection to get_cabac_bypass_sign_x86(). h264: reindent get_cabac_bypass_sign_x86(). h264: use struct offsets in get_cabac_bypass_sign_x86(). h264: fix overreads in cabac reader. wmall: fix seeking. lagarith: fix buffer overreads. dvdec: drop unnecessary dv_tablegen.h #include build: fix doc generation errors in parallel builds Replace memset(0) by zero initializations. faandct: Remove FAAN_POSTSCALE define and related code. dvenc: print allowed profiles if the video doesn't conform to any of them. avcodec_encode_{audio,video}: only reallocate output packet when it has non-zero size. FATE: add a test for vp8 with changing frame size. fate: add kgv1 fate test. oggdec: calculate correct timestamps in Ogg/FLAC Conflicts: libavcodec/4xm.c libavcodec/cook.c libavcodec/dvdata.c libavcodec/dvdsubdec.c libavcodec/lagarith.c libavcodec/lagarithrac.c libavcodec/utils.c tests/fate/video.mak Merged-by: Michael Niedermayer <michaelni@gmx.at>