aboutsummaryrefslogtreecommitdiffstats
path: root/libavcodec/x86
Commit message (Collapse)AuthorAgeFilesLines
* Merge remote-tracking branch 'qatar/master'Michael Niedermayer2012-04-221-1/+1
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * qatar/master: ARM: allow runtime masking of CPU features dsputil: remove unused functions mov: Treat keyframe indexes as 1-origin if starting at non-zero. mov: Take stps entries into consideration also about key_off. Remove lowres video decoding Conflicts: ffmpeg.c ffplay.c libavcodec/arm/vp8dsp_init_arm.c libavcodec/libopenjpegdec.c libavcodec/mjpegdec.c libavcodec/mpegvideo.c libavcodec/utils.c libavformat/mov.c Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * Remove lowres video decodingMans Rullgard2012-04-211-1/+1
| | | | | | | | | | | | | | This feature is complex, of questionable utility, and slows down normal decoding. Signed-off-by: Mans Rullgard <mans@mansr.com>
* | Merge remote-tracking branch 'qatar/master'Michael Niedermayer2012-04-211-7/+0
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * qatar/master: avcodec: remove AVCodecContext.dsp_mask avconv: fix a segfault when default encoder for a format doesn't exist. utvideo: general cosmetics aac: Handle HE-AACv2 when sniffing a channel order. movenc: Support high sample rates in isomedia formats by setting the sample rate field in stsd to 0. xxan: Remove write-only variable in xan_decode_frame_type0(). ivi_common: Initialize a variable at declaration in ff_ivi_decode_blocks(). Conflicts: ffmpeg.c libavcodec/utvideo.c Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * avcodec: remove AVCodecContext.dsp_maskMans Rullgard2012-04-211-7/+0
| | | | | | | | | | | | | | | | This removes all references to AVCodecContext.dsp_mask and marks it for eviction at the next version bump. It has been superseded by av_set_cpu_flag_mask() which, unlike this field, works everywhere. Signed-off-by: Mans Rullgard <mans@mansr.com>
* | Revert "h264: assembly version of get_cabac for x86_64 with PIC (v4)"Michael Niedermayer2012-04-212-124/+24
| | | | | | | | | | | | This broke compilation on darwin, revert until a better solution is found. This reverts commit a812b599b504b39a8021827da89d5e23fb361cc9.
* | h264: assembly version of get_cabac for x86_64 with PIC (v4)Roland Scheidegger2012-04-212-24/+124
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds a hand-optimized assembly version for get_cabac much like the existing one, but it works if the table offsets are RIP-relative. Compared to the non-RIP-relative version this adds 2 lea instructions and it needs one extra register. There is a surprisingly large performance improvement over the c version (more so than the generated assembly seems to suggest) just in get_cabac, I measured roughly 40% faster for get_cabac on a K8. However, overall the difference is not that big, I measured roughly 5% on a test clip on a K8 and a Core2. Hopefully it still compiles on x86 32bit... v2: incorporated feedback from Loren Merritt to avoid rip-relative movs for every table, and got rid of unnecessary @GOTPCREL. v3: apply similar fixes to the the decode_significance functions, and use same macro arguments for non-pic case. v4: prettify inline asm arguments, add a non-fast-cmov version (as I expect the c code to be faster otherwise since both cmov and sbb suck hard on a Prescott, even can't construct the mask with a 64bit shift as that's just as terrible - it's quite difficult to find usable instructions on that chip...). This is tested to work but not on a P4, in theory it _should_ be fast there. Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* | Merge remote-tracking branch 'qatar/master'Michael Niedermayer2012-04-171-4/+4
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * qatar/master: indeo3: add parens around some macro arguments h264: use proper PROLOGUE statement for a function using 8 registers. doc: Update sample Vim config with suitable (function) indentation settings. dv: Merge dvquant.h into dvdata.c where all other DV tables reside. dv: Move static tables only used in one place to where they are used. graphparser: set next to NULL on an entry extracted from inputs list doc/filters: update documentation. avconv: flush decoders immediately after an EOF. avconv: send EOF to vsrc_buffer. avconv: reindent. Conflicts: doc/filters.texi ffmpeg.c Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * h264: use proper PROLOGUE statement for a function using 8 registers.Ronald S. Bultje2012-04-161-4/+4
| | | | | | | | Fixes crashes when using biweight on win64.
* | Merge remote-tracking branch 'qatar/master'Michael Niedermayer2012-04-141-1/+1
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * qatar/master: vsrc_buffer: fix check from 7ae7c41. libxvid: Reorder functions to avoid forward declarations; make functions static. libxvid: drop some pointless dead code wmal: vertical alignment cosmetics wmal: Warn about missing bitstream splicing feature and ask for sample. wmal: Skip seekable_frame_in_packet. wmal: Drop unused variable num_possible_block_size. avfiltergraph: make the AVFilterInOut alloc/free API public graphparser: allow specifying sws flags in the graph description. graphparser: fix the order of connecting unlabeled links. graphparser: add avfilter_graph_parse2(). vsrc_buffer: allow using a NULL buffer to signal EOF. swscale: handle last pixel if lines have an odd width. qdm2: fix a dubious pointer cast WMAL: Do not try to read rawpcm coefficients if bits is invalid mov: Fix detecting there is no sync sample. tiffdec: K&R cosmetics avf: has_duration does not check the global one dsputil: fix optimized emu_edge function on Win64. Conflicts: doc/APIchanges libavcodec/libxvid_rc.c libavcodec/libxvidff.c libavcodec/tiff.c libavcodec/wmalosslessdec.c libavfilter/avfiltergraph.h libavfilter/graphparser.c libavfilter/version.h libavfilter/vsrc_buffer.c libswscale/output.c Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * dsputil: fix optimized emu_edge function on Win64.Ronald S. Bultje2012-04-131-1/+1
| | | | | | | | | | | | | | | | Recent register allocation changes (x86inc.asm update) changed the register order and thus opcodes for the inner loops. One of them became >128bytes, which confuses other parts of this function where it jumps to fixed-offset positions to extend the edge by fixed amounts. A simple register change fixes this.
* | Merge remote-tracking branch 'qatar/master'Michael Niedermayer2012-04-132-6/+8
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * qatar/master: swscale: K&R formatting cosmetics (part II) tiffdec: Add a malloc check and refactor another. faxcompr: Check malloc results and unify return path configure: escape colons in values written to config.fate ac3dsp: call femms/emms at the end of float_to_fixed24() for 3DNow and SSE matroska: Fix leaking memory allocated for laces. pthread: Fix crash due to fctx->delaying not being cleared. vp3: Assert on invalid filter_limit values. h264: fix 10bit biweight functions after recent x86inc.asm fixes. ffv1: Fix size mismatch in encode_line. movenc: Remove a dead initialization git-howto: Explain how to avoid Windows line endings in git checkouts. build: Move all arch OBJS declarations into arch subdirectory Makefiles. Conflicts: configure libavcodec/vp3.c libavformat/matroskadec.c libavutil/Makefile libswscale/Makefile libswscale/swscale.c libswscale/swscale_internal.h libswscale/utils.c Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * ac3dsp: call femms/emms at the end of float_to_fixed24() for 3DNow and SSEJustin Ruggles2012-04-121-2/+4
| | | | | | | | | | | | Fixes ac3-encode and eac3-encode FATE test failures with SSE2 disabled. Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
| * h264: fix 10bit biweight functions after recent x86inc.asm fixes.Ronald S. Bultje2012-04-121-4/+4
| | | | | | | | | | This should have been updated in the x86inc.asm update, but was accidently forgotten.
* | Merge remote-tracking branch 'qatar/master'Michael Niedermayer2012-04-1312-228/+204
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * qatar/master: libxvid: remove disabled code qdm2: make a table static const qdm2: simplify bitstream reader setup for some subpacket types qdm2: use get_bits_left() build: Consistently handle conditional compilation for all optimization OBJS. avpacket, bfi, bgmc, rawenc: K&R prettyprinting cosmetics msrle: convert MS RLE decoding function to bytestream2. x86inc improvements for 64-bit Conflicts: common.mak libavcodec/avpacket.c libavcodec/bfi.c libavcodec/msrledec.c libavcodec/qdm2.c Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * build: Consistently handle conditional compilation for all optimization OBJS.Diego Biurrun2012-04-121-3/+2
| |
| * x86inc improvements for 64-bitHenrik Gramner2012-04-1110-221/+198
| | | | | | | | | | | | | | | | | | | | | | | | Add support for all x86-64 registers Prefer caller-saved register over callee-saved on WIN64 Support up to 15 function arguments Also (by Ronald S. Bultje) Fix up our asm to work with new x86inc.asm. Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com> Signed-off-by: Justin Ruggles <justin.ruggles@gmail.com>
* | Merge remote-tracking branch 'qatar/master'Michael Niedermayer2012-04-102-69/+68
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * qatar/master: (22 commits) rv40dsp x86: use only one register, for both increment and loop counter rv40dsp: implement prescaled versions for biweight. avconv: use default channel layouts when they are unknown avconv: parse channel layout string nutdec: K&R formatting cosmetics vda: Signal 4 byte NAL headers to the decoder regardless of what's in the extradata mem: Consistently return NULL for av_malloc(0) vf_overlay: implement poll_frame() vf_scale: support named constants for sws flags. lavc doxy: add all installed headers to doxy groups. lavc doxy: add avfft to the main lavc group. lavc doxy: add remaining avcodec.h functions to a misc doxygen group. lavc doxy: add AVPicture functions to a doxy group. lavc doxy: add resampling functions to a doxy group. lavc doxy: replace \ with / lavc doxy: add encoding functions to a doxy group. lavc doxy: add decoding functions to a doxy group. lavc doxy: fix formatting of AV_PKT_DATA_{PARAM_CHANGE,H263_MB_INFO} lavc doxy: add AVPacket-related stuff to a separate doxy group. lavc doxy: add core functions/definitions to a doxy group. ... Conflicts: ffmpeg.c libavcodec/avcodec.h libavcodec/vda.c libavcodec/x86/rv40dsp.asm libavfilter/vf_scale.c libavformat/nutdec.c libavutil/mem.c tests/ref/acodec/pcm_s24daud Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * rv40dsp x86: use only one register, for both increment and loop counterChristophe GISQUET2012-04-101-23/+20
| | | | | | | | | | | | Around 10 cycles faster for luma. Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
| * rv40dsp: implement prescaled versions for biweight.Christophe GISQUET2012-04-102-49/+51
| | | | | | | | | | | | | | | | | | | | Quite often, the original weights are multiple of 512. By prescaling them by 1/512 when they are computed (once per frame), no intermediate shifting is needed, and no prescaling on each call either. The x86 code already used that trick. Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
* | Merge remote-tracking branch 'qatar/master'Michael Niedermayer2012-04-052-12/+10
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * qatar/master: h264: Factorize declaration of mb_sizes array. vsrc_buffer: when no frame is available, return an error instead of segfaulting. configure: add dl to frei0r extralibs. dsputil x86: use SSE float instruction instead of SSE2 integer equivalent dsputil x86: remove deprecated parameter from scalarproduct_int16 prototype vp8dsp x86: perform rounding shift with a single instruction fate: add BMP tests. swscale: handle complete dimensions for monoblack/white. aacenc: Mark deinterleave_input_samples argument as const. vf_unsharp: Mark readonly variable as const. h264: fix 4:2:2 PCM-macroblocks decoding Conflicts: configure libavcodec/h264.h libavcodec/x86/dsputil_mmx.c libavfilter/vf_unsharp.c Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * dsputil x86: use SSE float instruction instead of SSE2 integer equivalentChristophe GISQUET2012-04-042-2/+2
| | | | | | | | | | | | All the more required since the users are pure SSE functions. Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
| * dsputil x86: remove deprecated parameter from scalarproduct_int16 prototypeChristophe GISQUET2012-04-041-2/+2
| | | | | | | | Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
| * vp8dsp x86: perform rounding shift with a single instructionChristophe GISQUET2012-04-041-10/+8
| | | | | | | | Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
* | dsputil_mmx: fix scalarproduct prototypesMichael Niedermayer2012-04-011-2/+2
| | | | | | | | Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* | Merge remote-tracking branch 'qatar/master'Michael Niedermayer2012-03-292-63/+77
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * qatar/master: asf: only set index_read if the index contained entries. cabac: add overread protection to BRANCHLESS_GET_CABAC(). cabac: increment jump locations by one in callers of BRANCHLESS_GET_CABAC(). cabac: remove unused argument from BRANCHLESS_GET_CABAC_UPDATE(). cabac: use struct+offset instead of memory operand in BRANCHLESS_GET_CABAC(). h264: add overread protection to get_cabac_bypass_sign_x86(). h264: reindent get_cabac_bypass_sign_x86(). h264: use struct offsets in get_cabac_bypass_sign_x86(). h264: fix overreads in cabac reader. wmall: fix seeking. lagarith: fix buffer overreads. dvdec: drop unnecessary dv_tablegen.h #include build: fix doc generation errors in parallel builds Replace memset(0) by zero initializations. faandct: Remove FAAN_POSTSCALE define and related code. dvenc: print allowed profiles if the video doesn't conform to any of them. avcodec_encode_{audio,video}: only reallocate output packet when it has non-zero size. FATE: add a test for vp8 with changing frame size. fate: add kgv1 fate test. oggdec: calculate correct timestamps in Ogg/FLAC Conflicts: libavcodec/4xm.c libavcodec/cook.c libavcodec/dvdata.c libavcodec/dvdsubdec.c libavcodec/lagarith.c libavcodec/lagarithrac.c libavcodec/utils.c tests/fate/video.mak Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * cabac: add overread protection to BRANCHLESS_GET_CABAC().Ronald S. Bultje2012-03-282-11/+22
| | | | | | | | Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
| * cabac: increment jump locations by one in callers of BRANCHLESS_GET_CABAC().Ronald S. Bultje2012-03-281-12/+12
| |
| * cabac: remove unused argument from BRANCHLESS_GET_CABAC_UPDATE().Ronald S. Bultje2012-03-281-3/+3
| |
| * cabac: use struct+offset instead of memory operand in BRANCHLESS_GET_CABAC().Ronald S. Bultje2012-03-282-20/+22
| |
| * h264: add overread protection to get_cabac_bypass_sign_x86().Ronald S. Bultje2012-03-281-3/+5
| |
| * h264: reindent get_cabac_bypass_sign_x86().Ronald S. Bultje2012-03-281-22/+22
| |
| * h264: use struct offsets in get_cabac_bypass_sign_x86().Ronald S. Bultje2012-03-281-8/+11
| |
* | Merge remote-tracking branch 'qatar/master'Michael Niedermayer2012-03-261-54/+46
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * qatar/master: build: ppc: drop stray leftover backslash build: Only clean the architecture subdirectory we build for. build: drop some unnecessary dependencies from the H.264 parser build: prettyprinting cosmetics libavutil: Remove pointless rational test program. libavutil: Remove broken and pointless lzo test program. lavf doxy: expand AVStream.codec doxy. lavf doxy: improve AVStream.time_base doxy. lavf doxy: add some basic documentation about reading from the demuxer. lavf doxy: document passing options to demuxers. lavf doxy: clarify that an AVPacket contains encoded data. mpegtsenc: allow user triggered PES packet flushing APIchanges: mark the place where 0.7 was cut. APIchanges: mark the place where 0.8 was cut. APIchanges: fill in missing dates and hashes. smacker: convert palette and header reading to bytestream2. alac: convert extradata reading to bytestream2. Conflicts: doc/APIchanges libavcodec/smacker.c libavcodec/x86/Makefile libavfilter/Makefile libavutil/Makefile Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * build: prettyprinting cosmeticsDiego Biurrun2012-03-261-47/+40
| |
* | Merge remote-tracking branch 'qatar/master'Michael Niedermayer2012-03-252-1988/+2295
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * qatar/master: x86: dsputil: prettyprint gcc inline asm x86: K&R prettyprinting cosmetics for dsputil_mmx.c x86: conditionally compile H.264 QPEL optimizations dsputil_mmx: Surround QPEL macros by "do { } while (0);" blocks. Ignore generated files below doc/. dpcm: convert to bytestream2. interplayvideo: convert to bytestream2. movenc: Merge if statements h264: fix memleak in error path. pthread: Immediately release all frames in ff_thread_flush() h264: Add check for invalid chroma_format_idc utvideo: port header reading to bytestream2. Conflicts: .gitignore configure libavcodec/h264_ps.c libavcodec/interplayvideo.c libavcodec/pthread.c libavcodec/x86/dsputil_mmx.c Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * x86: dsputil: prettyprint gcc inline asmDiego Biurrun2012-03-251-1295/+1310
| |
| * x86: K&R prettyprinting cosmetics for dsputil_mmx.cDiego Biurrun2012-03-251-773/+1049
| |
| * x86: conditionally compile H.264 QPEL optimizationsDiego Biurrun2012-03-252-6/+14
| |
| * dsputil_mmx: Surround QPEL macros by "do { } while (0);" blocks.Diego Biurrun2012-03-251-4/+12
| | | | | | | | This makes them safe to use in non-fully braced if-blocks and similar.
* | Fix linking without yasm.Carl Eugen Hoyos2012-03-241-4/+6
| |
* | Merge remote-tracking branch 'qatar/master'Michael Niedermayer2012-03-241-0/+1
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * qatar/master: rv34: error out on size changes with frame threading aacsbr: Add a debug check to sbr_mapping. aac: Reset some state variables when turning SBR off aac: Reset PS parameters on header decode failure. fate: add wmalossless test. aacsbr: handle m_max values smaller than 4. Conflicts: libavcodec/aacsbr.c tests/fate/lossless-audio.mak Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * aacsbr: handle m_max values smaller than 4.Ronald S. Bultje2012-03-231-0/+1
| | | | | | | | | | | | | | | | Prevents a signflip in the counter, and a subsequent crash because of overreads/overwrites. Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind CC: libav-stable@libav.org
* | VC1: restore optimizations broken in 9a1ced32.Reimar Döffinger2012-03-231-6/+11
| | | | | | | | | | | | | | | | They were moved into code under HAVE_YASM and most of them even into completely disabled code with no reason given for that in the commit message. Signed-off-by: Reimar Döffinger <Reimar.Doeffinger@gmx.de>
* | Replace SSE2 instruction in scalarproduct_float_sse() by SSE equivalent.ami_stuff2012-03-221-1/+1
| | | | | | | | | | | | Fixes an AAC decoding issue with the sample from ticket #213 on machines with SSE but without SSE2. Based on 89411a by Reimar.
* | Replace SSE2 instruction by SSE equivalent.Reimar Döffinger2012-03-211-1/+1
| | | | | | | | | | | | | | | | This is even potentially faster in this use-case. Should fix AAC SBR decoding on machines with SSE but not SSE2, fixing track issue #1041. Signed-off-by: Reimar Döffinger <Reimar.Doeffinger@gmx.de>
* | dsp: fix diff_bytes_mmx() with small widthMichael Niedermayer2012-03-171-0/+1
| | | | | | | | | | | | Fixes Ticket1068 Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* | dsputil: mark source of diff_bytes as const.Michael Niedermayer2012-03-151-1/+1
| | | | | | | | Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* | dirac: mark some variables const.Michael Niedermayer2012-03-151-3/+3
| | | | | | | | Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* | Move struc FFTContext below SECTION_RODATANico Weber2012-03-121-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | Yasm creates an implicit unaligned text section if "struc" is used outside of any section: http://tortall.lighthouseapp.com/projects/78676-yasm/tickets/247 Since yasm only honors the "align" annotation on the first declaration of a section, this implicit text section causes all text section alignments to be ignored. Also fixes a yasm warning about it agnoring alignment. Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* | Merge remote-tracking branch 'qatar/master'Michael Niedermayer2012-03-111-423/+377
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * qatar/master: cook: expand dither_tab[], and make sure indexes into it don't overflow. xxan: reindent xan_unpack_luma(). xxan: protect against chroma LUT overreads. xxan: convert to bytestream2 API. xxan: don't read before start of buffer in av_memcpy_backptr(). vp8: convert mbedge loopfilter x86 assembly to use named arguments. vp8: convert inner loopfilter x86 assembly to use named arguments. Conflicts: libavcodec/xxan.c Merged-by: Michael Niedermayer <michaelni@gmx.at>