aboutsummaryrefslogtreecommitdiffstats
path: root/libavcodec/x86/h264_idct.asm
Commit message (Collapse)AuthorAgeFilesLines
* h264_idct: enable unmacro on newer NASM versionsRostislav Pehlivanov2018-02-121-1/+5
| | | | Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
* Merge commit 'e9bb77fb1012cba1951a82136df7071f71bce8fb'James Almer2017-10-211-24/+28
|\ | | | | | | | | | | | | * commit 'e9bb77fb1012cba1951a82136df7071f71bce8fb': x86: h264: Simplify DEQUANT macro with cpuflags Merged-by: James Almer <jamrial@gmail.com>
| * x86: h264: Simplify DEQUANT macro with cpuflagsDiego Biurrun2017-03-141-24/+28
| |
* | avcodec/h264: add sse2 versions of previous idct functionsJames Darnley2017-05-151-2/+9
| | | | | | | | | | | | Kaby Lake Pentium: - ff_h264_idct_add_8_sse2: ~1.18x faster than mmxext - ff_h264_idct_dc_add_8_sse2: ~1.07x faster than mmxext
* | avcodec/h264: add avx 8-bit h264_idct_dc_addJames Darnley2017-05-151-0/+20
| | | | | | | | | | | | | | | | Haswell: - 1.02x faster (405±0.7 vs. 397±0.8 decicycles) compared with mmxext Skylake-U: - 1.06x faster (498±1.8 vs. 470±1.3 decicycles) compared with mmxext
* | avcodec/h264: add avx 8-bit h264_idct_addJames Darnley2017-05-151-1/+32
| | | | | | | | | | | | | | | | Haswell: - 1.11x faster (522±0.4 vs. 469±1.8 decicycles) compared with mmxext Skylake-U: - 1.21x faster (671±5.5 vs. 555±1.4 decicycles) compared with mmxext
* | avcodec/h264: use some 3 operand formsJames Darnley2017-05-151-12/+9
| |
* | avcodec/h264: change RETs into REP_RETs where appropriateJames Darnley2017-05-151-6/+6
| |
* | avcodec/h264: mmx 4:2:2 idct add8 functionJames Darnley2016-11-301-0/+32
| | | | | | | | 2.87 times faster (1830 vs. 638 cycles)
* | Merge commit 'f1a9eee41c4b5ea35db9ff0088ce4e6f1e187f2c'Clément Bœsch2016-07-091-0/+21
|\| | | | | | | | | | | | | * commit 'f1a9eee41c4b5ea35db9ff0088ce4e6f1e187f2c': x86: Add missing movsxd for the int stride parameter Merged-by: Clément Bœsch <u@pkh.me>
| * x86: Add missing movsxd for the int stride parameterMartin Storsjö2016-06-171-0/+21
| | | | | | | | Signed-off-by: Martin Storsjö <martin@martin.st>
* | Merge commit '55519926ef855c671d084ccc151056de9e3d3a77'Michael Niedermayer2014-03-141-42/+43
|\| | | | | | | | | | | | | | | | | | | * commit '55519926ef855c671d084ccc151056de9e3d3a77': x86: Make function prototype comments in assembly code consistent Conflicts: libavcodec/x86/sbrdsp.asm Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * x86: Make function prototype comments in assembly code consistentDiego Biurrun2014-03-131-42/+43
| | | | | | | | This helps grepping for functions, among other things.
* | Merge remote-tracking branch 'qatar/master'Michael Niedermayer2013-10-081-32/+42
|\| | | | | | | | | | | | | * qatar/master: x86: h264_idct: Update comments to match 8/10-bit depth optimization split Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * x86: h264_idct: Update comments to match 8/10-bit depth optimization splitDiego Biurrun2013-10-071-32/+42
| |
* | Merge commit '0b45269c2d732d15afa2de9c475d85fcf5561ac4'Michael Niedermayer2013-08-221-1/+0
|\| | | | | | | | | | | | | * commit '0b45269c2d732d15afa2de9c475d85fcf5561ac4': x86: h264_idct: Remove incorrect comment Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * x86: h264_idct: Remove incorrect commentDiego Biurrun2013-08-211-1/+0
| |
* | Merge commit '62844c3fd66940c7747e9b2bb7804e265319f43f'Michael Niedermayer2013-04-111-4/+4
|\| | | | | | | | | | | | | | | | | | | | | | | * commit '62844c3fd66940c7747e9b2bb7804e265319f43f': h264: Integrate clear_blocks calls with IDCT Conflicts: libavcodec/arm/h264idct_neon.S libavcodec/h264idct_template.c libavcodec/x86/h264_idct.asm Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * h264: Integrate clear_blocks calls with IDCTRonald S. Bultje2013-04-101-25/+83
| | | | | | | | | | | | | | | | | | The non-intra-pcm branch in hl_decode_mb (simple, 8bpp) goes from 700 to 672 cycles, and the complete loop of decode_mb_cabac and hl_decode_mb (in the decode_slice loop) goes from 1759 to 1733 cycles on the clip tested (cathedral), i.e. almost 30 cycles per mb faster. Signed-off-by: Martin Storsjö <martin@martin.st>
* | h264: integrate clear_blocks calls with IDCT.Ronald S. Bultje2013-02-191-25/+83
| | | | | | | | | | | | | | | | | | The non-intra-pcm branch in hl_decode_mb (simple, 8bpp) goes from 700 to 672 cycles, and the complete loop of decode_mb_cabac and hl_decode_mb (in the decode_slice loop) goes from 1759 to 1733 cycles on the clip tested (cathedral), i.e. almost 30 cycles per mb faster. Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* | Merge commit '88bd7fdc821aaa0cbcf44cf075c62aaa42121e3f'Michael Niedermayer2013-01-231-13/+13
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * commit '88bd7fdc821aaa0cbcf44cf075c62aaa42121e3f': Drop DCTELEM typedef Conflicts: libavcodec/alpha/dsputil_alpha.h libavcodec/alpha/motion_est_alpha.c libavcodec/arm/dsputil_init_armv6.c libavcodec/bfin/dsputil_bfin.h libavcodec/bfin/pixels_bfin.S libavcodec/cavs.c libavcodec/cavsdec.c libavcodec/dct-test.c libavcodec/dnxhdenc.c libavcodec/dsputil.c libavcodec/dsputil.h libavcodec/dsputil_template.c libavcodec/eamad.c libavcodec/h264_cavlc.c libavcodec/h264idct_template.c libavcodec/mpeg12.c libavcodec/mpegvideo.c libavcodec/mpegvideo.h libavcodec/mpegvideo_enc.c libavcodec/ppc/dsputil_altivec.c libavcodec/proresdsp.c Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * Drop DCTELEM typedefDiego Biurrun2013-01-221-13/+13
| | | | | | | | | | | | It does not help as an abstraction and adds dsputil dependencies. Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
* | Merge commit 'af7d13ee4a4bf8d708f9b0598abb8f6e22b76de1'Michael Niedermayer2012-11-281-69/+70
|\| | | | | | | | | | | | | | | | | * commit 'af7d13ee4a4bf8d708f9b0598abb8f6e22b76de1': asink_nullsink: plug a memory leak. x86: h264_idct: port to cpuflags x86: cpu: Drop unused HAVE_RWEFLAGS condition Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * x86: h264_idct: port to cpuflagsDiego Biurrun2012-11-281-69/+70
| |
* | Merge remote-tracking branch 'qatar/master'Michael Niedermayer2012-11-141-22/+24
|\| | | | | | | | | | | | | | | | | | | | | | | * qatar/master: x86: mmx2 ---> mmxext in asm constructs Conflicts: libavcodec/x86/h264_chromamc_10bit.asm libavcodec/x86/h264_deblock.asm libavcodec/x86/h264dsp_init.c Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * x86: mmx2 ---> mmxext in asm constructsDiego Biurrun2012-11-141-22/+24
| |
* | Merge commit 'be923ed659016350592acb9b3346f706f8170ac5'Michael Niedermayer2012-10-311-20/+20
|\| | | | | | | | | | | | | | | * commit 'be923ed659016350592acb9b3346f706f8170ac5': x86: fmtconvert: port to cpuflags x86: MMX2 ---> MMXEXT in macro names Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * x86: MMX2 ---> MMXEXT in macro namesDiego Biurrun2012-10-311-20/+20
| |
| * x86: yasm: Use complete source path for macro helper %includesDiego Biurrun2012-10-311-1/+1
| | | | | | | | | | This is more consistent with the way we handle C #includes and it simplifies the build system.
* | Merge commit '6860b4081d046558c44b1b42f22022ea341a2a73'Michael Niedermayer2012-10-311-1/+0
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * commit '6860b4081d046558c44b1b42f22022ea341a2a73': x86: include x86inc.asm in x86util.asm cng: Reindent some incorrectly indented lines cngdec: Allow flushing the decoder cngdec: Make the dbov variable have the right unit cngdec: Fix the memset size to cover the full array cngdec: Update the LPC coefficients after averaging the reflection coefficients configure: fix print_config() with broke awks Conflicts: libavcodec/x86/ac3dsp.asm libavcodec/x86/dct32.asm libavcodec/x86/deinterlace.asm libavcodec/x86/dsputil.asm libavcodec/x86/dsputilenc.asm libavcodec/x86/fft.asm libavcodec/x86/fmtconvert.asm libavcodec/x86/h264_chromamc.asm libavcodec/x86/h264_deblock.asm libavcodec/x86/h264_deblock_10bit.asm libavcodec/x86/h264_idct.asm libavcodec/x86/h264_idct_10bit.asm libavcodec/x86/h264_intrapred.asm libavcodec/x86/h264_intrapred_10bit.asm libavcodec/x86/h264_weight.asm libavcodec/x86/vc1dsp.asm libavcodec/x86/vp3dsp.asm libavcodec/x86/vp56dsp.asm libavcodec/x86/vp8dsp.asm Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * x86: include x86inc.asm in x86util.asmDiego Biurrun2012-10-311-1/+0
| | | | | | | | This is necessary to allow refactoring some x86util macros with cpuflags.
* | Merge remote-tracking branch 'qatar/master'Michael Niedermayer2012-08-071-29/+29
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * qatar/master: x86: fix build with nasm 2.08 x86: use nop cpu directives only if supported x86: fix rNmp macros with nasm build: add trailing / to yasm/nasm -I flags x86: use 32-bit source registers with movd instruction x86: add colons after labels Conflicts: Makefile libavutil/x86/x86inc.asm Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * x86: add colons after labelsMans Rullgard2012-08-071-29/+29
| | | | | | | | | | | | nasm prints a warning if the colon is missing. Signed-off-by: Mans Rullgard <mans@mansr.com>
* | Merge remote-tracking branch 'qatar/master'Michael Niedermayer2012-08-051-4/+4
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * qatar/master: x86: h264_idct: Rename x264_add8x4_idct_sse2 --> h264_add8x4_idct_sse2 rational: add av_inv_q() returning the inverse of an AVRational dpx: Make start offset unsigned lavfi: properly signal out-of-memory error in ff_filter_samples cosmetics: Fix a few switched periods and linebreaks zerocodec: Fix memleak in decode_frame zerocodec: Cosmetics Conflicts: ffmpeg.c Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * x86: h264_idct: Rename x264_add8x4_idct_sse2 --> h264_add8x4_idct_sse2Diego Biurrun2012-08-051-4/+4
| |
* | Merge remote-tracking branch 'qatar/master'Michael Niedermayer2012-04-131-83/+73
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * qatar/master: libxvid: remove disabled code qdm2: make a table static const qdm2: simplify bitstream reader setup for some subpacket types qdm2: use get_bits_left() build: Consistently handle conditional compilation for all optimization OBJS. avpacket, bfi, bgmc, rawenc: K&R prettyprinting cosmetics msrle: convert MS RLE decoding function to bytestream2. x86inc improvements for 64-bit Conflicts: common.mak libavcodec/avpacket.c libavcodec/bfi.c libavcodec/msrledec.c libavcodec/qdm2.c Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * x86inc improvements for 64-bitHenrik Gramner2012-04-111-83/+73
| | | | | | | | | | | | | | | | | | | | | | | | Add support for all x86-64 registers Prefer caller-saved register over callee-saved on WIN64 Support up to 15 function arguments Also (by Ronald S. Bultje) Fix up our asm to work with new x86inc.asm. Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com> Signed-off-by: Justin Ruggles <justin.ruggles@gmail.com>
* | Merge remote-tracking branch 'qatar/master'Michael Niedermayer2012-02-091-0/+3
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * qatar/master: pixdesc: mark pseudopaletted formats with a special flag. avconv: switch to avcodec_encode_video2(). libx264: implement encode2(). libx264: split extradata writing out of encode_nals(). lavc: add avcodec_encode_video2() that encodes from an AVFrame -> AVPacket cmdutils: update copyright year to 2012. swscale: sign-extend integer function argument to qword on x86-64. x86inc: support yasm -f win64 flag also. h264: manually save/restore XMM registers for functions using INIT_MMX. x86inc: allow manual use of WIN64_SPILL_XMM. aacdec: Use correct speaker order for 7.1. aacdec: Remove incorrect comment. aacdec: Simplify output configuration. Remove Sun medialib glue code. dsputil: set STRIDE_ALIGN to 16 for x86 also. pngdsp: swap argument inversion. Conflicts: cmdutils.c configure doc/APIchanges ffmpeg.c libavcodec/aacdec.c libavcodec/dsputil.h libavcodec/libx264.c libavcodec/mlib/dsputil_mlib.c libavcodec/utils.c libavfilter/vf_scale.c libavutil/avutil.h libswscale/mlib/yuv2rgb_mlib.c Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * h264: manually save/restore XMM registers for functions using INIT_MMX.Ronald S. Bultje2012-02-081-0/+3
| | | | | | | | | | On Win64, these registers are callee-save, so not saving/restoring them correctly is a violation of ABI and can lead to crashes or corrupt data.
* | Merge remote-tracking branch 'qatar/master'Michael Niedermayer2012-01-281-30/+30
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * qatar/master: (71 commits) movenc: Allow writing to a non-seekable output if using empty moov movenc: Support adding isml (smooth streaming live) metadata libavcodec: Don't crash in avcodec_encode_audio if time_base isn't set sunrast: Document the different Sun Raster file format types. sunrast: Add a check for experimental type. libspeexenc: use AVSampleFormat instead of deprecated/removed SampleFormat lavf: remove disabled FF_API_SET_PTS_INFO cruft lavf: remove disabled FF_API_OLD_INTERRUPT_CB cruft lavf: remove disabled FF_API_REORDER_PRIVATE cruft lavf: remove disabled FF_API_SEEK_PUBLIC cruft lavf: remove disabled FF_API_STREAM_COPY cruft lavf: remove disabled FF_API_PRELOAD cruft lavf: remove disabled FF_API_NEW_STREAM cruft lavf: remove disabled FF_API_RTSP_URL_OPTIONS cruft lavf: remove disabled FF_API_MUXRATE cruft lavf: remove disabled FF_API_FILESIZE cruft lavf: remove disabled FF_API_TIMESTAMP cruft lavf: remove disabled FF_API_LOOP_OUTPUT cruft lavf: remove disabled FF_API_LOOP_INPUT cruft lavf: remove disabled FF_API_AVSTREAM_QUALITY cruft ... Conflicts: doc/APIchanges libavcodec/8bps.c libavcodec/avcodec.h libavcodec/libx264.c libavcodec/mjpegbdec.c libavcodec/options.c libavcodec/sunrast.c libavcodec/utils.c libavcodec/version.h libavcodec/x86/h264_deblock.asm libavdevice/libdc1394.c libavdevice/v4l2.c libavformat/avformat.h libavformat/avio.c libavformat/avio.h libavformat/aviobuf.c libavformat/dv.c libavformat/mov.c libavformat/utils.c libavformat/version.h libavformat/wtv.c libavutil/Makefile libavutil/file.c libswscale/x86/input.asm libswscale/x86/swscale_mmx.c libswscale/x86/swscale_template.c tests/ref/lavf/ffm Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * config.asm: change %ifdef directives to %if directives.Ronald S. Bultje2012-01-271-30/+30
| | | | | | | | This allows combining multiple conditionals in a single statement.
* | Move x264asm to libavutil.Kieran Kunhya2011-10-191-2/+2
| | | | | | | | Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* | Merge remote-tracking branch 'qatar/master'Michael Niedermayer2011-08-151-2/+2
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * qatar/master: Fix NASM include directive dsputil_mmx: Honor HAVE_AMD3DNOW lavf,lavd: remove all usage of AVFormatParameters from demuxers. jack: add 'channels' private option. VC-1: fix reading of custom PAR. Remove redundant and dubious video codec detection by its extradata mpeg12: remove repeat-field code disabled since May 2002 patch checklist: suggest fate instead of regression tests Turn on resampling on sudden size change instead of bailing out during recode. avtools: reinitialise filter chain when input video stream changes dimensions Conflicts: Makefile avconv.c doc/developer.texi ffplay.c libavcodec/x86/dsputil_mmx.c libavdevice/libdc1394.c Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * Fix NASM include directiveDave Yeo2011-08-151-2/+2
| | | | | | | | Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
* | Merge commit 'b2c087871dafc7d030b2d48457ddff597dfd4925'Michael Niedermayer2011-08-131-2/+2
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * commit 'b2c087871dafc7d030b2d48457ddff597dfd4925': Move x86util.asm from libavcodec/ to libavutil/. Move x86inc.asm to libavutil/. APIchanges: note error_recognition in lavf lavf: add support for error_recognition, use it in avidec, and bump minor API version avconv: change semantics of -map avconv: get rid of new* options. cmdutils: allow precisely specifying a stream for AVOptions. configure: add missing CFLAGS to fix building on the HURD libx264: Include hint for possible values for configuring libx264 cmdutils: allow ':'-separated modifiers in option names. avconv: make -map_metadata work consistently with the other options avconv: remove deprecated options. avconv: make -map_chapters accept only the input file index. Make a copy of ffmpeg under a new name -- avconv. ffmpeg: add a warning stating that the program is deprecated. Add weighted motion compensation for RV40 B-frames RV3/4: calculate B-frame motion weights once per frame Move RV3/4-specific DSP functions into their own context mjpeg: propagate decode errors from ff_mjpeg_decode_sos and ff_mjpeg_decode_dqt h264: notice memory allocation failure Conflicts: .gitignore Makefile cmdutils.c configure doc/ffplay.texi doc/ffprobe.texi doc/ffserver.texi libavcodec/libx264.c libavformat/avformat.h libavformat/avidec.c libavformat/version.h tests/lavf-regression.sh tests/lavfi-regression.sh Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * Move x86util.asm from libavcodec/ to libavutil/.Ronald S. Bultje2011-08-121-1/+1
| | | | | | | | This allows using it in swscale also.
| * Move x86inc.asm to libavutil/.Ronald S. Bultje2011-08-121-1/+1
| | | | | | | | This allows using it in libswscale/ also.
* | Merge remote-tracking branch 'qatar/master'Michael Niedermayer2011-07-301-14/+14
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * qatar/master: mxfdec: Include FF_INPUT_BUFFER_PADDING_SIZE when allocating extradata. H.264: tweak some other x86 asm for Atom probe: Fix insane flow control. mpegts: remove invalid error check s302m: use nondeprecated audio sample format API lavc: use designated initialisers for all codecs. x86: cabac: add operand size suffixes missing from 6c32576 Conflicts: libavcodec/ac3enc_float.c libavcodec/flacenc.c libavcodec/frwu.c libavcodec/pictordec.c libavcodec/qtrleenc.c libavcodec/v210enc.c libavcodec/wmv2dec.c Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * H.264: tweak some other x86 asm for AtomJason Garrett-Glaser2011-07-291-14/+14
| |
* | Merge remote-tracking branch 'qatar/master'Michael Niedermayer2011-06-151-17/+27
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * qatar/master: swscale: remove misplaced comment. ffmpeg: fix streaming to ffserver. swscale: split out RGB48 output functions from yuv2packed[12X]_c(). build: move vpath directives to main Makefile swscale: fix JPEG-range YUV scaling artifacts. build: move ALLFFLIBS to a more logical place ARM: factor some repetitive code into macros Fix SVQ3 after adding 4:4:4 H.264 support H.264: fix CODEC_FLAG_GRAY 4:4:4 H.264 decoding support ac3enc: fix allocation of floating point samples. Conflicts: ffmpeg.c libavcodec/dsputil_template.c libavcodec/h264.c libavcodec/mpegvideo.c libavcodec/snow.c libswscale/swscale.c libswscale/swscale_internal.h Merged-by: Michael Niedermayer <michaelni@gmx.at>