summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* avcodec/interplayvideo: use correct context when checking for enough bytesPaul B Mahol2017-06-281-1/+1
| | | | | | Fixes #6502. Signed-off-by: Paul B Mahol <[email protected]>
* avcodec/x86: use new x86-64 functions for -idct simpleJames Darnley2017-06-281-2/+4
| | | | | They now match according to FATE, barring any further bugs with untested parts
* avcodec/x86: add an 8-bit simple IDCT function based on the x86-64 high ↵James Darnley2017-06-284-0/+126
| | | | | | | | depth functions Includes add/put functions Rounding contributed by Ronald S. Bultje
* avcodec/x86: allow future 8-bit simple idct to have "DC only hack"James Darnley2017-06-281-0/+40
| | | | Created by Ronald S. Bultje
* avcodec/interplayvideo: return voidPaul B Mahol2017-06-281-1/+1
| | | | Signed-off-by: Paul B Mahol <[email protected]>
* avcodec/interplayvideo: fix dead-lockPaul B Mahol2017-06-281-1/+3
| | | | | | Fixes #6499. Signed-off-by: Paul B Mahol <[email protected]>
* avcodec/interplayvideo: use int16_t instead of shortPaul B Mahol2017-06-281-16/+16
| | | | Signed-off-by: Paul B Mahol <[email protected]>
* avcodec/interplayvideo: check that video_size is >0Paul B Mahol2017-06-281-0/+2
| | | | | | Fixes #6498. Signed-off-by: Paul B Mahol <[email protected]>
* hevc: Add support for alternative transfer characterics SEIVittorio Giovara2017-06-283-0/+22
| | | | | | | | | | The use of this SEI is for backward compatibility in HLG HDR systems: older devices that cannot interpret the "arib-std-b67" transfer will get the compatible transfer (usually bt709 or bt2020) from the VUI, while newer devices that can interpret HDR will read the SEI and use its value instead. Signed-off-by: Vittorio Giovara <[email protected]>
* avcodec/utvideodec: Factor multiply out of inner loopMichael Niedermayer2017-06-281-2/+4
| | | | | | | | 0.5% faster loop Reviewed-by: Paul B Mahol <[email protected]> Reviewed-by: Steven Liu <[email protected]> Signed-off-by: Michael Niedermayer <[email protected]>
* avcodec/utvideodec: bswap directly without memcpyMichael Niedermayer2017-06-281-6/+2
| | | | | Reviewed-by: Paul B Mahol <[email protected]> Signed-off-by: Michael Niedermayer <[email protected]>
* avcodec/utvideodec: enable unchecked bitreaderMichael Niedermayer2017-06-281-0/+2
| | | | | | | inner reader loop becomes 16% faster Reviewed-by: Paul B Mahol <[email protected]> Signed-off-by: Michael Niedermayer <[email protected]>
* avcodec/utvideodec: hardcode vlc bitsMichael Niedermayer2017-06-281-5/+5
| | | | | | | 2.5% faster vlc decoding Reviewed-by: Paul B Mahol <[email protected]> Signed-off-by: Michael Niedermayer <[email protected]>
* avcodec/utvideodec: Move bitstream end check out of inner loopMichael Niedermayer2017-06-281-10/+12
| | | | | | | | | This is not needed when the buffer is large enough for the worst case of a line 2% faster vlc reading Reviewed-by: Paul B Mahol <[email protected]> Signed-off-by: Michael Niedermayer <[email protected]>
* lavc/aacpsdsp: use ptrdiff_t for stride in hybrid_analysisClément Bœsch2017-06-288-10/+13
|
* lavc/aarch64: add a few SIMD functions for AAC PSClément Bœsch2017-06-285-0/+202
| | | | | | | | | | | | | | | | | | | | | | | | ☭ tests/checkasm/checkasm --bench --test=aacpsdsp checkasm: using random seed 3318985180 MMX implied by specified flags MMX implied by specified flags NEON: - aacpsdsp.add_squares [OK] - aacpsdsp.mul_pair_single [OK] - aacpsdsp.hybrid_analysis [OK] - aacpsdsp.stereo_interpolate [OK] checkasm: all 5 tests passed nop: 10.0 ps_add_squares_c: 63221.2 ps_add_squares_neon: 22311.7 ps_hybrid_analysis_c: 2466.6 ps_hybrid_analysis_neon: 1521.9 ps_mul_pair_single_c: 68592.0 ps_mul_pair_single_neon: 17426.6 ps_stereo_interpolate_c: 72344.3 ps_stereo_interpolate_neon: 72308.8 ps_stereo_interpolate_ipdopd_c: 117415.2 ps_stereo_interpolate_ipdopd_neon: 113386.3
* lavc/aacpsdsp: fix a few spaces (cosmetics)Clément Bœsch2017-06-281-4/+5
|
* checkasm: add AAC PS testsClément Bœsch2017-06-285-1/+168
| | | | | | This includes various fixes and improvements from James Almer. Signed-off-by: James Almer <[email protected]>
* lavc/arm: fix lack of precision in ff_ps_stereo_interpolate_neonClément Bœsch2017-06-281-4/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The code originally pre-multiply by 2 the steps, causing the running sum of the h factors to drift away due to the lack of precision. It quickly causes an inaccuracy > 0.01. I tried diverse approaches such as multiply by 2.0 (instead of adding the value itself) without success. I'm unable to bench the impact of this change, feel free to compare. This commit fixes the incoming aacpsdsp tests. Following is an alternative simplified function (matching the incoming AArch64 code) that may be used: function ff_ps_stereo_interpolate_neon, export=1 vld1.32 {q0}, [r2] vld1.32 {q1}, [r3] ldr r12, [sp] vmov.f32 q8, q0 vmov.f32 q9, q1 vzip.32 q8, q0 vzip.32 q9, q1 1: vld1.32 {d4}, [r0,:64] vld1.32 {d6}, [r1,:64] vadd.f32 q8, q8, q9 vadd.f32 q0, q0, q1 vmov.f32 d5, d4 vmov.f32 d7, d6 vmul.f32 q2, q2, q8 vmla.f32 q2, q3, q0 vst1.32 {d4}, [r0,:64]! vst1.32 {d5}, [r1,:64]! subs r12, r12, #1 bgt 1b bx lr endfunc
* x86/vf_blend: use ABS2 macroJames Almer2017-06-271-6/+3
|
* avcodec/x86/vp9dsp_init_16bpp: Fix linking to missing ↵Michael Niedermayer2017-06-281-0/+2
| | | | | | ff_vp9_ipred_dr_32x32_16_avx2() on 32bit Signed-off-by: Michael Niedermayer <[email protected]>
* hwcontext_d3d11va: use correct license headerHendrik Leppkes2017-06-282-8/+8
|
* libswresample/swresample: remove obsolete codeMichael Niedermayer2017-06-271-4/+0
| | | | Signed-off-by: Michael Niedermayer <[email protected]>
* avcodec/hevcdec: do basic validity check on delta_chroma_weight and offsetMichael Niedermayer2017-06-271-2/+17
| | | | | | | | Fixes: runtime error: signed integer overflow: 2147483520 + 128 cannot be represented in type 'int' Fixes: 2385/clusterfuzz-testcase-minimized-6594333576790016 Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg Signed-off-by: Michael Niedermayer <[email protected]>
* avcodec/vp9: add 64-bit ipred_dr_32x32_16 avx2 implementationIlia Valiakhmetov2017-06-272-3/+106
| | | | | | | | | | | | vp9_diag_downright_32x32_12bpp_c: 429.7 vp9_diag_downright_32x32_12bpp_sse2: 158.9 vp9_diag_downright_32x32_12bpp_ssse3: 144.6 vp9_diag_downright_32x32_12bpp_avx: 141.0 vp9_diag_downright_32x32_12bpp_avx2: 73.8 Almost 50% faster than avx implementation Signed-off-by: Ronald S. Bultje <[email protected]>
* x86/vf_blend: optimize difference and negation functionsJames Almer2017-06-271-16/+24
| | | | | | | Process more pixels per loop. Reviewed-by: Paul B Mahol <[email protected]> Signed-off-by: James Almer <[email protected]>
* x86/vf_blend: add sse and ssse3 extremity functionsJames Almer2017-06-273-0/+30
| | | | | Reviewed-by: Paul B Mahol <[email protected]> Signed-off-by: James Almer <[email protected]>
* hwframe: Allow hwaccel frame allocators to align surface sizesAnton Khirnov2017-06-271-2/+6
| | | | | | | | | | | | | | | | | Hardware accelerated decoding generally uses AVHWFramesContext for pool allocation of hardware surfaces. These are setup to allocate surfaces aligned to hardware and hwaccel API requirements. Due to the architecture, av_hwframe_get_buffer() will return AVFrames with the dimensions set to the aligned sizes. This causes some decoders (like hevc) return these aligned size as final frame size, instead of cropping them to the video's actual dimensions. To make sure this doesn't happen, crop the frame to the size the decoder expects when ff_get_buffer() is called. Merges Libav commit 3fdf50f9e864c88da2139cf066832944de81acaa. Signed-off-by: Luca Barbato <[email protected]>
* dxva: verbose-log decoder GUID listwm42017-06-271-0/+47
| | | | | | | | Helpful for debugging. Merges Libav commit 068eaa534e7ddb2155e2830818d5c3f1069c68d8. Signed-off-by: Luca Barbato <[email protected]>
* hwcontext_d3d11va: add option to enable debug modewm42017-06-273-0/+38
| | | | | | | | | | | Basically copied from VLC (LGPL): http://git.videolan.org/?p=vlc.git;a=blob;f=modules/video_output/win32/direct3d11.c;h=e9fcb83dcabfe778f26e63d19f218caf06a7c3ae;hb=HEAD#l1482 http://git.videolan.org/?p=vlc.git;a=blob;f=modules/codec/avcodec/d3d11va.c;h=85e7d25caebc059a9770da2ef4bb8fe90816d76d;hb=HEAD#l599 Merges Libav commit cfc9e7c94eafa33e7f109099664ec4fb57ac5ca3. Signed-off-by: Luca Barbato <[email protected]>
* dxva: support DXGI_FORMAT_420_OPAQUE decodingwm42017-06-273-4/+25
| | | | | | | | | | | | | | Some devices (some phones, apparently) will support only this opaque format. Of course this won't work with CLI, because copying data directly is not supported. Automatic frame allocation (setting AVCodecContext.hw_device_ctx) does not support this mode, even if it's the only supported mode. But since opaque surfaces are generally less useful, that's probably ok. Merges Libav commit 5030e3856c2126fb829edb828f5aae011d178eb4. Signed-off-by: Luca Barbato <[email protected]>
* hwcontext_d3d11va: allocate staging texture lazilywm42017-06-271-12/+34
| | | | | | | | | | Makes dealing with formats that can not be used for staging textures easier (DXGI_FORMAT_420_OPAQUE). It also saves memory if the staging texture is never needed, so this is a good thing. Merges Libav commit 98d73e4174333b37d961b79e1182be5a02156c02. Signed-off-by: Luca Barbato <[email protected]>
* hwcontext_d3d11va: fix crash on frames_init failurewm42017-06-271-0/+2
| | | | | | | | | It appears in this case, frames_ininit is called twice (once by av_hwframe_ctx_init(), and again by unreffing the frames ctx ref). Merges Libav commit 086321c612185469ebb85a1887527c8915b488b7. Signed-off-by: Luca Barbato <[email protected]>
* dxva: fix some warningswm42017-06-271-9/+7
| | | | | | | | | | | | Some existed since forever, some are new. The cast in get_surface() is silly, but unless we change the av_log function signature, or all callers of ff_dxva2_get_surface_index(), it's needed to remove the const warning. Merges Libav commit 752ddb45569ffe278393cd853b70f18ae017219e. Signed-off-by: Luca Barbato <[email protected]>
* dxva: add declarative profile checkswm42017-06-271-20/+44
| | | | | | | | | | | | | | | | | | | | | Make supported codec profiles part of each dxva_modes entry. Every DXVA2 mode is representative for a codec with a subset of supported profiles, so reflecting that in dxva_modes seems appropriate. In practice, this will more strictly check MPEG2 profiles, will stop relying on the surface format checks for selecting the correct HEVC profile, and remove the verbose messages for mismatching H264/HEVC profiles. Instead of the latter, it will now print the more nebulous "No decoder device for codec found" verbose message. This also respects AV_HWACCEL_FLAG_ALLOW_PROFILE_MISMATCH. Move the Main10 HEVC entry before the normal one to make this work better. Originally inspired by VLC's code. Merges Libav commit 70e5e7c02232d1e51c362d5572c567a9e2a8dcc2. Signed-off-by: Luca Barbato <[email protected]>
* d3d11va: Link directly to dxgi.dll and d3d11.dll functions if LoadLibrary is ↵Martin Storsjö2017-06-272-33/+51
| | | | | | | | | | | | | unavailable When targeting the UWP API subset, the LoadLibrary function is not available (and the fallback, LoadPackagedLibrary, can't be used to load system DLLs). In these cases, link directly to the functions in the DLLs instead of trying to load them dynamically at runtime. Merges Libav commit fd1ffa1f10e940165035ccb79d4a6523da196062. Signed-off-by: Martin Storsjö <[email protected]>
* dxva: add support for new dxva2 and d3d11 hwaccel APIswm42017-06-2722-476/+885
| | | | | | | | | | | | | | | This also adds support to avconv (which is trivial due to the new hwaccel API being generic enough). The new decoder setup code in dxva2.c is significantly based on work by Steve Lhomme <[email protected]>, but with heavy changes/rewrites. Merges Libav commit f9e7a2f95a7194a8736cc1416a03a1a0155a3e9f. Also adds untested VP9 support. The check for DXVA2 COBJs is removed. Just update your MinGW to something newer than a 5 year old release. Signed-off-by: Diego Biurrun <[email protected]>
* dxva: move d3d11 locking/unlocking to functionswm42017-06-271-18/+28
| | | | | | | | | | | I want to make it non-mandatory to set a mutex in the D3D11 device context, and replacing it with user callbacks seems like the best solution. This is preparation for it. Also makes the code slightly more readable. Merges Libav commit 831cfe10b40414915fe7b6088158421fe02e2b2d. Signed-off-by: Diego Biurrun <[email protected]>
* dxva: preparations for new hwaccel APIwm42017-06-277-48/+60
| | | | | | | | | | | | | | | The actual hwaccel code will need to access an internal context instead of avctx->hwaccel_context, so add a new DXVA_CONTEXT() macro, that will dispatch between the "old" external and the new internal context. Also, the new API requires a new D3D11 pixfmt, so all places which check for the pixfmt need to be adjusted. Introduce a ff_dxva2_is_d3d11() function, which does the check. Merges Libav commit 4dec101acc393fbfe9a8ce0237b9efbae3f20139. Adds changes to vp9 over the Libav patch. Signed-off-by: Diego Biurrun <[email protected]>
* lavc: set avctx->hwaccel before initwm42017-06-271-2/+2
| | | | | | | | | | | So a hwaccel can access avctx->hwaccel in init for whatever reason. This is for the new d3d hwaccel API. We could create separate entrypoints for each of the 3 hwaccel types (dxva2, d3d11va, new d3d11va), but this seems nicer. Merges Libav commit bd747b9226414007f0207fa201976af7217e3b77. Signed-off-by: Diego Biurrun <[email protected]>
* lavu: add new D3D11 pixfmt and hwcontextwm42017-06-2710-2/+680
| | | | | | | | | | | | | | | | | | | | | | To be used with the new d3d11 hwaccel decode API. With the new hwaccel API, we don't want surfaces to depend on the decoder (other than the required dimension and format). The old D3D11VA pixfmt uses ID3D11VideoDecoderOutputView pointers, which include the decoder configuration, and thus is incompatible with the new hwaccel API. This patch introduces AV_PIX_FMT_D3D11, which uses ID3D11Texture2D and an index. It's simpler and compatible with the new hwaccel API. The introduced hwcontext supports only the new pixfmt. Frame upload code untested. Significantly based on work by Steve Lhomme <[email protected]>, but with heavy changes/rewrites. Merges Libav commit fff90422d181744cd75dbf011687ee7095f02875. Signed-off-by: Diego Biurrun <[email protected]>
* x86inc: don't use read-only data sections on COFF targetsJames Almer2017-06-271-0/+2
| | | | | | | | | | | | | Yasm: src/libavfilter/x86/af_volume.asm:24: warning: Standard COFF does not support read-only data sections src/libavfilter/x86/af_volume.asm:24: warning: Unrecognized qualifier `align' Nasm: src/libavfilter/x86/af_volume.asm:24: error: standard COFF does not support section alignment specification src/libavutil/x86/x86inc.asm:92: ... from macro `SECTION_RODATA' defined here Tested-by: Clément Bœsch <[email protected]> Signed-off-by: James Almer <[email protected]>
* avcodec/interplayvideo: properly check if there is enough bytes leftPaul B Mahol2017-06-271-0/+10
| | | | Signed-off-by: Paul B Mahol <[email protected]>
* Interplay MVE: Changelog entry for changesHein-Pieter van Braam2017-06-271-0/+1
| | | | Signed-off-by: Hein-Pieter van Braam <[email protected]>
* Interplay MVE: Implement frame format 0x10Hein-Pieter van Braam2017-06-272-23/+227
| | | | | | | | | | | | | | | | | | | | This implements the 0x10 frame format for Interplay MVE movies. The format is a variation on the 0x06 format with some changes. In addition to the decoding map there's also a skip map. This skip map is used to determine what 8x8 blocks can change in a particular frame. This format expects to be able to copy an 8x8 block from before the last time it was changed. This can be an arbitrary time in the past. In order to implement this this decoder allocates two additional AVFrames where actual decoding happens. At the end of a frame decoding changed blocks are copied to a finished frame based on the skip map. The skip map's encoding is a little convulted, I'll refer to the code for details. Values in the decoding map are the same as in format 0x06. Signed-off-by: Hein-Pieter van Braam <[email protected]>
* Interplay MVE: Implement frame format 0x06Hein-Pieter van Braam2017-06-272-14/+126
| | | | | | | | | | | | | | | | | | | | | | | | | This implements the 0x06 frame format for Interplay MVE movies. The format is relatively simple. The video data consists of two parts: 16 bits per 8x8 block movement data a number of 8x8 blocks of pixel data For each 8x8 block of pixel data the movement data is consulted. There are 3 possible meanings of the movement data: * zero : copy the 8x8 block from the pixel data * negative : copy the 8x8 block from the previous frame from an offset determined by the actual value of the entry -0xC000. * positive : copy the 8x8 block from the current frame from an offset determined by the actual value of the entry -0x4000 Decoding happens in two passes, in the fist pass only new pixeldata is copied, during the second pass data is copied from the previous and current frames. The codec expects that the current frame being decoded to still has the data from 2 frames ago on it when decoding starts. Signed-off-by: Hein-Pieter van Braam <[email protected]>
* Interplay MVE: Refactor IP packet formatHein-Pieter van Braam2017-06-272-28/+51
| | | | | | | | Interplay MVE can contain up to three different frame formats. They require different streams of information to render a frame. This patch changes the IP packet format to prepare for the extra frame formats. Signed-off-by: Hein-Pieter van Braam <[email protected]>
* Interplay MVE: Implement MVE SEND_BUFFER operationHein-Pieter van Braam2017-06-272-11/+20
| | | | | | | | | | | Interplay MVE movies have a SEND_BUFFER operation. Only after this command does the current decoding buffer get displayed. This is required for the other frame formats. They are fixed-size and can't always encode a full frame worth of pixeldata. This code prevents half-finished frames from being emitted. Signed-off-by: Hein-Pieter van Braam <[email protected]>
* avcodec/proresenc_kostya: add 4444XQ profilePaul B Mahol2017-06-272-3/+16
| | | | Signed-off-by: Paul B Mahol <[email protected]>
* avcodec/ffv1enc: compute the max number of slices and limit by thatMichael Niedermayer2017-06-271-3/+4
| | | | Signed-off-by: Michael Niedermayer <[email protected]>