| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
| |
|
|
|
|
|
|
|
| |
This will allow to relax the dependency on mpegvideoenc
for several codecs.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Before this commit, the input in get_pixels and get_pixels_unaligned
has been treated inconsistenly:
- The generic code treated 9, 10, 12 and 14 bits as 16bit input
(these bits correspond to what FFmpeg's dsputils supported),
everything with <= 8 bits as 8 bit and everything else as 8 bit
when used via AVDCT (which exposes these functions and purports
to support up to 14 bits).
- AARCH64, ARM, PPC and RISC-V, x86 ignore this AVDCT special case.
- RISC-V also ignored the restriction to 9, 10, 12 and 14 for its
16bit check and treated everything > 8 bits as 16bit.
- The mmi MIPS code treats everything as 8 bit when used via
AVDCT (this is certainly broken); otherwise it checks for <= 8 bits.
The msa MIPS code behaves like the generic code.
This commit changes this to treat 9..16 bits as 16 bit input,
everything else as 8 bit (the former because it makes sense,
the latter to preserve the behaviour for external users*).
*: The only internal user of AVDCT (the spp filter) always
uses 8, 9 or 10 bits.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For all encoders and all decoders except MPEG-4 the unquantize
functions to use don't change at all and therefore needn't be
kept in the context. So discard them after setting them;
for MPEG-4, the functions get assigned on a per-frame basis.
Decoders not using any unquantize functions (H.261, MPEG-1/2)
as well as decoders that only call ff_mpv_reconstruct_mb()
through error resilience (RV30/40, the VC-1 family) don't have
the remaining pointers set at all.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
|
|
|
|
|
|
|
| |
This is in preparation for only keeping the actually used
unquantize functions in MpegEncContext.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Many of the fields of MpegEncContext (which is also used by decoders)
are actually only used by encoders. Therefore this commit adds
a new encoder-only structure and moves all of the encoder-only
fields to it except for those which require more explicit
synchronisation between the main slice context and the other
slice contexts. This synchronisation is currently mainly provided
by ff_update_thread_context() which simply copies most of
the main slice context over the other slice contexts. Fields
which are moved to the new MPVEncContext no longer participate
in this (which is desired, because it is horrible and for the
fields b) below wasteful) which means that some fields can only
be moved when explicit synchronisation code is added in later commits.
More explicitly, this commit moves the following fields:
a) Fields not copied by ff_update_duplicate_context():
dct_error_sum and dct_count; the former does not need synchronisation,
the latter is synchronised in merge_context_after_encode().
b) Fields which do not change after initialisation (these fields
could also be put into MPVMainEncContext at the cost of
an indirection to access them): lambda_table, adaptive_quant,
{luma,chroma}_elim_threshold, new_pic, fdsp, mpvencdsp, pdsp,
{p,b_forw,b_back,b_bidir_forw,b_bidir_back,b_direct,b_field}_mv_table,
[pb]_field_select_table, mb_{type,var,mean}, mc_mb_var, {min,max}_qcoeff,
{inter,intra}_quant_bias, ac_esc_length, the *_vlc_length fields,
the q_{intra,inter,chroma_intra}_matrix{,16}, dct_offset, mb_info,
mjpeg_ctx, rtp_mode, rtp_payload_size, encode_mb, all function
pointers, mpv_flags, quantizer_noise_shaping,
frame_reconstruction_bitfield, error_rate and intra_penalty.
c) Fields which are already (re)set explicitly: The PutBitContexts
pb, tex_pb, pb2; dquant, skipdct, encoding_error, the statistics
fields {mv,i_tex,p_tex,misc,last}_bits and i_count; last_mv_dir,
esc_pos (reset when writing the header).
d) Fields which are only used by encoders not supporting slice
threading for which synchronisation doesn't matter: esc3_level_length
and the remaining mb_info fields.
e) coded_score: This field is only really used when FF_MPV_FLAG_CBP_RD
is set (which implies trellis) and even then it is only used for
non-intra blocks. For these blocks dct_quantize_trellis_c() either
sets coded_score[n] or returns a last_non_zero value of -1
in which case coded_score will be reset in encode_mb_internal().
Therefore no old values are ever used.
The MotionEstContext has not been moved yet.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reduces the amount the horizontal filters read beyond the filter
width to a consistent 1 pixel. The data is not used so this is usually
not noticeable. It becomes a problem when the application allocates
frame buffers only for the aligned picture size and the end of it is at
a page boundary. This happens for picture sizes which are a multiple of
the page size like 1280x640. The frame buffer allocation is based on
its most likely done via mmap + MAP_ANONYMOUS so start and end of the
buffer are page aligned and the previous and next page are not
necessarily mapped.
This mirrors the aarch64 change.
Signed-off-by: Janne Grunau <janne-ffmpeg@jannau.net>
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
binutils 2.43 has stricter validation for labels[1] and results in errors
when building ffmpeg for armv5:
src/libavcodec/arm/mlpdsp_armv5te.S:232: Error: junk at end of line, first unrecognized character is `0'
Remove the leading zero in the "01" label to resolve this error.
[1] https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=226749d5a6ff0d5c607d6428d6c81e1e7e7a994b
Signed-off-by: Ross Burton <ross.burton@arm.com>
Signed-off-by: Martin Storsjö <martin@martin.st>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This diff removes 4 unused ARMv7 NEON fixed-point DSP functions.
The function were originally moved here by 4958f35a2 (Dec 2013).
After 9e05421db (Jan 2021), as part of the refactor of the AC3
DSP to consistently use 32-bit sample format in the encoder, these
functions were removed from the DSP function table, but the ARMv7
implementations were kept.
Signed-off-by: Geoff Hill <geoff@geoffhill.org>
|
|
|
|
|
|
| |
Forgotten in 25e1986e6816fabf7f6065d090994f8ad050e540.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
|
|
|
|
|
|
|
| |
Also move AV_CHECK_OFFSET to its only user, namely
lavc/arm/mpegvideo_arm.c and rename it to CHECK_OFFSET.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
|
|
|
|
|
|
| |
Should simplify asm implementations, and prevent UB on at least win64.
Signed-off-by: James Almer <jamrial@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
They were replaced by TX from libavutil; the tremendous work
to get to this point (both creating TX as well as porting
the users of the components removed in this commit) was
completely performed by Lynne alone.
Removing the subsystems from configure may break some command lines,
because the --disable-fft etc. options are no longer recognized.
Co-authored-by: Lynne <dev@lynne.ee>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit 9413bdc381112711a7beb6d6b8d8fc4b4ff6ca01.
That commit broke the fate HEVC tests - unfortunately I only
tested checkasm for that patch, and that function is still
lacking checkasm coverage.
Signed-off-by: Martin Storsjö <martin@martin.st>
|
|
|
|
| |
Signed-off-by: Martin Storsjö <martin@martin.st>
|
|
|
|
| |
Signed-off-by: Martin Storsjö <martin@martin.st>
|
|
|
|
| |
Signed-off-by: Martin Storsjö <martin@martin.st>
|
|
|
|
| |
Signed-off-by: Martin Storsjö <martin@martin.st>
|
|
|
|
|
| |
Thanks to Martin Storsjö <martin@martin.st> for fixing and testing the
arm32 and aarch64 changes.
|
|
|
|
|
|
| |
Forgotten in fd98594a8831ce037a495b6d7e090bd8f81e83a1.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
|
|
|
|
|
|
|
|
| |
This hopefully should fix building with older toolchains, hopefully
fixing the fate failures on
http://fate.ffmpeg.org/history.cgi?slot=armel5tej-qemu-debian-gcc4.4.
Signed-off-by: Martin Storsjö <martin@martin.st>
|
|
|
|
|
|
|
| |
Unused since d74a8cb7e42f703be5796eeb485f06af710ae8ca.
Reviewed-by: Rémi Denis-Courmont <remi@remlab.net>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
|
|
|
|
| |
... for a difference between pointers.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When compiling FFmpeg with GCC-9, some very random segfaults were
observed in code which had previously called down into the SBC encoder
NEON assembly routines. This was caused by these functions clobbering
some of the vfp callee saved registers (d8 - d15 aka q4 - q7). GCC was
using these registers to save local variables, but after these
functions returned, they would contain garbage.
Fix by reallocating the registers in the two affected functions in
the following way:
ff_sbc_analyze_4_neon: q2-q5 => q8-q11, then q1-q4 => q8-q11
ff_sbc_analyze_8_neon: q2-q9 => q8-q15
The reason for using these replacements is to keep closely related
sets of registers consecutively numbered which hopefully makes the
code more easy to follow. Since this commit only reallocates
registers, it should have no performance impact.
Signed-off-by: James Cowgill <jcowgill@debian.org>
Signed-off-by: Martin Storsjö <martin@martin.st>
|
|
|
|
|
|
| |
Reviewed-by: Peter Ross <pross@xvid.org>
Reviewed-by: Ronald S. Bultje <rsbultje@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
|
|
|
|
|
|
|
| |
These were missed when h264_chroma_mc_func was changed in
e4a94d8b36c48d95a7d412c40d7b558422ff659c.
Signed-off-by: Martin Storsjö <martin@martin.st>
|
|
|
|
|
|
| |
This was missed in db54426975e124e98e5130ad01316cb7afd60630.
Signed-off-by: Martin Storsjö <martin@martin.st>
|
|
|
|
|
|
|
|
|
| |
This has been broken since the start, and it was only discovered
when I started testing my replacement for the FFT.
Disable it, since there's no point in fixing slower code that's about
to be removed anyway.
The vfp version is not affected.
|
|
|
|
| |
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
|
|
|
|
|
|
| |
Forgotten in e609cfd697f8eed7325591f767585041719807d1.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
|
|
|
|
| |
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
|
|
|
|
| |
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
|
|
|
|
| |
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
|
|
|
|
| |
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
|
|
|
|
| |
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
|
|
|
|
|
|
| |
Also use a vpx prefix for them.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
|
|
|
|
|
|
|
|
|
|
| |
checkasm benchmarks on 1.5 GHz Cortex-A72 are as follows.
vc1dsp.vc1_unescape_buffer_c: 918624.7
vc1dsp.vc1_unescape_buffer_neon: 142958.0
Signed-off-by: Ben Avison <bavison@riscosopen.org>
Signed-off-by: Martin Storsjö <martin@martin.st>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
checkasm benchmarks on 1.5 GHz Cortex-A72 are as follows. Note that the C
version can still outperform the NEON version in specific cases. The balance
between different code paths is stream-dependent, but in practice the best
case happens about 5% of the time, the worst case happens about 40% of the
time, and the complexity of the remaining cases fall somewhere in between.
Therefore, taking the average of the best and worst case timings is
probably a conservative estimate of the degree by which the NEON code
improves performance.
vc1dsp.vc1_h_loop_filter4_bestcase_c: 19.0
vc1dsp.vc1_h_loop_filter4_bestcase_neon: 48.5
vc1dsp.vc1_h_loop_filter4_worstcase_c: 144.7
vc1dsp.vc1_h_loop_filter4_worstcase_neon: 76.2
vc1dsp.vc1_h_loop_filter8_bestcase_c: 41.0
vc1dsp.vc1_h_loop_filter8_bestcase_neon: 75.0
vc1dsp.vc1_h_loop_filter8_worstcase_c: 294.0
vc1dsp.vc1_h_loop_filter8_worstcase_neon: 102.7
vc1dsp.vc1_h_loop_filter16_bestcase_c: 54.7
vc1dsp.vc1_h_loop_filter16_bestcase_neon: 130.0
vc1dsp.vc1_h_loop_filter16_worstcase_c: 569.7
vc1dsp.vc1_h_loop_filter16_worstcase_neon: 186.7
vc1dsp.vc1_v_loop_filter4_bestcase_c: 20.2
vc1dsp.vc1_v_loop_filter4_bestcase_neon: 47.2
vc1dsp.vc1_v_loop_filter4_worstcase_c: 164.2
vc1dsp.vc1_v_loop_filter4_worstcase_neon: 68.5
vc1dsp.vc1_v_loop_filter8_bestcase_c: 43.5
vc1dsp.vc1_v_loop_filter8_bestcase_neon: 55.2
vc1dsp.vc1_v_loop_filter8_worstcase_c: 316.2
vc1dsp.vc1_v_loop_filter8_worstcase_neon: 72.7
vc1dsp.vc1_v_loop_filter16_bestcase_c: 62.2
vc1dsp.vc1_v_loop_filter16_bestcase_neon: 103.7
vc1dsp.vc1_v_loop_filter16_worstcase_c: 646.5
vc1dsp.vc1_v_loop_filter16_worstcase_neon: 110.7
Signed-off-by: Ben Avison <bavison@riscosopen.org>
Signed-off-by: Martin Storsjö <martin@martin.st>
|
|
|
|
|
|
|
|
| |
This avoids unnecessary rebuilds of most source files if only the
list of enabled components has changed, but not the other properties
of the build, set in config.h.
Signed-off-by: Martin Storsjö <martin@martin.st>
|
|
|
|
|
|
|
|
| |
This reverts commit 2589060b92eeeb944c6e2b50e38412c0c5fabcf4 which was
originally to fix the FATE test. The real cause of the test breakage was
fixed in 22b7c37275c611b5417722d8941844028aed7f25.
Signed-off-by: J. Dekker <jdek@itanimul.li>
|
|
|
|
|
|
|
|
|
|
|
| |
The assembly is written assuming that the width is a multiple of 8.
However the real issue is the functions were errorneously assigned to
the 2, 4, 6 & 12 widths. This behaviour never broke the decoder as
samples which trigger the functions for these widths have not been found
in the wild. This relies on the mappings in ff_hevc_pel_weight[].
Signed-off-by: J. Dekker <jdek@itanimul.li>
|
|
|
|
|
|
|
| |
Don't use the loaded registers directly, avoiding stalls on in
order cores. Use vrhadd.u8 with q registers where easily possible.
Signed-off-by: Martin Storsjö <martin@martin.st>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This unbreaks the fate-checkasm-hevc_pel test on arm targets.
The assembly assumed that the width passed to the DSP functions is
a multiple of 8, while the checkasm test used other widths too.
This wasn't noticed before, because the hevc_pel checkasm tests
(that were added in 9c513edb7999a35ddcc6e3a8d984a96c8fb492a3 in
January) weren't run as part of fate until in
b492cacffd36ad4cb251ba1f13ac398318ee639a in August.
As this hasn't been an issue in practice with actual full decoding
tests, it seems like the actual decoder doesn't call these functions
with such widths. Therefore, we could alternatively fix the test
to only test things that the real decoder does, and this modification
could be reverted.
Signed-off-by: Martin Storsjö <martin@martin.st>
|
|
|
|
|
|
| |
Fixes many -Warray-parameter warnings from GCC 11.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
|
|
|
|
|
|
|
|
| |
Deprecated in commits 7fc329e2dd6226dfecaa4a1d7adf353bf2773726
and 31f6a4b4b83aca1d73f3cfc99ce2b39331970bf3.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Signed-off-by: James Almer <jamrial@gmail.com>
|
|
|
|
|
|
|
|
| |
Some files currently rely on libavutil/cpu.h to include it for them;
yet said file won't use include it any more after the currently
deprecated functions are removed, so include attributes.h directly.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
|
|
|
|
| |
Signed-off-by: James Almer <jamrial@gmail.com>
|