aboutsummaryrefslogtreecommitdiffstats
path: root/libavcodec/arm
Commit message (Collapse)AuthorAgeFilesLines
* all: fix typos found by codespellTimo Rothenpieler17 hours2-2/+2
|
* all: fix whitespace/new-line issuesTimo Rothenpieler17 hours1-1/+0
|
* configure: Factor mpegvideoencdsp out of mpegvideoencAndreas Rheinhardt2025-06-211-2/+2
| | | | | | | This will allow to relax the dependency on mpegvideoenc for several codecs. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
* avcodec/pixblockdsp: Improve 8 vs 16 bit checkAndreas Rheinhardt2025-05-311-2/+0
| | | | | | | | | | | | | | | | | | | | | | | | | Before this commit, the input in get_pixels and get_pixels_unaligned has been treated inconsistenly: - The generic code treated 9, 10, 12 and 14 bits as 16bit input (these bits correspond to what FFmpeg's dsputils supported), everything with <= 8 bits as 8 bit and everything else as 8 bit when used via AVDCT (which exposes these functions and purports to support up to 14 bits). - AARCH64, ARM, PPC and RISC-V, x86 ignore this AVDCT special case. - RISC-V also ignored the restriction to 9, 10, 12 and 14 for its 16bit check and treated everything > 8 bits as 16bit. - The mmi MIPS code treats everything as 8 bit when used via AVDCT (this is certainly broken); otherwise it checks for <= 8 bits. The msa MIPS code behaves like the generic code. This commit changes this to treat 9..16 bits as 16 bit input, everything else as 8 bit (the former because it makes sense, the latter to preserve the behaviour for external users*). *: The only internal user of AVDCT (the spp filter) always uses 8, 9 or 10 bits. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
* avcodec/mpegvideo: Only keep the actually used unquantize funcsAndreas Rheinhardt2025-03-263-5/+5
| | | | | | | | | | | | | | For all encoders and all decoders except MPEG-4 the unquantize functions to use don't change at all and therefore needn't be kept in the context. So discard them after setting them; for MPEG-4, the functions get assigned on a per-frame basis. Decoders not using any unquantize functions (H.261, MPEG-1/2) as well as decoders that only call ff_mpv_reconstruct_mb() through error resilience (RV30/40, the VC-1 family) don't have the remaining pointers set at all. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
* avcodec/mpegvideo: Move unquantize functions into a file of their ownAndreas Rheinhardt2025-03-261-0/+1
| | | | | | | This is in preparation for only keeping the actually used unquantize functions in MpegEncContext. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
* avcodec/mpegvideoenc: Add MPVEncContextAndreas Rheinhardt2025-03-261-6/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Many of the fields of MpegEncContext (which is also used by decoders) are actually only used by encoders. Therefore this commit adds a new encoder-only structure and moves all of the encoder-only fields to it except for those which require more explicit synchronisation between the main slice context and the other slice contexts. This synchronisation is currently mainly provided by ff_update_thread_context() which simply copies most of the main slice context over the other slice contexts. Fields which are moved to the new MPVEncContext no longer participate in this (which is desired, because it is horrible and for the fields b) below wasteful) which means that some fields can only be moved when explicit synchronisation code is added in later commits. More explicitly, this commit moves the following fields: a) Fields not copied by ff_update_duplicate_context(): dct_error_sum and dct_count; the former does not need synchronisation, the latter is synchronised in merge_context_after_encode(). b) Fields which do not change after initialisation (these fields could also be put into MPVMainEncContext at the cost of an indirection to access them): lambda_table, adaptive_quant, {luma,chroma}_elim_threshold, new_pic, fdsp, mpvencdsp, pdsp, {p,b_forw,b_back,b_bidir_forw,b_bidir_back,b_direct,b_field}_mv_table, [pb]_field_select_table, mb_{type,var,mean}, mc_mb_var, {min,max}_qcoeff, {inter,intra}_quant_bias, ac_esc_length, the *_vlc_length fields, the q_{intra,inter,chroma_intra}_matrix{,16}, dct_offset, mb_info, mjpeg_ctx, rtp_mode, rtp_payload_size, encode_mb, all function pointers, mpv_flags, quantizer_noise_shaping, frame_reconstruction_bitfield, error_rate and intra_penalty. c) Fields which are already (re)set explicitly: The PutBitContexts pb, tex_pb, pb2; dquant, skipdct, encoding_error, the statistics fields {mv,i_tex,p_tex,misc,last}_bits and i_count; last_mv_dir, esc_pos (reset when writing the header). d) Fields which are only used by encoders not supporting slice threading for which synchronisation doesn't matter: esc3_level_length and the remaining mb_info fields. e) coded_score: This field is only really used when FF_MPV_FLAG_CBP_RD is set (which implies trellis) and even then it is only used for non-intra blocks. For these blocks dct_quantize_trellis_c() either sets coded_score[n] or returns a last_non_zero value of -1 in which case coded_score will be reset in encode_mb_internal(). Therefore no old values are ever used. The MotionEstContext has not been moved yet. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
* arm: vp9mc: Load only 12 pixels in the 4 pixel wide horizontal filterJanne Grunau2025-01-031-3/+10
| | | | | | | | | | | | | | | | This reduces the amount the horizontal filters read beyond the filter width to a consistent 1 pixel. The data is not used so this is usually not noticeable. It becomes a problem when the application allocates frame buffers only for the aligned picture size and the end of it is at a page boundary. This happens for picture sizes which are a multiple of the page size like 1280x640. The frame buffer allocation is based on its most likely done via mmap + MAP_ANONYMOUS so start and end of the buffer are page aligned and the previous and next page are not necessarily mapped. This mirrors the aarch64 change. Signed-off-by: Janne Grunau <janne-ffmpeg@jannau.net> Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
* avcodec/mpegvideoencdsp: convert stride parameters from int to ptrdiff_tRamiro Polla2024-09-011-2/+2
|
* libavcodec/arm/mlpdsp_armv5te: fix label format to work with binutils 2.43Ross Burton2024-08-131-3/+3
| | | | | | | | | | | | | | binutils 2.43 has stricter validation for labels[1] and results in errors when building ffmpeg for armv5: src/libavcodec/arm/mlpdsp_armv5te.S:232: Error: junk at end of line, first unrecognized character is `0' Remove the leading zero in the "01" label to resolve this error. [1] https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=226749d5a6ff0d5c607d6428d6c81e1e7e7a994b Signed-off-by: Ross Burton <ross.burton@arm.com> Signed-off-by: Martin Storsjö <martin@martin.st>
* lavc/hevc*: move to hevc/ subdirAnton Khirnov2024-06-043-3/+3
|
* avcodec/ac3: Remove unused fixed-point ARMv7 DSPGeoff Hill2024-04-181-63/+0
| | | | | | | | | | | | | This diff removes 4 unused ARMv7 NEON fixed-point DSP functions. The function were originally moved here by 4958f35a2 (Dec 2013). After 9e05421db (Jan 2021), as part of the refactor of the AC3 DSP to consistently use 32-bit sample format in the encoder, these functions were removed from the DSP function table, but the ARMv7 implementations were kept. Signed-off-by: Geoff Hill <geoff@geoffhill.org>
* avcodec/arm/vp8: Don't discard constAndreas Rheinhardt2024-03-311-1/+2
| | | | | | Forgotten in 25e1986e6816fabf7f6065d090994f8ad050e540. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
* avcodec/arm/mpegvideo_arm: Use static_assert to check offsetsAndreas Rheinhardt2024-03-311-8/+12
| | | | | | | Also move AV_CHECK_OFFSET to its only user, namely lavc/arm/mpegvideo_arm.c and rename it to CHECK_OFFSET. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
* avcodec/ac3dsp: make len a size_t in float_to_fixed24James Almer2023-11-221-1/+1
| | | | | | Should simplify asm implementations, and prevent UB on at least win64. Signed-off-by: James Almer <jamrial@gmail.com>
* avcodec: Remove DCT, FFT, MDCT and RDFTAndreas Rheinhardt2023-10-019-1812/+0
| | | | | | | | | | | | | They were replaced by TX from libavutil; the tremendous work to get to this point (both creating TX as well as porting the users of the components removed in this commit) was completely performed by Lynne alone. Removing the subsystems from configure may break some command lines, because the --disable-fft etc. options are no longer recognized. Co-authored-by: Lynne <dev@lynne.ee> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
* Revert "avcodec/arm/hevc: remove duplicate mov of deblock neon"Martin Storsjö2023-03-231-0/+2
| | | | | | | | | | This reverts commit 9413bdc381112711a7beb6d6b8d8fc4b4ff6ca01. That commit broke the fate HEVC tests - unfortunately I only tested checkasm for that patch, and that function is still lacking checkasm coverage. Signed-off-by: Martin Storsjö <martin@martin.st>
* avcodec/arm/hevc: remove duplicate mov of deblock neonxufuji4562023-03-221-2/+0
| | | | Signed-off-by: Martin Storsjö <martin@martin.st>
* codec/arm/hevcdsp_idct_neon: remove duplicate movxufuji4562023-03-071-8/+2
| | | | Signed-off-by: Martin Storsjö <martin@martin.st>
* libavcodec/hevc: remove duplicate semicolon in hevcdsp_init_neonxufuji4562023-02-281-2/+2
| | | | Signed-off-by: Martin Storsjö <martin@martin.st>
* libavcodec/hevc: reuse scale_store on idct32x32_neonxufuji4562023-02-281-16/+2
| | | | Signed-off-by: Martin Storsjö <martin@martin.st>
* dca_core: convert to lavu/txLynne2022-11-063-8/+14
| | | | | Thanks to Martin Storsjö <martin@martin.st> for fixing and testing the arm32 and aarch64 changes.
* avcodec/ac3dsp: Remove unused parameterAndreas Rheinhardt2022-09-291-1/+1
| | | | | | Forgotten in fd98594a8831ce037a495b6d7e090bd8f81e83a1. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
* arm: vc1dsp: Canonicalize the syntax for aligned NEON loads/storesMartin Storsjö2022-09-291-20/+20
| | | | | | | | This hopefully should fix building with older toolchains, hopefully fixing the fate failures on http://fate.ffmpeg.org/history.cgi?slot=armel5tej-qemu-debian-gcc4.4. Signed-off-by: Martin Storsjö <martin@martin.st>
* avcodec/fmtconvert: Remove unused AVCodecContext parameterAndreas Rheinhardt2022-09-211-2/+1
| | | | | | | Unused since d74a8cb7e42f703be5796eeb485f06af710ae8ca. Reviewed-by: Rémi Denis-Courmont <remi@remlab.net> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
* lavc/vorbisdsp: use ptrdiff_t rather than intptr_tRémi Denis-Courmont2022-09-191-1/+1
| | | | ... for a difference between pointers.
* avcodec/arm/sbcenc: avoid callee preserved vfp registersJames Cowgill2022-09-131-110/+110
| | | | | | | | | | | | | | | | | | | | | | When compiling FFmpeg with GCC-9, some very random segfaults were observed in code which had previously called down into the SBC encoder NEON assembly routines. This was caused by these functions clobbering some of the vfp callee saved registers (d8 - d15 aka q4 - q7). GCC was using these registers to save local variables, but after these functions returned, they would contain garbage. Fix by reallocating the registers in the two affected functions in the following way: ff_sbc_analyze_4_neon: q2-q5 => q8-q11, then q1-q4 => q8-q11 ff_sbc_analyze_8_neon: q2-q9 => q8-q15 The reason for using these replacements is to keep closely related sets of registers consecutively numbered which hopefully makes the code more easy to follow. Since this commit only reallocates registers, it should have no performance impact. Signed-off-by: James Cowgill <jcowgill@debian.org> Signed-off-by: Martin Storsjö <martin@martin.st>
* avcodec/vp8dsp: Constify src in vp8_mc_funcAndreas Rheinhardt2022-09-112-4/+4
| | | | | | Reviewed-by: Peter Ross <pross@xvid.org> Reviewed-by: Ronald S. Bultje <rsbultje@gmail.com> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
* arm: rv40dsp: Change stride parameters to ptrdiff_tMartin Storsjö2022-09-021-4/+4
| | | | | | | These were missed when h264_chroma_mc_func was changed in e4a94d8b36c48d95a7d412c40d7b558422ff659c. Signed-off-by: Martin Storsjö <martin@martin.st>
* arm: vc1sdp: Change stride parameters to ptrdiff_tMartin Storsjö2022-09-021-6/+6
| | | | | | This was missed in db54426975e124e98e5130ad01316cb7afd60630. Signed-off-by: Martin Storsjö <martin@martin.st>
* arm/fft: disable NEON optimizations for 131072pt transformsLynne2022-08-291-2/+4
| | | | | | | | | This has been broken since the start, and it was only discovered when I started testing my replacement for the FFT. Disable it, since there's no point in fixing slower code that's about to be removed anyway. The vfp version is not affected.
* avcodec/flacdsp: Split encoder-only parts into a ctx of its ownAndreas Rheinhardt2022-08-052-6/+3
| | | | Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
* avcodec/flacdsp: Remove unused function parameterAndreas Rheinhardt2022-08-051-2/+1
| | | | | | Forgotten in e609cfd697f8eed7325591f767585041719807d1. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
* avcodec/h264chroma: Constify src in h264_chroma_mc_funcAndreas Rheinhardt2022-08-054-16/+16
| | | | Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
* avcodec/hevcdsp: Constify src pointersAndreas Rheinhardt2022-08-051-39/+39
| | | | Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
* avcodec/mpegvideoencdsp: Allow pointers to const where possibleAndreas Rheinhardt2022-07-311-2/+2
| | | | Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
* avcodec/me_cmp: Constify me_cmp_func buffer parametersAndreas Rheinhardt2022-07-311-5/+5
| | | | Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
* avcodec/videodsp: Constify buf in VideoDSPContext.prefetchAndreas Rheinhardt2022-07-311-1/+1
| | | | Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
* avcodec/vp56: Move VP5-9 range coder functions to a header of their ownAndreas Rheinhardt2022-07-283-12/+12
| | | | | | Also use a vpx prefix for them. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
* avcodec/vc1: Arm 32-bit NEON unescape fast pathBen Avison2022-04-012-0/+179
| | | | | | | | | | checkasm benchmarks on 1.5 GHz Cortex-A72 are as follows. vc1dsp.vc1_unescape_buffer_c: 918624.7 vc1dsp.vc1_unescape_buffer_neon: 142958.0 Signed-off-by: Ben Avison <bavison@riscosopen.org> Signed-off-by: Martin Storsjö <martin@martin.st>
* avcodec/vc1: Arm 32-bit NEON deblocking filter fast pathsBen Avison2022-04-012-0/+657
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | checkasm benchmarks on 1.5 GHz Cortex-A72 are as follows. Note that the C version can still outperform the NEON version in specific cases. The balance between different code paths is stream-dependent, but in practice the best case happens about 5% of the time, the worst case happens about 40% of the time, and the complexity of the remaining cases fall somewhere in between. Therefore, taking the average of the best and worst case timings is probably a conservative estimate of the degree by which the NEON code improves performance. vc1dsp.vc1_h_loop_filter4_bestcase_c: 19.0 vc1dsp.vc1_h_loop_filter4_bestcase_neon: 48.5 vc1dsp.vc1_h_loop_filter4_worstcase_c: 144.7 vc1dsp.vc1_h_loop_filter4_worstcase_neon: 76.2 vc1dsp.vc1_h_loop_filter8_bestcase_c: 41.0 vc1dsp.vc1_h_loop_filter8_bestcase_neon: 75.0 vc1dsp.vc1_h_loop_filter8_worstcase_c: 294.0 vc1dsp.vc1_h_loop_filter8_worstcase_neon: 102.7 vc1dsp.vc1_h_loop_filter16_bestcase_c: 54.7 vc1dsp.vc1_h_loop_filter16_bestcase_neon: 130.0 vc1dsp.vc1_h_loop_filter16_worstcase_c: 569.7 vc1dsp.vc1_h_loop_filter16_worstcase_neon: 186.7 vc1dsp.vc1_v_loop_filter4_bestcase_c: 20.2 vc1dsp.vc1_v_loop_filter4_bestcase_neon: 47.2 vc1dsp.vc1_v_loop_filter4_worstcase_c: 164.2 vc1dsp.vc1_v_loop_filter4_worstcase_neon: 68.5 vc1dsp.vc1_v_loop_filter8_bestcase_c: 43.5 vc1dsp.vc1_v_loop_filter8_bestcase_neon: 55.2 vc1dsp.vc1_v_loop_filter8_worstcase_c: 316.2 vc1dsp.vc1_v_loop_filter8_worstcase_neon: 72.7 vc1dsp.vc1_v_loop_filter16_bestcase_c: 62.2 vc1dsp.vc1_v_loop_filter16_bestcase_neon: 103.7 vc1dsp.vc1_v_loop_filter16_worstcase_c: 646.5 vc1dsp.vc1_v_loop_filter16_worstcase_neon: 110.7 Signed-off-by: Ben Avison <bavison@riscosopen.org> Signed-off-by: Martin Storsjö <martin@martin.st>
* configure: Use a separate config_components.h header for $ALL_COMPONENTSMartin Storsjö2022-03-162-0/+3
| | | | | | | | This avoids unnecessary rebuilds of most source files if only the list of enabled components has changed, but not the other properties of the build, set in config.h. Signed-off-by: Martin Storsjö <martin@martin.st>
* Revert "arm: hevc_qpel: Fix the assembly to work with non-multiple of 8 widths"J. Dekker2022-01-041-9/+9
| | | | | | | | This reverts commit 2589060b92eeeb944c6e2b50e38412c0c5fabcf4 which was originally to fix the FATE test. The real cause of the test breakage was fixed in 22b7c37275c611b5417722d8941844028aed7f25. Signed-off-by: J. Dekker <jdek@itanimul.li>
* lavc/arm: dont assign hevc_qpel functions for non-multiple of 8 widthsJ. Dekker2022-01-041-1/+2
| | | | | | | | | | | The assembly is written assuming that the width is a multiple of 8. However the real issue is the functions were errorneously assigned to the 2, 4, 6 & 12 widths. This behaviour never broke the decoder as samples which trigger the functions for these widths have not been found in the wild. This relies on the mappings in ff_hevc_pel_weight[]. Signed-off-by: J. Dekker <jdek@itanimul.li>
* arm/aarch64: Improve scheduling in the avg form of h264_qpelMartin Storsjö2021-10-181-33/+24
| | | | | | | Don't use the loaded registers directly, avoiding stalls on in order cores. Use vrhadd.u8 with q registers where easily possible. Signed-off-by: Martin Storsjö <martin@martin.st>
* arm: hevc_qpel: Fix the assembly to work with non-multiple of 8 widthsMartin Storsjö2021-08-251-9/+9
| | | | | | | | | | | | | | | | | | | | This unbreaks the fate-checkasm-hevc_pel test on arm targets. The assembly assumed that the width passed to the DSP functions is a multiple of 8, while the checkasm test used other widths too. This wasn't noticed before, because the hevc_pel checkasm tests (that were added in 9c513edb7999a35ddcc6e3a8d984a96c8fb492a3 in January) weren't run as part of fate until in b492cacffd36ad4cb251ba1f13ac398318ee639a in August. As this hasn't been an issue in practice with actual full decoding tests, it seems like the actual decoder doesn't call these functions with such widths. Therefore, we could alternatively fix the test to only test things that the real decoder does, and this modification could be reverted. Signed-off-by: Martin Storsjö <martin@martin.st>
* avcodec/h264dsp, h264idct: Fix lengths of array parametersAndreas Rheinhardt2021-08-081-4/+4
| | | | | | Fixes many -Warray-parameter warnings from GCC 11. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
* avcodec: Remove deprecated old encode/decode APIsAndreas Rheinhardt2021-04-271-35/+0
| | | | | | | | Deprecated in commits 7fc329e2dd6226dfecaa4a1d7adf353bf2773726 and 31f6a4b4b83aca1d73f3cfc99ce2b39331970bf3. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com> Signed-off-by: James Almer <jamrial@gmail.com>
* Include attributes.h directlyAndreas Rheinhardt2021-04-194-0/+4
| | | | | | | | Some files currently rely on libavutil/cpu.h to include it for them; yet said file won't use include it any more after the currently deprecated functions are removed, so include attributes.h directly. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
* avcodec: add missing FF_API_OLD_ENCDEC wrappers to xmm clobber functionsJames Almer2021-02-261-15/+17
| | | | Signed-off-by: James Almer <jamrial@gmail.com>