aboutsummaryrefslogtreecommitdiffstats
path: root/tests/checkasm/checkasm.c
Commit message (Collapse)AuthorAgeFilesLines
* tests/checkasm: add check for vf_colordetectNiklas Haas14 days1-0/+3
|
* tests/checkasm: add test for vf_blackdetectNiklas Haas2025-07-181-0/+3
|
* tests/checkasm: add scene_sad checkasm testNiklas Haas2025-07-171-0/+3
|
* configure: Factor mpegvideoencdsp out of mpegvideoencAndreas Rheinhardt2025-06-211-1/+1
| | | | | | | This will allow to relax the dependency on mpegvideoenc for several codecs. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
* checkasm: add vvc_saoShaun Loo2025-05-141-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a part of Google Summer of Code 2023 AVX2: - vvc_sao.sao_band [OK] - vvc_sao.sao_edge [OK] checkasm: all 54 tests passed vvc_sao_band_8_8_c: 157.4 ( 1.00x) vvc_sao_band_8_8_avx2: 30.7 ( 5.12x) vvc_sao_band_8_10_c: 119.4 ( 1.00x) vvc_sao_band_8_10_avx2: 29.2 ( 4.09x) vvc_sao_band_8_12_c: 144.6 ( 1.00x) vvc_sao_band_8_12_avx2: 30.0 ( 4.82x) vvc_sao_band_16_8_c: 446.5 ( 1.00x) vvc_sao_band_16_8_avx2: 103.3 ( 4.32x) vvc_sao_band_16_10_c: 399.2 ( 1.00x) vvc_sao_band_16_10_avx2: 64.3 ( 6.21x) vvc_sao_band_16_12_c: 472.9 ( 1.00x) vvc_sao_band_16_12_avx2: 56.5 ( 8.37x) vvc_sao_band_32_8_c: 2430.9 ( 1.00x) vvc_sao_band_32_8_avx2: 203.3 (11.96x) vvc_sao_band_32_10_c: 1405.7 ( 1.00x) vvc_sao_band_32_10_avx2: 208.5 ( 6.74x) vvc_sao_band_32_12_c: 2054.3 ( 1.00x) vvc_sao_band_32_12_avx2: 213.0 ( 9.64x) vvc_sao_band_48_8_c: 3835.4 ( 1.00x) vvc_sao_band_48_8_avx2: 604.2 ( 6.35x) vvc_sao_band_48_10_c: 3624.6 ( 1.00x) vvc_sao_band_48_10_avx2: 468.8 ( 7.73x) vvc_sao_band_48_12_c: 3752.4 ( 1.00x) vvc_sao_band_48_12_avx2: 477.5 ( 7.86x) vvc_sao_band_64_8_c: 6061.1 ( 1.00x) vvc_sao_band_64_8_avx2: 803.9 ( 7.54x) vvc_sao_band_64_10_c: 6142.5 ( 1.00x) vvc_sao_band_64_10_avx2: 827.3 ( 7.43x) vvc_sao_band_64_12_c: 6106.6 ( 1.00x) vvc_sao_band_64_12_avx2: 839.9 ( 7.27x) vvc_sao_band_80_8_c: 9478.0 ( 1.00x) vvc_sao_band_80_8_avx2: 1516.7 ( 6.25x) vvc_sao_band_80_10_c: 10300.5 ( 1.00x) vvc_sao_band_80_10_avx2: 1298.7 ( 7.93x) vvc_sao_band_80_12_c: 8941.1 ( 1.00x) vvc_sao_band_80_12_avx2: 1315.3 ( 6.80x) vvc_sao_band_96_8_c: 13351.5 ( 1.00x) vvc_sao_band_96_8_avx2: 1815.4 ( 7.35x) vvc_sao_band_96_10_c: 13197.5 ( 1.00x) vvc_sao_band_96_10_avx2: 1872.4 ( 7.05x) vvc_sao_band_96_12_c: 11969.0 ( 1.00x) vvc_sao_band_96_12_avx2: 1895.8 ( 6.31x) vvc_sao_band_112_8_c: 19936.9 ( 1.00x) vvc_sao_band_112_8_avx2: 2802.3 ( 7.11x) vvc_sao_band_112_10_c: 19534.9 ( 1.00x) vvc_sao_band_112_10_avx2: 2635.0 ( 7.41x) vvc_sao_band_112_12_c: 16520.6 ( 1.00x) vvc_sao_band_112_12_avx2: 2591.8 ( 6.37x) vvc_sao_band_128_8_c: 25967.5 ( 1.00x) vvc_sao_band_128_8_avx2: 3155.3 ( 8.23x) vvc_sao_band_128_10_c: 24002.6 ( 1.00x) vvc_sao_band_128_10_avx2: 3374.6 ( 7.11x) vvc_sao_band_128_12_c: 20829.4 ( 1.00x) vvc_sao_band_128_12_avx2: 3377.0 ( 6.17x) vvc_sao_edge_8_8_c: 174.6 ( 1.00x) vvc_sao_edge_8_8_avx2: 37.0 ( 4.72x) vvc_sao_edge_8_10_c: 174.4 ( 1.00x) vvc_sao_edge_8_10_avx2: 58.5 ( 2.98x) vvc_sao_edge_8_12_c: 171.1 ( 1.00x) vvc_sao_edge_8_12_avx2: 58.5 ( 2.93x) vvc_sao_edge_16_8_c: 677.7 ( 1.00x) vvc_sao_edge_16_8_avx2: 72.2 ( 9.39x) vvc_sao_edge_16_10_c: 724.8 ( 1.00x) vvc_sao_edge_16_10_avx2: 106.4 ( 6.81x) vvc_sao_edge_16_12_c: 647.0 ( 1.00x) vvc_sao_edge_16_12_avx2: 106.6 ( 6.07x) vvc_sao_edge_32_8_c: 3001.8 ( 1.00x) vvc_sao_edge_32_8_avx2: 157.6 (19.04x) vvc_sao_edge_32_10_c: 3071.1 ( 1.00x) vvc_sao_edge_32_10_avx2: 404.2 ( 7.60x) vvc_sao_edge_32_12_c: 2698.6 ( 1.00x) vvc_sao_edge_32_12_avx2: 398.8 ( 6.77x) vvc_sao_edge_48_8_c: 6557.7 ( 1.00x) vvc_sao_edge_48_8_avx2: 380.1 (17.25x) vvc_sao_edge_48_10_c: 6319.9 ( 1.00x) vvc_sao_edge_48_10_avx2: 896.3 ( 7.05x) vvc_sao_edge_48_12_c: 6306.4 ( 1.00x) vvc_sao_edge_48_12_avx2: 885.5 ( 7.12x) vvc_sao_edge_64_8_c: 11510.7 ( 1.00x) vvc_sao_edge_64_8_avx2: 504.1 (22.84x) vvc_sao_edge_64_10_c: 10917.4 ( 1.00x) vvc_sao_edge_64_10_avx2: 1608.3 ( 6.79x) vvc_sao_edge_64_12_c: 11499.8 ( 1.00x) vvc_sao_edge_64_12_avx2: 1586.4 ( 7.25x) vvc_sao_edge_80_8_c: 18193.2 ( 1.00x) vvc_sao_edge_80_8_avx2: 930.2 (19.56x) vvc_sao_edge_80_10_c: 17984.3 ( 1.00x) vvc_sao_edge_80_10_avx2: 2420.9 ( 7.43x) vvc_sao_edge_80_12_c: 18289.4 ( 1.00x) vvc_sao_edge_80_12_avx2: 2412.1 ( 7.58x) vvc_sao_edge_96_8_c: 26361.8 ( 1.00x) vvc_sao_edge_96_8_avx2: 1118.4 (23.57x) vvc_sao_edge_96_10_c: 26162.2 ( 1.00x) vvc_sao_edge_96_10_avx2: 3666.9 ( 7.13x) vvc_sao_edge_96_12_c: 25926.6 ( 1.00x) vvc_sao_edge_96_12_avx2: 3433.9 ( 7.55x) vvc_sao_edge_112_8_c: 36562.9 ( 1.00x) vvc_sao_edge_112_8_avx2: 1741.0 (21.00x) vvc_sao_edge_112_10_c: 38126.4 ( 1.00x) vvc_sao_edge_112_10_avx2: 5153.3 ( 7.40x) vvc_sao_edge_112_12_c: 36345.7 ( 1.00x) vvc_sao_edge_112_12_avx2: 4684.9 ( 7.76x) vvc_sao_edge_128_8_c: 46379.8 ( 1.00x) vvc_sao_edge_128_8_avx2: 2012.4 (23.05x) vvc_sao_edge_128_10_c: 47029.5 ( 1.00x) vvc_sao_edge_128_10_avx2: 6162.2 ( 7.63x) vvc_sao_edge_128_12_c: 49647.3 ( 1.00x) vvc_sao_edge_128_12_avx2: 6127.1 ( 8.10x) Co-authored-by: Nuo Mi <nuomi2021@gmail.com>
* lavc/apv: AVX2 transquant for x86-64Mark Thompson2025-04-271-0/+3
| | | | | | | | | Typical checkasm result on Alder Lake: decode_transquant_8_c: 464.2 ( 1.00x) decode_transquant_8_avx2: 86.2 ( 5.38x) decode_transquant_10_c: 481.6 ( 1.00x) decode_transquant_10_avx2: 83.5 ( 5.77x)
* checkasm: add tests for AESRodger Combs2025-04-061-0/+1
| | | | Signed-off-by: James Almer <jamrial@gmail.com>
* tests/checkasm/checkasm.c: Assert that aligned_w/h do not overflowMichael Niedermayer2025-04-031-2/+5
| | | | | Reviewed-by: Martin Storsjö <martin@martin.st> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* checkasm: Implement helpers for defining and checking padded rectsMartin Storsjö2025-04-011-19/+70
| | | | | | | | | | | | | | | This backports similar functionality from dav1d, from commits 35d1d011fda4a92bcaf42d30ed137583b27d7f6d and d130da9c315d5a1d3968d278bbee2238ad9051e7. This allows detecting writes out of bounds, on all 4 sides of the intended destination rectangle. The bounds checking also can optionally allow small overwrites (up to a specified alignment), while still checking for larger overwrites past the intended allowed region. Signed-off-by: Martin Storsjö <martin@martin.st>
* checkasm: Make checkasm_fail_func return whether we should print verboselyMartin Storsjö2025-04-011-4/+5
| | | | | | | | | This makes it easier to implement custom error printouts in tests. This is a port of dav1d's commit 13a7d78655f8747c2cd01e8a48d44dcc7f60a8e5 into ffmpeg's checkasm. Signed-off-by: Martin Storsjö <martin@martin.st>
* checkasm: Print benchmarks of C-only functionsMartin Storsjö2024-12-111-25/+23
| | | | | | | | | | | | | This corresponds to commit 9278a14cf406f8edb5052c42b83750112bf5b515 in dav1d. Omitting the C-only functions doesn't speed up benchmarking anyway (as those has to be benchmarked before we know if we have any corresponding assembly functions), and being able to benchmark those functions without corresponding assembly can be valuable in a number of cases. Signed-off-by: Martin Storsjö <martin@martin.st>
* tests/checkasm: Simplify logic for WASI signal handlingZhao Zhili2024-12-061-8/+2
| | | | | Signed-off-by: Zhao Zhili <zhilizhao@tencent.com> Reviewed-by: Martin Storsjö <martin@martin.st>
* tests/checkasm: Add partial support for wasmZhao Zhili2024-12-041-0/+8
| | | | | | | | | WASI mssing signal and siglongjmp support. This patch workaround build error and add simd128 flag. Please note that many tests use large array on stack, so you need to increase the stack size when build checkasm, e.g., --extra-ldflags='-Wl,-z,stack-size=10485760' Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
* checkasm: add RISC-V vector width to arch infoRémi Denis-Courmont2024-11-171-0/+6
|
* checkasm/diracdsp: test add_dirac_obmcKyosuke Kawakami2024-11-151-0/+3
| | | | | Signed-off-by: Kyosuke Kawakami <kawakami150708@gmail.com> Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
* checkasm: Print the SVE vector length at startupMartin Storsjö2024-09-271-1/+11
| | | | Signed-off-by: Martin Storsjö <martin@martin.st>
* aarch64: Add CPU feature flags for SVE and SVE2Martin Storsjö2024-09-271-0/+2
| | | | | | Add code for detecting the feature on Linux and Windows. Signed-off-by: Martin Storsjö <martin@martin.st>
* checkasm/riscv: print official extension namesRémi Denis-Courmont2024-09-041-6/+6
|
* checkasm: add wildcompares for test & functionsJ. Dekker2024-08-281-5/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | Added: --test=<pattern> Filter tests by glob style pattern. --bench[=<pattern>] Run benchmark and optionally filter functions by glob style pattern. Example: $ ./tests/checkasm/checkasm --bench=yuva* [...] yuva420p_bgr24_8_c: 34.5 ( 1.00x) yuva420p_bgr24_8_ssse3: 31.1 ( 1.11x) yuva420p_bgr24_128_c: 310.6 ( 1.00x) yuva420p_bgr24_128_ssse3: 178.1 ( 1.74x) yuva420p_bgr24_1080_c: 2509.6 ( 1.00x) yuva420p_bgr24_1080_ssse3: 1471.5 ( 1.71x) yuva420p_bgr24_1920_c: 4462.6 ( 1.00x) yuva420p_bgr24_1920_ssse3: 2331.1 ( 1.91x) [...] Ported from dav1d. Signed-off-by: J. Dekker <jdek@itanimul.li>
* checkasm: improve print formatJ. Dekker2024-08-281-5/+48
| | | | | | | Port dav1d's checkasm output format to FFmpeg's checkasm, includes relative speedups and aligns results. Signed-off-by: J. Dekker <jdek@itanimul.li>
* checkasm: print only results to stdoutJ. Dekker2024-08-281-3/+3
| | | | Signed-off-by: J. Dekker <jdek@itanimul.li>
* checkasm: add csv/tsv bench outputJ. Dekker2024-08-281-2/+20
| | | | | | | | When collecting performance information from checkasm it is common to parse the output for use in graphs to compare vs different architectures. Signed-off-by: J. Dekker <jdek@itanimul.li>
* checkasm/mpegvideoencdsp: add pix_sum, pix_norm1, and draw_edgesRamiro Polla2024-08-261-0/+3
|
* checkasm/yuv2yuv: add tests for semiplanar unscaled convertersRamiro Polla2024-08-261-0/+1
|
* lavu/riscv: drop probing for zba CPU capabilityRémi Denis-Courmont2024-08-051-1/+0
|
* lavc/riscv: drop probing for F & D extensionsRémi Denis-Courmont2024-08-011-2/+0
| | | | | | | | | | F and D extensions are included in all RISC-V application profiles ever made (so starting from RV64GC a.k.a. RVA20). Realistically they need to be selected at compilation time. Currently, there are no consumers for these two flags. If there is ever a need to reintroduce F- or D-specific optimisations, we can always use __riscv_f or __riscv_d compiler predefined macros respectively.
* lavu/riscv: add CPU flag for B bit manipulationsRémi Denis-Courmont2024-07-251-0/+1
| | | | | | | | The B extension was finally ratified in May 2024, encompassing: - Zba (addresses), - Zbb (basics) and - Zbs (single bits). It does not include Zbc (base-2 polynomials).
* checkasm: add tests for yuv2rgbRamiro Polla2024-06-281-0/+1
|
* tests/checkasm: Remove check on linux perf fd in uninitZhao Zhili2024-06-181-2/+1
| | | | | | | The check should be >= 0, not > 0. The check itself is redundant since uninit only being called after init is success. Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
* checkasm: add tests for {lum,chr}ConvertRangeRamiro Polla2024-06-161-0/+1
|
* checkasm: disable unaligned access emulationRémi Denis-Courmont2024-06-071-0/+6
| | | | | | | | | | | | The OS may silently fix (emulate) unaligned hardware access exceptions. This is extremely slow and code should be fixed not to rely on unaligned access on affected hardware. Accordingly this requests that the OS disable emulation and instead throw Bus error, which will be caught by checkasm's signal handler. This has no effects if the hardware supports unaligned access in hardware, since no exceptions are generated. prctl() will fail safe in that case.
* checkasm: add linear least square testsRémi Denis-Courmont2024-06-011-0/+1
|
* checkasm: add h263dsp.{h,v}_loop_filterRémi Denis-Courmont2024-05-271-0/+3
|
* checkasm/riscv: test misaligned before VRémi Denis-Courmont2024-05-241-1/+1
| | | | Otherwise V functions mask scalar misaligned ones.
* checkasm: print bench runs when benchmarkingLynne2024-05-211-0/+3
| | | | Helps make sense of the possible noise in the results.
* checkasm: add runs argument to adjust during benchJ. Dekker2024-05-211-1/+15
| | | | | | | | | | | | Some timers on certain device and test combinations can produce noisy results, affecting the reliability of performance measurements. One notable example of this is the Canaan K230 RISC-V development board. An option to adjust the number of samples by an exponent (--runs) has been added, allowing developers to increase the sample count for more reliable results. Signed-off-by: J. Dekker <jdek@itanimul.li>
* lavu/riscv: CPU flag for fast misaligned accessesRémi Denis-Courmont2024-05-141-0/+1
|
* tests/checkasm: add checkasm_check_vvc_alf and check_alf_filterWu Jianhua2024-05-141-1/+2
| | | | Signed-off-by: Wu Jianhua <toqsxw@outlook.com>
* riscv: add Zvbb vector bit manipulation extensionRémi Denis-Courmont2024-05-111-0/+1
|
* checkasm: add test for fdctRamiro Polla2024-05-111-0/+3
| | | | | Reviewed-by: Martin Storsjö <martin@martin.st> Reviewed-by: Rémi Denis-Courmont <remi@remlab.net>
* checkasm/rv40dsp: add chroma_mc testsunyuechi2024-05-031-0/+3
| | | | | | This is similar to h264. Signed-off-by: Rémi Denis-Courmont <remi@remlab.net>
* tests/checkasm: add exclude_guest for non-x86 linux perfJ. Dekker2024-04-101-0/+3
| | | | | | | | | | | The exclude_guest option only has an effect on x86. Omitting 'exclude_guest' defaults to zero which implies that you can count guest events should you run one. Some non-x86 kernels just ignore it, while others (e.g. the Asahi Linux kernels) require the user to explicitly set the option to 1, i.e. the only behaviour that makes sense when counting guest events isn't supported. Signed-off-by: J. Dekker <jdek@itanimul.li>
* checkasm/rv34dsp: add rv34_inv_transform_dc testsunyuechi2024-02-171-0/+3
| | | | Signed-off-by: Rémi Denis-Courmont <remi@remlab.net>
* tests/checkasm: add checkasm_check_vvc_mcWu Jianhua2024-02-011-0/+3
| | | | Signed-off-by: Wu Jianhua <toqsxw@outlook.com>
* checkasm: Test whether the native FFmpeg timers workMartin Storsjö2024-01-151-1/+11
| | | | | | | | | | | | | | | On some platforms (in particular, ARM/AArch64), the implementation of AV_READ_TIME() may use a privileged instruction - in such cases, benchmarking just fails with a SIGILL. Instead of crashing, try executing AV_READ_TIME() once within a region with the signal handler active, to allow gracefully informing the user about the issue. This matches the dav1d checkasm commit 95a192549a448b70d9542e840c4e34b60d09b093. Signed-off-by: Martin Storsjö <martin@martin.st>
* checkasm/svqenc: add ssd_int8_vs_int16 testsunyuechi2024-01-151-0/+3
| | | | Signed-off-by: Rémi Denis-Courmont <remi@remlab.net>
* checkasm: Generalize crash handlingMartin Storsjö2024-01-111-12/+84
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This replaces the riscv specific handling from 7212466e735aa187d82f51dadbce957fe3da77f0 (which essentially is reverted), with a different implementation of the same (plus a bit more), based on the corresponding feature in dav1d's checkasm, supporting both Unix and Windows. See in particular the dav1d commits 0b6ee30eab2400e4f85b735ad29a68a842c34e21, 0421f787ea592fd2cc74c887f20b8dc31393788b, 8501a4b20135f93a4c3b426468e2240e872949c5 and d23e87f7aee26ddcf5f7a2e185112031477599a7, authored by Henrik Gramner. The overall approach compared to the existing implementation for riscv is the same; set up a signal handler, store the state with sigsetjmp, jump out of the crashing function with siglongjmp. The main difference is in what happens when the signal handler is invoked. In the previous implementation, it would resume from right before calling the crashing function, and then skip that call based on the setjmp return value. In the imported implementation from dav1d, we return to right before the check_func() call, which will skip testing the current function (as the pointer is the same as it was before). Other differences are: - Support for other signal handling mechanisms (Windows AddVectoredExceptionHandler) - Using RtlCaptureContext/RtlRestoreContext instead of setjmp/longjmp on Windows with SEH - Only catching signals once per function - if more than one signal is delivered before signal handling is reenabled, any signal is handled as it would without our handler - Not using an arch specific signal handler written in assembly Signed-off-by: Martin Storsjö <martin@martin.st>
* checkasm/takdsp: add decorrelate_ls testsunyuechi2023-12-211-0/+3
| | | | Signed-off-by: Rémi Denis-Courmont <remi@remlab.net>
* checkasm: Remove unnecessary const on scalar parametersMartin Storsjö2023-12-211-2/+2
| | | | | | | | | | | | | | | | | | | | | The ffmpeg coding style doesn't usually use const on scalar parameters (or on the pointer values - as opposed to the type that is pointed to, where it has a semantic meaning), contrary to the dav1d coding style (where this was imported from). This avoids warnings about differences in the type signatures between declaration and definition of this function, with older versions of MSVC. The issue was observed with one version of MSVC 2017, 19.16.27024.1, with warnings like these: src/tests/checkasm/checkasm.c(969): warning C4028: formal parameter 3 different from declaration The warning itself is bogus as the const here is harmless, and newer versions of MSVC no longer warn about this. Signed-off-by: Martin Storsjö <martin@martin.st>
* checkasm: test for abs_pow34sunyuechi2023-12-111-0/+3
| | | | Signed-off-by: Rémi Denis-Courmont <remi@remlab.net>