diff options
author | Andreas Rheinhardt <andreas.rheinhardt@outlook.com> | 2022-09-08 03:45:12 +0200 |
---|---|---|
committer | Andreas Rheinhardt <andreas.rheinhardt@outlook.com> | 2022-09-19 23:40:41 +0200 |
commit | 4d7a1a4619c9ed89c5f6dbf8a0ae85b9a8aa056b (patch) | |
tree | 8336dd86959c1d92fad306572ad016a1f9bf6edd /libavcodec/h263dsp.h | |
parent | 187cd278325ae2fc6d533141e7d6c557b669932a (diff) | |
download | ffmpeg-4d7a1a4619c9ed89c5f6dbf8a0ae85b9a8aa056b.tar.gz |
swscale/input: Avoid calls to av_pix_fmt_desc_get()
Up until now, libswscale/input.c used a macro to read
an input pixel which involved a call to av_pix_fmt_desc_get()
to find out whether the input pixel format is BE or LE
despite this being known at compile-time (there are templates
per pixfmt). Even worse, these calls are made in a loop,
so that e.g. there are six calls to av_pix_fmt_desc_get()
for every pair of UV pixel processed in
rgb64ToUV_half_c_template().
This commit modifies these macros to ensure that isBE()
is evaluated at compile-time. This saved 9743B of .text
for me (GCC 11.2, -O3). For a simple RGB64LE->YUV420P
transformation like
ffmpeg -f lavfi -i haldclutsrc,format=rgba64le -pix_fmt yuv420p \
-threads 1 -t 1:00 -f null -
the amount of decicycles spent in rgb64LEToUV_half_c
(which is created via the template mentioned above)
decreases from 19751 to 5341; for RGBA64BE the number
went down from 11945 to 5393. For shared builds (where
the call to av_pix_fmt_desc_get() is indirect) the old numbers
are 15230 for RGBA64BE and 27502 for RGBA64LE, whereas
the numbers with this patch are indistinguishable from
the numbers from a static build.
Also make the macros that are touched conform to the
usual convention of using uppercase names while just at it.
Reviewed-by: Anton Khirnov <anton@khirnov.net>
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Diffstat (limited to 'libavcodec/h263dsp.h')
0 files changed, 0 insertions, 0 deletions