aboutsummaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorMans Rullgard <mans@mansr.com>2012-07-01 20:21:10 +0100
committerMans Rullgard <mans@mansr.com>2012-07-03 16:10:54 +0100
commit06eb4f0885746b7e8a652d7b6026abf834e04b73 (patch)
treee2fd9c485995e64266f63efbab338dd3473b5b6c
parent09a445ce3426081da1254af62cb185180099e74e (diff)
downloadffmpeg-06eb4f0885746b7e8a652d7b6026abf834e04b73.tar.gz
configure: do not disable av_always_inline with --enable-small
Currently, --enable-small turns av_always_inline into plain inline, which is more or less ignored by the compiler. While the intent of this is probably to reduce code size by avoiding some inlining, it has more far-reaching effects. We use av_always_inline in two situations: 1. The body of a function is smaller than the call overhead. Instances of these are abundant in libavutil, the bswap.h functions being good examples. 2. The function is a template relying on constant propagation through inlined calls for sane code generation. These are often found in motion compensation code. Both of these types of functions should be inlined even if targeting small code size. Although GCC has heuristics for detecting the first of these types, it is not always reliable, especially when the function uses inline assembler, which is often the reason for having those functions in the first place, so making it explicit is generally a good idea. The size increase from inlining template-type functions is usually much smaller than it seems due to different branches being mutually exclusive between the different invocations. The dead branches can, however, only be removed after inlining and constant propagation have been performed, which means the initial cost estimate for inlining these is much higher than is actually the case, resulting in GCC often making bad choices if left to its own devices. Furthermore, the GCC inliner limits how much it allows a function to grow due to automatic inlining of calls, and this appears to not take call overhead into account. When nested inlining is used, the limit may be hit before the innermost level is reached. In some cases, this has prevented inlining of type 1 functions as defined above, resulting in significant performance loss. Signed-off-by: Mans Rullgard <mans@mansr.com>
-rwxr-xr-xconfigure9
1 files changed, 0 insertions, 9 deletions
diff --git a/configure b/configure
index a7ac43fa07..3a1b4361b8 100755
--- a/configure
+++ b/configure
@@ -3403,15 +3403,6 @@ EOF
test -n "$malloc_prefix" &&
echo "#define MALLOC_PREFIX $malloc_prefix" >>$TMPH
-if enabled small || disabled optimizations; then
- echo "#undef av_always_inline" >> $TMPH
- if enabled small; then
- echo "#define av_always_inline inline" >> $TMPH
- else
- echo "#define av_always_inline av_unused" >> $TMPH
- fi
-fi
-
if enabled yasm; then
append config_files $TMPASM
printf '' >$TMPASM