diff options
author | Peter Meerwald <pmeerw@pmeerw.net> | 2015-02-20 01:35:34 +0100 |
---|---|---|
committer | Luca Barbato <lu_zero@gentoo.org> | 2015-02-21 00:54:40 +0100 |
commit | 76ce9bd8e26dcb3652240a1072840ff4011d7cdc (patch) | |
tree | 5e358c0303e49f28efe6ace0c90ba5561ac3073c | |
parent | bf07d813f6c88b5a76980f321cf7272d799c4216 (diff) | |
download | ffmpeg-76ce9bd8e26dcb3652240a1072840ff4011d7cdc.tar.gz |
libavutil: Add ARM av_clip_intp2_arm
add ARM code for implementing av_clip_intp2 using the ssat instruction
on Cortex-A8, av_clip_intp2_arm() is faster than av_clip_intp2_c() and
the generic av_clip(), about -19%
Signed-off-by: Peter Meerwald <pmeerw@pmeerw.net>
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
-rw-r--r-- | libavutil/arm/intmath.h | 8 |
1 files changed, 8 insertions, 0 deletions
diff --git a/libavutil/arm/intmath.h b/libavutil/arm/intmath.h index 56fcdb3ec0..2b15ba0593 100644 --- a/libavutil/arm/intmath.h +++ b/libavutil/arm/intmath.h @@ -62,6 +62,14 @@ static av_always_inline av_const int av_clip_int16_arm(int a) return x; } +#define av_clip_intp2 av_clip_intp2_arm +static av_always_inline av_const int av_clip_intp2_arm(int a, int p) +{ + unsigned x; + __asm__ ("ssat %0, %2, %1" : "=r"(x) : "r"(a), "i"(p+1)); + return x; +} + #define av_clip_uintp2 av_clip_uintp2_arm static av_always_inline av_const unsigned av_clip_uintp2_arm(int a, int p) { |