celp: optimise ff_celp_lp_synthesis_filter()

Adding instead of subtracting the products in the loop allows the compiler to generate more efficient multiply-accumulate instructions when 16-bit multiply-subtract is not available. ARM has only multiply-accumulate for 16-bit operands. In general, if only one variant exists, it is usually accumulate rather than subtract. In the same spirit, using the dedicated saturation function enables use of any special optimised versions of this. Signed-off-by: Mans Rullgard <mans@mansr.com>
author: Mans Rullgard <mans@mansr.com> 2012-08-11 04:18:53 +0100
committer: Mans Rullgard <mans@mansr.com> 2012-08-13 01:03:25 +0100
commit: fddc5b9bea39968ed1f45c667869428865de7626 (patch)
tree: 417219a2ad12b4bfe0ee0f9618ad42f4c8db0711 /libavcodec
parent: 6c4975eaafd7f8f91e81ad8d6be744a434241fd3 (diff)
download: ffmpeg-fddc5b9bea39968ed1f45c667869428865de7626.tar.gz
1 files changed, 7 insertions, 8 deletions
diff --git a/libavcodec/celp_filters.c b/libavcodec/celp_filters.c
index 4e5bcda79a..d764d19219 100644
--- a/libavcodec/celp_filters.c
+++ b/libavcodec/celp_filters.c
@@ -63,17 +63,16 @@ int ff_celp_lp_synthesis_filter(int16_t *out, const int16_t *filter_coeffs,
     int i,n;
 
     for (n = 0; n < buffer_length; n++) {
-        int sum = rounder;
+        int sum = -rounder, sum1;
         for (i = 1; i <= filter_length; i++)
-            sum -= filter_coeffs[i-1] * out[n-i];
+            sum += filter_coeffs[i-1] * out[n-i];
 
-        sum = ((sum >> 12) + in[n]) >> shift;
+        sum1 = ((-sum >> 12) + in[n]) >> shift;
+        sum  = av_clip_int16(sum1);
+
+        if (stop_on_overflow && sum != sum1)
+            return 1;
 
-        if (sum + 0x8000 > 0xFFFFU) {
-            if (stop_on_overflow)
-                return 1;
-            sum = (sum >> 31) ^ 32767;
-        }
         out[n] = sum;
     }
author	Mans Rullgard <mans@mansr.com>	2012-08-11 04:18:53 +0100
committer	Mans Rullgard <mans@mansr.com>	2012-08-13 01:03:25 +0100
commit	fddc5b9bea39968ed1f45c667869428865de7626 (patch)
tree	417219a2ad12b4bfe0ee0f9618ad42f4c8db0711 /libavcodec
parent	6c4975eaafd7f8f91e81ad8d6be744a434241fd3 (diff)
download	ffmpeg-fddc5b9bea39968ed1f45c667869428865de7626.tar.gz