diff options
author | Shiyou Yin <yinshiyou-hf@loongson.cn> | 2019-01-21 18:10:24 +0800 |
---|---|---|
committer | Michael Niedermayer <michael@niedermayer.cc> | 2019-01-22 00:46:36 +0100 |
commit | 6d1916481137a9608d95fc3aa8f3d874b7aaea3c (patch) | |
tree | 04914f3e1ea9a920e5d1d262c063e4502fa1b663 /libavutil/mips | |
parent | 8133921ad2d80b8b2fe4e0bf6e0abd40afe3e235 (diff) | |
download | ffmpeg-6d1916481137a9608d95fc3aa8f3d874b7aaea3c.tar.gz |
avcodec/mips: [loongson] optimize put_hevc_qpel_hv_8 with mmi.
Optimize put_hevc_qpel_hv_8 with mmi in the case width=4/8/12/16/24/32/48/64.
This optimization improved HEVC decoding performance 11%(1.81x to 2.01x, tested on loongson 3A3000).
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Diffstat (limited to 'libavutil/mips')
-rw-r--r-- | libavutil/mips/mmiutils.h | 9 |
1 files changed, 9 insertions, 0 deletions
diff --git a/libavutil/mips/mmiutils.h b/libavutil/mips/mmiutils.h index 76b1199ab9..5fc1908ee3 100644 --- a/libavutil/mips/mmiutils.h +++ b/libavutil/mips/mmiutils.h @@ -251,6 +251,15 @@ ); /** + * brief: Transpose 2X2 word packaged data. + * fr_i0, fr_i1: src + * fr_o0, fr_o1: dst + */ +#define TRANSPOSE_2W(fr_i0, fr_i1, fr_o0, fr_o1) \ + "punpcklwd "#fr_o0", "#fr_i0", "#fr_i1" \n\t" \ + "punpckhwd "#fr_o1", "#fr_i0", "#fr_i1" \n\t" + +/** * brief: Transpose 4X4 half word packaged data. * fr_i0, fr_i1, fr_i2, fr_i3: src & dst * fr_t0, fr_t1, fr_t2, fr_t3: temporary register |