| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
| |
Simplify macro TRANSPOSE_4H in mmiutils.h and add TRANSPOSE_8B as a common macro.
Signed-off-by: Michael Niedermayer <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Optimize vp8 loop filter with mmi, four functions optimized:
1. ff_vp8_h_loop_filter8uv_mmi.
2. ff_vp8_v_loop_filter8uv_mmi.
3. ff_vp8_h_loop_filter16_mmi.
4. ff_vp8_v_loop_filter16_mmi.
Vp8 decoding speed improved about 50%(from 73fps to 110fps, Tested on loongson 3A3000).
Signed-off-by: Shiyou Yin <[email protected]>
Signed-off-by: Michael Niedermayer <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Performance of mpeg4 decoding improved about 23%(from 128fps to 158fps, tested on loongson 3A3000).
Reoptimized following functions with mmi.
1. ff_simple_idct_put_8_mmi
2. ff_simple_idct_add_8_mmi
3. ff_simple_idct_8_mmi
Signed-off-by: Michael Niedermayer <[email protected]>
|
|
|
|
|
|
|
|
| |
Use immediate unsigned saturation for clip to max saving one vector register.
Signed-off-by: Kaustubh Raste <[email protected]>
Reviewed-by: Manojkumar Bhosale <[email protected]>
Signed-off-by: Michael Niedermayer <[email protected]>
|
|
|
|
|
|
|
|
| |
Replace generic with block size specific function.
Signed-off-by: Kaustubh Raste <[email protected]>
Reviewed-by: Manojkumar Bhosale <[email protected]>
Signed-off-by: Michael Niedermayer <[email protected]>
|
|
|
|
|
|
|
|
| |
Replace generic with block size specific function.
Signed-off-by: Kaustubh Raste <[email protected]>
Reviewed-by: Manojkumar Bhosale <[email protected]>
Signed-off-by: Michael Niedermayer <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Load the specific destination bytes instead of MSA load and pack.
Pack the data to half word before clipping.
Use immediate unsigned saturation for clip to max saving one vector register.
Signed-off-by: Kaustubh Raste <[email protected]>
Reviewed-by: Manojkumar Bhosale <[email protected]>
Signed-off-by: Michael Niedermayer <[email protected]>
|
|
|
|
|
|
|
|
| |
Preload data in band filter 0-8 for better pipeline parallelization.
Signed-off-by: Kaustubh Raste <[email protected]>
Reviewed-by: Manojkumar Bhosale <[email protected]>
Signed-off-by: Michael Niedermayer <[email protected]>
|
|
|
|
|
|
|
|
| |
Load the specific destination bytes instead of MSA load and pack.
Signed-off-by: Kaustubh Raste <[email protected]>
Reviewed-by: Manojkumar Bhosale <[email protected]>
Signed-off-by: Michael Niedermayer <[email protected]>
|
|
|
|
|
|
|
|
| |
Removed memset call and improved performance.
Signed-off-by: Kaustubh Raste <[email protected]>
Reviewed-by: Manojkumar Bhosale <[email protected]>
Signed-off-by: Michael Niedermayer <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Reduced msa load-store code.
Removed inline asm of GP load-store for 64 bit.
Updated variable names in GP load-store macros for naming consistency.
Corrected macro descriptions.
Signed-off-by: Kaustubh Raste <[email protected]>
Reviewed-by: Manojkumar Bhosale <[email protected]>
Signed-off-by: Michael Niedermayer <[email protected]>
|
|
|
|
|
|
|
| |
1.mmiutils.h defined MMI_ load/store macros for loongson2e/2f/3a
2.mmiutils.h defined some mmi assembly macors
Signed-off-by: Michael Niedermayer <[email protected]>
|
|
|
|
|
|
|
| |
for mips r6
Signed-off-by: Shivraj Patil <[email protected]>
Signed-off-by: Michael Niedermayer <[email protected]>
|
|
|
|
| |
Signed-off-by: Michael Niedermayer <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Understanding the mips32r6 and mips64r6 ISAs in the configure script is
not enough. In order to have full support for MIPS R6 in FFmpeg we need
to be able to build it, and for that we need to make sure we don't use
incompatible assembler code which makes the build fail. Ifdefing the
offending code is sufficient to fix the problem.
Signed-off-by: Vicente Olivert Riera <[email protected]>
Signed-off-by: Michael Niedermayer <[email protected]>
|
| |
|
|
|
|
|
|
| |
Signed-off-by: Shivraj Patil <[email protected]>
Reviewed-by: "Ronald S. Bultje" <[email protected]>
Signed-off-by: Michael Niedermayer <[email protected]>
|
|
|
|
|
|
|
| |
This patch adds MSA (MIPS-SIMD-Arch) optimizations for idctdsp functions in new file idctdsp_msa.c and simple_idct_msa.c
Signed-off-by: Shivraj Patil <[email protected]>
Signed-off-by: Michael Niedermayer <[email protected]>
|
|
|
|
|
|
|
| |
This patch adds MSA (MIPS-SIMD-Arch) optimizations for me_cmp functions in new file me_cmp_msa.c
Signed-off-by: Shivraj Patil <[email protected]>
Signed-off-by: Michael Niedermayer <[email protected]>
|
|
|
|
|
|
|
| |
This patch adds MSA (MIPS-SIMD-Arch) optimizations for mpegvideoencdsp functions in new file mpegvideoencdsp_msa.c
Signed-off-by: Shivraj Patil <[email protected]>
Signed-off-by: Michael Niedermayer <[email protected]>
|
|
|
|
|
|
|
| |
This patch adds MSA (MIPS-SIMD-Arch) optimizations for mpegvideo functions in new file mpegvideo_msa.c
Signed-off-by: Shivraj Patil <[email protected]>
Signed-off-by: Michael Niedermayer <[email protected]>
|
|
|
|
|
|
|
|
| |
This patch adds MSA (MIPS-SIMD-Arch) optimizations for pixblock functions in new file pixblockdsp_msa.c
Adds new generic macros (needed for this patch) in libavutil/mips/generic_macros_msa.h
Signed-off-by: Shivraj Patil <[email protected]>
Signed-off-by: Michael Niedermayer <[email protected]>
|
|
|
|
|
|
|
|
| |
This patch adds MSA (MIPS-SIMD-Arch) optimizations for hpel functions in new file hpeldsp_msa.c
Adds new generic macros (needed for this patch) in libavutil/mips/generic_macros_msa.h
Signed-off-by: Shivraj Patil <[email protected]>
Signed-off-by: Michael Niedermayer <[email protected]>
|
|
|
|
|
|
|
|
| |
This patch adds MSA (MIPS-SIMD-Arch) optimizations for qpel functions in new file qpeldsp_msa.c
Adds new generic macros (needed for this patch) in libavutil/mips/generic_macros_msa.h
Signed-off-by: Shivraj Patil <[email protected]>
Signed-off-by: Michael Niedermayer <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
This patch adds MSA (MIPS-SIMD-Arch) optimizations for AVC qpel functions in new file h264qpel_msa.c
Adds new generic macros (needed for this patch) in libavutil/mips/generic_macros_msa.h
Added const to local static array.
Signed-off-by: Shivraj Patil <[email protected]>
Signed-off-by: Michael Niedermayer <[email protected]>
|
|
|
|
|
|
|
|
| |
This patch adds MSA (MIPS-SIMD-Arch) optimizations for AVC idct functions in new file h264idct_msa.c
Adds new generic macros (needed for this patch) in libavutil/mips/generic_macros_msa.h
Signed-off-by: Shivraj Patil <[email protected]>
Signed-off-by: Michael Niedermayer <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
functions
This patch adds MSA (MIPS-SIMD-Arch) optimizations for AVC intra prediction functions in new file h264pred_msa.c
Adds new generic macros (needed for this patch) in libavutil/mips/generic_macros_msa.h
Signed-off-by: Shivraj Patil <[email protected]>
Signed-off-by: Michael Niedermayer <[email protected]>
|
|
|
|
|
|
|
|
| |
s patch adds MSA (MIPS-SIMD-Arch) optimizations for AVC chroma mc functions in new file h264chroma_msa.c
Adds new generic macros (needed for this patch) in libavutil/mips/generic_macros_msa.h
Signed-off-by: Shivraj Patil <[email protected]>
Signed-off-by: Michael Niedermayer <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
functions
This patch adds MSA (MIPS-SIMD-Arch) optimizations for HEVC intra predition functions in new file hevcpred_msa.c
Adds new generic macros (needed for this patch) in libavutil/mips/generic_macros_msa.h
Signed-off-by: Shivraj Patil <[email protected]>
Signed-off-by: Michael Niedermayer <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
sao functions
This patch adds MSA (MIPS-SIMD-Arch) optimizations for HEVC loop filter and sao functions in new file hevc_lpf_sao_msa.c
Adds new generic macros (needed for this patch) in libavutil/mips/generic_macros_msa.h
In this patch, in comparision with previous patch, duplicated c functions are removed.
Signed-off-by: Shivraj Patil <[email protected]>
Signed-off-by: Michael Niedermayer <[email protected]>
|
|
|
|
|
|
|
|
| |
This patch adds MSA (MIPS-SIMD-Arch) optimizations for HEVC idct functions in new file hevc_idct_msa.c
Adds new generic macros (needed for this patch) in libavutil/mips/generic_macros_msa.h
Signed-off-by: Shivraj Patil <[email protected]>
Signed-off-by: Michael Niedermayer <[email protected]>
|
|
|
|
|
|
|
|
| |
This patch adds MSA (MIPS-SIMD-Arch) optimizations for HEVC uni mc epel functions.
Adds new generic macros (needed for this patch) in libavutil/mips/generic_macros_msa.h
Signed-off-by: Shivraj Patil <[email protected]>
Signed-off-by: Michael Niedermayer <[email protected]>
|
|
|
|
|
|
|
|
| |
This patch adds MSA (MIPS-SIMD-Arch) optimizations for HEVC uniw mc functions (qpel as well as epel) in new file hevc_mc_uniw_msa.c
Adds new generic macros (needed for this patch) in libavutil/mips/generic_macros_msa.h
Signed-off-by: Shivraj Patil <[email protected]>
Signed-off-by: Michael Niedermayer <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This patch adds MSA (MIPS-SIMD-Arch) optimizations for HEVC bi mc functions (qpel as well as epel) in new file hevc_mc_bi_msa.c
Adds new generic macros (needed for this patch) in libavutil/mips/generic_macros_msa.h
Adds HEVC specific macros (needed for this patch) in libavcodec/mips/hevc_macros_msa.h
Signed-off-by: Shivraj Patil <[email protected]>
Signed-off-by: Michael Niedermayer <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This patch moves HEVC code of uni mc cases to new file hevc_mc_uni_msa.c.
(There are total 5 sub-modules of HEVC mc functions, if we add all these modules in one single file, its size would be huge (~750k) & difficult to maintain, so splitting it in multiple files)
This patch also adds new HEVC header file libavcodec/mips/hevc_macros_msa.h
Signed-off-by: Shivraj Patil <[email protected]>
Signed-off-by: Michael Niedermayer <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch includes restructuring of existing macros and addition of more generic macros.
This change was necessary to avoid repeated review comments in remaining patches which we were about to submit.
Also this patch reduces number of code lines due to maximum use of generic macros, allows better code alignment & readability etc.
These modifications in commonly used .libavutil/mips/generic_macros_msa.h. impacts the already accepted code, hence re-submitting it in 2/4,3/4 & 4/4.
Overall, this patch set is just upgrading the code with styling changes and will bring it in sync with MIPS-SIMD optimized latest codebase at our end.
Signed-off-by: Shivraj Patil <[email protected]>
Signed-off-by: Michael Niedermayer <[email protected]>
|
|
|
|
|
|
|
|
| |
horizontal and uni vertical mc functions
Signed-off-by: Shivraj Patil <[email protected]>
Reviewed-by: Nedeljko Babic <[email protected]>
Signed-off-by: Michael Niedermayer <[email protected]>
|
|
|
|
|
|
|
| |
weight/biweight functions
Reviewed-by: Nedeljko Babic <[email protected]>
Signed-off-by: Michael Niedermayer <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
functions
Incorporated review comment.
Removed "__" from volatile.
Signed-off-by: Shivraj Patil <[email protected]>
Reviewed-by: Nedeljko Babic <[email protected]>
Signed-off-by: Michael Niedermayer <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Nedeljko Babic <[email protected]>
Signed-off-by: Michael Niedermayer <[email protected]>
|
|
|
|
|
|
|
|
| |
vertical mc functions
Signed-off-by: Shivraj Patil <[email protected]>
Reviewed-by: Nedeljko Babic <[email protected]>
Signed-off-by: Michael Niedermayer <[email protected]>
|
|
|
|
|
|
|
|
| |
supported)
Signed-off-by: Shivraj Patil <[email protected]>
Reviewed-by: Nedeljko Babic <[email protected]>
Signed-off-by: Michael Niedermayer <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commit dfa920807494 ("mips/float_dsp: fix a bug in vector_fmul_window_mips")
fixed vector_fmul_window_mips by unrolling the loop only 4 times, but also
removed the outer C loop and replaced it with assembly branches and pointer
arithmetic. When submitting my 64-bit porting patch I missed this new
assembly which also needed porting.
This patch fixes a bus error in the fate-float-dsp test when run on 64-bit
mips.
Signed-off-by: James Cowgill <[email protected]>
Reviewed-by: Nedeljko Babic <[email protected]>
Signed-off-by: Michael Niedermayer <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Unfortunately android < api 21 (lollipop) doesn't have the sgidefs.h header,
the easiest way around this is to just use the preprocessor definitions from
gcc / clang.
Signed-off-by: James Cowgill <[email protected]>
Signed-off-by: Michael Niedermayer <[email protected]>
|
|
|
|
|
| |
Signed-off-by: James Cowgill <[email protected]>
Signed-off-by: Michael Niedermayer <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This mainly consists of replacing all the pointer arithmatic 'addiu'
instructions with PTR_ADDIU which will handle the differences in pointer
sizes when compiled on 64 bit mips systems.
The header asmdefs.h contains the PTR_ macros which expend to the correct mips
instructions to manipulate registers containing pointers.
Signed-off-by: James Cowgill <[email protected]>
Reviewed-by: Nedeljko Babic <[email protected]>
Signed-off-by: Michael Niedermayer <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Loop was unrolled eight times although in heder there is assumption
that len is multiple of 4.
This is fixed, and assembly code is rewritten to be more optimal and
to simplify clobber list.
Signed-off-by: Nedeljko Babic <[email protected]>
Signed-off-by: Michael Niedermayer <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Mirjana Vulin <[email protected]>
Signed-off-by: Michael Niedermayer <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Mirjana Vulin <[email protected]>
Signed-off-by: Michael Niedermayer <[email protected]>
|
|
|
|
|
| |
Reviewed-by: "Babic, Nedeljko" <[email protected]>
Signed-off-by: Michael Niedermayer <[email protected]>
|