| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
| |
Signed-off-by: Felipe Contreras <[email protected]>
Signed-off-by: Mans Rullgard <[email protected]>
|
|
|
|
|
|
|
| |
Overall almost 4% faster, idct_add down from 350 to 85 cycles, idct_dc_add
down from 83 to 30 cycles.
squash: rv34 idct rearrange partial register loads
|
|
|
|
| |
Implement 1-pass inverse transform and reconstruction for inter blocks.
|
|
|
|
|
|
|
|
| |
The alignment directive must obviously precede the label.
This was never noticed in ARM mode since the location is
already aligned there.
Signed-off-by: Mans Rullgard <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Due to apprent bugs in the GNU assembler and/or linker, relocations
can be incorrectly processed if the alignment of a Thumb instruction
is changed in the output file compared to the input object.
This fixes crashes in h264 decoding with Thumb enabled. No effect in
ARM mode since everything is 4-byte aligned there.
Signed-off-by: Mans Rullgard <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Mans Rullgard <[email protected]>
Signed-off-by: Janne Grunau <[email protected]>
|
|
|
|
|
| |
30-50% faster than the C implementation, 0.5% overall speedup on
bourne.rmvb.
|
|
|
|
|
|
|
|
|
|
|
| |
Perform dequantization while decoding coefficients instead of performing it
on the entire coefficients buffer.
Since quantized coefficients are very sparse, this usually causes a small
speedup. Speedup of around 1% on Panda board compared to the removed here
neon code. Global speedup is probably around 3%.
Signed-off-by: Kostya Shishkov <[email protected]>
|
|
|
|
| |
Signed-off-by: Mans Rullgard <[email protected]>
|
|
|
|
|
|
|
| |
External symbol references need prefixes on some systems.
This should fix build errors on Darwin.
Signed-off-by: Mans Rullgard <[email protected]>
|
|
|
|
|
|
|
| |
Old gcc versions have trouble compiling this function, and
no simple, targeted test is possible.
Signed-off-by: Mans Rullgard <[email protected]>
|
|
|
|
| |
Signed-off-by: Mans Rullgard <[email protected]>
|
|
|
|
| |
Signed-off-by: Mans Rullgard <[email protected]>
|
|
|
|
|
|
| |
Based on patch by Janne Grunau.
Signed-off-by: Mans Rullgard <[email protected]>
|
|
|
|
| |
Signed-off-by: Mans Rullgard <[email protected]>
|
|
|
|
| |
Signed-off-by: Mans Rullgard <[email protected]>
|
|
|
|
|
|
| |
This allows sharing code with the rv40 version of these functions.
Signed-off-by: Mans Rullgard <[email protected]>
|
|
|
|
| |
Signed-off-by: Mans Rullgard <[email protected]>
|
|
|
|
|
|
|
|
| |
- Replace 'ip' with 'r12'.
- Use correct size designators for vld1/vst1.
- Whitespace fixes.
Signed-off-by: Mans Rullgard <[email protected]>
|
|
|
|
| |
Signed-off-by: Mans Rullgard <[email protected]>
|
|
|
|
| |
Signed-off-by: Mans Rullgard <[email protected]>
|
|
|
|
|
|
|
| |
This makes whitespace and register names consistent with
the style used in more recent code.
Signed-off-by: Mans Rullgard <[email protected]>
|
| |
|
|
|
|
|
|
| |
Although this adds a few lines, the macro calls are less convoluted.
Signed-off-by: Mans Rullgard <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This is a hand-tuned version of the code with impossible parts of
the FASTDIV function ommitted.
2-5% faster overall on Cortex-A8.
Signed-off-by: Mans Rullgard <[email protected]>
|
|
|
|
|
|
|
| |
The 'function' macro already includes the appropriate
directives.
Signed-off-by: Mans Rullgard <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This prevents build errors when compiler and assembler default
targets differ. Ideally each file would declare the highest
level it requires. This is however not easily possible as it
complicates assembling pre-armv6t2 code in Thumb-2 mode.
HAVE_NEON is used as indicator for ARMv7-A since no other
symbol exists for this and NEON is only available in this
variant.
Signed-off-by: Mans Rullgard <[email protected]>
|
|
|
|
| |
Filenames are brittle across renames and add no useful information.
|
| |
|
|
|
|
| |
Neon parts by Mans Rullgard <[email protected]>.
|
|
|
|
|
| |
Signed-off-by: Diego Biurrun <[email protected]>
Signed-off-by: Ronald S. Bultje <[email protected]>
|
|
|
|
|
|
|
|
| |
The inline asm added in bf5d46d uses the 'y' modifier which
is only supported from gcc 4.5. This check allows building
with older compilers.
Signed-off-by: Mans Rullgard <[email protected]>
|
|
|
|
| |
Signed-off-by: Mans Rullgard <[email protected]>
|
|
|
|
| |
Signed-off-by: Mans Rullgard <[email protected]>
|
|
|
|
| |
Deprecate CODEC_FLAG2_INTRA_VLC.
|
|
|
|
|
|
| |
This fixes building for ARM11 without Thumb2.
Signed-off-by: Martin Storsjö <[email protected]>
|
|
|
|
|
|
|
| |
This allows using the same arm assembler offsets for both EABI
and the mach-o ABI.
Signed-off-by: Martin Storsjö <[email protected]>
|
|
|
|
| |
Signed-off-by: Mans Rullgard <[email protected]>
|
|
|
|
| |
Signed-off-by: Mans Rullgard <[email protected]>
|
|
|
|
| |
Signed-off-by: Mans Rullgard <[email protected]>
|
|
|
|
| |
Signed-off-by: Mans Rullgard <[email protected]>
|
| |
|
|
|
|
| |
Signed-off-by: Mans Rullgard <[email protected]>
|
|
|
|
|
|
|
|
| |
Some versions of the GNU assembler do not handle 64-bit
immediate operands containing arithmetic. Writing the
value out in full works correctly.
Signed-off-by: Mans Rullgard <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This function is called with only 8-byte alignment from
imdct for size 16. The fft4 function is not called for
the larger FFT or MDCT sizes, so this has no impact on
typical uses.
Signed-off-by: Mans Rullgard <[email protected]>
|
|
|
|
| |
Signed-off-by: Mans Rullgard <[email protected]>
|
|
|
|
| |
Signed-off-by: Mans Rullgard <[email protected]>
|
|
|
|
|
|
| |
PLD is present in ARMv5TE and later, which is checked for separately.
Signed-off-by: Mans Rullgard <[email protected]>
|
|
|
|
| |
Signed-off-by: Mans Rullgard <[email protected]>
|
|
|
|
| |
Signed-off-by: Mans Rullgard <[email protected]>
|