aboutsummaryrefslogtreecommitdiffstats
path: root/libavcodec/arm/mdct_vfp.S
Commit message (Collapse)AuthorAgeFilesLines
* armv6: Accelerate ff_imdct_half for general case (mdct_bits != 6)Ben Avison2014-07-131-2/+144
| | | | | | | | | | | | | | | | | | | | | | | | | | | The previous implementation targeted DTS Coherent Acoustics, which only requires mdct_bits == 6. This relatively small size lent itself to unrolling the loops a small number of times, and encoding offsets calculated at assembly time within the load/store instructions of each iteration. In the more general case (codecs such as AAC and AC3) much larger arrays are used - mdct_bits == [8, 9, 11]. The old method does not scale for these cases, so more integer registers are used with non-unrolled versions of the loops (and with some stack spillage). The postrotation filter loop is still unrolled by a factor of 2 to permit the double-buffering of some VFP registers to facilitate overlap of neighbouring iterations. I benchmarked the result by measuring the number of gperftools samples that hit anywhere in the AAC decoder (starting from aac_decode_frame()) or specifically in ff_imdct_half_c / ff_imdct_half_vfp, for the same example AAC stream: Before After Mean StdDev Mean StdDev Confidence Change aac_decode_frame 2368.1 35.8 2117.2 35.3 100.0% +11.8% ff_imdct_half_* 457.5 22.4 251.2 16.2 100.0% +82.1% Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* arm: Mangle external symbols properly in new vfp assembly filesMartin Storsjö2013-07-221-2/+2
| | | | | Reviewed-by: Kostya Shishkov Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* Merge commit '8b9eba664edaddf9a304d3acbf0388b5c520781d'Michael Niedermayer2013-07-221-3/+2
|\ | | | | | | | | | | | | * commit '8b9eba664edaddf9a304d3acbf0388b5c520781d': arm: Add VFP-accelerated version of fft16 Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * arm: Add VFP-accelerated version of fft16Martin Storsjö2013-07-221-3/+2
| | | | | | | | | | | | | | | | | | Before After Mean StdDev Mean StdDev Change This function 1389.3 4.2 967.8 35.1 +43.6% Overall 15577.5 83.2 15400.0 336.4 +1.2% Signed-off-by: Martin Storsjö <martin@martin.st>
* | Merge commit 'b63bb251ea6d6ba23295294e37a92625c0192206'Michael Niedermayer2013-07-221-4/+4
|/ | | | | | | * commit 'b63bb251ea6d6ba23295294e37a92625c0192206': arm: Add VFP-accelerated version of imdct_half Merged-by: Michael Niedermayer <michaelni@gmx.at>
* arm: Add VFP-accelerated version of imdct_halfMartin Storsjö2013-07-221-0/+206
Before After Mean StdDev Mean StdDev Change This function 2653.0 28.5 1108.8 51.4 +139.3% Overall 17049.5 408.2 15973.0 223.2 +6.7% Signed-off-by: Martin Storsjö <martin@martin.st>