diff options
author | Andreas Rheinhardt <andreas.rheinhardt@gmail.com> | 2020-11-09 05:11:31 +0100 |
---|---|---|
committer | Andreas Rheinhardt <andreas.rheinhardt@gmail.com> | 2020-12-08 17:51:47 +0100 |
commit | 240a25f94fb71a38d0ab9034681e4276f08e1750 (patch) | |
tree | 81d238187317c5e2112d1f626bccdff4d2f3bd2f /COPYING.GPLv3 | |
parent | c638d1d126fa1462bdcd30883061be9dd6f29dcd (diff) | |
download | ffmpeg-240a25f94fb71a38d0ab9034681e4276f08e1750.tar.gz |
avcodec/magicyuv: Optimize creating Huffman tables
MagicYUV transmits its Huffman trees by providing the length of the code
corresponding to each symbol; then the decoder has to assemble the table
in such a way that (i) longer codes are to the left of the tree and (ii)
for codes of the same length the symbols are ascending from left to right.
Up until now the decoder did this as follows: It counted the number of
codes of each length and derived the first code of a given length via
(ii). Then the array of lengths is traversed a second time to create
the codes; there is one running counter for each length to do so. This
process creates a default symbol table (that is omitted).
This commit changes this as follows: Everything is indexed by the
position in the tree (with codes to the left first); given (i), we can
calculate the ranges occupied by the codes of each length; and with (ii)
we can derive the actual symbols of each code; the running counters for
each length are now used for the symbols and not for the codes.
Doing so allows us to switch to ff_init_vlc_from_lengths(); this has the
advantage that the codes table needs only be traversed once and that the
codes need not be sorted any more (right now, the codes that are so long
that they will be put into subtables need to be sorted so that codes
that end up in the same subtable are contiguous).
For a sample produced by our encoder (natural content, 4000 frames,
YUV420p, ten iterations, GCC 9.3) this decreased the amount of
decicycles for each call to build_huffman() from 1336049 to 1309401.
Notice that our encoder restricts the code lengths to 12 and our decoder
only uses subtables when the code is longer than 12 bits, so the sorting
that can be avoided does not happen at the moment. If one reduces the
decoder's tables to nine bits, the performance improvement becomes more
apparent: The amount of decicycles for build_huffman() decreased from
1165210 to 654055.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
Diffstat (limited to 'COPYING.GPLv3')
0 files changed, 0 insertions, 0 deletions