aboutsummaryrefslogtreecommitdiffstats
path: root/doc/filters.texi
diff options
context:
space:
mode:
authorAndreas Rheinhardt <andreas.rheinhardt@gmail.com>2020-10-20 08:31:17 +0200
committerAndreas Rheinhardt <andreas.rheinhardt@gmail.com>2020-10-26 07:15:37 +0100
commitc69392a54c161fc679a31cecc4ab8fbe3a46c84b (patch)
tree266fc3d5eb89ed4f770c33b84f3c2bb4d840f4da /doc/filters.texi
parentbc90145fa3b6e8d26dafe057252a17cdd3a9a095 (diff)
downloadffmpeg-c69392a54c161fc679a31cecc4ab8fbe3a46c84b.tar.gz
avcodec/vp3: Make parsing Theora Huffman tables more spec-compliant
Theora allows to use custom Huffman tables which are coded in the bitstream as a tree: Whether the next node is a leaf or not is coded in a bit; each node itself contains a five bit token. Each tree can contain at most 32 leafs; typically they contain exactly 32 with the 32 symbols forming a permutation of 0..31. Yet the standard does not impose either of these requirements. It explicitly allows less than 32 leafs and multiple codes with the same token. But our decoder used an algorithm that required the codes->token mapping to be injective and that also presumed that there be at least two leafs: Instead of using an array for codes, tokens and code lengths, the decoder only had arrays for codes and code lengths. The code and length for a given token were stored in entry[token]. As no symbols table was used when initializing the VLC, the default one applied and therefore the entry[token] got the symbol token (if the length of said entry is >0). Yet if multiple codes had the same token, the codes and lengths from the later token would overwrite the earlier codes and lengths. Furthermore, less than 32 leafs could also lead to problems: Namely if this was not the first time Huffman tables have been parsed in which case the array is not zeroed initially so that old entries could make the new table invalid. libtheora seems to always use 32 leafs and no duplicate tokens; I am not aware of any existing valid files that do not. This is fixed by using a codes, symbols and lengths array when initializing the VLC. In order to reduce the amount of stuff kept in the context only the symbols and lengths (which both fit into an uint8_t) are kept in the context; the codes are derived from the lengths immediately before creating the tables. There is now only one thing left which is not spec-compliant: Trees with only one node (which has length zero) are not supported by ff_init_vlc_sparse() yet. Reviewed-by: Peter Ross <pross@xvid.org> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
Diffstat (limited to 'doc/filters.texi')
0 files changed, 0 insertions, 0 deletions