aboutsummaryrefslogtreecommitdiffstats
path: root/contrib/libs/lzma/TODO
diff options
context:
space:
mode:
authorthegeorg <thegeorg@yandex-team.ru>2022-05-07 11:12:39 +0300
committerthegeorg <thegeorg@yandex-team.ru>2022-05-07 11:12:39 +0300
commita0c9cd069ff45244eeb9caa2fcadf1e9bc4a840d (patch)
treec9a178299041ce75a8b6d3edb644521c0f13c092 /contrib/libs/lzma/TODO
parentd47cdd245c4b49c8e273c80a2e2b29ee5b2071c3 (diff)
downloadydb-a0c9cd069ff45244eeb9caa2fcadf1e9bc4a840d.tar.gz
Improve layout of contrib/libs/lzma
* Rename contrib/libs/xz to contrib/libs/lzma (yamaker project is updated accordingly) * Move xz/liblzma/ya.make to top level * Update provides.pbtxt and PEERDIRs as necessary ref:1bb07a87b6adb738965483167fa64e7ad5da1e2b
Diffstat (limited to 'contrib/libs/lzma/TODO')
-rw-r--r--contrib/libs/lzma/TODO111
1 files changed, 111 insertions, 0 deletions
diff --git a/contrib/libs/lzma/TODO b/contrib/libs/lzma/TODO
new file mode 100644
index 0000000000..45ba433a58
--- /dev/null
+++ b/contrib/libs/lzma/TODO
@@ -0,0 +1,111 @@
+
+XZ Utils To-Do List
+===================
+
+Known bugs
+----------
+
+ The test suite is too incomplete.
+
+ If the memory usage limit is less than about 13 MiB, xz is unable to
+ automatically scale down the compression settings enough even though
+ it would be possible by switching from BT2/BT3/BT4 match finder to
+ HC3/HC4.
+
+ XZ Utils compress some files significantly worse than LZMA Utils.
+ This is due to faster compression presets used by XZ Utils, and
+ can often be worked around by using "xz --extreme". With some files
+ --extreme isn't enough though: it's most likely with files that
+ compress extremely well, so going from compression ratio of 0.003
+ to 0.004 means big relative increase in the compressed file size.
+
+ xz doesn't quote unprintable characters when it displays file names
+ given on the command line.
+
+ tuklib_exit() doesn't block signals => EINTR is possible.
+
+ SIGTSTP is not handled. If xz is stopped, the estimated remaining
+ time and calculated (de)compression speed won't make sense in the
+ progress indicator (xz --verbose).
+
+ If liblzma has created threads and fork() gets called, liblzma
+ code will break in the child process unless it calls exec() and
+ doesn't touch liblzma.
+
+
+Missing features
+----------------
+
+ Add support for storing metadata in .xz files. A preliminary
+ idea is to create a new Stream type for metadata. When both
+ metadata and data are wanted in the same .xz file, two or more
+ Streams would be concatenated.
+
+ The state stored in lzma_stream should be cloneable, which would
+ be mostly useful when using a preset dictionary in LZMA2, but
+ it may have other uses too. Compare to deflateCopy() in zlib.
+
+ Support LZMA_FINISH in raw decoder to indicate end of LZMA1 and
+ other streams that don't have an end of payload marker.
+
+ Adjust dictionary size when the input file size is known.
+ Maybe do this only if an option is given.
+
+ xz doesn't support copying extended attributes, access control
+ lists etc. from source to target file.
+
+ Multithreaded compression:
+ - Reduce memory usage of the current method.
+ - Implement threaded match finders.
+ - Implement pigz-style threading in LZMA2.
+
+ Multithreaded decompression
+
+ Buffer-to-buffer coding could use less RAM (especially when
+ decompressing LZMA1 or LZMA2).
+
+ I/O library is not implemented (similar to gzopen() in zlib).
+ It will be a separate library that supports uncompressed, .gz,
+ .bz2, .lzma, and .xz files.
+
+ Support changing lzma_options_lzma.mode with lzma_filters_update().
+
+ Support LZMA_FULL_FLUSH for lzma_stream_decoder() to stop at
+ Block and Stream boundaries.
+
+ lzma_strerror() to convert lzma_ret to human readable form?
+ This is tricky, because the same error codes are used with
+ slightly different meanings, and this cannot be fixed anymore.
+
+ Make it possible to adjust LZMA2 options in the middle of a Block
+ so that the encoding speed vs. compression ratio can be optimized
+ when the compressed data is streamed over network.
+
+ Improved BCJ filters. The current filters are small but they aren't
+ so great when compressing binary packages that contain various file
+ types. Specifically, they make things worse if there are static
+ libraries or Linux kernel modules. The filtering could also be
+ more effective (without getting overly complex), for example,
+ streamable variant BCJ2 from 7-Zip could be implemented.
+
+ Filter that autodetects specific data types in the input stream
+ and applies appropriate filters for the corrects parts of the input.
+ Perhaps combine this with the BCJ filter improvement point above.
+
+ Long-range LZ77 method as a separate filter or as a new LZMA2
+ match finder.
+
+
+Documentation
+-------------
+
+ More tutorial programs are needed for liblzma.
+
+ Document the LZMA1 and LZMA2 algorithms.
+
+
+Miscellaneous
+------------
+
+ Try to get the media type for .xz registered at IANA.
+