mirror of https://github.com/Cisco-Talos/clamav
You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
960 lines
38 KiB
960 lines
38 KiB
2023-02-03 Stuart Caie <kyzer@cabextract.org.uk>
|
|
|
|
* configure.ac: do AC_CHECK_SIZEOF([off_t]) test only after
|
|
AC_SYS_LARGEFILE, because the latter can alter the size of off_t.
|
|
|
|
* cabd_extract(): file->offset and file->length are unsigned ints,
|
|
both of them and their sum are checked to be <= CAB_LENGTHMAX. But
|
|
recent code stuffs file->length into an off_t and checks that instead.
|
|
On 32-bit architectures, if file->length > 2GiB then the off_t is
|
|
negative, evading the check. Ultimately this causes the decompression
|
|
functions to return MSPACK_ERR_ARGS as they already guard against
|
|
being asked to decompress a negative number of bytes.
|
|
|
|
2023-02-01 Stuart Caie <kyzer@cabextract.org.uk>
|
|
|
|
* readbits.h, readhuff.h, cabd.c, kwajd.c, lzxd.c, mszipd.c, qtmd.c:
|
|
ensure bit operations (including intermediary ones) are considered
|
|
as unsigned int, so UBSan is happy.
|
|
|
|
2023-01-31 Stuart Caie <kyzer@cabextract.org.uk>
|
|
|
|
* chmd.c: replace READ_ENCINT() macro with stricter read_encint()
|
|
function that reads no more than 63 or 31 bits so ENCINTs can never
|
|
be negative.
|
|
|
|
I'd prefer to use unsigned types, but off_t is used for file offsets
|
|
and lengths to match the environment's file I/O, so changing it is
|
|
tricky and would change the current public API.
|
|
|
|
Additionally, UBSan complains about shifting a 1 into a signed
|
|
type's MSB. https://www.cs.utah.edu/~regehr/papers/tosem15.pdf
|
|
notes that this is legal in ANSI C and "fairly benign (and well-
|
|
defined until C99)", but C99 made it undefined for no good reason.
|
|
I don't agree with this, but I don't want someone else using a C99
|
|
compiler to end up miscompiling the code.
|
|
|
|
* chmd_read_headers(): the CHM's internally declared file length is
|
|
compared against its actual file length and a warning is printed if
|
|
they don't match.
|
|
|
|
* chmd_extract(): files in the uncompressed section will print a
|
|
warning if their declared length goes beyond the declared end of the
|
|
CHM file. This may not match the actual CHM file length. You will
|
|
still get seek or read errors if a file's offset or length go beyond
|
|
the actual CHM file length.
|
|
|
|
Files in the compressed section will now cause a decrunch error if
|
|
their declared offset goes beyond the uncompressed length of the
|
|
section. If their offset is OK but their declared length goes beyond
|
|
the end, they will print a warning and then decompress as much as
|
|
possible before causing an error.
|
|
|
|
2023-01-02 Stuart Caie <kyzer@cabextract.org.uk>
|
|
|
|
* kwajd_extract(): KWAJ compression method #2 is the QBasic variant
|
|
of the SZDD compression algorithm. Thanks to Jason Summers for finding
|
|
this and providing examples.
|
|
|
|
2021-07-20 Stuart Caie <kyzer@cabextract.org.uk>
|
|
|
|
* lzxd_decompress(): simplified the code that decodes match_offset.
|
|
Thanks to Jasper St. Pierre for prompting me to look at it.
|
|
|
|
2020-12-30 Stuart Caie <kyzer@cabextract.org.uk>
|
|
|
|
* cabd_read_string(): libmspack no longer rejects CAB files with
|
|
empty previnfo/nextinfo strings. Thanks to Simon Tatham for the
|
|
patch, and for noting that WiX v4 currently generates such files.
|
|
|
|
2020-08-10 Stuart Caie <kyzer@cabextract.org.uk>
|
|
|
|
* lzxd_decompress(): merged the code for decoding aligned and
|
|
verbatim blocks, also verified there is no significant performance
|
|
penalty.
|
|
|
|
2020-08-07 Stuart Caie <kyzer@cabextract.org.uk>
|
|
|
|
* read_sys_file(): in a CHM file, the ControlData and ResetTable
|
|
files are loaded entirely into memory, regardless of file size.
|
|
This is not in the spirit of letting users control memory usage.
|
|
|
|
ControlData previously had to be at least 28 bytes (in case a new,
|
|
larger version of the file ever appeared), but is now rejected
|
|
if not exactly 28 bytes.
|
|
|
|
ResetTable can theoretically be huge; the longest LZX stream of
|
|
16 exabytes could have a 4 petabyte ResetTable. Practically, the
|
|
largest seen in the wild is 46 kilobytes (PHP manuals). I picked
|
|
an arbitrary upper limit of 1MB; please get in contact if you
|
|
know of any CHM files in the wild that are largest than this.
|
|
|
|
Thanks to seviezhou on Github for reporting this.
|
|
|
|
2020-04-13 Stuart Caie <kyzer@cabextract.org.uk>
|
|
|
|
* system.h: clear up libmspack's large file support.
|
|
|
|
To support large files, do this:
|
|
|
|
1. add any defines that your compiler needs to enable large file
|
|
support. It may be supported by default.
|
|
2. Define HAVE_FSEEKO if fseeko() and ftello() are available.
|
|
3. Define SIZEOF_OFF_T to the value of sizeof(off_t); it must be a
|
|
literal value because sizeof() can't be used in preprocessor tests.
|
|
|
|
libmspack uses the off_t datatype for all file offsets. If off_t is
|
|
less than 64 bits, libmspack will return an error when processing
|
|
CHM files with offsets beyond 2GB, and won't search for CAB headers
|
|
beyond 2GB into a file. In both cases, it prints a warning message
|
|
that the library doesn't support large files.
|
|
|
|
2020-04-13 Stuart Caie <kyzer@cabextract.org.uk>
|
|
|
|
* macros.h: new header for the D(), LD/LU and EndGet???() macros.
|
|
Use this instead of system.h.
|
|
|
|
* system.h: if MSPACK_NO_DEFAULT_SYSTEM is defined, define
|
|
inline versions of the only standard C functions used in
|
|
mspack (strlen, memcmp, memset), so that no standard C library
|
|
functions are needed at all.
|
|
|
|
2020-01-08 Stuart Caie <kyzer@cabextract.org.uk>
|
|
|
|
* lzxd_decompress(): do not apply the E8 transformation on the
|
|
32769th LZX frame! Thanks to Cezary Sliwa for discovering this
|
|
bug and providing an example cab file (which is
|
|
http://download.windowsupdate.com/d/msdownload/update/driver/
|
|
drvs/2019/11/016c7f3e-809d-4720-893b-
|
|
e0d74f10c39d_35e12507628e8dc8ae5fb3332835f4253d2dab23.cab)
|
|
|
|
* cabd_compare: use EXPAND.EXE instead of EXTRACT.EXE when
|
|
testing files in a directory called 'expand'. The example
|
|
cab file above is extracted wrongly by EXTRACT.EXE, but
|
|
correctly by EXPAND.EXE because they take different approaches
|
|
to E8 transformations:
|
|
|
|
- EXTRACT.EXE writes "E8E8E8E8E8E8' to the last 6 bytes of
|
|
frame, looks for E8 bytes up to the last 6 bytes, then restores
|
|
the last 6 bytes, leaving partial transforms of 1-3 bytes if
|
|
E8 byte is found near the end of the frame
|
|
|
|
- EXPAND.EXE looks for E8 bytes up to the last 10 bytes of a
|
|
frame, therefore the last 6 bytes are never altered and all
|
|
transforms are 4 bytes
|
|
|
|
2019-02-18 Stuart Caie <kyzer@cabextract.org.uk>
|
|
|
|
* chmd_read_headers(): a CHM file name beginning "::" but shorter
|
|
than 33 bytes will lead to reading past the freshly-allocated name
|
|
buffer - checks for specific control filenames didn't take length
|
|
into account. Thanks to ADLab of Venustech for the report and
|
|
proof of concept.
|
|
|
|
2019-02-18 Stuart Caie <kyzer@cabextract.org.uk>
|
|
|
|
* chmd_read_headers(): CHM files can declare their chunks are any
|
|
size up to 4GB, and libmspack will attempt to allocate that to
|
|
read the file.
|
|
|
|
This is not a security issue; libmspack doesn't promise how much
|
|
memory it'll use to unpack files. You can set your own limits by
|
|
returning NULL in a custom mspack_system.alloc() implementation.
|
|
|
|
However, it would be good to validate chunk size further. With no
|
|
official specification, only empirical data is available. All files
|
|
created by hhc.exe have a chunk size of 4096 bytes, and this is
|
|
matched by all the files I've found in the wild, except for one
|
|
which has a chunk size of 8192 bytes, which was created by someone
|
|
developing a CHM file creator 15 years ago, and they appear to
|
|
have abandoned it, so it seems 4096 is a de-facto standard.
|
|
|
|
I've changed the "chunk size is not a power of two" warning to
|
|
"chunk size is not 4096", and now only allow chunk sizes between
|
|
22 and 8192 bytes. If you have CHM files with a larger chunk size,
|
|
please send them to me and I'll increase this upper limit.
|
|
|
|
Thanks to ADLab of Venustech for the report.
|
|
|
|
2019-02-18 Stuart Caie <kyzer@cabextract.org.uk>
|
|
|
|
* oabd.c: replaced one-shot copying of uncompressed blocks (which
|
|
requires allocating a buffer of the size declared in the header,
|
|
which can be 4GB) with a fixed-size buffer. The buffer size is
|
|
user-controllable with the new msoab_decompressor::set_param()
|
|
method (check you have version 2 of the OAB decompressor), and
|
|
also controls the input buffer used for OAB's LZX decompression.
|
|
|
|
Reminder: compression formats can dictate how much memory is
|
|
needed to decompress them. If memory usage is a security concern
|
|
to you, write a custom mspack_system.alloc() that returns NULL
|
|
if "too much" memory is requested. Do not rely on libmspack adding
|
|
special heuristics to know not to request "too much".
|
|
|
|
Thanks to ADLab of Venustech for the report.
|
|
|
|
2018-11-03 Stuart Caie <kyzer@cabextract.org.uk>
|
|
|
|
* configure.ac, doc/Makefile.in, doc/Doxyfile.in: remove these
|
|
template files and replace with static files. You can still build
|
|
the documentation with make -C doc
|
|
|
|
2018-11-03 Stuart Caie <kyzer@cabextract.org.uk>
|
|
|
|
* Makefile.am, src: move the "useful" programs in src/ to examples/
|
|
and don't auto-install them. Even though they're useful, they are
|
|
intended as examples and aren't productised (no command-line
|
|
options, no man pages, etc.) -- if you disagree, feel free to
|
|
send in a patch
|
|
|
|
2018-11-01 Stuart Caie <kyzer@cabextract.org.uk>
|
|
|
|
* cabd_extract(): would not do decompression for random-access
|
|
offsets if the folder type was LZX. This is a fairly major bug,
|
|
and affects any decompression where you skip directly to a file,
|
|
or decompress data out-of-order. Thanks to austin987 for alerting
|
|
me to this.
|
|
|
|
This bug was introduced by the recent 'salvage mode' patch. Even
|
|
though I'd reviewed all the differences in clamav's copy of
|
|
libmspack and said "wtf" to this particular change, I didn't
|
|
notice it was still in the resulting patch I merged. Mea culpa :)
|
|
|
|
* test/cabd_test.c: now has a regression test to cover this
|
|
|
|
2018-10-31 Stuart Caie <kyzer@cabextract.org.uk>
|
|
|
|
* Makefile.am, test/*_test.c: use the automake test-suite system
|
|
with the test-suite programs (cabd_test, chmd_test, kwajd_test).
|
|
This also fixes a longstanding bugbear that these programs don't
|
|
access their test files using an absolute path. Now this is passed
|
|
to them and you can run them from any directory. Thanks to Richard
|
|
Jones for requesting this.
|
|
|
|
2018-10-31 Stuart Caie <kyzer@cabextract.org.uk>
|
|
|
|
* configure.ac: require at least automake 1.11, use AM_SILENT_RULES
|
|
unconditionally
|
|
|
|
2018-10-30 Stuart Caie <kyzer@cabextract.org.uk>
|
|
|
|
* configure.ac: remove obsolescent C library tests. AC_HEADER_STDC is
|
|
removed, and so are most checks for standard C headers. libmspack now
|
|
makes these assumptions:
|
|
- <ctype.h> <limits.h> <stdlib.h> <string.h> exist
|
|
- <ctype.h> defines tolower()
|
|
- <string.h> defines memset(), memcmp(), strlen()
|
|
- if towlower() exists, it's defined in <wctype.h>
|
|
|
|
2018-10-22 Stuart Caie <kyzer@cabextract.org.uk>
|
|
|
|
* cabd.c: remove the only use of assert()
|
|
|
|
2018-10-20 Stuart Caie <kyzer@cabextract.org.uk>
|
|
|
|
* src/chmextract.c: add anti "../" and leading slash protection to
|
|
chmextract. I'm not pleased about this. All the sample code provided
|
|
with libmspack is meant to be simple examples of library use, not
|
|
"productised" binaries. Making the "useful" code samples install
|
|
as binaries was a mistake. They were never intended to protect you
|
|
from unpacking archive files with relative/absolute paths, and I
|
|
would prefer that they never will be.
|
|
|
|
2018-10-17 Stuart Caie <kyzer@cabextract.org.uk>
|
|
|
|
* cab.h: Make the CAB block input buffer one byte larger, to allow
|
|
a maximum-allowed-size input block and the special extra byte added
|
|
after the block by cabd_sys_read_block to help Quantum alignment.
|
|
Thanks to Henri Salo for reporting this.
|
|
|
|
2018-10-17 Stuart Caie <kyzer@cabextract.org.uk>
|
|
|
|
* chmd_read_headers(): again reject files with blank filenames, this
|
|
time because their 1st or 2nd byte is null, not because their length
|
|
is zero. Thanks again to Hanno Böck for finding the issue.
|
|
|
|
2018-10-16 Stuart Caie <kyzer@cabextract.org.uk>
|
|
|
|
* Makefile.am: using automake _DEPENDENCIES for chmd_test appears to
|
|
override the default dependencies (e.g. sources), so libchmd.la was no
|
|
longer considered a dependency of chmd_test. This breaks parallel
|
|
builds like "make -j4". Added libchmd.la explicitly to dependencies.
|
|
Thanks to Thomas Deutschmann for reporting this.
|
|
|
|
2018-10-16 Stuart Caie <kyzer@cabextract.org.uk>
|
|
|
|
* cabd.c: add new parameter, MSCABD_PARAM_SALVAGE, which makes CAB file
|
|
reading and extraction more lenient, to allow damaged or mangled CABs
|
|
to be extracted. When enabled:
|
|
- cabd->open() won't reject cabinets with files that have invalid
|
|
folder indices or filenames. These files will simply be skipped
|
|
- cabd->extract() won't reject files with invalid lengths, but will
|
|
limit them to the maximum possible
|
|
- block output sizes over 32768 bytes won't be rejected
|
|
- invalid data block checksums won't be rejected
|
|
|
|
It's still possible for corrupted files to fail extraction, but more
|
|
data can be extracted before they do.
|
|
|
|
This new parameter doesn't affect the existing MSCABD_PARAM_FIXMSZIP
|
|
parameter, which ignores MSZIP decompression failures. You can enable
|
|
both at once.
|
|
|
|
Thanks to Micah Snyder from ClamAV for working with me to get this
|
|
feature into libmspack. This also helps ClamAV move towards using a
|
|
vanilla copy of libmspack without needing their own patchset.
|
|
|
|
2018-08-13 Stuart Caie <kyzer@cabextract.org.uk>
|
|
|
|
* mspack.h: clarify that mspack_system.free() should allow NULL. If your
|
|
mspack_system implementation doesn't, it would already have crashed, as
|
|
there are several places where libmspack calls sys->free(NULL). This
|
|
change makes it official, and amends a few "if (x) sys->free(x)" cases
|
|
to the simpler "sys->free(x)" to make it clearer.
|
|
|
|
2018-08-09 Stuart Caie <kyzer@cabextract.org.uk>
|
|
|
|
* Makefile.am: the test file cve-2015-4467-reset-interval-zero.chm is
|
|
detected by ClamAV as BC.Legacy.Exploit.CVE_2012_1458-1 "infected".
|
|
My hosting deletes anything that ClamAV calls "infected", so has been
|
|
continually deleting the official libmspack 0.7alpha release.
|
|
|
|
CVE-2012-1458 is the same issue as CVE-2015-4467: both libmspack, and
|
|
ClamAV using libmspack, could get a division-by-zero crash when the LZX
|
|
reset interval was zero. This was fixed years ago, but ClamAV still has
|
|
it as a signature, which today prevents me from releasing libmspack.
|
|
|
|
BC.Legacy.Exploit.CVE_2012_1458-1 is a bytecode signature, so I can't
|
|
see the exact trigger conditions, but I can see that it looks for the
|
|
"LZXC" signature of the LZX control file, so I've changed this to
|
|
"lzxc" and added a step in the Makefile to change it back to LZXC, so
|
|
I can release libmspack whether or not ClamAV keeps the signature.
|
|
|
|
2018-04-26 Stuart Caie <kyzer@cabextract.org.uk>
|
|
|
|
* read_chunk(): the test that chunk numbers are in bounds was off
|
|
by one, so read_chunk() returned a pointer taken from outside
|
|
allocated memory that usually crashes libmspack when accessed.
|
|
Thanks to Hanno Böck for finding the issue and providing a sample.
|
|
|
|
* chmd_read_headers(): reject files with blank filenames. Thanks
|
|
again to Hanno Böck for finding the issue and providing a sample file.
|
|
|
|
2018-02-06 Stuart Caie <kyzer@cabextract.org.uk>
|
|
|
|
* chmd.c: fixed an off-by-one error in the TOLOWER() macro, reported
|
|
by Dmitry Glavatskikh. Thanks Dmitry!
|
|
|
|
2017-11-26 Stuart Caie <kyzer@cabextract.org.uk>
|
|
|
|
* kwajd_read_headers(): fix up the logic of reading the filename and
|
|
extension headers to avoid a one or two byte overwrite. Thanks to
|
|
Jakub Wilk for finding the issue.
|
|
|
|
* test/kwajd_test.c: add tests for KWAJ filename.ext handling
|
|
|
|
2017-10-16 Stuart Caie <kyzer@cabextract.org.uk>
|
|
|
|
* test/cabd_test.c: update the short string tests to expect not only
|
|
MSPACK_ERR_DATAFORMAT but also MSPACK_ERR_READ, because of the recent
|
|
change to cabd_read_string(). Thanks to maitreyee43 for spotting this.
|
|
|
|
* test/msdecompile_md5: update the setup instructions for this script,
|
|
and also change the script so it works with current Wine. Again, thanks
|
|
to maitreyee43 for trying to use it and finding it not working.
|
|
|
|
2017-08-13 Stuart Caie <kyzer@cabextract.org.uk>
|
|
|
|
* src/chmextract.c: support MinGW one-arg mkdir(). Thanks to AntumDeluge
|
|
for reporting this.
|
|
|
|
2017-08-13 Stuart Caie <kyzer@cabextract.org.uk>
|
|
|
|
* read_spaninfo(): a CHM file can have no ResetTable and have a
|
|
negative length in SpanInfo, which then feeds a negative output length
|
|
to lzxd_init(), which then sets frame_size to a value of your choosing,
|
|
the lower 32 bits of output length, larger than LZX_FRAME_SIZE. If the
|
|
first LZX block is uncompressed, this writes data beyond the end of the
|
|
window. This issue was raised by ClamAV as CVE-2017-6419. Thanks to
|
|
Sebastian Andrzej Siewior for finding this by chance!
|
|
|
|
* lzxd_init(), lzxd_set_output_length(), mszipd_init(): due to the issue
|
|
mentioned above, these functions now reject negative lengths
|
|
|
|
2017-08-05 Stuart Caie <kyzer@cabextract.org.uk>
|
|
|
|
* cabd_read_string(): add missing error check on result of read().
|
|
If an mspack_system implementation returns an error, it's interpreted
|
|
as a huge positive integer, which leads to reading past the end of the
|
|
stack-based buffer. Thanks to Sebastian Andrzej Siewior for explaining
|
|
the problem. This issue was raised by ClamAV as CVE-2017-11423
|
|
|
|
2016-04-20 Stuart Caie <kyzer@cabextract.org.uk>
|
|
|
|
* configure.ac: change my email address to kyzer@cabextract.org.uk
|
|
|
|
2015-05-10 Stuart Caie <kyzer@4u.net>
|
|
|
|
* cabd_read_string(): correct rejection of empty strings. Thanks to
|
|
Hanno Böck for finding the issue and providing a sample file.
|
|
|
|
2015-05-10 Stuart Caie <kyzer@4u.net>
|
|
|
|
* Makefile.am: Add subdir-objects option as suggested by autoreconf.
|
|
|
|
* configure.ac: Add AM_PROG_AR as suggested by autoreconf.
|
|
|
|
2015-01-29 Stuart Caie <kyzer@4u.net>
|
|
|
|
* system.h: if C99 inttypes.h exists, use its PRI{d,u}{32,64} macros.
|
|
Thanks to Johnathan Kollasch for the suggestion.
|
|
|
|
2015-01-18 Stuart Caie <kyzer@4u.net>
|
|
|
|
* lzxd_decompress(): the byte-alignment code for reading uncompressed
|
|
block headers presumed it could wind i_ptr back 2 bytes, but this
|
|
hasn't been true since READ_BYTES was allowed to read bytes straddling
|
|
two blocks, leaving just 1 byte in the read buffer. Thanks to Jakub
|
|
Wilk for finding the issue and providing a sample file.
|
|
|
|
* inflate(): off-by-one error. Distance codes are 0-29, not 0-30.
|
|
Thanks to Jakub Wilk again.
|
|
|
|
* chmd_read_headers(), search_chunk(): another fix for checking pointer
|
|
is within a chunk, thanks again to Jakub Wilk.
|
|
|
|
2015-01-17 Stuart Caie <kyzer@4u.net>
|
|
|
|
* GET_UTF8_CHAR(): Remove 5/6-byte encoding support and check decoded
|
|
chars are no more than U+10FFFF.
|
|
|
|
* chmd_init_decomp(): A reset interval of 0 is invalid. Thanks to
|
|
Jakub Wilk for finding the issue and providing a sample and patch.
|
|
|
|
2015-01-15 Stuart Caie <kyzer@4u.net>
|
|
|
|
* chmd_read_headers(): add a bounds check to prevent over-reading data,
|
|
which caused a segfault on 32-bit architectures. Thanks to Jakub Wilk.
|
|
|
|
* search_chunk(): change the order of pointer arithmetic operations to
|
|
avoid overflow during bounds checks, which lead to segfaults on 32-bit
|
|
architectures. Again, thanks to Jakub Wilk for finding this issue,
|
|
providing sample files and a patch.
|
|
|
|
2015-01-08 Stuart Caie <kyzer@4u.net>
|
|
|
|
* cabd_extract(): No longer uses broken state data if extracting from
|
|
folder 1, 2, 1 and setting up folder 2 fails. This prevents a jump to
|
|
null and thus segfault. Thanks to Jakub Wilk again.
|
|
|
|
* cabd_read_string: reject empty strings. They are not found in any
|
|
valid CAB files. Thanks to Hanno Böck for sending me an example.
|
|
|
|
2015-01-05 Stuart Caie <kyzer@4u.net>
|
|
|
|
* cabd_can_merge_folders(): disallow folder merging if the combined
|
|
folder would have more than 65535 data blocks.
|
|
|
|
* cabd_decompress(): disallow files if their offset, length or
|
|
offset+length is more than 65535*32768, the maximum size of any
|
|
folder. Thanks to Jakub Wilk for identifying the problem and providing
|
|
a sample file.
|
|
|
|
2014-04-20 Stuart Caie <kyzer@4u.net>
|
|
|
|
* readhuff.h: fixed the table overflow check, which allowed one more
|
|
code after capacity had been reached, resulting in a read of
|
|
uninitialized data inside the decoding table. Thanks to Denis Kroshin
|
|
for identifying the problem and providing a sample file.
|
|
|
|
2013-05-27 Stuart Caie <kyzer@4u.net>
|
|
|
|
* test/oabx.c: added new example command for unpacking OAB files.
|
|
|
|
2013-05-17 Stuart Caie <kyzer@4u.net>
|
|
|
|
* mspack.h: Support for decompressing a new file format, the Exchange
|
|
Offline Address Book (OAB). Thanks to David Woodhouse for writing
|
|
the implementation. I've bumped the version to 0.4alpha in celebration.
|
|
|
|
2012-04-15 Stuart Caie <kyzer@4u.net>
|
|
|
|
* chmd_read_headers(): More thorough validation of CHM header values.
|
|
Thanks to Sergei Trofimovich for finding sample files.
|
|
|
|
* read_reset_table(): Better test for overflow. Thanks again to
|
|
Sergei Trofimovich for generating a good example.
|
|
|
|
* test/chminfo.c: this test program reads the reset table by itself
|
|
and was also susceptible to the same overflow problems.
|
|
|
|
2012-03-16 Stuart Caie <kyzer@4u.net>
|
|
|
|
* Makefile.am, configure.ac: make the GCC warning flags conditional
|
|
on using the GCC compiler. Thanks to Dagobert Michelsen for letting
|
|
me know.
|
|
|
|
2011-11-25 Stuart Caie <kyzer@4u.net>
|
|
|
|
* lzxd_decompress(): Prevent matches that go beyond the start
|
|
of the LZX stream. Thanks to Sergei Trofimovich for testing
|
|
with valgrind and finding a corrupt sample file that exercises
|
|
this scenario.
|
|
|
|
2011-11-23 Stuart Caie <kyzer@4u.net>
|
|
|
|
* chmd_fast_find(): add a simple check against infinite PMGL
|
|
loops. Thanks to Sergei Trofimovich for finding sample files.
|
|
Multi-step PMGL/PMGI infinite loops remain possible.
|
|
|
|
2011-06-17 Stuart Caie <kyzer@4u.net>
|
|
|
|
* read_reset_table(): wasn't reading the right offset for getting
|
|
the LZX uncompressed length. Thanks to Sergei Trofimovich for
|
|
finding the bug.
|
|
|
|
2011-05-31 Stuart Caie <kyzer@4u.net>
|
|
|
|
* kwajd.c, mszipd.c: KWAJ type 4 files (MSZIP) are now supported.
|
|
Thanks to Clive Turvey for sending me the format details.
|
|
|
|
* doc/szdd_kwaj_format.html: Updated documentation to cover
|
|
KWAJ's MSZIP compression.
|
|
|
|
2011-05-11 Stuart Caie <kyzer@4u.net>
|
|
|
|
* cabd_find(): rethought how large vs small file support is
|
|
handled, as users were getting "library not compiled to support
|
|
large files" message on some small files. Now checks for actual
|
|
off_t overflow, rather than trying to preempt it.
|
|
|
|
2011-05-10: Stuart Caie <kyzer@4u.net>
|
|
|
|
* chmd.c: implemented fast_find()
|
|
|
|
* test/chmx.c: removed the multiple extraction orders, now it just
|
|
extracts in the fastest order
|
|
|
|
* test/chmd_order.c: new program added to test that different
|
|
extraction orders don't affect the results of extraction
|
|
|
|
* test/chmd_find.c: new program to test that fast_find() works.
|
|
Either supply your own filename to find, or it will try finding
|
|
every file in the CHM.
|
|
|
|
* configure.ac: because CHM fast find requires case-insensitive
|
|
comparisons, tolower() or towlower() are used where possible.
|
|
These functions and their headers are checked for.
|
|
|
|
* mspack.h: exposed struct mschmd_sec_mscompressed's spaninfo
|
|
and struct mschmd_header's first_pmgl, last_pmgl and chunk_cache
|
|
to the world. Check that the CHM decoder version is v2 or higher
|
|
before using them.
|
|
|
|
* system.c: set CHM decoder version to v2
|
|
|
|
2011-04-27: Stuart Caie <kyzer@4u.net>
|
|
|
|
* many files: Made C++ compilers much happier with libmspack.
|
|
Changed char * to const char * where possible.
|
|
|
|
* mspack.h: Changed user-supplied char * to const char *.
|
|
Unless you've written your own mspack_system implementation,
|
|
you will likely be unaffected.
|
|
If you have written your own mspack_system implementation:
|
|
1: change open() so it takes a const char *filename
|
|
2: change message() so it takes a const char *format
|
|
If you cast your function into the mspack_system struct,
|
|
you can change the cast instead of the function.
|
|
|
|
2011-04-27: Stuart Caie <kyzer@4u.net>
|
|
|
|
* Makefile.am: changed CFLAGS from "-Wsign-compare -Wconversion
|
|
-pedantic" to "-W -Wno-unused". This enables more warnings, and
|
|
disables these specific warnings which are now a hindrance.
|
|
|
|
2011-04-27: Stuart Caie <kyzer@4u.net>
|
|
|
|
* test/cabrip.c, test/chminfo.c: used macros from system.h for
|
|
printing offsets and reading 64-bit values, rather than
|
|
reinvent the wheel.
|
|
|
|
* cabd_can_merge_folders(): declare variables at the start of
|
|
a block so older C compilers won't choke.
|
|
|
|
* cabd_find(): avoid compiler complaints about non-initialised
|
|
variables. We know they'll get initialised before use, but the
|
|
compiler can't reverse a state machine to draw the same conclusion.
|
|
|
|
2011-04-26: Stuart Caie <kyzer@4u.net>
|
|
|
|
* configure.ac, mspack/system.h: Added a configure test to get
|
|
the size of off_t. If off_t is 8 bytes or more, we presume this
|
|
system has large file support. This fixes LFS detection for Fedora
|
|
x86_64 and Darwin/Mac OS X, neither of which declare FILESIZEBITS in
|
|
<limits.h>. It's not against the POSIX standard to do this: "A
|
|
definition of [FILESIZEBITS] shall be omitted from the <limits.h>
|
|
header on specific implementations where the corresponding value is
|
|
equal to or greater than the stated minimum, but where the value can
|
|
vary depending on the file to which it is applied."
|
|
(http://pubs.opengroup.org/onlinepubs/009695399/basedefs/limits.h.html)
|
|
Thanks to Edward Sheldrake for the patch.
|
|
|
|
2011-04-26: Stuart Caie <kyzer@4u.net>
|
|
|
|
* chmd.c: all 64-bit integer reads are now consolidated into
|
|
the read_off64() function
|
|
|
|
* chmd_read_headers(): this function has been made resilient
|
|
against accessing memory past the end of a chunk. Thanks to
|
|
Sergei Trofimovich for sending me examples and analysis.
|
|
|
|
* chmd_init_decomp(): this function now reads the SpanInfo file
|
|
if the ResetTable file isn't available, it also checks that each
|
|
system file it needs is large enough before accessing it, and
|
|
some of its code has been split into several new functions:
|
|
find_sys_file(), read_reset_table() and read_spaninfo()
|
|
|
|
2011-04-26: Stuart Caie <kyzer@4u.net>
|
|
|
|
* mspack.h, chmd.c: now reads the SpanInfo system file if the
|
|
ResetTable file isn't available. This adds a new spaninfo pointer
|
|
into struct mschmd_sec_mscompressed
|
|
|
|
2011-04-26: Stuart Caie <kyzer@4u.net>
|
|
|
|
* test/chminfo.c: more sanity checks for corrupted CHM files where
|
|
entries go past the end of a PMGL/PMGI chunk, thanks to
|
|
Sergei Trofimovich for sending me examples and analysis.
|
|
|
|
2011-04-25: Stuart Caie <kyzer@4u.net>
|
|
|
|
* cabd_merge(): Drew D'Addesio showed me spanning cabinets which
|
|
don't have all the CFFILE entries they should, but otherwise have
|
|
all necessary data for extraction. Changed the merging folders
|
|
test to be less strict; if folders don't exactly match, warn which
|
|
files are missing, but allow merging if at least one necessary
|
|
file is present.
|
|
|
|
2010-09-24: Stuart Caie <kyzer@4u.net>
|
|
|
|
* readhuff.h: Don't let build_decode_table() allow empty trees.
|
|
It's meant to be special case just for the LZX length tree, so
|
|
move that logic out to the LZX code. Thanks to Danny Kroshin for
|
|
discovering the bug.
|
|
|
|
* lzxd.c: Allow empty length trees, but not other trees. If
|
|
the length tree is empty, fail if asked to decode a length symbol.
|
|
Again, thanks to Danny Kroshin for discovering the bug.
|
|
|
|
2010-09-20: Stuart Caie <kyzer@4u.net>
|
|
|
|
* Makefile.am: Set EXTRA_DIST so it doesn't include .svn
|
|
directories in the distribution, but does include docs.
|
|
|
|
2010-09-20: Stuart Caie <kyzer@4u.net>
|
|
|
|
* Makefile.am, configure.ac: Use modern auto* practises; turn on
|
|
automake silent rules where possible, use "m4" directory for libtool
|
|
macros, use LT_INIT instead of AC_PROG_LIBTOOL and use AM_CPPFLAGS
|
|
instead of INCLUDES. Thanks to Sergei Trofimovich for the patch.
|
|
|
|
2010-09-15: Stuart Caie <kyzer@4u.net>
|
|
|
|
* many files: Made the code compile with C++
|
|
- Renamed all 'this' variables/parameters to 'self'
|
|
- Added casts to all memory allocations.
|
|
- Added extern "C" to header files with extern declarations.
|
|
- Made system.c include system.h.
|
|
- Changed the K&R-style headers to ANSI-style headers in md5.c
|
|
|
|
2010-08-04: Stuart Caie <kyzer@4u.net>
|
|
|
|
* many files: removed unnecessary <unistd.h> include
|
|
|
|
2010-07-19: Stuart Caie <kyzer@4u.net>
|
|
|
|
* cabd_md5.c, chmd_md5.c: Replace writing files to disk then
|
|
MD5summing them, with an MD5summer built into mspack_system.
|
|
Much, much faster results.
|
|
|
|
* qtmd_decompress(): Robert Riebisch pointed out a Quantum
|
|
data integrity check that could never be tripped, because
|
|
frame_todo is unsigned, so it will never be decremented
|
|
below zero. Replaced the check with one that assumes that
|
|
decrementing past zero wraps frame_todo round to a number
|
|
more than its maximum value (QTM_FRAME_SIZE).
|
|
|
|
2010-07-18: Stuart Caie <kyzer@4u.net>
|
|
|
|
* cabd.c: Special logic to pass cabd_sys_read() errors back
|
|
to cabd_extract() wasn't compatible with the decompressor
|
|
logic of returning the same error repeatedly once unpacking
|
|
fails. This meant that if decompressing failed because of
|
|
a read error, then the next file in the same folder would
|
|
come back as "no error", but the decompressed wouldn't have
|
|
even attempted to decompress the file. Added a new state
|
|
variable, read_error, with the same lifespan as a decompressor,
|
|
to pass the underlying reason for MSPACK_ERR_READ errors back.
|
|
|
|
* mszipd.c: improve MS-ZIP recovery by saving all the bytes
|
|
decoded prior to a block failing. This requires remembering
|
|
how far we got through the block, so the code has been made
|
|
slightly slower (about 0.003 seconds slower per gigabyte
|
|
unpacked) by removing the local variable window_posn
|
|
and keeping it in the state structure instead.
|
|
|
|
2010-07-16: Stuart Caie <kyzer@4u.net>
|
|
|
|
* Makefile.am: strange interactions. When -std=c99 is used,
|
|
my Ubuntu's <stdio.h> (libc6-dev 2.11.1-0ubuntu7.2) does NOT
|
|
define fseeko() unless _LARGEFILE_SOURCE is also defined. But
|
|
configure always uses -std=gnu99, not -std=c99, so its test
|
|
determines _LARGEFILE_SOURCE isn't needed but HAVE_FSEEKO is
|
|
true. The implicit fseeko definition has a 32-bit rather than
|
|
64-bit offset, which means the mode parameter is interpreted
|
|
as part of the offset, and the mode is taken from the stack,
|
|
which is generally 0 (SEEK_SET). This breaks all SEEK_CURs.
|
|
The code works fine when -std=c99 is not set, so just remove
|
|
it for the time being.
|
|
|
|
2010-07-12: Stuart Caie <kyzer@4u.net>
|
|
|
|
* system.c: Reject reading/writing a negative number of bytes.
|
|
|
|
* chmd.c: allow zero-length files to be seen. Previously they were
|
|
skipped because they were mistaken for directory entries.
|
|
|
|
2010-07-08: Stuart Caie <kyzer@4u.net>
|
|
|
|
* qtmd.c: Larry Frieson found an important bug in the Quantum
|
|
decoder. Window wraps flush all unwritten data to disk.
|
|
However, sometimes less data is needed, which makes
|
|
out_bytes negative, which is then passed to write(). Some
|
|
write() implementations treat negative sizes it as a large
|
|
positive integer and segfault trying to write the buffer.
|
|
|
|
* Makefile.am, test/*.c: fixed automake file so that the
|
|
package passes a "make distcheck".
|
|
|
|
2010-07-07: Stuart Caie <kyzer@4u.net>
|
|
|
|
* doc/szdd_kwaj_format.html: explain SZDD/KWAJ file format.
|
|
|
|
* lzssd.c: fixed SZDD decompression bugs.
|
|
|
|
* test/chmd_compare: Add scripts for comparing chmd_md5 against
|
|
Microsoft's own code.
|
|
|
|
* test/chmd_md5.c: remove the need to decompress everything
|
|
twice, as this is already in chmx.c if needed.
|
|
|
|
2010-07-06: Stuart Caie <kyzer@4u.net>
|
|
|
|
* many files: added SZDD and KWAJ decompression support.
|
|
|
|
2010-06-18: Stuart Caie <kyzer@4u.net>
|
|
|
|
* system.h: expanded the test for 64-bit largefile support so
|
|
it also works on 64-bit native operating systems where you
|
|
don't have to define _FILE_OFFSET_BITS.
|
|
|
|
2010-06-17: Stuart Caie <kyzer@4u.net>
|
|
|
|
* libmspack.pc.in: Added pkg-config support. Thanks to
|
|
Patrice Dumas for the patch.
|
|
|
|
2010-06-14: Stuart Caie <kyzer@4u.net>
|
|
|
|
* qtmd.c, lzxd.c, mszipd.c: created new headers, readbits.h and
|
|
readhuff.h, which bundle up the bit-reading and huffman-reading
|
|
code found in the MSZIP, LZX and Quantum decoders.
|
|
|
|
2010-06-11: Stuart Caie <kyzer@4u.net>
|
|
|
|
* qtmd_static_init(): Removed function in favour of static const
|
|
tables, same rationale as for lzxd_static_init().
|
|
|
|
* qtmd_read_input(), zipd_read_input(): After testing against my
|
|
set of CABs from the wild, I've found both these functions _need_
|
|
an extra EOF flag, like lzxd_read_input() has. So I've added
|
|
it. This means CABs get decoded properly AND there's no reading
|
|
fictional bytes.
|
|
|
|
2010-06-03: Stuart Caie <kyzer@4u.net>
|
|
|
|
* test/cabd_md5.c: updated this so it has better output and
|
|
doesn't need to be in the same directory as the files for multi-
|
|
part sets.
|
|
|
|
2010-05-20: Stuart Caie <kyzer@4u.net>
|
|
|
|
* qtmd_read_input(), zipd_read_input(): Both these functions are
|
|
essentially copies of lzxd_read_input(), but that has a feature
|
|
they don't have - an extra EOF flag. So if EOF is
|
|
encountered (sys->read() returns 0 bytes), these don't pass on the
|
|
error. Their respective bit-reading functions that called them
|
|
then go on to access at least one byte of the input buffer, which
|
|
doesn't exist as sys->read() returned 0. Thanks to Michael
|
|
Vidrevich for spotting this and providing a test case.
|
|
|
|
2010-05-20: Stuart Caie <kyzer@4u.net>
|
|
|
|
* system.h: It turns out no configure.ac tests are needed to
|
|
decide between __func__ and __FUNCTION__, so I put the standard
|
|
one (__func__) back into the D() macro, along with some
|
|
special-case ifdefs for old versions of GCC.
|
|
|
|
* lzxd_static_init(): Removed function in favour of static const
|
|
tables. Jorge Lodos thinks it causes multithreading problems, I
|
|
disagree. However, there are speed benefits to declaring the
|
|
tables as static const.
|
|
|
|
* cabd_init_decomp(): Fixed code which never runs but would write
|
|
to a null pointer if it could. Changed it to an assert() as it
|
|
will only trip if someone rewrites the internals of cabd.c. Thanks
|
|
to Jorge Lodos for finding it.
|
|
|
|
* inflate(): Fixed an off-by-one error: if the LITERAL table
|
|
emitted code 286, this would read one byte past the end of
|
|
lit_extrabits[]. Thanks to Jorge Lodos for finding it.
|
|
|
|
2010-05-06: Stuart Caie <kyzer@4u.net>
|
|
|
|
* test/cabrip.c, test/chminfo.c: add fseeko() support
|
|
|
|
2009-06-01: Stuart Caie <kyzer@4u.net>
|
|
|
|
* README: clarify the extended license terms
|
|
|
|
* doc, Makefile.am: make the doxygen makefile work when using
|
|
an alternate build directory
|
|
|
|
2006-09-20: Stuart Caie <kyzer@4u.net>
|
|
|
|
* system.h: I had a choice of adding more to configure.ac to
|
|
test for __func__ and __FUNCTION__, or just removing __FUNCTION__
|
|
from the D() macro. I chose the latter.
|
|
|
|
* Makefile.am: Now the --enable-debug in configure will actually
|
|
apply -DDEBUG to the sources.
|
|
|
|
2006-09-20: Stuart Caie <kyzer@4u.net>
|
|
|
|
* qtmd_decompress(): Fixed a major bug in the QTM decoder, as
|
|
reported by Tomasz Kojm last year. Removed the restriction on
|
|
window sizes as a result. Correctly decodes the XLVIEW cabinets.
|
|
|
|
2006-08-31: Stuart Caie <kyzer@4u.net>
|
|
|
|
* lzxd_decompress(): Two major bugs fixed. Firstly, the R0/R1/R2
|
|
local variables weren't set to 1 after lzxd_reset_state().
|
|
Secondly, the LZX decompression stream can sometimes become
|
|
odd-aligned (after an uncompressed block) and the next 16 bit
|
|
fetch needs to be split across two input buffers, ENSURE_BITS()
|
|
didn't cover this case. Many thanks to Igor Glucksmann for
|
|
discovering both these bugs.
|
|
|
|
2005-06-30: Stuart Caie <kyzer@4u.net>
|
|
|
|
* cabd_search(): fixed problems with searching files > 4GB for
|
|
cabinets.
|
|
|
|
2005-06-23: Stuart Caie <kyzer@4u.net>
|
|
|
|
* qtmd_init(): The QTM decoder is broken for QTM streams with a
|
|
window size less than the frame size. Until this is fixed, fail
|
|
to initialise QTM window sizes less than 15. Thanks to Tomasz Kojm
|
|
for finding the bug.
|
|
|
|
2005-03-22: Stuart Caie <kyzer@4u.net>
|
|
|
|
* system.h: now undefs "read", as the latest glibc defines read()
|
|
as a macro which messes everything up. Thanks to Ville Skyttä for
|
|
the update.
|
|
|
|
2005-03-14: Stuart Caie <kyzer@4u.net>
|
|
|
|
* test/multifh.c: write an mspack_system implementation that can
|
|
handle normal disk files, open file handles, open file descriptors
|
|
and raw memory all at the same time.
|
|
|
|
2005-02-24: Stuart Caie <kyzer@4u.net>
|
|
|
|
* chmd_read_headers(): avoid infinite loop when chmhs1_ChunkSize is
|
|
zero. Thanks to Serge Semashko for the research and discovery.
|
|
|
|
2005-02-18: Stuart Caie <kyzer@4u.net>
|
|
|
|
* mspack.h: renamed the "interface" parameter of mspack_version() to
|
|
"entity", as interface is a reserved word in C++. Thanks to Yuriy Z
|
|
for the discovery.
|
|
|
|
2004-12-09: Stuart Caie <kyzer@4u.net>
|
|
|
|
* lzss.h, szdd.h, szddd.h: more work on the SZDD/LZSS design.
|
|
|
|
2004-06-12: Stuart Caie <kyzer@4u.net>
|
|
|
|
* lzxd_static_init(): removed write to lzxd_extra_bits[52], thanks
|
|
to Nigel Horne from the ClamAV project.
|
|
|
|
2004-04-23: Stuart Caie <kyzer@4u.net>
|
|
|
|
* mspack.h: changed 'this' parameters to 'self' to allow compiling in
|
|
C++ compilers, thanks to Michal Cihar for the suggestion.
|
|
|
|
* mspack.h, system.h, mspack.def, winbuild.sh: integrated some changes
|
|
from Petr Blahos to let libmspack build as a Win32 DLL.
|
|
|
|
* chmd_fast_find(): added the first part of this code, and comments
|
|
sufficient to finish it :)
|
|
|
|
2004-04-08 Stuart Caie <kyzer@4u.net>
|
|
|
|
* test/chminfo.c: added a program for dumping useful data from CHM
|
|
files, e.g. index entries and reset tables. I wrote this a while ago
|
|
for investigating a corrupt cabinet, but I never committed it.
|
|
|
|
2004-03-26 Stuart Caie <kyzer@4u.net>
|
|
|
|
* test/cabd_memory.c: added a new test example which shows an
|
|
mspack_system implementation that reads and writes from memory only,
|
|
no file I/O. Even the source code has a little cab file embedded in it.
|
|
|
|
2004-03-10 Stuart Caie <kyzer@4u.net>
|
|
|
|
* cabd.c: updated the location of the CAB SDK.
|
|
|
|
* cabd.c: changed a couple of MSPACK_ERR_READ errors not based on
|
|
read() failures into MSPACK_ERR_DATAFORMAT errors.
|
|
|
|
* mszipd_decompress(): repair mode now aborts after writing a
|
|
repaired block if the error was a hard error (e.g. read error, out
|
|
of blocks, etc)
|
|
|
|
2004-03-08 Stuart Caie <kyzer@4u.net>
|
|
|
|
* Makefile.am: now builds and installs a versioned library.
|
|
|
|
* mszipd.c: completed a new MS-ZIP and inflate implementation.
|
|
|
|
* system.c: added mspack_version() and committed to a versioned
|
|
ABI for the library.
|
|
|
|
* cabd.c: made mszip repair functionality work correctly.
|
|
|
|
* cabd.c: now identifies invalid block headers
|
|
|
|
* doc/: API documentation is now included with the library, not
|
|
just on the web.
|
|
|
|
* chmd.c: fixed error messages and 64-bit debug output.
|
|
|
|
* chmd.c: now also catches NULL files in section 1.
|
|
|
|
* test/chmx.c: now acts more like cabextract.
|
|
|
|
2003-08-29 Stuart Caie <kyzer@4u.net>
|
|
|
|
* ChangeLog: started keeping a ChangeLog :)
|
|
|