As of ClamAV 0.105, libjson-c is required.
There is also no option to disable libjson-c support.
This commit removes the dead code associated with the old build
option.
We have some special functions to wrap malloc, calloc, and realloc to
make sure we don't allocate more than some limit, similar to the
max-filesize and max-scansize limits. Our wrappers are really only
needed when allocating memory for scans based on untrusted user input,
where a scan file could have bytes that claim you need to allocate
some ridiculous amount of memory. Right now they're named:
- cli_malloc
- cli_calloc
- cli_realloc
- cli_realloc2
... and these names do not convey their purpose
This commit renames them to:
- cli_max_malloc
- cli_max_calloc
- cli_max_realloc
- cli_max_realloc2
The realloc ones also have an additional feature in that they will not
free your pointer if you try to realloc to 0 bytes. Freeing the memory
is undefined by the C spec, and only done with some realloc
implementations, so this stabilizes on the behavior of not doing that,
which should prevent accidental double-free's.
So for the case where you may want to realloc and do not need to have a
maximum, this commit adds the following functions:
- cli_safer_realloc
- cli_safer_realloc2
These are used for the MPOOL_REALLOC and MPOOL_REALLOC2 macros when
MPOOL is disabled (e.g. because mmap-support is not found), so as to
match the behavior in the mpool_realloc/2 functions that do not make use
of the allocation-limit.
There are a large number of allocations for fix sized buffers using the
`cli_malloc` and `cli_calloc` calls that check if the requested size is
larger than our allocation threshold for allocations based on untrusted
input. These allocations will *always* be higher than the threshold, so
the extra stack frame and check for these calls is a waste of CPU.
This commit replaces needless calls with A -> B:
- cli_malloc -> malloc
- cli_calloc -> calloc
- CLI_MALLOC -> MALLOC
- CLI_CALLOC -> CALLOC
I also noticed that our MPOOL_MALLOC / MPOOL_CALLOC are not limited by
the max-allocation threshold, when MMAP is found/enabled. But the
alternative was set to cli_malloc / cli_calloc when disabled. I changed
those as well.
I didn't change the cli_realloc/2 calls because our version of realloc
not only implements a threshold but also stabilizes the undefined
behavior in realloc to protect against accidental double-free's.
It may be worth implementing a cli_realloc that doesn't have the
threshold built-in, however, so as to allow reallocaitons for things
like buffers for loading signatures, which aren't subject to the same
concern as allocations for scanning possible malware.
There was one case in mbox.c where I changed MALLOC -> CLI_MALLOC,
because it appears to be allocating based on untrusted input.
* Added loglevel parameter to logg()
* Fix logg and mprintf internals with new loglevels
* Update all logg calls to set loglevel
* Update all mprintf calls to set loglevel
* Fix hidden logg calls
* Executed clam-format
A heap buffer over-read may occur in the OLE2 parser if the --gen-json
option is enabled (the CL_SCAN_GENERAL_COLLECT_METADATA scan option).
The issue occurs because a string input is not checked to verify if it
is empty (zero-byte length) prior to use.
We determined that this issue is not exploitable to cause a crash or to
do anything malicious. The overflow (er... underflow?) is 1 byte before
a malloced buffer.
This commit adds checks to the function parameters in case the original
pointer itself is NULL, and to account for conversion of an empty string.
Fixes https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=39673
- 192959 Resource leak - In cli_bcomp_compare_check: Leak of
memory or pointers to system resources. Several fail cases
could lead to `buffer` or `tmp_buffer` being leaked
- 192934 Resource leak - In cli_bcomp_normalize_buffer: Leak of
memory or pointers to system resources. `hex_buffer` leaked
under certain conditions
- 185977 Resource leak - In ole2_process_property: Leak of memory
or pointers to system resources. A fail case could lead to
`outstr` and `outstr2` being leaked
- 185941 Resource leak - In header_cb (clamsubmit): Leak of
memory or pointers to system resources. A fail case could lead
to `mem` being leaked
- 185925 Resource leak - In load_oneyara: Leak of memory or
pointers to system resources. Several fail cases could lead
to `newident` being leaked
- 185918 Resource leak - In parsehwp3_docsummary: Leak of memory
or pointers to system resources. Not actually a leak, but
caused by checking for a condition that can’t occur.
- 185915 Resource leak - In parsehwp3_docinfo: Leak of memory or
pointers to system resources. Not actually a leak, but caused
by checking for a condition that can’t occur.
- 147644 Resource leak - In tcpserver: Leak of memory or pointers
to system resources. A fail case could lead to `info` being leaked
- 147642 Resource leak - In onas_ht_add_hierarchy: Leak of memory
or pointers to system resources. Several fail cases could lead
to `hnode` or `elem` memory leaks
Also relocated codepage table from msdoc.h to entconv.h
Also adds new macros for codepages to reduce use of magic numbers when
referencing code pages elsewhere in libclamav.
A way is needed to record scanned file names for two purposes:
1. File names (and extensions) must be stored in the json metadata
properties recorded when using the --gen-json clamscan option. Future
work may use this to compare file extensions with detected file types.
2. File names are useful when interpretting tmp directory output when
using the --leave-temps option.
This commit enables file name retention for later use by storing file
names in the fmap header structure, if a file name exists.
To store the names in fmaps, an optional name argument has been added to
any internal scan API's that create fmaps and every call to these APIs
has been modified to pass a file name or NULL if a file name is not
required. The zip and gpt parsers required some modification to record
file names. The NSIS and XAR parsers fail to collect file names at all
and will require future work to support file name extraction.
Also:
- Added recursive extraction to the tmp directory when the
--leave-temps option is enabled. When not enabled, the tmp directory
structure remains flat so as to prevent the likelihood of exceeding
MAX_PATH. The current tmp directory is stored in the scan context.
- Made the cli_scanfile() internal API non-static and added it to
scanners.h so it would be accessible outside of scanners.c in order to
remove code duplication within libmspack.c.
- Added function comments to scanners.h and matcher.h
- Converted a TDB-type macros and LSIG-type macros to enums for improved
type safey.
- Converted more return status variables from `int` to `cl_error_t` for
improved type safety, and corrected ooxml file typing functions so
they use `cli_file_t` exclusively rather than mixing types with
`cl_error_t`.
- Restructured the magic_scandesc() function to use goto's for error
handling and removed the early_ret_from_magicscan() macro and
magic_scandesc_cleanup() function. This makes the code easier to
read and made it easier to add the recursive tmp directory cleanup to
magic_scandesc().
- Corrected zip, egg, rar filename extraction issues.
- Removed use of extra sub-directory layer for zip, egg, and rar file
extraction. For Zip, this also involved changing the extracted
filenames to be randomly generated rather than using the "zip.###"
file name scheme.