A heap buffer overflow could occur during resource cleanup if a
malloc fails when adding a regex pattern to the phishing suffix tree.
The solution is to increment suffix_cnt after cli_realloc succeeds.
The issue was identified using fault injection and is not a vulnerability.
Resolves: https://github.com/Cisco-Talos/clamav/issues/429
Add `sigtool --fuzzy-img` option to generate image fuzzy hash.
Also fix assorted warnings, mostly ensuring enough buffer space so format
strings aren't truncated.
For the dsig change: the returned string is allocated and is not const.
The caller will have to free it.
In 0.104 and prior, the function for adding a logical subsignature
was being used for NDB sigs, FTM, sigs, *and* Yara strings.
When cleaning up the logic for handling different types of logical
sig subsignatures, we accidentally broke support for regex strings in
yara rules.
This commit adds new logic for recording if the Yara string is a regex
string, by adding a regex subsig opt. Then in a new function for adding
different types of Yara strings, we check if it's a regex string or not
before adding as either a PCRE pattern or as an AC/BM pattern.
Resolves: https://github.com/Cisco-Talos/clamav/issues/494
Also add a basic test for yara regex rule.
Refactored the clamscan code that determines 'what to scan' in order
to clean up some very messy logic and also to get around a difference in
how vscode and clang-format handle formatting #ifdef blocks in the
middle of an else/if.
In addition to refactoring, there is a slight behavior improvement. With
this change, doing `clamscan blah -` will now scan `blah` and then also
scan `stdin`. You can even do `clamscan - blah` to now scan `stdin` and
then scan `blah`. Before, The `-` had to be the only "filename" argument
in order to scan from stdin.
In addition, added a bunch of extra empty lines or changing multi-line
function calls to single-line function calls in order to get around a
bug in clang-format with these two options do not playing nice together:
- AlignConsecutiveAssignments: true
- AlignAfterOpenBracket: true
AlignAfterOpenBracket is not taking account the spaces inserted by
AlignConsecutiveAssignments, so you end up with stuff like this:
```c
bleeblah = 1;
blah = function(arg1,
arg2,
arg3);
// ^--- these args 4-left from where they should be.
```
VSCode, meanwhile, somehow fixes this whitespace issue so code that is
correctly formatted by VSCode doesn't have this bug, meaning that:
1. The clang-format check in GH Actions fails.
2. We'd all have to stop using format-on-save in VSCode and accept the
bug if we wanted those GH Actions tests to pass.
Adding an empty line before variable assignments from multi-line
function calls evades the buggy behavior.
This commit should resolve the clang-format github action test failures,
for now.
EGG archives may have individually encrypted files, or may specify
encryption for the whole archive. For those that have individually
encrypted files, the clean-up code neglects to free the encrypt
structure, which holds 2 pointers (16 bytes on 64bit machines).
This commit adds that missing free.
Thank you Michał Dardas for reporting this issue.
The `have_clamjit` global is used in the unit tests but doesn't appear
to be exported when I was testing the external LLVM runtime support PR,
resulting in an undefined symbol issue. Converting this to a function
that returns 0 or 1 instead of a global variable resolved the issue.
Modified bytecode JIT code to support llvm versions 8, 9, 10, 11, 12.
Additionally updated FindLLVM.cmake to the current version, found at
https://github.com/ldc-developers/ldc/blob/master/cmake/Modules/FindLLVM.cmake,
as well as making modifications suggested by Micah Snyder to check VERSION_LESS
in CMakeLists.txt to avoid having modifications to FindLLVM.cmake.
Validate the length of the crt subject before memcpying it.
This resolves a possible multibyte heap buffer overread.
We determiend that this issue is not a vulnerability.
Fixes: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=43828
The endianness conversion when reading the PE image import descriptor is
making the change in-place in the fmap.
On Windows, the fmap is read-only and so in Debug builds that's causing
a crash.
This uses a buffer on the stack and copies each image import descriptor
before doing the conversions and then processing each.
* Broke out the variants of error/result handling in `frs_error.rs`.
Made syntax slightly cleaner for `frs_call!`, explicitly moving
the output variables *out* of the function call so as not to make
the parameter order confusing.
* Wrapped the FuzzyHash map into a container rather than exposing
the HashMap directly. Simplifies casting, and allows it to feel
more like a class with methods.
* Fixed various clippy complaints regarding unsafe, etc.
* Rename `frs_error.rs` to `ffi_utils.rs` and migrated ffi-specific
features like the `validate_str_param!()` macro to this new module.
Extends the new frs_error module to provide variants of the
frs_result!() macro that accept a Result as input instead of calling a
function on your behalf. This enables us to use the macro in conditions
where we don't want to return on success, and want to do other things
before we return.
Use the new frs_error module to return errors to the C calling functions
rather than logging the error in Rust-land.
Notably, this enables us to store more meaningful error messages in the
JSON output if we fail to calculate the image fuzzy hash.
Add a new logical signature subsignature type for matching on images
with image fuzzy hashes.
Image fuzzy hash subsigantures follow this format:
fuzzy_img#<hash>#<dist>
In this initial implementation, the hamming distance (dist) is ignored
and only exact fuzzy hash matches will alert.
Fuzzy hash matching is only performed for supported image types.
Also: removed some excessive debug log messages on start-up.
Fixed an issue where the signature name (virname) is being allocated and
stored for every subsignature or even ever sub-pattern in an AC-pattern
(i.e. NDB sig or LDB subsig) containing a `{n-m}` or `*` wildcard.
This fix is only for LDB subsigs though. NDB signatures are still
allocaing one virname per sub-pattern.
This fix was required because I needed a place to store the virname with
fuzzy-hash subsignatures. Storing it in the fuzzy-hash subsig
metadatathe way AC-pattern, PCRE, and BComp subsigs were doing it
wouldn't work because it would cross the C-Rust FFI boundary and giving
pointers to Rust allocated stuff is dicey. Not to mention native Rust
strings are different thatn C strings. Anyways, the correct thing to do
was to store the virname with the actual logical signature.
TODO: Keep track of NDB signatures in the same way and store the virname
for NDB sigs there instead of in AC-patterns so that we can get rid of
the virname field in the AC-pattern struct.
Fix two locations where the stack-allocated arrays lack space for a null-
terminating byte and could overwrite the array in:
- dsig.c
- sigtool.c
The ClamAV team verified that these overflows are not a security issue.
The logic for parsing a logical subsignature isn't clearly identified
and has been, perhaps mistakenly or out of convenience, used to when
parsing NDB signatures in addition to LDB subsignatures. What this means
is that you can technically use a PCRE subsignature in an NDB file and
clam won't complain about it. It won't work however, because a PCRE
subsignature requires another matching subsignature to trigger it, but
it will parse. The same is likely true for byte-compare subsignatures.
This commit restructures that logic a bit so subsignature parsing has
its own function and is more organized.
I also renamed the functions a little bit and added lots of comments.
I fixed a few minor warnings relating to format string characters.
The change in str.c:cli_ldbtokenize is to prevent a buffer under-read if
you were to use the function on the start of a buffer, as is now down in
this commit.
Corrected buffer size check in the regex signature parsing code.
This resolves a possible 1-byte stack buffer overwrite (NULL byte).
We determined that this issue is not a vulnerability.
Fixes: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=43869
The XLS parsing code for extracting images and XLM macros is alerting
with Heuristics.OLE2.ContainsMacros.XLM if any images are found if the
--alert-macros (AlertOLE2Macros) option is enabled.
This fix moves the alert logic before the XLM & image extraction logic
where we know at least one macro exists, but before we try to extract it
I've wrapped it with an "if (has_xlm)" condition.
Resolves:
- https://github.com/Cisco-Talos/clamav/issues/423
- https://bugzilla.clamav.net/show_bug.cgi?id=12844
The byte compare feature in logical signatures will cause the rule to
alert if it successfully matches regardless of the rest of the logical
signature.
An easy way to test this is with a logical signature that has two
bcomp subsignatures and requires both to match for the rule to alert.
In the following example, we have 4 signatures where
- the first will match both bcomp subsigs.
- the second will match neither.
- the last two match just one bcomp subsig.
In an --allmatch test, you'll find that the 3 of these match, with the
first one matching *twice*, once for each bcomp subsig.
test.ldb:
```
bcomp.both;Engine:51-255,Target:0;0&1&2&3;4141;0(>>5#hb2#=123);4242;2(>>5#hb2#=255)
bcomp.neither;Engine:51-255,Target:0;0&1&2&3;4141;0(>>5#hb2#=124);4242;2(>>5#hb2#=254)
bcomp.second;Engine:51-255,Target:0;0&1&2&3;4141;0(>>5#hb2#=124);4242;2(>>5#hb2#=255)
bcomp.first;Engine:51-255,Target:0;0&1&2&3;4141;0(>>5#hb2#=123);4242;2(>>5#hb2#=254)
```
test.sample:
```
AA = 7B; BB = FF
```
You can also try a similar test to compare the behavior with regular
ac-pattern-match subsigs with this lsig-test.ldb:
```
pattern.both;Engine:51-255,Target:0;0&1;4141;4242
pattern.neither;Engine:51-255,Target:0;0&1;4140;4241
pattern.second;Engine:51-255,Target:0;0&1;4140;4242
pattern.first;Engine:51-255,Target:0;0&1;4141;4241
```
This commit fixes the issue by incrementing the logical subsignature
count for each bcomp subsig match instead of appending an alert for
each bcomp match.
Also removed call to `lsig_sub_matched()` that didn't do anything.
* Added loglevel parameter to logg()
* Fix logg and mprintf internals with new loglevels
* Update all logg calls to set loglevel
* Update all mprintf calls to set loglevel
* Fix hidden logg calls
* Executed clam-format
Add support for compiling with external TomsFastMath library provided by
the system instead of compiling the vendored copy into libclamav.
The vendored source is still built directly into libclamav instead of as
a separate library the way libmspack is done.
The rationale is that:
A) it's more complicated to deal with possibly compiling as static or
dynamic, and also
B) libmspack and libunrar are compiled separately primarily because of
licensing concerns. TomsFastMath public domain, so that isn't a concern.
Resolves: https://bugzilla.clamav.net/show_bug.cgi?id=12562
Update the vendored TomsFastMath (TFM) library to v0.13.1.
Resolves: https://bugzilla.clamav.net/show_bug.cgi?id=11992
I removed compatibility macro's from when libTomMath was used.
This required removing a bunch of faux-error handling because
the fast-math equivalent functions return void, and cannot fail.
The previous version used had named the header "bignum_fast.h"
instead of "tfm.h" and had customizations in that header to enable
TFM_CHECK all the time, and also TFM_NO_ASM if __GNUC__ not defined
or if the system isn't 64bit architecture. This update uses tfm.h
as-is, and has CMake define TFM_CHECK and TFM_NO_ASM as needed.
I've kept bignum.h as an interface to including tfm.h so that in
the future we can more easily add support for system-installed
TomsFastMath instead of the vendored one, taking inspiration from
Debian's patch to support system-TomsFastMath.
See: https://salsa.debian.org/clamav-team/clamav/-/blob/unstable/debian/patches/add-support-for-system-tomsfastmath.patch
The logging functions use a callback to print log messages. Because the
callback could be anything provided by an application, it isn't safe to
log while holding a mutex.
This commit defers error reporting in cacheset_add() to prevent running
the callback while the mutex is held.
Resolves https://bugzilla.clamav.net/show_bug.cgi?id=12592
A null-dereference crash happens if the database directory contains:
a good `daily.cvd` symlink
a broken `daily.cld` symlink
or:
a broken `daily.cvd` symlink
a good `daily.cld` symlink
You're not supposed to have both a .cvd and .cld for the same file
at the same time. And making individual symlinks for the database
files is also unexpected.
That is to say that this is not an intended or supported use case
for the database directory. But it's a fairly simple bug to fix.
The issue is that a bad symlink is still added to the database load-
list after the access check fails. This commit skips daily databases
that fail the access check.
The PE section hash scanning code didn't implement the all-match check.
While this check isn't the ideal implementation for all-match mode...
(see the commit message for the previous commit)
...it's simple enough to add the all-match check here for now.
All-match mode is presently not working if you have multiple hash-based
signatures for the same file. That is, only the first signature will
alert before moving on to other parts of the scan.
Our current all-match implementation adds checked into every file parser
(often at multiple layers in the scanning process) to make sure that the
scan continued even if an alert is found, but only when in all-match mode.
This is clearly an error-prone approach.
Instead, I would like to change it so that every alert appends to an alert
list. When in all-match mode, the scan status would continue with "clean"
until the very end when the process is complete and the list of alerts is
evaluated.
Unfortunately, that's a much bigger job. For this specific bug, it is
simple enough to fix the broken logic that aborts the hash scanning
early. In the future, we should rework all of this logic throughout
libclamav, as described above.
In testing, I found that libclang.so/clang.dll is required by bindgen
and was not found on all of our test machines. To resolve this we will
only generate sys.rs bindings when CMake MAINTAINER_MODE option is "ON".
This is unfortunate that we have to commit generated source to version
control. But as a benefit it makes rust-analyzer happier when editing a
workspace that hasn't yet been compiled. And it makes it more reasonable
that the generated sys.rs file generated to the source directory and not
the build directory (something we hadn't resolved yet).
Use bindgen to generate Rust-bindings for some libclamav internal
functions and structures in an new "sys.rs" module.
"sys.rs" will be generated at build-time and unfortunately must be
dropped in the source/libclamav_rust/src directory rather than under the
build directory. As far as I know, Cargo/Rust provide no way to set an
include path to the build directory to separately place generated files.
TODO: Verify if this is true and move sys.rs to the build directory if
possible.
Using the new bindings with:
- the logging module.
- the cdiff module.
Also:
- Removed clamav_rust.h from .gitignore, because we generate it in the
build directory now.
- Removed the hand-written bindings from the cbindgen exclusions.
lib.rs has an annotation that prevents cbindgen from looking at sys.rs.
- Fixed a `u8` -> `c_char` type issue in cdiff in the cli_getdsig() call
parameters.
We received a high FP report rate for Heuristics.PNG.CVE-2010-1205.
This is a 11 year old CVE and is likely no longer relevant anyways.
Removing Heuristics.PNG.CVE-2010-1205 means there is no longer any point
decompressing the PNG image data in the PNG CVE checker. Removing image
data decompression in the CVE checker should improve PNG image scan
times.
The OOXML parser in libclamav may try to extract an entry that is
missing a file name. This results in an invalid 0x1 pointer dereference
in the ZIP parser that is likely to crash the scanning application.
This commit fixes the issue by requiring both the PartName (PN) *and*
the ContentType (CT) variables to be non-NULL or else the entry will be
skipped.
Thank you Laurent Delosieres for reporting this issue.
Rustify cdiff_apply() and clean up error handling:
- Restore [some] safety and clean up error handling.
- Use rust-crypto sha2 instead of OpenSSL's.
Fix signedness of cli_versig2() dsig parameter.
c_char may be an i8 or u8 depending on platform:
https://doc.rust-lang.org/src/std/os/raw/mod.rs.html#91-133
Rustify cmd_close():
- Consolidate DEL/XCHG records.
- Tidy up ADD handling.
- Various error handling cleanup, etc.
You should be able to disable the maxfilesize limit by setting it to
zero. When "disabled", ClamAV should defer to inherent limitations, which
at this time is INT_MAX - 2 bytes.
This works okay for ClamScan and ClamD because our option parser
converts max-filesize=0 to 4294967295 (4GB). But it is presently broken
for other applications using the libclamav C API, like this:
```c
cl_engine_set_num(engine, CL_ENGINE_MAX_FILESIZE, 0);
```
The limit checks added for cl_scanmap_callback and cl_scanfile_callback
in 0.103.4 and 0.104.1 broke this ability because we forgot to check if
the `maxfilesize > 0` before enforcing it.
This commit adds that guard so you can disable by setting to `0`.
While working on this, I also found that the `max_size` variables in our
libmspack scanner code are using an `off_t` type, which is a SIGNED integer
that may be 32bit width even on some 64bit platforms, or may be a 64bit
width. AND the default `max_size` when `maxfilesize == 0` was being set to
UINT_MAX (0xffffffff), aka `-1` when `off_t` is 32bits.
This commit addresses this related issue by:
- changing the `max_size` to use `uint64_t`, like our other limits.
- verifying that `maxfilesize > 0` before using it.
- checking that using `UINT32_MAX` as a backup will not exceed the
max-scansize in the same way that we do with the maxfilesize.
The max bytes supplied to strftime should be the length of result
string, including the terminating null byte.
Without the extra byte for the terminating null byte, the output is one
byte too long and results in undefined behavior:
If the length of the result string (including the terminating null
byte) would exceed max bytes, then strftime() returns 0, and the
contents of the array are undefined.
Also resolve alleged uninitialized memory use by initializing the
`digest` variable in `cli_md5buff()`. MSAN blames it for putting
uninitialized data in the `name_salt` global, though in debugging and in
review I can't find any evidence that it isn't initialized by the call
to `cl_hash_data()` in `cli_md5buff()`.
This MSAN complaint has been a blocker to enabling MSAN in OSS-Fuzz.
Since uClibc can be configured without support for backtrace, disable
the backtrace if we are building with a uClibc that was built without
backtrace.
This is a bit hacky, and would greatly benefit from a test in ./configure
instead, but does nicely as a quick fix for now.
Signed-off-by: "Yann E. MORIN" <yann.morin.1998@free.fr>
Signed-off-by: Bernd Kuhls <bernd.kuhls@t-online.de>
[Bernd: rebased for 0.103.0]
[Fabrice: retrieved from
https://git.buildroot.net/buildroot/tree/package/clamav/0002-mbox-do-not-use-backtrace-if-using-uClibc-without-ba.patch]
Signed-off-by: Fabrice Fontaine <fontaine.fabrice@gmail.com>
Convert cli_dbgmsg to inline function to ensure ctx check for debug flag
is always run
Add copyright and licensing info
Fix valgrind uninitialized buffer issue in cliunzip.c
Windows build fix
The `realpath()` function on macOS will fail when scanning a symlink
that doesn't exist because it tries to determine the real path of the
thing the symlink points to and fails when that thing doesn't exist.
This behavior is different than on Linux or FreeBSD Unix.
I'm fixing this by opening the symlink with `O_SYMLINK` then getting
realpath of the link using `cli_get_filepath_from_filedesc()`, much
like we do on Windows.
Side note: I did try using macOS's _DARWIN_BETTER_REALPATH variant to
see if that resolved the issue, but it didn't seem to.
This resolves https://bugzilla.clamav.net/show_bug.cgi?id=12792
This commit also removes the problematic "access denied" clamd test from
the test suite. This commit broke that test on macOS because the error
message when clamdscan fails to `open()` the file to check the
realpath() is different than the error message when clamd tries to scan
it, but has access denied.
(It's like "Permission denied" instead of "Access denied")
It's not worth #ifdef'ing around macOS to get a test pass because this
test is already causing problems in 32-bit Docker environments where
access isn't actually denied to the user (!). Honestly, it's not worth
keeping the test that simply verifies ClamD's error message when denied
matches expectations.
I also switched to use the C_DARWIN macro instead of __APPLE__ because
I kno C_DARWIN is defined in clamav-config.h on macOS, and I found that
__APPLE__ wasn't defined when I tested.
Accidentally introduced an invalid format string character, which is
apparently just a warning. ¯\_(ツ)_/¯
But Coverity really didn't like it, and now that I know about it,
neither do I...
The zip parser may leak a string for a zip record filename if the
the record in the central directory is the last record.
This fixes https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=31760
It also appears that there may be a related record filename leak if an
error occured when indexing the files in the central directory header,
but I don't have any test file for this but it was an obvious fix.