In the core `cli_magic_scan()` function, early termination of the scan
before hashing the file, such as a return from the prescan-callback,
could result in the hashed_size variable being uninitalized when calling
into `cache_add()`. That's okay because we now skip caching of the md5 hash
when it is NULL, hence "benign". This is just a minor code quality fix.
Changes include:
* Change include of system regex headers to internal
* Add cli prefix to regex functions
* Change cli_regcomp to cli_regcomp_real to work with the
others_common.c regex interface
* Optimize re_guts struct:
- Reordering fields allows the struct to fit within 16 bytes vs 20
bytes. This helps to fix a bug on legacy 64-bit systems where
there was a behaviour difference between 32 and 64 systems.
- see bb 474 for further details
* Fix out of memory condition
- see bb 849 for further details
- reported by Gianluigi Tiesi <sherpya*netfarm.it>
* Remove duplicate nomem check
* Avoid passing out-of-range values to isalnum
- reported by Nigel
* Avoid name collisions on AIX
* Fix compiler warnings
* Fix error path leak in regex/engine.c
* Fix regex when sizeof(void*) != sizeof(long) for 64bit Windows
- see bb 2232 for further Details
- reported by Martin Olsen
* Add error case safety checks and cleanups
* Add patch for 'possible' heap overflow
- see bb11264 for further details
- patch submitted by the Debian team
* Use clam internal allocation functions
* Replace bounds check asserts with if checks (asserts are compiled
out of production builds)
Contributors to the above include:
* Nigel Horne
* aCaB
* Török Edvin
* David Raynor
* Shawn Webb
* Steven Morgan
* Micah Snyder
* Mickey Sola
Updated using the openbsd github repo using the code in this directory:
https://github.com/openbsd/src/tree/master/lib/libc/regex
This build will not function without its child commit, which introduces
clam specific modifications. The two have been separated to make future
upgrades easier.
If realloc() is called with size == 0, realloc() will free the pointer
and return NULL. Unless you check for this and set the pointer to NULL,
the pointer may later be free'd again after the `done:` label.
This commit fixes it by using the new CLI_REALLOC macro. CLI_REALLOC
uses `cli_realloc()` that both limits the amoutn of memory that may be
allocated and also will return NULL if you try to set size == 0, WITHOUT
free'ing the memory.
Resolves: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=44040
The `fmap_readn()` function returns a size_t.
On error, it returns (size_t)-1. The SIS, SWF, and TNEF parsers were
storing the result as a signed int and then checking if < 0 for the
error case.
Added a CLI_ISCONTAINED_2_0_TO() macro, like the CLI_ISCONTAINED_0_TO()
macro for use when the big buffer offset == 0, to eliminate pointless
warnings.
Fix enum return type for functions in SWF parser.
Errors are enums, not ints! There's a difference.
Fix if-check in SWF parser where we relied on integer overflow for error
checking.
In the PE parser when reading up to 4KB following the entrypoint
there's is no call to verify if the read failed. Later it is assumed
that the read succeeded and that the data in the buffer is valid.
I believe the correct response is to bail out if the read failed.
I also fixed some warnings:
- The the max # of PE sections was effectively disabled by setting it
to the max size of a uint16_t, so the max-check was pointless.
- Some undocumented switch fall-throughs were throwing warnings as well.
- Unsigned integer subtraction results in a signed value, which was
throwing warnings when compared with another unsigned value. The
substraction for `peinfo->nsections - 1` won't be less than 0 though
because we've already verified that nsections != 0 so we can just cast
the result of the subtraction back to an unsigned value to silence the
warning safely.
According to the mspack documentation:
[The mspack read() callback function should] return the number of bytes
successfully read (this can be less than the number requested), zero to mark
the end of file, or less than zero to indicate an error.
The library does not "retry" reads and assumes short reads are due to
EOF, so you should avoid returning short reads because of transient errors.
Our implementation returns the number of bytes read, as required, but
fails to update the internal offset when reading from an fmap. This
means that if mspack does retry a partial read, it will return the same
bytes in perpetuity.
This commit updates the offset for partial reads.
It also appears as though our implementation would return more bytes
than read if a partial read occurs if using a file descriptor instead of
an fmap. I've corrected this to return the number of bytes read.
Thank you to Michał Dardas for reporting this issue.
The TIFF parser is missing offset checks when iterating through the
directory entries and could end up in an infinite loop.
Added additional error checking to TIFF parser to check for offsets not
being before the current offset.
Thanks to Michał Dardas for reporting this.
Fixed leak where a malloced pointer was overwritten, and separate
leak where a returned textbuf struct was not cleaned up.
Thanks to Michał Dardas for reporting this.
Since converting the hash variable from a stack array to a pointer, the
pointer may now be NULL if the file is truncated after the scan starts
but before the hash is calculated. This race condition would result in
a NULL pointer dereference and crash.
This commit adds additional NULL parameter checks.
Thanks to Alexander Patrakov and Antoine Gatineau for reporting this issue.
Resolves: https://github.com/Cisco-Talos/clamav/issues/440
* Windows: Fix utf8 filename support issue
The function used to verify the real path for the file being
scanned fails on some utf8 filenames, like:
file_with_ümlaut.eicar.com
The scan continues despite the failure, but the error reported from
clamd and warnings from clamscan are confusing because it looks like
the scan failed even if a verdict shows up later.
The fix here is to convert the path from utf8 to utf16 and then use
the CreateFileW() API to grab the file handle for the real path check.
Resolves: https://github.com/Cisco-Talos/clamav/issues/418
Resolves: CLAM-1782
* Windows: Fix utf8 libclamav logging issues
Print to the log using rust eprint() instead of fputs()
Rust's eprint() function supports full utf8, while fputs() gets
confused and prints stuff like:
file_with_├╝mlaut.eicar.com
instead of:
file_with_ümlaut.eicar.com
Using a fuzzy hash test for the clamav daemon JPEG attached to the XLS
document. Not yet testing PNG, because the fuzzy hash implementation
isn't properly hashing that file, yet.
This test is for a regression where malware detection wasn't properly
being tracked for OLE2 (XLS) image extraction / scanning.
The code to extract images from XLS documents isn't recording any alerts
that may be found when scanning those XLS documents. The logic to record
alerts was almost entirely done except left disconnected in one spot.
This fixes that broken link, propagating the alert all the way up.
During wdb load, it was possible to go beyond the bounds
of the pattern buffer due to two subsequent increment ops
with no bounds checking in between.
This issue was reported by external researchers and
they provided the fix as well.
Based on our own research, this is a defect but not a vulnerability.
Co-authored-by: Mickey Sola <micksola@cisco.com>
We don't allocate a copy of the signature name to store in the AC
pattern structure for logical signature patterns because it is already
stored in the logical signature structure. But oss-fuzz found that we're
freeing that virname in when an error happens even if it wasn't copied.
This fix checks the allocation before MPOOL_FREE.
Since virname is passed in, and only cloned under certain condtions,
check to see that it has actually been cloned before freeing it in any
cleanup code.
Resolves: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=45205
* clamonacc: fix unused variable compile-time warning
Remove unused variable 'ret' from onas_queue.c and get rid of the
following compile-time warning:
~/clamav/clamonacc/scan/onas_queue.c: In function ‘onas_scan_queue_th’:
~/clamav/clamonacc/scan/onas_queue.c:161:9: warning: unused variable ‘ret’ [-Wunused-variable]
161 | int ret;
| ^~~
* libclamav: fix unused variable compile-time warning
Remove unused variable 'err' from libclamav/png.c, and get rid of
the following compile-time warning:
~/clamav/libclamav/png.c: In function ‘cli_parsepng’:
~/clamav/libclamav/png.c:101:9: warning: unused variable ‘err’ [-Wunused-variable]
101 | int err = Z_OK;
| ^~~
The office art structure for OLE2 documents records the file name
length using a `uint8_t`, meaning the name may be up to 255 bytes in
length, not including the null terminating byte. If the length is
255 then the parser will write the null-terminating byte just after
the end of the name buffer on the stack.
This issue does not cause a crash and is not a vulnerability.
This fix extends the size of stack array to account for the null
terminator.
Thank you Michał Dardas for reporting this issue.
The check of pattern_len against FILEBUF is largely meaningless since
pattern is derived from a strchr() call against buffer (with length FILEBUFF).
This fix ensures that the relative size is checked against max buffer size
which prevents overwriting stack memory with a single null byte.
Resolves: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=45247
The fuzz targets that write a temp file currently use the same filename
every time. One of the users identified that if the tests are running
in parallel mode, many processes are accessing the same file.
This results in unstable input to the API being tested, where the file
may be overwritten as the function is being tested.
This commit fixes it by putting the fuzz process PID in the filenames
for the scanfile and dbload fuzz targets.
Resolves: https://github.com/Cisco-Talos/clamav/issues/432
Also fixed a CMake bug that built an extra fuzz target file that doesn't
serve any purpose.
Resolves: https://github.com/Cisco-Talos/clamav/issues/431
Commit f82492aef4 fixed a crash in Windows
debug builds but in so doing accidentally introduced a possible crash
when scanning PE files that lack import tables. The issue being that
the openssl hashing functions try to "finish" a hash that was never
started.
This commit fixes the issue by returning CL_BREAK instead of CL_SUCCESS
when the import table doesn't exist or RVA is invalid so that we can
differentiate between successfully calculating the hashes and
successfully skipping the hashing process.
A heap buffer overflow could occur during resource cleanup if a
malloc fails when adding a regex pattern to the phishing suffix tree.
The solution is to increment suffix_cnt after cli_realloc succeeds.
The issue was identified using fault injection and is not a vulnerability.
Resolves: https://github.com/Cisco-Talos/clamav/issues/429
Add `sigtool --fuzzy-img` option to generate image fuzzy hash.
Also fix assorted warnings, mostly ensuring enough buffer space so format
strings aren't truncated.
For the dsig change: the returned string is allocated and is not const.
The caller will have to free it.
In 0.104 and prior, the function for adding a logical subsignature
was being used for NDB sigs, FTM, sigs, *and* Yara strings.
When cleaning up the logic for handling different types of logical
sig subsignatures, we accidentally broke support for regex strings in
yara rules.
This commit adds new logic for recording if the Yara string is a regex
string, by adding a regex subsig opt. Then in a new function for adding
different types of Yara strings, we check if it's a regex string or not
before adding as either a PCRE pattern or as an AC/BM pattern.
Resolves: https://github.com/Cisco-Talos/clamav/issues/494
Also add a basic test for yara regex rule.
Refactored the clamscan code that determines 'what to scan' in order
to clean up some very messy logic and also to get around a difference in
how vscode and clang-format handle formatting #ifdef blocks in the
middle of an else/if.
In addition to refactoring, there is a slight behavior improvement. With
this change, doing `clamscan blah -` will now scan `blah` and then also
scan `stdin`. You can even do `clamscan - blah` to now scan `stdin` and
then scan `blah`. Before, The `-` had to be the only "filename" argument
in order to scan from stdin.
In addition, added a bunch of extra empty lines or changing multi-line
function calls to single-line function calls in order to get around a
bug in clang-format with these two options do not playing nice together:
- AlignConsecutiveAssignments: true
- AlignAfterOpenBracket: true
AlignAfterOpenBracket is not taking account the spaces inserted by
AlignConsecutiveAssignments, so you end up with stuff like this:
```c
bleeblah = 1;
blah = function(arg1,
arg2,
arg3);
// ^--- these args 4-left from where they should be.
```
VSCode, meanwhile, somehow fixes this whitespace issue so code that is
correctly formatted by VSCode doesn't have this bug, meaning that:
1. The clang-format check in GH Actions fails.
2. We'd all have to stop using format-on-save in VSCode and accept the
bug if we wanted those GH Actions tests to pass.
Adding an empty line before variable assignments from multi-line
function calls evades the buggy behavior.
This commit should resolve the clang-format github action test failures,
for now.
EGG archives may have individually encrypted files, or may specify
encryption for the whole archive. For those that have individually
encrypted files, the clean-up code neglects to free the encrypt
structure, which holds 2 pointers (16 bytes on 64bit machines).
This commit adds that missing free.
Thank you Michał Dardas for reporting this issue.
Also add release notes from 0.103 and 0.104 patch versions published
during the development of 0.105.
Also Update llvm support details in the INSTALL.md file.
CMake has a pretty sweet alternative to pkg-config or to the older
`package-config` script that you'd run with `--libs`, `--ldflags`, etc.
CMake can (optionally) install a YourPackageConfig.cmake alongside your
libs under `<prefix>/lib/cmake/<pkg>/<pkg>Config.cmake`.
If you build something with -D CMAKE_FIND_PACKAGE_PREFER_CONFIG=TRUE,
then it will find that and use that to bring real CMake targets into
your build system for your dependencies, guaranteeing that you get all
the right include paths, library paths, ldflags, etc.
See: https://cmake.org/cmake/help/latest/variable/CMAKE_FIND_PACKAGE_PREFER_CONFIG.html
This works great for us for Curl and LLVM... not so much for libxml2.
Ideally we'd figure out what's wrong and support libxml2's package
config.cmake file, but for now this workaround lets us ignore libxml2
and continu to use this feature for the other libs, like LLVM.
I observed undefined symbol errors when linking the bytecode_runtime
object with the check_* test executables on Windows when using LLVM
built from source. I'm not sure why, exactly, but these symbols should
all be present in the ClamAV::libclamav library that we're linking with,
so I don't know why we link with the object library targets in addition
to the libclamav shared/or/static library target.
This commit removes linking with those extra object library targets.
The `have_clamjit` global is used in the unit tests but doesn't appear
to be exported when I was testing the external LLVM runtime support PR,
resulting in an undefined symbol issue. Converting this to a function
that returns 0 or 1 instead of a global variable resolved the issue.
We must pass the LLVM library dependencies to the libclamav_rust
build.rs script so it links the libclamav_rust unit test executable with
LLVM.
Also:
- We can remove the libtinfo dependency that was hardcoded for the LLVM
3.6 support, and must remove it for the build to work on Alpine, macOS.
- Also, increased the libcheck default timeout from 60s to 300s after
experiencing a failure while testing this.
- Also made one of the valgrind suppressions more generic to account for
inline optimization differences observed in testing this.
Modified bytecode JIT code to support llvm versions 8, 9, 10, 11, 12.
Additionally updated FindLLVM.cmake to the current version, found at
https://github.com/ldc-developers/ldc/blob/master/cmake/Modules/FindLLVM.cmake,
as well as making modifications suggested by Micah Snyder to check VERSION_LESS
in CMakeLists.txt to avoid having modifications to FindLLVM.cmake.
Validate the length of the crt subject before memcpying it.
This resolves a possible multibyte heap buffer overread.
We determiend that this issue is not a vulnerability.
Fixes: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=43828
The endianness conversion when reading the PE image import descriptor is
making the change in-place in the fmap.
On Windows, the fmap is read-only and so in Debug builds that's causing
a crash.
This uses a buffer on the stack and copies each image import descriptor
before doing the conversions and then processing each.