I observed undefined symbol errors when linking the bytecode_runtime
object with the check_* test executables on Windows when using LLVM
built from source. I'm not sure why, exactly, but these symbols should
all be present in the ClamAV::libclamav library that we're linking with,
so I don't know why we link with the object library targets in addition
to the libclamav shared/or/static library target.
This commit removes linking with those extra object library targets.
The `have_clamjit` global is used in the unit tests but doesn't appear
to be exported when I was testing the external LLVM runtime support PR,
resulting in an undefined symbol issue. Converting this to a function
that returns 0 or 1 instead of a global variable resolved the issue.
We must pass the LLVM library dependencies to the libclamav_rust
build.rs script so it links the libclamav_rust unit test executable with
LLVM.
Also:
- We can remove the libtinfo dependency that was hardcoded for the LLVM
3.6 support, and must remove it for the build to work on Alpine, macOS.
- Also, increased the libcheck default timeout from 60s to 300s after
experiencing a failure while testing this.
- Also made one of the valgrind suppressions more generic to account for
inline optimization differences observed in testing this.
Modified bytecode JIT code to support llvm versions 8, 9, 10, 11, 12.
Additionally updated FindLLVM.cmake to the current version, found at
https://github.com/ldc-developers/ldc/blob/master/cmake/Modules/FindLLVM.cmake,
as well as making modifications suggested by Micah Snyder to check VERSION_LESS
in CMakeLists.txt to avoid having modifications to FindLLVM.cmake.
Validate the length of the crt subject before memcpying it.
This resolves a possible multibyte heap buffer overread.
We determiend that this issue is not a vulnerability.
Fixes: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=43828
The endianness conversion when reading the PE image import descriptor is
making the change in-place in the fmap.
On Windows, the fmap is read-only and so in Debug builds that's causing
a crash.
This uses a buffer on the stack and copies each image import descriptor
before doing the conversions and then processing each.
* Broke out the variants of error/result handling in `frs_error.rs`.
Made syntax slightly cleaner for `frs_call!`, explicitly moving
the output variables *out* of the function call so as not to make
the parameter order confusing.
* Wrapped the FuzzyHash map into a container rather than exposing
the HashMap directly. Simplifies casting, and allows it to feel
more like a class with methods.
* Fixed various clippy complaints regarding unsafe, etc.
* Rename `frs_error.rs` to `ffi_utils.rs` and migrated ffi-specific
features like the `validate_str_param!()` macro to this new module.
Extends the new frs_error module to provide variants of the
frs_result!() macro that accept a Result as input instead of calling a
function on your behalf. This enables us to use the macro in conditions
where we don't want to return on success, and want to do other things
before we return.
Use the new frs_error module to return errors to the C calling functions
rather than logging the error in Rust-land.
Notably, this enables us to store more meaningful error messages in the
JSON output if we fail to calculate the image fuzzy hash.
The new frs_error module makes it so you can painlessly transfer Rust
errors to-and-from C.
On the Rust-side, on failure it returns an error struct to the C caller.
The C caller detects the failure by checking the return value.
On failure, the caller may use an additional frs_error Rust API to get a
formatted error message.
Two variants are provided:
- One for functions that return a boolean and have an out-parameter for
function output.
- One for functions that return a pointer, where NULL means failure and
non-NULL contains the successful function output.
We've observed a number of false positives with iconv, pthread, libxml2,
etc regarding memory allocation on init that is never cleaned up but
isn't a leak. Now we're seeing a whole tonne more from Rust libraries
with wild stack traces that couldn't be easily suppress.
So the solution is to disable any alerts from possible-leaks and only
show definite-leaks.
Updates to image fuzzy hash algorithm to mimick the phash() function
from the Python imagehash package as closely as possible.
The primary difference is that the DCT is now a 2D DCT-2 instead of
a standard DCT-2 on a flattened array of the image data.
We also needed to:
- drop the alpha channel for images with transparency.
- use the grayscale coefficients used by Python's Pillow library.
- round rather than truncate when converting from f32 -> u8.
- use a median instead of a mean for finding the average frequency in
the low-frequency vector. Previously I had though we were
implementing phash_simple(), which used mean. Sorry Mickey!
- multiply the Rust DCT-2 results * 2, to match DCT results from the
Python package.
Note:
The image-rs crate (v0.24) doesn't currently support using custom
RGB->Luma constants and doesn't round when converting Luma f32 back to
Luma u8 values.
So, this commit includes a custom greyscale() function just for rgb8
images so we can match the greyscale algorithm used in the Python
Pillow package.
See also: https://github.com/image-rs/image/issues/1554
The change in the code that converts the hash bit-vector to a Vec<u8>,
removing map_while(), was done to support older Rust versions,
such as the one currently in Alpine 3.15.0.
Add a new logical signature subsignature type for matching on images
with image fuzzy hashes.
Image fuzzy hash subsigantures follow this format:
fuzzy_img#<hash>#<dist>
In this initial implementation, the hamming distance (dist) is ignored
and only exact fuzzy hash matches will alert.
Fuzzy hash matching is only performed for supported image types.
Also: removed some excessive debug log messages on start-up.
Fixed an issue where the signature name (virname) is being allocated and
stored for every subsignature or even ever sub-pattern in an AC-pattern
(i.e. NDB sig or LDB subsig) containing a `{n-m}` or `*` wildcard.
This fix is only for LDB subsigs though. NDB signatures are still
allocaing one virname per sub-pattern.
This fix was required because I needed a place to store the virname with
fuzzy-hash subsignatures. Storing it in the fuzzy-hash subsig
metadatathe way AC-pattern, PCRE, and BComp subsigs were doing it
wouldn't work because it would cross the C-Rust FFI boundary and giving
pointers to Rust allocated stuff is dicey. Not to mention native Rust
strings are different thatn C strings. Anyways, the correct thing to do
was to store the virname with the actual logical signature.
TODO: Keep track of NDB signatures in the same way and store the virname
for NDB sigs there instead of in AC-patterns so that we can get rid of
the virname field in the AC-pattern struct.
Fix two locations where the stack-allocated arrays lack space for a null-
terminating byte and could overwrite the array in:
- dsig.c
- sigtool.c
The ClamAV team verified that these overflows are not a security issue.
The current method of trying to determine the target triple based
the architecture, operating system, etc. is difficult to get right
and maintain. In particular, I found that some installations like the
Alpine package will use "alpine" in the name of the triple instead of
"unknown".
E.g. "aarch64-alpine-linux-musl" instead of "aarch64-unknown-linux-musl".
This makes it nearly impossible to figure out the exact target name
based on target system specifications.
I believe it will be better to use the default target 99% of the time
and require users that which to cross-compile to use a new CMake
variable `-D RUST_TARGET_TRIPLE=<target>` to choose the target.
This is basically how it works for C/C++ anyways.
The clamonacc-ddd thread (clamonacc/inotif/inotif.c) has currently no
handling for inotify events where the passed watch descriptor is lower
than zero (e.g.: "-1"). Although not actually mentioned in the inotify
documentation, this case has actually been observed:
[...]
ClamInotif: inotify event wd:-1, mask:1073742080, cookie:0, length:16, child:<CHILD-PATH>
Clamonacc: onas_clamonacc_exit(), signal 11
Assumption: This probably occurs when a inotify event is generated for
the creation of a directory including subdirectories. The clamonacc-ddd
thread then discovers all children of the directory and is generating
inotify watch descriptors for all children. A subsequent inotify event
is generated for a subdirectory, but an inotify watch descriptor already
exists from the previous discovery.
In any case, this leads to an out-of-bounds array access on the internal
data structure "wdlt" of clamonacc while trying to look up the "path"
from there. This in turn can lead to a SIGSEGV of the clamonacc-ddd
thread.
The clamonacc-ddd thread (clamonacc/inotif/inotif.c) has currently no
explicit handling for the special inotify kernel events IN_UNMOUNT,
IN_Q_OVERFLOW and IN_IGNORED. It treats the inotify_event structs in
the same way as regular inotify events, although for those special
events the "len" and "name" fields a value of zero respectively NULL.
This can lead to a SIGSEGV of the clamonacc-ddd thread when calling
"strlen(name)" on a NULL value.
This commit adds just a quick'n'dirty workaround for the SIGSEGV and
some logging output. It DOES NOT fix the more overall issue, that a
IN_Q_OVERFLOW inotify event also leads to an out-of-sync situation
between the monitored filesystem/directory and the internal data
structures "ddd_ht" and "wdlt" of clamonacc.
The clamonacc-ddd thread (clamonacc/inotif/inotif.c) currently and in-
discriminately stops processing of inotify watch descriptors in the
function "onas_ddd_unwatch_hierarchy" upon any error returned from the
call to "inotify_rm_watch". This in turn also prevents further updates
and cleanup of the internal data structures "ddd_ht" and "wdlt", which
leads to a invalid value of the "path" variable.
This causes a leak of inotify watch descriptors in the case of e.g. a
move of a directory containing a hierarchy of watched subdirectories.
See the inotify watch descriptor ("wd:") values in the section "Example
and reproducer" below.
This commit adds a workaround for this issue by specifically ignoring
the "ENOENT" errno in case the call to "inotify_rm_watch" results in
a non-zero return code. This allows the cleanup to properly proceed
for all children.
Example and reproducer:
- Starting point:
Log:
ClamInotif: created watch descriptor (wd:1) for path: /clamonacc/monitored
[...]
ClamInotif: created watch descriptor (wd:129637) for path: /clamonacc/monitored/[...]
- Initial directory and subdirectory creation:
mkdir -p /clamonacc/monitored/.test_ff/{1,2,3,4,5}
Log:
ClamInotif: inotify event wd:1, mask:1073742080, cookie:0, length:16, child:.test_ff
ClamInotif: inotify event path: /clamonacc/monitored
ClamInotif: CREATE - adding /clamonacc/monitored/.test_ff to /clamonacc/monitored with wd:1
ClamInotif: created watch descriptor (wd:129638) for path: /clamonacc/monitored/.test_ff
ClamInotif: created watch descriptor (wd:129639) for path: /clamonacc/monitored/.test_ff/3
ClamInotif: created watch descriptor (wd:129640) for path: /clamonacc/monitored/.test_ff/4
ClamInotif: created watch descriptor (wd:129641) for path: /clamonacc/monitored/.test_ff/1
ClamInotif: created watch descriptor (wd:129642) for path: /clamonacc/monitored/.test_ff/5
ClamInotif: created watch descriptor (wd:129643) for path: /clamonacc/monitored/.test_ff/2
- Move / rename of the parent directory:
mv -i /clamonacc/monitored/.test_ff/ /clamonacc/monitored/test_ff
Log:
ClamInotif: inotify event wd:1, mask:1073741888, cookie:12094045, length:16, child:.test_ff
ClamInotif: inotify event path: /clamonacc/monitored
ClamInotif: MOVED_FROM - removing /clamonacc/monitored/.test_ff from /clamonacc/monitored with wd:1
ERROR: ClamInotif: error removing watch descriptor (wd:129638) for path: /clamonacc/monitored/.test_ff, No such file or directory
ClamInotif: inotify event wd:1, mask:1073741952, cookie:12094045, length:16, child:test_ff
ClamInotif: inotify event path: /clamonacc/monitored
ClamInotif: MOVED_TO - adding /clamonacc/monitored/test_ff to /clamonacc/monitored with wd:1
ClamInotif: created watch descriptor (wd:129644) for path: /clamonacc/monitored/test_ff
ClamInotif: created watch descriptor (wd:129639) for path: /clamonacc/monitored/test_ff/3
ClamInotif: created watch descriptor (wd:129640) for path: /clamonacc/monitored/test_ff/4
ClamInotif: created watch descriptor (wd:129641) for path: /clamonacc/monitored/test_ff/1
ClamInotif: created watch descriptor (wd:129642) for path: /clamonacc/monitored/test_ff/5
ClamInotif: created watch descriptor (wd:129643) for path: /clamonacc/monitored/test_ff/2
- Removal of the parent directory unter the new name:
rm -r /clamonacc/monitored/test_ff
Log:
ClamInotif: inotify event wd:129644, mask:1073742336, cookie:0, length:16, child:3
ClamInotif: inotify event path: /clamonacc/monitored/test_ff
ClamInotif: DELETE - removing /clamonacc/monitored/test_ff/3 from /clamonacc/monitored/test_ff with wd:129644
ClamInotif: removed watch descriptor (wd:129639) for path: /clamonacc/monitored/test_ff/3
ClamInotif: inotify event wd:129644, mask:1073742336, cookie:0, length:16, child:4
ClamInotif: inotify event path: /clamonacc/monitored/test_ff
ClamInotif: DELETE - removing /clamonacc/monitored/test_ff/4 from /clamonacc/monitored/test_ff with wd:129644
ClamInotif: removed watch descriptor (wd:129640) for path: /clamonacc/monitored/test_ff/4
ClamInotif: inotify event wd:129644, mask:1073742336, cookie:0, length:16, child:1
ClamInotif: inotify event path: /clamonacc/monitored/test_ff
ClamInotif: DELETE - removing /clamonacc/monitored/test_ff/1 from /clamonacc/monitored/test_ff with wd:129644
ClamInotif: removed watch descriptor (wd:129641) for path: /clamonacc/monitored/test_ff/1
ClamInotif: inotify event wd:129644, mask:1073742336, cookie:0, length:16, child:5
ClamInotif: inotify event path: /clamonacc/monitored/test_ff
ClamInotif: DELETE - removing /clamonacc/monitored/test_ff/5 from /clamonacc/monitored/test_ff with wd:129644
ClamInotif: removed watch descriptor (wd:129642) for path: /clamonacc/monitored/test_ff/5
ClamInotif: inotify event wd:129644, mask:1073742336, cookie:0, length:16, child:2
ClamInotif: inotify event path: /clamonacc/monitored/test_ff
ClamInotif: DELETE - removing /clamonacc/monitored/test_ff/2 from /clamonacc/monitored/test_ff with wd:129644
ClamInotif: removed watch descriptor (wd:129643) for path: /clamonacc/monitored/test_ff/2
ClamInotif: inotify event wd:1, mask:1073742336, cookie:0, length:16, child:test_ff
ClamInotif: inotify event path: /clamonacc/monitored
ClamInotif: DELETE - removing /clamonacc/monitored/test_ff from /clamonacc/monitored with wd:1
ClamInotif: removed watch descriptor (wd:129644) for path: /clamonacc/monitored/test_ff
This is also not strictly necessary, but i found it rather helpful to
be able to identify and distinguish the individual clamonacc threads in
the usual Linux system tools ("ps", "top", "gdb", etc.).
Setting the thread name is already done for the threads started by the
bundled c-thread-pool library ("thread-pool-N"). Only the clamonacc
threads all identify themselfes as "clamonacc".
This commit now sets the following - shortened (due to the 16 character
limit) - thread names:
- "clamonacc-ddd" for the ddd/inotify thread.
- "clamonacc-sq" for the scan queue thread.
- the main clamonacc thread currently keeps the name "clamonacc" in-
herited from the main process.
This is not strictly necessary, but i found it rather helpful to know
what is going on during the debugging process and while trying to under-
stand how clamonacc actually works.
Explicit setting of pthread_sigmask in the individual threads
should not be necessary at all(?), since it is already done in
the main process/thread and should be inherited from there. But
since the sigfillset/sigdelset lines are already there, at least
they should be consistent with regard to the undefined behaviour
caused by ignoring SIGFPE, SIGILL, SIGSEGV, or SIGBUS signals.
Improvements to the logg() API appear to have corrected a bug wherein
debug/warning/error messages were writing to stdout instead of stderr.
Now that they're going to stderr correctly, this fixes the test to match.
Don't crop logs so we can see the whole valgrind output in large logs.
Generate suppression rules, in case they're needed for slight variations
on existing suppressions or sometimes new suppressions.
Increase valgrind's stack size for initial thread because I observed some
unusual errors in custom tests on Debian 10 like this:
7: [INFO]: ==9911== Process terminating with default action of signal 15 (SIGTERM)
7: [INFO]: ==9911== at 0x401A4FB: __open_nocancel (open64_nocancel.c:45)
7: [INFO]: ==9911== by 0x4006F65: open_verify.constprop.12 (dl-load.c:1728)
7: [INFO]: ==9911== by 0x4007737: open_path (dl-load.c:2057)
7: [INFO]: ==9911== by 0x4008C17: _dl_map_object (dl-load.c:2297)
7: [INFO]: ==9911== by 0x400D291: openaux (dl-deps.c:64)
7: [INFO]: ==9911== by 0x401956A: _dl_catch_exception (dl-error-skeleton.c:196)
7: [INFO]: ==9911== by 0x400D605: _dl_map_object_deps (dl-deps.c:248)
7: [INFO]: ==9911== by 0x4003AFA: dl_main (rtld.c:1733)
7: [INFO]: ==9911== by 0x401864F: _dl_sysdep_start (dl-sysdep.c:253)
7: [INFO]: ==9911== by 0x4002117: _dl_start_final (rtld.c:415)
7: [INFO]: ==9911== by 0x4002117: _dl_start (rtld.c:522)
7: [INFO]: ==9911== by 0x4001097: ??? (in /lib/x86_64-linux-gnu/ld-2.28.so)
7: [INFO]: ==9911== by 0x1: ???
7: [INFO]: ==9911== Jump to the invalid address stated on the next line
7: [INFO]: ==9911== at 0x1036: ???
7: [INFO]: ==9911== by 0x4006F65: open_verify.constprop.12 (dl-load.c:1728)
7: [INFO]: ==9911== by 0x4007737: open_path (dl-load.c:2057)
7: [INFO]: ==9911== by 0x4008C17: _dl_map_object (dl-load.c:2297)
7: [INFO]: ==9911== by 0x400D291: openaux (dl-deps.c:64)
7: [INFO]: ==9911== by 0x401956A: _dl_catch_exception (dl-error-skeleton.c:196)
7: [INFO]: ==9911== by 0x400D605: _dl_map_object_deps (dl-deps.c:248)
7: [INFO]: ==9911== by 0x4003AFA: dl_main (rtld.c:1733)
7: [INFO]: ==9911== by 0x401864F: _dl_sysdep_start (dl-sysdep.c:253)
7: [INFO]: ==9911== by 0x4002117: _dl_start_final (rtld.c:415)
7: [INFO]: ==9911== by 0x4002117: _dl_start (rtld.c:522)
7: [INFO]: ==9911== by 0x4001097: ??? (in /lib/x86_64-linux-gnu/ld-2.28.so)
7: [INFO]: ==9911== by 0x1: ???
7: [INFO]: ==9911== Address 0x1036 is not stack'd, malloc'd or (recently) free'd
7: [INFO]: ==9911==
7: [INFO]: ==9911==
7: [INFO]: ==9911== Process terminating with default action of signal 11 (SIGSEGV)
7: [INFO]: ==9911== Bad permissions for mapped region at address 0x1036
7: [INFO]: ==9911== at 0x1036: ???
7: [INFO]: ==9911== by 0x4006F65: open_verify.constprop.12 (dl-load.c:1728)
7: [INFO]: ==9911== by 0x4007737: open_path (dl-load.c:2057)
7: [INFO]: ==9911== by 0x4008C17: _dl_map_object (dl-load.c:2297)
7: [INFO]: ==9911== by 0x400D291: openaux (dl-deps.c:64)
7: [INFO]: ==9911== by 0x401956A: _dl_catch_exception (dl-error-skeleton.c:196)
7: [INFO]: ==9911== by 0x400D605: _dl_map_object_deps (dl-deps.c:248)
7: [INFO]: ==9911== by 0x4003AFA: dl_main (rtld.c:1733)
7: [INFO]: ==9911== by 0x401864F: _dl_sysdep_start (dl-sysdep.c:253)
7: [INFO]: ==9911== by 0x4002117: _dl_start_final (rtld.c:415)
7: [INFO]: ==9911== by 0x4002117: _dl_start (rtld.c:522)
7: [INFO]: ==9911== by 0x4001097: ??? (in /lib/x86_64-linux-gnu/ld-2.28.so)
7: [INFO]: ==9911== by 0x1: ???
Per this conversation: https://bugs.kde.org/show_bug.cgi?id=359705
I'm testing to see if the issue is resolved by increasing valgrind's
main stack size.
https://valgrind.org/docs/manual/manual-core.html claims that the
default is the current `ulimit` value or 16MB, whichever is lower.
The logic for parsing a logical subsignature isn't clearly identified
and has been, perhaps mistakenly or out of convenience, used to when
parsing NDB signatures in addition to LDB subsignatures. What this means
is that you can technically use a PCRE subsignature in an NDB file and
clam won't complain about it. It won't work however, because a PCRE
subsignature requires another matching subsignature to trigger it, but
it will parse. The same is likely true for byte-compare subsignatures.
This commit restructures that logic a bit so subsignature parsing has
its own function and is more organized.
I also renamed the functions a little bit and added lots of comments.
I fixed a few minor warnings relating to format string characters.
The change in str.c:cli_ldbtokenize is to prevent a buffer under-read if
you were to use the function on the start of a buffer, as is now down in
this commit.
FreeBSD and OpenBSD package maintainers identified failures when
pkg-config .pc files not present for curses/ncurses.
Patch courtesy of Stuart Henderson.
Corrected buffer size check in the regex signature parsing code.
This resolves a possible 1-byte stack buffer overwrite (NULL byte).
We determined that this issue is not a vulnerability.
Fixes: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=43869
Time is always a nasty thing and without proper timezone data it is even
harder. Set a default timezone (easily to be overridde by the user from
the cli with docker run --env) and as a consequence add the timezone
data package.
Signed-off-by: Olliver Schinagl <oliver@schinagl.nl>
Prevent healthcheck from initially failing as the container
is still starting up. ClamAV's database loading can take quite a time,
in which the healthcheck is constantly failing. While it is impossible to
have a default that works everywhere (slower systems vs fast ones). Using 6
minutes tries to find a compromise here but it by no means perfect.
Signed-off-by: Olliver Schinagl <oliver@schinagl.nl>
The XLS parsing code for extracting images and XLM macros is alerting
with Heuristics.OLE2.ContainsMacros.XLM if any images are found if the
--alert-macros (AlertOLE2Macros) option is enabled.
This fix moves the alert logic before the XLM & image extraction logic
where we know at least one macro exists, but before we try to extract it
I've wrapped it with an "if (has_xlm)" condition.
Resolves:
- https://github.com/Cisco-Talos/clamav/issues/423
- https://bugzilla.clamav.net/show_bug.cgi?id=12844