ClamAV's option parser, used for `freshclam.conf`, `clamd.conf`,
and `clamav-milter.conf` has a max line length of 512 characters.
By request, this commit increases the line length to 1024 to accommodate
very long `DatabaseMirror` options when using access tokens in the URI.
Resolves: https://github.com/Cisco-Talos/clamav/issues/281
You should be able to disable the maxfilesize limit by setting it to
zero. When "disabled", ClamAV should defer to inherent limitations, which
at this time is INT_MAX - 2 bytes.
This works okay for ClamScan and ClamD because our option parser
converts max-filesize=0 to 4294967295 (4GB). But it is presently broken
for other applications using the libclamav C API, like this:
```c
cl_engine_set_num(engine, CL_ENGINE_MAX_FILESIZE, 0);
```
The limit checks added for cl_scanmap_callback and cl_scanfile_callback
in 0.103.4 and 0.104.1 broke this ability because we forgot to check if
the `maxfilesize > 0` before enforcing it.
This commit adds that guard so you can disable by setting to `0`.
While working on this, I also found that the `max_size` variables in our
libmspack scanner code are using an `off_t` type, which is a SIGNED integer
that may be 32bit width even on some 64bit platforms, or may be a 64bit
width. AND the default `max_size` when `maxfilesize == 0` was being set to
UINT_MAX (0xffffffff), aka `-1` when `off_t` is 32bits.
This commit addresses this related issue by:
- changing the `max_size` to use `uint64_t`, like our other limits.
- verifying that `maxfilesize > 0` before using it.
- checking that using `UINT32_MAX` as a backup will not exceed the
max-scansize in the same way that we do with the maxfilesize.
The max bytes supplied to strftime should be the length of result
string, including the terminating null byte.
Without the extra byte for the terminating null byte, the output is one
byte too long and results in undefined behavior:
If the length of the result string (including the terminating null
byte) would exceed max bytes, then strftime() returns 0, and the
contents of the array are undefined.
Also resolve alleged uninitialized memory use by initializing the
`digest` variable in `cli_md5buff()`. MSAN blames it for putting
uninitialized data in the `name_salt` global, though in debugging and in
review I can't find any evidence that it isn't initialized by the call
to `cl_hash_data()` in `cli_md5buff()`.
This MSAN complaint has been a blocker to enabling MSAN in OSS-Fuzz.
Add a basic unit test for the new libclamav_rust `logging.rs` module.
This test simply initializes logging and then prints out a message with
each of the `log` macros.
Also set the Rust edition to 2018 because the default is the 2015
edition in which using external crates is very clunky.
For the Rust test support in CMake this commit adds support for
cross-compiling the Rust tests.
Rust tests must be built for the same LLVM triple (target platform) as
the rest of the project. In particular this is needed to build both
x64 and x86 packages on a 64bit Windows host.
For Alpine, we observed that the LLVM triple for the host platform tools
may be either:
- x86_64-unknown-linux-musl, or
- x86_64-alpine-linux-musl
To support it either way, we look up the host triple with `rustc -vV`
and use that if the musl libc exists. This is a big hacky and
unfortunately means that we probably can't cross-compile to other
platforms when running on a musl libc host. There are probably
improvements to be made to improve cross compiling support.
The Rust test programs must link with libclamav, libclammspack, and
possibly libclamunrar_iface and libclamunrar plus all of the library
dependencies for those libraries.
To do this, we pass the path of each library in environment variables
when building the libclamav_rust unit test program.
Within `libclamav_rust/build.rs`, we read those environment variables.
If set, we parse each into library path and name components to use
as directives for how to build the unit test program.
See: https://doc.rust-lang.org/cargo/reference/build-scripts.html
Our `build.rs` file ignores the library path environment variables if
thye're not set, which is necessary when building the libclamav_rust
library and when libclamunrar isn't static and for when not linking with
a libiconv external to libc.
Rust test programs are built and executed in subdirectory under:
<target>/<llvm triple>/<config>/deps
where "target" for libclamav_rust tests is set to <build>/unit_tests
For example:
clamav/build/unit_tests/x86_64-pc-windows-msvc/debug/deps/clamav_rust-7e1343f8a2bff1cc.exe
Since this program isn't co-located with the rest of the libraries
we also have to set environment variables so the test program can find and
load the shared libraries:
- Windows: PATH
- macOS: DYLD_LIBRARY_PATH
We already set LD_LIBRARY_PATH when not Windows for similar reasons.
Note: In build.rs, we iterate references to LIB_ENV_LINK & Co because
older Rust versions do implement Iterator for [&str].
Add a top-level Cargo.toml.
Remove the vestigial libclamav_rust/Makefile.am.
Place Rust source under a libclamav_rust/src directory as is canonical
for Rust projects.
Since uClibc can be configured without support for backtrace, disable
the backtrace if we are building with a uClibc that was built without
backtrace.
This is a bit hacky, and would greatly benefit from a test in ./configure
instead, but does nicely as a quick fix for now.
Signed-off-by: "Yann E. MORIN" <yann.morin.1998@free.fr>
Signed-off-by: Bernd Kuhls <bernd.kuhls@t-online.de>
[Bernd: rebased for 0.103.0]
[Fabrice: retrieved from
https://git.buildroot.net/buildroot/tree/package/clamav/0002-mbox-do-not-use-backtrace-if-using-uClibc-without-ba.patch]
Signed-off-by: Fabrice Fontaine <fontaine.fabrice@gmail.com>
Add support for Rust FreeBSD, OpenBSD targets
Add support for Rust on GNU aarch64.
Add support for Rust on Alpine (musl x86_64).
Note: Current trick of checking for musl libc.so doesn't work when
cross compiling. TODO: Find a better way to check if the target is
MUSL.
Add Rust toolchain to fix Dockerfile build.
Convert cli_dbgmsg to inline function to ensure ctx check for debug flag
is always run
Add copyright and licensing info
Fix valgrind uninitialized buffer issue in cliunzip.c
Windows build fix
Vendoring crate dependencies is required for offline builds.
Some packaging systems require that builds can be performed offline.
This feature enabled vendoring crates at configure time which are
then included in in CPack source packaging.
The `realpath()` function on macOS will fail when scanning a symlink
that doesn't exist because it tries to determine the real path of the
thing the symlink points to and fails when that thing doesn't exist.
This behavior is different than on Linux or FreeBSD Unix.
I'm fixing this by opening the symlink with `O_SYMLINK` then getting
realpath of the link using `cli_get_filepath_from_filedesc()`, much
like we do on Windows.
Side note: I did try using macOS's _DARWIN_BETTER_REALPATH variant to
see if that resolved the issue, but it didn't seem to.
This resolves https://bugzilla.clamav.net/show_bug.cgi?id=12792
This commit also removes the problematic "access denied" clamd test from
the test suite. This commit broke that test on macOS because the error
message when clamdscan fails to `open()` the file to check the
realpath() is different than the error message when clamd tries to scan
it, but has access denied.
(It's like "Permission denied" instead of "Access denied")
It's not worth #ifdef'ing around macOS to get a test pass because this
test is already causing problems in 32-bit Docker environments where
access isn't actually denied to the user (!). Honestly, it's not worth
keeping the test that simply verifies ClamD's error message when denied
matches expectations.
I also switched to use the C_DARWIN macro instead of __APPLE__ because
I kno C_DARWIN is defined in clamav-config.h on macOS, and I found that
__APPLE__ wasn't defined when I tested.
Accidentally introduced an invalid format string character, which is
apparently just a warning. ¯\_(ツ)_/¯
But Coverity really didn't like it, and now that I know about it,
neither do I...
The zip parser may leak a string for a zip record filename if the
the record in the central directory is the last record.
This fixes https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=31760
It also appears that there may be a related record filename leak if an
error occured when indexing the files in the central directory header,
but I don't have any test file for this but it was an obvious fix.
The mbox.c:messageGetFilename() function returns a COPY of the filename,
which must be free()'d.
Fixes: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=31775
Also fixed and issue where the email parser may fail to finish parsing
soem multipart emails when --gen-json is enabled if messageGetJObj() or
cli_json_addowner() fail. These should be non-fatal failures. The rest
of the (broken) email should be parsed.
The email parser (mbox.c & message.c) also has a lot of assertions
instead using if()'s for error handling.
Because of the complexity of the email parser, it's unclear for many of
the assertions if they could be triggered based on user input like
scanning a malformed email. So to be safe, I've replaced all of the
assertions in this parser with error handling to fail gracefully.
Also fix formatting issues and use `stdbool.h`'s `true` instead of `TRUE`.
Extends the existing freshclam 403 test to check if a repeated freshclam
403 response outputs the cool-down message instead of the original 403
response message.
The ClamAV CDN may send a 403 (forbidden) response for database
downloads to networks that are explicitly blocked for one reason or
another, or for download requests from out-of-date freshclam clients or
clients other than freshclam and cvdupdate.
The volume of data serving 403 responses to clients that are blocked and
retry in a tight loop is considerable.
This commit seeks to remedy that for future freshclam versions by
extending freshclam's self-regulated cool-down for 429 (retry later)
responses to include 403 (forbidden) responses.
The cooldown period if a 403 is received will be 24 hours.
Rename Heuristics.Email.ExceedsMax alerts to start with
Heuristics.Limits.Exceeded.Email instead, so that all heuristic alerts
for exceeded scan limits have the same prefix.
The Email heuristics when scan limits are exceeded should only alert if
clamscan's `--alert-exceeds-max` option is enabled.
The ClamD options is: `AlertExceedsMax`
The libclamav option is: `CL_SCAN_HEURISTIC_EXCEEDS_MAX`
A heap buffer over-read may occur in the OLE2 parser if the --gen-json
option is enabled (the CL_SCAN_GENERAL_COLLECT_METADATA scan option).
The issue occurs because a string input is not checked to verify if it
is empty (zero-byte length) prior to use.
We determined that this issue is not exploitable to cause a crash or to
do anything malicious. The overflow (er... underflow?) is 1 byte before
a malloced buffer.
This commit adds checks to the function parameters in case the original
pointer itself is NULL, and to account for conversion of an empty string.
Fixes https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=39673
The fixes to the fmap bounds for nested (duplicate) fmaps added recently
introduced a subtle arithmetic bug that was detected by OSS-Fuzz:
```c
scanat = m->nested_offset + *at % m->pgsz;
```
should have been:
```c
scanat = (m->nested_offset + *at) % m->pgsz;
```
Without the parenthesis, `scanat` could be > `m->pgsz`, which would
overflow in the subsequent `memchr()` call.
See:
- https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=40452
- https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=40455
This commit also tightens up some of the other bounds checks done with
`CLI_ISCONTAINED()` macro so the check limits the bounds to the nested
fmap and not the original map.
In addition, I've added a `CLI_ISCONTAINED_0_TO()` macro that removes
checks when the "bigger" buffer starts at offset 0. This should silence
a bunch of (benign) warnings and medium severity Coverity issues.
There is also a possible use of an uninitialized variable
(`old_hook_lsig_matches`) in `cli_magic_scan()`.
Finally, I also removed an unecessary NULL-check on `filebase` in
`fmap_dup_to_file()` that Coverity was unhappy with.
CID 361074: fmap.c: Possible invalid dereference if status != success
and the new map was not yet allocated.
CID 361077: others.c: Structurally dead code revealed a bug in the
cli_recursion_stack_get_size() function.
CID 361080, 361078, 361083: sigtool.c: Inverted check for if engine
needs to be free'd, could leak the engine structure.
CID 361075: sigtool.c: Missed a `return -1` that should've been `goto
done;` and would leak the new_map buffer.
CID 361079: sigtool/vba.c: Checking if we should free the new_map on
failure only if ctx also needs to be free'd, which would leak the
new_map if ctx was not allocated yet.
The previous commit broke alerting when exceeding the recursion limit
because recursion tracking is so effective that by limiting the final
layer of recursion to a scan of the fmap, we prevented it from ever
hitting the recursion limit.
This commit removes that restriction where it only does an fmap scan
(aka "raw scan") of files that are at their limit so that we can
actually hit the recursion limit and alert as intended.
Also tidied up the cache_clean check so it checks the
`fmap->dont_cache_flag` at the right point (before caching) instead of
before setting the "CLEAN" verdict.
Note: The `cache_clean` variable appears to be used to record the clean
status so the `ret` variable can be re-used without losing the verdict.
This is of course only required because the verdict is stored in the
error enum. *cough*
Also fixed a couple typos.
The fmap module provides a mechanism for creating a mapping into an
existing map at an offset and length that's used when a file is found
with an uncompressed archive or when embedded files are found with
embedded file type recognition in scanraw(). This is the
"fmap_duplicate()" function. Duplicate fmaps just reference the original
fmap's 'data' or file handle/descriptor while allowing the caller to
treat it like a new map using offsets and lengths that don't account for
the original/actual file dimensions.
fmap's keep track of this with m->nested_offset & m->real_len, which
admittedly have confusing names. I found incorrect uses of these in a
handful of locations. Notably:
- In cli_magic_scan_nested_fmap_type().
The force-to-disk feature would have been checking incorrect sizes and
may have written incorrect offsets for duplicate fmaps.
- In XDP parser.
- A bunch of places from the previous commit when making dupe maps.
This commit fixes those and adds lots of documentation to the fmap.h API
to try to prevent confusion in the future.
nested_offset should never be referenced outside of fmap.c/h.
The fmap_* functions for accessing or reading map data have two
implementations, mem_* or handle_*, depending the data source.
I found issues with some of these so I made a unit test that covers each
of the functions I'm concerned about for both types of data sources and
for both original fmaps and nested/duplicate fmaps.
With the tests, I found and fixed issues in these fmap functions:
- handle_need_offstr(): must account for the nested_offset in dupe maps.
- handle_gets(): must account for nested_offset and use len & real_len
correctly.
- mem_need_offstr(): must account for nested_offset in dupe maps.
- mem_gets(): must account for nested_offset and use len & real_len
correctly.
Moved CDBRANGE() macro out of function definition so for better
legibility.
Fixed a few warnings.
Scan recursion is the process of identifying files embedded in other
files and then scanning them, recursively.
Internally this process is more complex than it may sound because a file
may have multiple layers of types before finding a new "file".
At present we treat the recursion count in the scanning context as an
index into both our fmap list AND our container list. These two lists
are conceptually a part of the same thing and should be unified.
But what's concerning is that the "recursion level" isn't actually
incremented or decremented at the same time that we add a layer to the
fmap or container lists but instead is more touchy-feely, increasing
when we find a new "file".
To account for this shadiness, the size of the fmap and container lists
has always been a little longer than our "max scan recursion" limit so
we don't accidentally overflow the fmap or container arrays (!).
I've implemented a single recursion-stack as an array, similar to before,
which includes a pointer to each fmap at each layer, along with the size
and type. Push and pop functions add and remove layers whenever a new
fmap is added. A boolean argument when pushing indicates if the new layer
represents a new buffer or new file (descriptor). A new buffer will reset
the "nested fmap level" (described below).
This commit also provides a solution for an issue where we detect
embedded files more than once during scan recursion.
For illustration, imagine a tarball named foo.tar.gz with this structure:
| description | type | rec level | nested fmap level |
| ------------------------- | ----- | --------- | ----------------- |
| foo.tar.gz | GZ | 0 | 0 |
| └── foo.tar | TAR | 1 | 0 |
| ├── bar.zip | ZIP | 2 | 1 |
| │ └── hola.txt | ASCII | 3 | 0 |
| └── baz.exe | PE | 2 | 1 |
But suppose baz.exe embeds a ZIP archive and a 7Z archive, like this:
| description | type | rec level | nested fmap level |
| ------------------------- | ----- | --------- | ----------------- |
| baz.exe | PE | 0 | 0 |
| ├── sfx.zip | ZIP | 1 | 1 |
| │ └── hello.txt | ASCII | 2 | 0 |
| └── sfx.7z | 7Z | 1 | 1 |
| └── world.txt | ASCII | 2 | 0 |
(A) If we scan for embedded files at any layer, we may detect:
| description | type | rec level | nested fmap level |
| ------------------------- | ----- | --------- | ----------------- |
| foo.tar.gz | GZ | 0 | 0 |
| ├── foo.tar | TAR | 1 | 0 |
| │ ├── bar.zip | ZIP | 2 | 1 |
| │ │ └── hola.txt | ASCII | 3 | 0 |
| │ ├── baz.exe | PE | 2 | 1 |
| │ │ ├── sfx.zip | ZIP | 3 | 1 |
| │ │ │ └── hello.txt | ASCII | 4 | 0 |
| │ │ └── sfx.7z | 7Z | 3 | 1 |
| │ │ └── world.txt | ASCII | 4 | 0 |
| │ ├── sfx.zip | ZIP | 2 | 1 |
| │ │ └── hello.txt | ASCII | 3 | 0 |
| │ └── sfx.7z | 7Z | 2 | 1 |
| │ └── world.txt | ASCII | 3 | 0 |
| ├── sfx.zip | ZIP | 1 | 1 |
| └── sfx.7z | 7Z | 1 | 1 |
(A) is bad because it scans content more than once.
Note that for the GZ layer, it may detect the ZIP and 7Z if the
signature hits on the compressed data, which it might, though
extracting the ZIP and 7Z will likely fail.
The reason the above doesn't happen now is that we restrict embedded
type scans for a bunch of archive formats to include GZ and TAR.
(B) If we scan for embedded files at the foo.tar layer, we may detect:
| description | type | rec level | nested fmap level |
| ------------------------- | ----- | --------- | ----------------- |
| foo.tar.gz | GZ | 0 | 0 |
| └── foo.tar | TAR | 1 | 0 |
| ├── bar.zip | ZIP | 2 | 1 |
| │ └── hola.txt | ASCII | 3 | 0 |
| ├── baz.exe | PE | 2 | 1 |
| ├── sfx.zip | ZIP | 2 | 1 |
| │ └── hello.txt | ASCII | 3 | 0 |
| └── sfx.7z | 7Z | 2 | 1 |
| └── world.txt | ASCII | 3 | 0 |
(B) is almost right. But we can achieve it easily enough only scanning for
embedded content in the current fmap when the "nested fmap level" is 0.
The upside is that it should safely detect all embedded content, even if
it may think the sfz.zip and sfx.7z are in foo.tar instead of in baz.exe.
The biggest risk I can think of affects ZIPs. SFXZIP detection
is identical to ZIP detection, which is why we don't allow SFXZIP to be
detected if insize of a ZIP. If we only allow embedded type scanning at
fmap-layer 0 in each buffer, this will fail to detect the embedded ZIP
if the bar.exe was not compressed in foo.zip and if non-compressed files
extracted from ZIPs aren't extracted as new buffers:
| description | type | rec level | nested fmap level |
| ------------------------- | ----- | --------- | ----------------- |
| foo.zip | ZIP | 0 | 0 |
| └── bar.exe | PE | 1 | 1 |
| └── sfx.zip | ZIP | 2 | 2 |
Provided that we ensure all files extracted from zips are scanned in
new buffers, option (B) should be safe.
(C) If we scan for embedded files at the baz.exe layer, we may detect:
| description | type | rec level | nested fmap level |
| ------------------------- | ----- | --------- | ----------------- |
| foo.tar.gz | GZ | 0 | 0 |
| └── foo.tar | TAR | 1 | 0 |
| ├── bar.zip | ZIP | 2 | 1 |
| │ └── hola.txt | ASCII | 3 | 0 |
| └── baz.exe | PE | 2 | 1 |
| ├── sfx.zip | ZIP | 3 | 1 |
| │ └── hello.txt | ASCII | 4 | 0 |
| └── sfx.7z | 7Z | 3 | 1 |
| └── world.txt | ASCII | 4 | 0 |
(C) is right. But it's harder to achieve. For this example we can get it by
restricting 7ZSFX and ZIPSFX detection only when scanning an executable.
But that may mean losing detection of archives embedded elsewhere.
And we'd have to identify allowable container types for each possible
embedded type, which would be very difficult.
So this commit aims to solve the issue the (B)-way.
Note that in all situations, we still have to scan with file typing
enabled to determine if we need to reassign the current file type, such
as re-identifying a Bzip2 archive as a DMG that happens to be Bzip2-
compressed. Detection of DMG and a handful of other types rely on
finding data partway through or near the ned of a file before
reassigning the entire file as the new type.
Other fixes and considerations in this commit:
- The utf16 HTML parser has weak error handling, particularly with respect
to creating a nested fmap for scanning the ascii decoded file.
This commit cleans up the error handling and wraps the nested scan with
the recursion-stack push()/pop() for correct recursion tracking.
Before this commit, each container layer had a flag to indicate if the
container layer is valid.
We need something similar so that the cli_recursion_stack_get_*()
functions ignore normalized layers. Details...
Imagine an LDB signature for HTML content that specifies a ZIP
container. If the signature actually alerts on the normalized HTML and
you don't ignore normalized layers for the container check, it will
appear as though the alert is in an HTML container rather than a ZIP
container.
This commit accomplishes this with a boolean you set in the scan context
before scanning a new layer. Then when the new fmap is created, it will
use that flag to set similar flag for the layer. The context flag is
reset those that anything after this doesn't have that flag.
The flag allows the new recursion_stack_get() function to ignore
normalized layers when iterating the stack to return a layer at a
requested index, negative or positive.
Scanning normalized extracted/normalized javascript and VBA should also
use the 'layer is normalized' flag.
- This commit also fixes Heuristic.Broken.Executable alert for ELF files
to make sure that:
A) these only alert if cli_append_virus() returns CL_VIRUS (aka it
respects the FP check).
B) all broken-executable alerts for ELF only happen if the
SCAN_HEURISTIC_BROKEN option is enabled.
- This commit also cleans up the error handling in cli_magic_scan_dir().
This was needed so we could correctly apply the layer-is-normalized-flag
to all VBA macros extracted to a directory when scanning the directory.
- Also fix an issue where exceeding scan maximums wouldn't cause embedded
file detection scans to abort. Granted we don't actually want to abort
if max filesize or max recursion depth are exceeded... only if max
scansize, max files, and max scantime are exceeded.
Add 'abort_scan' flag to scan context, to protect against depending on
correct error propagation for fatal conditions. Instead, setting this
flag in the scan context should guarantee that a fatal condition deep in
scan recursion isn't lost which result in more stuff being scanned
instead of aborting. This shouldn't be necessary, but some status codes
like CL_ETIMEOUT never used to be fatal and it's easier to do this than
to verify every parser only returns CL_ETIMEOUT and other "fatal
status codes" in fatal conditions.
- Remove duplicate is_tar() prototype from filestypes.c and include
is_tar.h instead.
- Presently we create the fmap hash when creating the fmap.
This wastes a bit of CPU if the hash is never needed.
Now that we're creating fmap's for all embedded files discovered with
file type recognition scans, this is a much more frequent occurence and
really slows things down.
This commit fixes the issue by only creating fmap hashes as needed.
This should not only resolve the perfomance impact of creating fmap's
for all embedded files, but also should improve performance in general.
- Add allmatch check to the zip parser after the central-header meta
match. That way we don't multiple alerts with the same match except in
allmatch mode. Clean up error handling in the zip parser a tiny bit.
- Fixes to ensure that the scan limits such as scansize, filesize,
recursion depth, # of embedded files, and scantime are always reported
if AlertExceedsMax (--alert-exceeds-max) is enabled.
- Fixed an issue where non-fatal alerts for exceeding scan maximums may
mask signature matches later on. I changed it so these alerts use the
"possibly unwanted" alert-type and thus only alert if no other alerts
were found or if all-match or heuristic-precedence are enabled.
- Added the "Heuristics.Limits.Exceeded.*" events to the JSON metadata
when the --gen-json feature is enabled. These will show up once under
"ParseErrors" the first time a limit is exceeded. In the present
implementation, only one limits-exceeded events will be added, so as to
prevent a malicious or malformed sample from filling the JSON buffer
with millions of events and using a tonne of RAM.
Yara rule files may contain multiple signatures. If one of the
signatures fails to load because of a parse error in the yara rule
condition, the rest of the rules still load. This is fine, but it seems
that something isn't properly cleaned up, so there end up being runtime
crashes when running the correctly loaded rules as a result.
Specifically, the crash occurs because of an assert() that expects the
operation stack to be empty and it is not. A simple fix is to print an
error or debug message instead of crashing. It's not the right fix, but
it at least prevents crash.
Resolves: https://bugzilla.clamav.net/show_bug.cgi?id=12077
Also fixed a bunch of warnings in the yara module caused by comparing
different integer types.
Adds an equivalent functionality to ClamScan's --gen-json option to
ClamD.
Behavior for GenerateMetadataJson is the same as with --gen-json.
If Debug is enabled, it will print out the JSON after each scan.
If LeaveTemporaryFiles is enabled, it will drop a metadat.json file
in the scan temp directory, which of course may be customized using
the TemporaryDirectory option.
To build with code signing, the macOS build must have:
-G Xcode \
-D CLAMAV_SIGN_FILE=ON \
-D CODE_SIGN_IDENTITY="...your codesign ID..." \
-D DEVELOPMENT_TEAM_ID="...your team ID..." \
You can find the codesign ID using:
/usr/bin/env xcrun security find-identity -v -p codesigning
The team ID should also be listed in the identity description.
Also I changed the package name for APPLE to be "clamav" so it doesn't
put "ClamAV <version>" in the PKG PackageInfo like this:
com.cisco.ClamAV 0.104.0.libraries
Instead, it should just be something like:
com.cisco.clamav.libraries
Version is a separate field in that file and shouldn't be in the name.
At present the .msi installer is only installing documentation component
files and the vcredist files but fails to install clamav libraries,
programs, and dependencies.
It appears that explicitly installing the NEWS & README files under the
documentation component before calling "include(CPack)" was causing the
MSI installer to think it needed to install the documentation component
but nothing else.
This commit removes the component name, since we don't want to use
components in the Windows MSI installer anyways. This appears to resolve
the issue so that the MSI installer installs all the desired files.
When locale is UTF-8, check that signature pattern bytes are < 0x80
before using the isalpha() and toupper() functions since that can lead
to segfaults and/or unintended matches.
For example take a LDB signature with a case-insensitive subsignature
containing byte 0xb5. The uint16_t value of pattern->pattern[i] is
0x10b5 since 0xb5 is OR'd with the CLI_MATCH_NOCASE (0x1000) flag.
Locale: C
isalpha((unsigned char) (0x10b5 & 0xff)): 0
toupper((unsigned char) (0x10b5 & 0xff)): b5
Locale: en_US.UTF-8
isalpha((unsigned char) (0x10b5 & 0xff)): 1
toupper((unsigned char) (0x10b5 & 0xff)): 39c
U+00B5 is the Micro Sign (also known as Mu)
U+03BC is the Greek Small Letter Mu
U+039C is the Greek Capital Letter Mu
Zero-byte CDIFFs are sometimes issued in place of real CDIFFs to force
freshclam to download a whole CVD because using a CDIFF would be less
efficient or otherwise problematic.
There is a bug where freshclam fails to detect if a downloaded CDIFF is
empty. This issue prints an ugly warning message and may require the
user to run freshclam up to 3x before they get over the empty-CVD hump
and are back to normal updates.
This commit resolves this bug by checking the size of the downloaded
CDIFF patch and returning an appropriate status code.
There is a bug where freshclam fails to detect if a downloaded CDIFF is
empty. In 0.103 this, combined with a CDN caching issue could result in
freshclam downloading a daily.cvd but failing to update, putting it in a
sort of infinite loop. In 0.104 this issue manifests slightly
differently, requiring freshclam to run up to 3x before you get over the
empty-CVD hump and are back to normal updates.
This commit updates an existing cdiff test with the zero-byte cdiff + an
out-of-date CVD to confirm the bug. The following commit will fix it.
The freshclam.dat file shouldn't be in the Docker images or else
everyone using the image will have the same UUID.
This commit deletes it after each update.
If running multiple parallel processes of "xor_testfile.py" there was a
race condition between checking for the existence of the directory and
creating it. Now this is handled as a dependency in CMake.