We were signing with the signing key + signing cert and verifying
with the intermediate cert + root cert. However, we should have been
signing with the signing key + signing cert + intermediate cert, and
verifying with just the root cert.
To fix this, I...
1. Provided new certs and test file .sign files to use the correct
signing method.
2. Restructured the `unit_tests/input/signing` directory to highlight
which files are for signing and which are for verification.
There is a multi-arch build issue because I previously used i8 to
represent a C character. I switched it to c_char, which should fix the
clamav-debian multi-arch Docker image build.
It turns out we weren't failing out when signing if one of the provided
intermediate certificate paths is incorrect. Instead of using
`filter_map()`, I switched to just iterate the list to populate the
vector of intermediate certs.
Add X509 certificate chain based signing with PKCS7-PEM external
signatures distributed alongside CVD's in a custom .cvd.sign format.
This new signing and verification mechanism is primarily in support
of FIPS compliance.
Fixes: https://github.com/Cisco-Talos/clamav/issues/564
Add a Rust implementation for parsing, verifying, and unpacking CVD
files.
Now installs a 'certs' directory in the app config directory
(e.g. <prefix>/etc/certs). The install location is configurable.
The CMake option to configure the CVD certs directory is:
`-D CVD_CERTS_DIRECTORY=PATH`
New options to set an alternative CVD certs directory:
- Commandline for freshclam, clamd, clamscan, and sigtool is:
`--cvdcertsdir PATH`
- Env variable for freshclam, clamd, clamscan, and sigtool is:
`CVD_CERTS_DIR`
- Config option for freshclam and clamd is:
`CVDCertsDirectory PATH`
Sigtool:
- Add sign/verify commands.
- Also verify CDIFF external digital signatures when applying CDIFFs.
- Place commonly used commands at the top of --help string.
- Fix up manpage.
Freshclam:
- Will try to download .sign files to verify CVDs and CDIFFs.
- Fix an issue where making a CLD would only include the CFG file for
daily and not if patching any other database.
libclamav.so:
- Bump version to 13:0:1 (aka 12.1.0).
- Also remove libclamav.map versioning.
Resolves: https://github.com/Cisco-Talos/clamav/issues/1304
- Add two new API's to the public clamav.h header:
```c
extern cl_error_t cl_cvdverify_ex(const char *file,
const char *certs_directory);
extern cl_error_t cl_cvdunpack_ex(const char *file,
const char *dir,
bool dont_verify,
const char *certs_directory);
```
The original `cl_cvdverify` and `cl_cvdunpack` are deprecated.
- Add `cl_engine_field` enum option `CL_ENGINE_CVDCERTSDIR`.
You may set this option with `cl_engine_set_str` and get it
with `cl_engine_get_str`, to override the compiled in default
CVD certs directory.
libfreshclam.so: Bump version to 4:0:0 (aka 4.0.0).
Add sigtool sign/verify tests and test certs.
Make it so downloadFile doesn't throw a warning if the server
doesn't have the .sign file.
Replace use of md5-based FP signatures in the unit tests with
sha256-based FP signatures because the md5 implementation used
by Python may be disabled in FIPS mode.
Fixes: https://github.com/Cisco-Talos/clamav/issues/1411
CMake: Add logic to enable the Rust openssl-sys / openssl-rs crates
to build against the same OpenSSL library as is used for the C build.
The Rust unit test application must also link directly with libcrypto
and libssl.
Fix some log messages with missing new lines.
Fix missing environment variable notes in --help messages and manpages.
Deconflict CONFDIR/DATADIR/CERTSDIR variable names that are defined in
clamav-config.h.in for libclamav from variable that had the same name
for use in clamav applications that use the optparser.
The 'clamav-test' certs for the unit tests will live for 10 years.
The 'clamav-beta.crt' public cert will only live for 120 days and will
be replaced before the stable release with a production 'clamav.crt'.
If the 'hexsig' for an image fuzzy hash subsignature has invalid unicode
it may cause a crash. The problem is we fail to allocate an error
message in this instance, so when it tries to print that message it gets
a NULL dereference.
This is not a security issue.
Fixes: https://issues.oss-fuzz.com/issues/376331488
The C-Rust FFI code is needlessly complex. Now that we are calling into
magic_scan from Rust, we can simply hand off the <style> block contents
to Rust code to handle extraction and scanning.
Primarily this commit fixes an issue with the size of the parameters
passed to cli_checklimits(). The parameters were "unsigned long", which
varies in size depending on platform.
I've switched them to uint64_t / u64.
While working on this, I observed some concerning warnigns on Windows,
and some less serious ones, primarily regarding inconsistencies with
`const` parameters.
Finally, in `scanmem.c`, there is a warning regarding use of `wchar_t *`
with `GetModuleFileNameEx()` instead of `GetModuleFileNameExW()`.
This made me realize this code assumes we're not defining `UNICODE`,
which would have such macros use the 'A' variant.
I have fixed it the best I can, although I'm still a little
uncomfortable with some of this code that uses `char` or `wchar_t`
instead of TCHAR.
I also remove the `if (GetModuleFileNameEx) {` conditional, because this
macro/function will always be defined. The original code was checking a
function pointer, and so this was a bug when integrating into ClamAV.
Regarding the changes to `rijndael.c`, I found that this module assumes
`unsigned long` == 32bits. It does not.
I have corrected it to use `uint32_t`.
The fmap structure has some stuff that differs in size in memory between
Linux and Windows, and between 32bit and 64bit architectures.
Notably, `time_t` appears to be defined by the Rust bindgen module as
`ulong` which may be either 8 bytes or 4 bytes, depending architecture
(thanks, C). To resolve this, we'll store time as a uint64_t instead.
The other problem in the fmap structure is the windows file and map
handles should always be exist, and may only be used in Windows, for
consistency in sizing of the structure.
In order to generate Rust bindings for C code, Rust's bindgen module
needs to know where to find all headers included by the API.
If they're all inside the project or inside the standard include path
(e.g. /usr/include and /usr/local/include) that's fine. But for third-
party C library headers from outside the standard include path, that's
a problem.
We didn't really notice this problem when generating on Unix systems
until we switched to use OpenSSL 3.1 and tested on systems that have
the OpenSSL 1.1.1 dev package installed.
The ability to find headers outside the project path is also needed to
generate bindings on Windows, if desired.
This commit solves the problem by passing include directories for the
ClamAV::libclamav CMake build target to the Rust build via the
CARGO_INCLUDE_DIRECTORIES environment variable.
Then, in the `libclamav_rust/build.rs` script, where we run bindgen,
we split that `;` separated string into invididual paths and add each
to the bindgen builder.
Includes rudimentary support for getting slices from FMap's and for
interacting with libclamav's context structure.
For now will use a Cisco-Talos org fork of the onenote_parser
until the feature to read open a onenote section from a slice (instead
of from a filepath) is added to the upstream.
HTML files with <style> blocks containing non-utf8 sequences are causing
warnings when processing them to extract base64 encoded images.
To resolve this, we can use the to_string_lossy() method that may
allocate and sanitize a copy of the content if the non-utf8 characters
are encountered.
Resolves: https://github.com/Cisco-Talos/clamav/issues/1082
I'm unsure why, but building with cmke -D MAINTAINER_MODE=ON is failing
right now. Updating to a newer version of bindgen appears to resolve the
issue.
I was able to update it by changing the version specified in
libclamav_rust/Cargo.toml, and then running `cargo update -p bindgen`
Not that I expect anyone else to be running maintainer-mode, but I did
also confirm using `cargo-msrv` that the minimum supported version of
rust did not change as a result of this commit.
When processing UTF-8 HTML code, the image extraction logic may panic if
the string contains a multi-byte grapheme that includes a '(', ')',
whitespace, or one of the other characters used to split the text when
searching for the base64 image content.
The panic is because the `split_at()` method will panic if you try to
split in the middle of a unicode grapheme.
This commit fixes the issue by processing the HTML string one grapheme
at a time instead of one character (byte) at a time.
The `grapheme_indices()` method is used to get the correct position of
the start of each grapheme for splitting the string.
The CLOSE command is failing to create a file when appending changes if
the file does not already exist. This prevents adding new files to a
database with a CDIFF and caused failures applying the test-3.cdiff file
in the freshclam feature tests.
Also improved the error message to show which command, specifically, is
failing (not just the line number).
Any cdiff or script using the UNLINK operation will fail to delete the
file claiming "No DB open for action UNLINK".
The UNLINK operation appears to be trying to delete a currently open
database, when in fact it should ensure no database is open before
deleting the local file given by the single "db_name" parameter.
I found that the `url(data:` type does not matter to a browser.
In addition, whitespace may be placed in a few locations and the browser
will ignore it.
This commit accounts for this, and updates the test accordingly.
This commit adds a feature to find, decode, and scan each image found
within HTML <style> tags where the image data is embedded in `url()`
function parameters a base64 blob
In C in the html normalization process we extract style tag contents
to new buffer for processing. We call into a new feature in Rust code to
find and decode each image (if there are multiple).
Once extracted, the images are scanned as contained files of unknown
type, and file type identifcation will determine the actual type.
If the image decoders (i.e. jpeg, tiff, png, etc.) fail to load an image
due to a panic, the application will crash.
This commit attempts to catch those panics and handle the error.
When potentially unwanted indicators are found, they are not reported
right away, except in heuristic-precedence mode. At the end of the
scan, we then report all of the heuristic/PUA alerts. In my initial
overhaul of the allmatch mode, I introduced an issue where it is never
reporting more than 1 heuristic/PUA alert.
This commit fixes that by reporting all potentially-unwanted indicators
at the end. Note that instead of using `cli_virus_found_cb()` which
would report the top layer file descriptor as well in the callback,
I'm reporting `-1` for the file descriptor instead, because we don't
know when the alert was added. If it was at a deeper layer in the file,
we would not longer have access to it at the end of the scan.
Also removed excessive debug statements '... infected with ...' that use
`cli_get_last_virus()` because it doesn't really reflect all of the
matches in allmatch mode, because the match would've been reported
previously (so it is excessive), and because I don't like the word
'infected'. :)
Rework the append_virus mechanism to store evidence (strong indicators,
pua indicators, and eventually weak indicators) in vectors. When
appending a "virus", we will return CLEAN when in allmatch-mode, and
simply add the indicator to the appropriate vector.
Later we can check if there were any alerts to return a vector by
summing the lengths of the strong and pua indicator vectors.
This does away with storing the latest "virname" in the scan context.
Instead, we can query for the last indicator in the evidence, giving
priority to strong indicators.
When heuristic-precendence is enabled, add PUA as Strong instead of
as PotentiallyUnwanted. This way, they will be treated equally and
reported in order in allmatch mode.
Also document reason for disabling cache with metadata JSON enabled