This is just preliminary support for identifying an assortment of
different AI model files.
So far, this detects the following types:
- GGML GGUF (.gguf)
- ONNX AI (.onnx)
- TensorFlow Lite (.tflite)
Additional types to consider:
- SafeTensors (.safetensors)
- TensorFlow (.pb, .ckpt, .tfrecords)
- Keras (.keras)
- pickle (.pkl)
- numpy (.npy, .npz)
- coreml (.coreml)
- PyTorch (.pt, .pth, .bin, .mar, .pte, .pt2, .ptl)
Outside of being able to differentiate by file type, the scanner
will treat CL_TYPE_AI_MODEL the same as CL_TYPE_BINARY_DATA.
We're not adding parsers to further process these files, for now.
Add X509 certificate chain based signing with PKCS7-PEM external
signatures distributed alongside CVD's in a custom .cvd.sign format.
This new signing and verification mechanism is primarily in support
of FIPS compliance.
Fixes: https://github.com/Cisco-Talos/clamav/issues/564
Add a Rust implementation for parsing, verifying, and unpacking CVD
files.
Now installs a 'certs' directory in the app config directory
(e.g. <prefix>/etc/certs). The install location is configurable.
The CMake option to configure the CVD certs directory is:
`-D CVD_CERTS_DIRECTORY=PATH`
New options to set an alternative CVD certs directory:
- Commandline for freshclam, clamd, clamscan, and sigtool is:
`--cvdcertsdir PATH`
- Env variable for freshclam, clamd, clamscan, and sigtool is:
`CVD_CERTS_DIR`
- Config option for freshclam and clamd is:
`CVDCertsDirectory PATH`
Sigtool:
- Add sign/verify commands.
- Also verify CDIFF external digital signatures when applying CDIFFs.
- Place commonly used commands at the top of --help string.
- Fix up manpage.
Freshclam:
- Will try to download .sign files to verify CVDs and CDIFFs.
- Fix an issue where making a CLD would only include the CFG file for
daily and not if patching any other database.
libclamav.so:
- Bump version to 13:0:1 (aka 12.1.0).
- Also remove libclamav.map versioning.
Resolves: https://github.com/Cisco-Talos/clamav/issues/1304
- Add two new API's to the public clamav.h header:
```c
extern cl_error_t cl_cvdverify_ex(const char *file,
const char *certs_directory);
extern cl_error_t cl_cvdunpack_ex(const char *file,
const char *dir,
bool dont_verify,
const char *certs_directory);
```
The original `cl_cvdverify` and `cl_cvdunpack` are deprecated.
- Add `cl_engine_field` enum option `CL_ENGINE_CVDCERTSDIR`.
You may set this option with `cl_engine_set_str` and get it
with `cl_engine_get_str`, to override the compiled in default
CVD certs directory.
libfreshclam.so: Bump version to 4:0:0 (aka 4.0.0).
Add sigtool sign/verify tests and test certs.
Make it so downloadFile doesn't throw a warning if the server
doesn't have the .sign file.
Replace use of md5-based FP signatures in the unit tests with
sha256-based FP signatures because the md5 implementation used
by Python may be disabled in FIPS mode.
Fixes: https://github.com/Cisco-Talos/clamav/issues/1411
CMake: Add logic to enable the Rust openssl-sys / openssl-rs crates
to build against the same OpenSSL library as is used for the C build.
The Rust unit test application must also link directly with libcrypto
and libssl.
Fix some log messages with missing new lines.
Fix missing environment variable notes in --help messages and manpages.
Deconflict CONFDIR/DATADIR/CERTSDIR variable names that are defined in
clamav-config.h.in for libclamav from variable that had the same name
for use in clamav applications that use the optparser.
The 'clamav-test' certs for the unit tests will live for 10 years.
The 'clamav-beta.crt' public cert will only live for 120 days and will
be replaced before the stable release with a production 'clamav.crt'.
Fixes:
- We need to look at the local headers if no central directory headers are
found. Restructured the main `cli_unzip()` function to allocate an empty
zip catalogue when we can't use a central directory at all.
- In `index_local_file_headers_within_bounds()`, we must decrement the
`coff` variable after adding the size of a file entry using
`parse_local_file_header()`, to account for the increment when it loops
around. If we don't, the next entry won't be at 'PK\x03\x04', it will be
at 'K\0x03\x04'.
- Attempt to unzip when encrypted if we don't have a valid password.
This may enable extraction for files where a header lies about encryption.
- The `fmap_need_off()` call to get the `compressed_data` pointer used the
wrong size, checking if there was enough data for a header instead of
for the compressed data that follows the header. I stumbled across this
older bug when testing extraction of a zip where the file entries are
tiny and I'd stripped off the central directory. As a result, there
wasn't enough data for a whole file header and my test failed.
Cleanup:
- Initialize status variables as CL_ERROR and only assign to CL_SUCCESS if
successful. This is to protect against future changes in case someone
accidentally goes-to-done without setting the status.
- Remove legacy use of CL_CLEAN. Not a functional change.
This mostly a stylistic preference.
- Use calloc instead of malloc + memset in a couple places.
Make use of the new allocation macros with goto-done error handling.
- Some opinionated format changes such as shifting some longer function
arguments all to a new line so they're no so far to the right.
- Auto-format with clang-format.
The ClamAV inflate64 module is based on zlib 1.2.3 source code with
significant changes to support extracting zip64 and some addressing
code quality issues.
This commit adds a zlib v1.2.9 fix for possible undefined behavior:
6a043145ca
Thank you to TITAN Team for reporting this issue.
The bounds check for the loop iterating an OLE2 block during decryption
may have an integer unerflow if the `leftover + bytesToWrite` is less
than 16. That results in a significant buffer over read and a segfault.
The fix is simply to do addition on the left side of the check instead
of subtraction on the right.
Fixes https://issues.oss-fuzz.com/issues/372544101
At install, the CMake build may fail if it detects the same library
dependency in two locations. This happened for us with the following
error:
CMake Error at libfreshclam/cmake_install.cmake:157 (file):
file Multiple conflicting paths found for libcrypto-3-x64.dll:
C:/Users/clamav_jenkins_svc.TALOS/clam_dependencies/x64/lib/libcrypto-3-x64.dll
C:/WINDOWS/system32/libcrypto-3-x64.dll
C:\WINDOWS\system32/libcrypto-3-x64.dll
Call Stack (most recent call first):
cmake_install.cmake:96 (include)
This happens when system provided DLL names match exactly with the ones
we provide. ClamAV woudld't prefer that DLL at load time, because it
looks in the EXE directory first. But it does confuse the `file()`
command used to locate build dependencies.
The fix in this commit uses a regex to exclude all libraries found under
C:\Windows
Occasionally the MD5 hash for RSA-based digital signature
verification begins with zeros. A bug in how we convert the RSA
decoded plain text from a big number back to a hex string causes it
to write the number to the far left of the plain text buffer.
If the number is smaller than a hash, then zero-padding ends up on
the right when it should've been on the left.
Additional fix: BN_bn2bin() will write zero bytes if the bignum is 0.
So there is no point "error checking" the BN_bn2bin() call.
Thanks to Tom Judge for noticing these shenanigans.
Ref: https://github.com/openssl/openssl/issues/2101
Side note: BN_num_bytes() will also return 0 if the bignum is 0,
which is fine.
Store URLs found in HTML `<a>` and `<form>` tags during scan of HTML files
when recording scan metadata.
HTML URL recording will be ON by default, but is a part of the
generate-metadata-json feature.
The generate-metadata-json feature is OFF by default.
This introduces a new general scan option:
- libclamav: `CL_SCAN_GENERAL_STORE_HTML_URLS`.
- ClamD: `JsonStoreHTMLUrls`.
- ClamScan: `--json-store-html-urls`
Thank you Matt Jolly for the helpful comment on the pull request.
Add keys to the metadata.json file that informs the user that a scanned
ole2 file is encrypted. Information about the type of encryption is
provided when the information is available. This feature co-authored by
Micah Snyder.
There is presently no limit for the max-recursion scan option.
Selecting a max-recursion limit that is too high will cause confusing
errors. E.g.:
/home/aragusa/install.alz/bin/clamscan -d clamav.hdb . --max-recursion=9999999999
LibClamAV Error: fmap_fd: Attempted to get fd for NULL fmap
/home/aragusa/issue/clamav.hdb: Can't allocate memory ERROR
LibClamAV Error: fmap_fd: Attempted to get fd for NULL fmap
/home/aragusa/issue/test.sh: Can't allocate memory ERROR
This commit prevents setting the max-recursion limit higher than 100.
The `find_length()` function in the PDF parser incorrectly assumes that
objects found are located in the main PDF file map, and fails to take
into account whether the objects were in fact found in extracted PDF
object streams. The resulting pointer is then invalid and may be an out
of bounds read.
This issue was found by OSS-Fuzz.
This fix checks if the object is from an object stream, and then
calculates the pointer based on the start of the object stream instead
of based on the start of the PDF.
I've also added extra checks to verify the calculated pointer and object
size are within the stream (or PDF file map). I'm not entirely sure this
is necessary, but better safe than sorry.
Fixes: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=69617
Clamscan and ClamD will throw an error if you use the
'--fail-if-cvd-older-than=DAYS' / 'FailIfCvdOlderThan' option and
try to load any plaintext signature files.
That is, it throws an error when encountering plain signature files like
`.ign2`, `.ldb`, `.hdb`, etc.
This feature should only verify CVD / CLD files.
The feature (and bug) was introduced in ClamAV 1.1.0, here:
e4fe6654c1
With this change, the `cl_cvdgetage` checks will skip any file that is
not a CVD or CLD.
Fixes: https://github.com/Cisco-Talos/clamav/issues/1174
The clamscan test "assorted_test.py::TC::test_pe_cert_trust" is about to
fail because the "test.exe" test file was signed with a cert set to
expire after only 2 years, and it has been 23 months.
While attempting to generate a new one that will last 73000 days (200
years), I discovered that any signing certificate set to expire after
2038 will fail the trust-check because the `ca.not_after` variable is
maxed out `time_t` incapable of expressing a higher number.
To fix this, I've upgraded the variables to `uint64_t`.
I also had to replace a bunch of generated signatures to match the new
"test.exe".
Finally, I noticed that "ca.not_before" was being set to the token[8]
instead of token[9], which presumably mean the "NotBefore" field for
Trusted and Revoked Certificates was non-functional, as it was treating
the "CertSign" boolean as the "NotBefore" value.
Fixes: https://github.com/Cisco-Talos/clamav/issues/1300
fmap_need_off_once() may return an unaligned pointer. This in return
leads to an unaligned access during the load of the uint32_t variables
loading to failures on architectures not supporting unaligned access.
This was reported to the Debian BTS as #1073128.
[bigeasy: Commit message, reworked the patch a bit].
Link: https://bugs.debian.org/1073128
Signed-off-by: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
If SCAN_COLLECT_METADATA is enabled, and caching is disabled, we zero-out
the hash after recording it.
This results in a non-NULL and invalid-hash that may be passed to
`cli_scan_fmap()` for the "raw mode" scan.
It's an uncommon code path, but would result in comparing hash-sigs with
a zeroed hash rather than the valid hash.
This bug could result in a missed hash-based sig matches.
There is no reason to invalidate or zero-out the hash if we happen to
calculate it. We avoid the cache-lookup by checking the engine setting,
not by checking if we have a hash.
ClamAV initalization's rarload() function tries to load
libclamunrar_iface from the install path before checking under
LD_LIBRARY_PATH.
This means the unit tests will use the wrong unrar library if testing on
a system where ClamAV is already installed.
In the event there is an ABI break between versions, this will cause a
bunch of tests to fail.
This commit fixes the issue by checking for libclamunrar_iface under
LD_LIBRARY_PATH *first* before checking in the install lib directory.
Note in the previous version we were also checking LD_LIBRARY_PATH on
Windows, which is not a thing. I removed this.
Fixes: https://github.com/Cisco-Talos/clamav/issues/1249
Also removed check for WARN_DLOPEN_FAIL define, which was not used, and
mistakenly set for the unrar library build target.
The C-Rust FFI code is needlessly complex. Now that we are calling into
magic_scan from Rust, we can simply hand off the <style> block contents
to Rust code to handle extraction and scanning.
Immediately store pointers as new pointer type rather than using
intermediate uint8_t pointer.
Also "unneed" some of the "needed" pointers as soon as we're able to
release them rather than holding on until the end of the UDF image.
Add assorted debug messages and code comments.
Make FileSetDescriptor optional as minor step towards supporting
ExtendedFileEntries.
Minor variable name changes for readability.
Use tag_identifier enum for variable type rather than uint16_t and
add "INVALID_DESCRIPTOR" (0) to enum and use it in the switch. This way
we're not comparing enums with ints.
Move GenericVolumeStructureDescriptor to udf.h.
As of ClamAV 0.105, libjson-c is required.
There is also no option to disable libjson-c support.
This commit removes the dead code associated with the old build
option.
As of ClamAV 0.105, libz is required.
There is also no option to disable zlib support.
This commit removes the dead code associated with the old build
option.
As of ClamAV 0.105, libbz2 is required.
There is also no option to disable bz2 support.
This commit removes the dead code associated with the old build
option.
As of ClamAV 0.105, libxml2 is required.
There is also no option to disable PCRE support.
This commit removes the dead code associated with the old build
option.
As of ClamAV 0.105, PCRE2 is required. PCRE (1) is not an option, and
there is also no option to disable PCRE support.
This commit removes the dead code associated with those old build
options.
The in_iconv_u16() function resolves "alignment" issues where the length
of the input string is not mod(4). The solution trims the extra bytes
off the input string. If the input string is total less than 4 bytes,
then those extra bytes are put in a 4-byte array and are converted.
However, if the input string is longer, then those extra bytes are lost.
This fix saves the extra "unaligned" bytes in the 4-byte array and
converts them afterwards so we don't accidentally lose 1 to 2
characters.
The delharc crate used to add LZH archive support appears to add
a dependency on macOS CoreFoundation library.
The error is:
[ 78%] Linking C shared library libclamav.dylib
Undefined symbols for architecture x86_64:
"_CFRelease", referenced from:
iana_time_zone::platform::get_timezone_inner::hc7da204717a39974 in libclamav_rust.a(iana_time_zone-bc4762a47da73d72.iana_time_zone.1863eb20d202562a-cgu.0.rcgu.o)
...
clang: error: linker command failed with exit code 1 (use -v to see invocation)
make[2]: *** [libclamav/libclamav.12.0.2.dylib] Error 1
We already link with CoreFoundation for libfreshclam and clamsubmit, so
this commit extends that to libclamav as well.