HTML files with <style> blocks containing non-utf8 sequences are causing
warnings when processing them to extract base64 encoded images.
To resolve this, we can use the to_string_lossy() method that may
allocate and sanitize a copy of the content if the non-utf8 characters
are encountered.
Resolves: https://github.com/Cisco-Talos/clamav/issues/1082
Some PDF's with an empty password can't be decrypted. Investigation
found that the problem is a strlen check to prevent an overflow rather
than passing down the actual length of the allocated field.
Specifically, the UE buffer may have NULL values in it, so a strlen
check will claim the field is shorter than it is and then later checks
fail because the length is the wrong size.
While at it, I improved code comments on the function reading dictionary
key-value strings and switched a flag use a bool rather than an int.
Prevent allocating more than 1GB regardless of what is requested.
RAR dictionary sizes may not be larger than 1GB, at least in the current
version.
This is a cherry-pick of commit 9b444e7e02
This is a cherry-pick of commit 24f225c21f
Modification to unrar codebase allowing skipping of files within
Solid archives when parsing in extraction mode, enabling us to skip
encrypted files while still scanning metadata and potentially
scanning unencrypted files later in the archive.
UnRAR logic replaces directory symlinks found within archive file entry
file paths with actual directories by deleting them after they're
extracted.
Unfortunately, this logic extends to deleting existing directories if you
set the `DestName` instead of the `DestPath` in this API:
rc = RARProcessFile(hArchive, RAR_EXTRACT, NULL, destFilePath);
In the future UnRAR may change to disable the `LinksToDirs()` feature
if using the `DestName` parameter. In the meantime, this commit
completely disables it for our use case.
The --alert-exceeds-max feature should alert for all files larger than
2GB because 2GB is the internal limit for individual files.
This isn't working correctly because the `goto done;` exit condition
after recording the exceeds-max heuristic skips over the logic that
reports the alert.
This fix moves the ">2GB" check up to the location where the
max-filesize engine option is set by clamd or clamscan.
If max-filesize > 2GB - 1 is requested, then max-filesize is set to
2GB - 1.
Additionally, a warning is printed if max-filesize > 2GB is requested
(with an exception for when it's maxed out by setting --max-filesize=0).
Resolves: https://github.com/Cisco-Talos/clamav/issues/1030
Developers of FreeBSD base system are currently working to upgrade its
LLVM/Clang/LLDB/LLD to 17. As a part of it they tried building all
ports in FreeBSD ports collections to check if build of them succeeds
with LLVM/Clang/LLD 17. As a result there are some ports that fail to
be built with it and unfortunately `security/clamav` is one of
them. The build of it fails with link error as following.
```
ld: error: version script assignment of 'CLAMAV_PRIVATE' to symbol 'cli_cvdunpack' failed: symbol not defined
ld: error: version script assignment of 'CLAMAV_PRIVATE' to symbol 'cli_dbgmsg_internal' failed: symbol not defined
ld: error: version script assignment of 'CLAMAV_PRIVATE' to symbol 'init_domainlist' failed: symbol not defined
ld: error: version script assignment of 'CLAMAV_PRIVATE' to symbol 'init_whitelist' failed: symbol not defined
ld: error: version script assignment of 'CLAMAV_PRIVATE' to symbol 'cli_parse_add' failed: symbol not defined
ld: error: version script assignment of 'CLAMAV_PRIVATE' to symbol 'cli_bytecode_context_clear' failed: symbol not defined
cc: error: linker command failed with exit code 1 (use -v to see invocation)
```
According to the investigation of ClamAV's source code,
`cli_cvdunpack` is a static function so it isn't visible to external
consumers. And other mentioned symbols aren't found anywhere. So fix
link error by removing all of them from linker version script.
Prevent allocating more than 1GB regardless of what is requested.
RAR dictionary sizes may not be larger than 1GB, at least in the current
version.
This is a cherry-pick of commit 9b444e7e02
This is a cherry-pick of commit 24f225c21f
Modification to unrar codebase allowing skipping of files within
Solid archives when parsing in extraction mode, enabling us to skip
encrypted files while still scanning metadata and potentially
scanning unencrypted files later in the archive.
UnRAR logic replaces directory symlinks found within archive file entry
file paths with actual directories by deleting them after they're
extracted.
Unfortunately, this logic extends to deleting existing directories if you
set the `DestName` instead of the `DestPath` in this API:
rc = RARProcessFile(hArchive, RAR_EXTRACT, NULL, destFilePath);
In the future UnRAR may change to disable the `LinksToDirs()` feature
if using the `DestName` parameter. In the meantime, this commit
completely disables it for our use case.
json-c 0.17 defines the ssize_t type using a typedef on Windows.
We have been setting ssize_t for Windows to a different type using
a #define instead of a typedef.
We should have been using a typedef since it is a type.
However, we must also match the exact type they're setting it to or else
the compiler will baulk because the types are different.
Note: in C11, it's fine to use typedef the same type more than once, so
long as you're defining it the same every time.
Having the filename is useful for certain callbacks, and will likely be
more useful in the future if we can start comparing detected filetypes
with file extensions.
E.g. if filetype is just "binary" or "text" we may be able to do better
by trusting a ".js" extension to determine the type.
Or else if detected file type is "pe" but the extension is ".png" we may
want to say it's suspicious.
Also adjusted the example callback program to disable metadata option.
The CL_SCAN_GENERAL_COLLECT_METADATA is no longer required for the Zip
parser to record filenames for embedded files, and described in the
previous commit.
This program can be used to demonstrate that it is behaving as desired.
Previously clamd would report "Cannot allocate memory" when a file
exceeded max file size. This commit corrects it to report
"Heuristics.Limits.Exceeded.MaxFileSize."
Fixes: https://github.com/Cisco-Talos/clamav/issues/670.
Commit d1656ee increased the default scan limits. We updated them in the
clamd manpages but forgot to update them in the clamscan and
clamav-milter manpages.
Note that I'm also removing the "max: <4 GB" annotations because of the
change in commit 2962509 that removed the artificial 4 GB limit on
sizes that most notably increases the max-scansize to make it easy to
scan very large archives comprised of more than 4 GB worth of stuff.
There is no specific limit now on these size options.
When decompressing a zlib stream, it's possible to reach end of stream
before running out of available bytes. In the DMG parser, this may cause
an infinite loop.
This commit adds a check for the condition where stream has ended before
running out of input.
Fixes: https://github.com/Cisco-Talos/clamav/issues/925
Somehow forgot to save and commit final error handling check on the new
set_tls_client_certificate() function.
This change is needed to have Freshclam fail if you try to use the new
client certificate environment variables incorrectly.
In aspack decrypt function, there's a check to make sure that backbytes
doesn't exceed 57, because it is used as an index in init_array.
However, it is mathematically impossible.
So this commit removes the check.
`cli_getpagesize()` may return -1 in an error condition.
If it does, let's just treat it as 4096.
I believe the actual coverity complaint is a false positive, but it's
fair to account for the error case and this should shut it up.
The bytes_remaining variable may be set negative by mistake, when really
we just want to decrement it.
This issue may result in a 1-byte over read but does not cause any
crash.
We determined that this issue is not a vulnerability.
Fixes: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=58475
Some change in Cargo/Rust version 1.70 or 1.71 appears to have broken
the build on Windows because we are incorrectly attempting to check the
native static libraries by compiling an empty file (/dev/null) which
does not exist on Windows.
A simple fix is to make an empty file of our own and use that instead.
Fixes: https://github.com/Cisco-Talos/clamav/issues/990
In `cli_scanudf()` it's possible to `goto done;` before memset'inng the
fileIdentifierList and fileEntryList. This would likely cause
uninitialized pointer reads.
Instead of memset, just initialize the structs with `= {0};`
Also:
- Rename to use FRESHCLAM_CLIENT_CERT, FRESHCLAM_CLIENT_KEY instead
prefixing with "CURL_". Unlike CURL_CA_BUNDLE, these variable names
are not used by the `curl` program and so do not piggyback on that
existing functionality.
- Add FRESHCLAM_CLIENT_KEY_PASSWD environment variable to support
password protected private key PEM files, as described in:
https://curl.se/libcurl/c/CURLOPT_SSLCERT.html
- Document the new environment variable options in the manpage and in
the `freshclam --help` message. Also add missing documentation in the
freshclam and clamsubmit help-messages for CURL_CA_BUNDLE.
- Update the NEWS.md file to credit jedrzej for the new feature.
These symbols are used by an internal python tool for generating
signatures:
- fuzzy_hash_calculate_image
- ffierror_fmt
`ffierror_fmt` is required to free the error structure passed back in
case of an error.
Since version 1.1.0 started using libclamav.map again, we need to
explicitly export these symbols.