Fixes an fmap leak in the bytecode switch_input() API. The
switch_input() API provides a way to read from an extracted file instead
of reading from the current file. The issue is that the current
implementation fails to free the fmap created to read from the extracted
file on cleanup or when switching back to the original fmap. In
addition, it fails to use the cli_bytecode_context_setfile() function
to restore the file_size in the context for the current fmap.
Fixes a couple fmap leaks in the unit tests.
Specifically this fixes use of cli_map_scandesc().
The cli_map_scandesc() function used to override the current fmap
settings with a new size and offset, performing a scan of the embedded
content. This broke the ability to iterate backwards through the fmap
recursion array when an alert occurs to check each map's hash for
whitelist matches.
In order to fix this issue, it needed to be possible to duplicate an
fmap header for the scan of the embedded file without duplicating the
actual map/data. This wasn't feasible with the posix fmap handle
implementation where the fmap header, bitmap array, and memory map
were all contiguouus. This commit makes it possible by extracting the
fmap header and bitmap array from the mmap region, using instead a
pointer for both the bitmap array and mmap/data. The resulting posix
fmap handle implementation as a result ended up working more similarly
to existing the Windows implementation.
In addition to the above changes, this commit fixes:
- fmap recursion tracking for cli_scandesc()
- a recursion tracking issue in cli_scanembpe() error handling
Signature alerts on content extracted into a new fmap such as normalized
HTML resulted in checking FP signatures against the fmap's hash value
that was initialized to all zeroes, and never computed.
This patch seeks will enable FP signatures of normalized HTML files or
other content that is extracted to a new fmap to work. This patch
doesn't resolve the issue that normal people will write FP signatures
targeting the original file, not the normalized file and thus won't
really see benefit from this bug-fix.
Additional work is needed to traverse the fmap recursion lists and
FP-check all parent fmaps when an alert occurs. In addition, the HTML
normalization method of temporarily overriding the ctx->fmap instead of
increasing the recursion depth and doing ctx->fmap++/-- will need to be
corrected for fmap reverse recursion traversal to work.
If the clamd.conf enables the LocalSocket option and sets the unix
socket file in a directory that does not exist, clamd creates the
missing directory but with invalid 000 permissions bits, causing socket
creation to fail.
This patch sets the umask temporarily to allow creation of the
directory w/ dwrxwr-wr- (766) permissions.
ClamAV doesn't handle compressed attribute for hfs+ file catalog
entries.
This patch adds support for FLATE compressed files.
To accomplish this, we had to find and parse the root/header node
of the attributes file, if one exists. Then, parse the attribute map
to check if the compressed attribute exists. If compressed, parse the
compression header to determine how to decompress it. Support is
included for both inline compressed files as well as compressed
resource forks.
Inflating inline compressed files is straightforward.
Inflating a compressed resource fork requires more work:
- Find location and size of the resource.
- Parse the resource block table.
- Inflate and write each block to a temporary file to be scanned.
Additional changes needed for this work:
- Make hfsplus_fetch_node work for both catalog and attributes.
- Figure out node size.
- Handle nodes that span several blocks.
- If the attributes are missing, or invalid, extraction continues.
This behavior is to support malformed files which would also
extract on macOS and perhaps other systems.
This patch also:
- Adds filename extraction for the hfs+ parser.
- Skips embedded file type detection for GPT image file types. This
prevents double extraction of embedded files, or misclassfication
of GPT images as MHTML, for example. This resolves bb12335.
The PDF parser currently prints verbose error messages when attempting
to shrink a buffer down to actual data length after decoding if it turns
out that the decoded stream was empty (0 bytes). With exception to the
verbose error messages, there's no real behavior issue.
This commit fixes the issue by checking if any bytes were decoded before
attempting to shrink the buffer.
Scans performed in the RTC SCAN_CLEANUP macro by the state.cb_end()
callback function never save the return value and thus fail to record a
detection. This patch sets `ret` so the detection isn't lost.
fixed a leak where host and port were not being properly cleaned up
cleaned up error handling for make_connection_real function
added various null param checks
a problem existed in which specifying --enable-libclamav-only would fail
if curl was not installed on the system
this fix puts a check in place which will ensure the curl check code is
not run if the option is turned on
in the future if curl becomes required in libclamav this check will need
to be removed
The newer freshclam uses libcurl for downloads and downloads the
updates via https. There are systems which don't have a "default CA
store" but instead the administrator maintains a CA-bundle of certs
they trust.
This patch allows the users to specify their own CA cert path by
setting the environment variable CURL_CA_BUNDLE to the path of their
choice.
Patch courtesy of Sebastian A. Siewior
These opcodes specify a function or keyword by number
instead of by name. The corresponding lookup tables
still have a few entries without names, but the majority
of them are been determined and verified.
The PROFILE_HASHTABLE preprocessor definition can be set at build
time and is intended to be used to enable profiling capabilities
for developers working with hash table and set data structure
profiling. This hashtable profiling functionality was added into
the code a while back and isn't currently functional, but would
ultimately be nice to have. This commit is a first step towards
getting it working.
When PROFILE_HASHTABLE is set, it causes several counters used for
collecting performance metrics to be inserted into the core hashtable
structures. When PROFILE_HASHTABLE is not set, however, these
counters are omitted, and the other members of the structure only
ever contain constant data. I'm guessing that at some point, as an
optimization in the latter case, ClamAV began declaring the hashtable
structures `const`, causing gcc (and maybe other compilers) to put
the structures in the read-only data section. Thus, the code
crashes when PROFILE_HASHTABLE is defined and the counters in the
read-only data section try to get incremented. The fix for this is
to just not mark these structures as `const` if PROFILE_HASHTABLE
is defined.
The commit "Freshclam create database directory if missing" inadvertently
broke the build, because it was blindly rebased and merged after a prior
commit relocated the required `statbuf` variable.
This commit adds back the missing `statbuf` variable.
This fixes issues in cvd download when network speed is slow.
Setting is passed to libcurl CURLOPT_TIMEOUT. Original default of 60s
was not enough if network speed is limited. Curl handles this as
total time for http(s) transfer.
https://curl.haxx.se/libcurl/c/CURLOPT_TIMEOUT.html
Also change commented out setting of ReceiveTimeout on example configs
to somewhat sensible value (1800s).
Signed-off-by: Tuomo Soini <tis@foobar.fi>
On initialization, freshclam will create the database directory if it is
missing.
If running as root, freshclam will assign ownership of the new directory
to the DatabaseOwner account.
Disable line wrap when printing the progress bar so that small terminal
windows do not see excessive lines printed.
Reduce the number of characters in the progress bar to accomodate
80-char width terminals.
Correctly display number of kilobytes (KiB) in progress bar. Previously
was showing # of MiB but printing "KiB".
Freshclam creates a tmp directory in the database directory used to
store downloaded patches or databases before they replace current
databases. The tmp directory previously was created at when freshclam
was initialized and deleted when freshclam exited. This was problematic
if freshclam was run in daemon mode and then run manually while the
daemon was already running.
This commit alters the behavior to create tmp directory with a random
suffix before the update begins and remove this directory when the
update ends, allowing freshclam to be run manually without causing the
freshclam daemon to fail later.
Users have reported that freshclam will fail to download daily.cvd to an
empty database directory shortly after a new database is published due
to a mirror synchronization error. This may occur if the CDN serves up
an older version while a newer version has just been published. Clearing
the CDN cache has not eliminated the issue.
This commit allows freshclam to accept a database 1 version older than
advertised by DNS but will still fail if the database is more than one
version out of date.
Removing problematic call to convert file descriptors to filepaths.
Added filename and tempfile names to scandesc calls in clamd.
Added a general scan option to treat the scan engine as unprivileged,
meaning that the scan engine will not have read access to the file.
Added check to drop a temp file for RAR's where the we don't have
read access to the filepath provided (i.e. unprivileged is set, or
access() check fails).
Users have reported slow scan speeds of some PDF documents. The scan
speed was very slow on Windows in particular. Investigation indicated
significant time spent in cli_realloc.
Performance is particularly bad with a relatively large data stream is
decompressed using small chunk sizes and the final buffer is reallocated
to a larger buffer size each time a chunk is added.
This commit replaces BUFSIZ, which varies from 256B -> 8192B, with
INFLATE_CHUNK_SIZE, set to 256kB, the chunk size recommended by the zlib
documentation for efficient inflate performance. The output buffer is
shrunk (reallocated) down to the final decoded buffer length so as not
to waste memory when many small buffers must be decompressed.
A followup fix should provide a standard way to do zlib decompression
across libclamav where a linked list of decompressed chunks are
assembled and then the final output buffer is allocated at the end.