Fixes a shell compatibility issue with string comparisons in the
clamonacc and libclamav-only M4 files:
test(1) uses `=` for string equality. (`==` is a bashism)
XLM is a macro language in Excel that was used before VBA (before
1996). It is still parsed and executed by modern Excel and is gaining
popularity with malware authors.
This patch adds rudimentary support for detecting and extracting
Excel 4.0 (XLM) macros.
The code is based on Didier Steven's plugin_biff for oletools.py.
Fixes a bug in the PtrVerifier pass when using LLVM >= v3.5 for the
bytecode signature runtime.
LLVM 3.5 changed the meaning of "use" and introduced "user". This fix
swaps out "use" keywords for "user" so the code functions correctly when
using LLVM 3.5+.
Add the credit card-only DLP option "StructuredCCOnly" to the win32
sample clamd config.
Also update NEWS.md to credit John Schember and Alexander Sulfrian for
the DLP CC-only mode contribution.
An integer overflow causes an out-of-bounds read that results in
a crash. The crash may occur when using the optional
Data-Loss-Prevention (DLP) feature to block content that contains credit
card numbers. This commit fixes the issue by using a signed index variable.
Add Data-Loss-Prevention option to detect credit cards only, excluding
debit and private label cards where possible.
You can select the credit card-only DLP mode for clamscan with the
`--structured-cc-mode` command-line option.
You can select the credit card-only DLP mode for clamd with the
`StructuredCCOnly` clamd.conf config option.
This patch also adds credit card matching for additional vendors:
- Mastercard 2016
- China Union Pay
- Discover 2009
Adds LZMA and BZip2 decompression routines to the bytecode API.
The ability to decompress LZMA and BZip2 streams is particularly
useful for bytecode signatures that extend clamav executable
unpacking capabilities.
Of note, the LZMA format is not well standardized. This API
expects the stream to start with the LZMA_Alone header.
Also fixed a bug in LZMA dictionary size setting.
- Existing VBA extraction code uses undocumented cache structures.
This code uses the documented way of accessing VBA projects.
- Adds additional detail to the dumped information:
Project name, Project doc string, ...
All VBA projects are dumped into a single file.
- Malware authors are currently evading detection by spreading
malicious code over several projects. It is hard to write
signatures if only part of the malicious code is visible.
Fixes an fmap leak in the bytecode switch_input() API. The
switch_input() API provides a way to read from an extracted file instead
of reading from the current file. The issue is that the current
implementation fails to free the fmap created to read from the extracted
file on cleanup or when switching back to the original fmap. In
addition, it fails to use the cli_bytecode_context_setfile() function
to restore the file_size in the context for the current fmap.
Fixes a couple fmap leaks in the unit tests.
Specifically this fixes use of cli_map_scandesc().
The cli_map_scandesc() function used to override the current fmap
settings with a new size and offset, performing a scan of the embedded
content. This broke the ability to iterate backwards through the fmap
recursion array when an alert occurs to check each map's hash for
whitelist matches.
In order to fix this issue, it needed to be possible to duplicate an
fmap header for the scan of the embedded file without duplicating the
actual map/data. This wasn't feasible with the posix fmap handle
implementation where the fmap header, bitmap array, and memory map
were all contiguouus. This commit makes it possible by extracting the
fmap header and bitmap array from the mmap region, using instead a
pointer for both the bitmap array and mmap/data. The resulting posix
fmap handle implementation as a result ended up working more similarly
to existing the Windows implementation.
In addition to the above changes, this commit fixes:
- fmap recursion tracking for cli_scandesc()
- a recursion tracking issue in cli_scanembpe() error handling
Signature alerts on content extracted into a new fmap such as normalized
HTML resulted in checking FP signatures against the fmap's hash value
that was initialized to all zeroes, and never computed.
This patch seeks will enable FP signatures of normalized HTML files or
other content that is extracted to a new fmap to work. This patch
doesn't resolve the issue that normal people will write FP signatures
targeting the original file, not the normalized file and thus won't
really see benefit from this bug-fix.
Additional work is needed to traverse the fmap recursion lists and
FP-check all parent fmaps when an alert occurs. In addition, the HTML
normalization method of temporarily overriding the ctx->fmap instead of
increasing the recursion depth and doing ctx->fmap++/-- will need to be
corrected for fmap reverse recursion traversal to work.
If the clamd.conf enables the LocalSocket option and sets the unix
socket file in a directory that does not exist, clamd creates the
missing directory but with invalid 000 permissions bits, causing socket
creation to fail.
This patch sets the umask temporarily to allow creation of the
directory w/ dwrxwr-wr- (766) permissions.
ClamAV doesn't handle compressed attribute for hfs+ file catalog
entries.
This patch adds support for FLATE compressed files.
To accomplish this, we had to find and parse the root/header node
of the attributes file, if one exists. Then, parse the attribute map
to check if the compressed attribute exists. If compressed, parse the
compression header to determine how to decompress it. Support is
included for both inline compressed files as well as compressed
resource forks.
Inflating inline compressed files is straightforward.
Inflating a compressed resource fork requires more work:
- Find location and size of the resource.
- Parse the resource block table.
- Inflate and write each block to a temporary file to be scanned.
Additional changes needed for this work:
- Make hfsplus_fetch_node work for both catalog and attributes.
- Figure out node size.
- Handle nodes that span several blocks.
- If the attributes are missing, or invalid, extraction continues.
This behavior is to support malformed files which would also
extract on macOS and perhaps other systems.
This patch also:
- Adds filename extraction for the hfs+ parser.
- Skips embedded file type detection for GPT image file types. This
prevents double extraction of embedded files, or misclassfication
of GPT images as MHTML, for example. This resolves bb12335.
The PDF parser currently prints verbose error messages when attempting
to shrink a buffer down to actual data length after decoding if it turns
out that the decoded stream was empty (0 bytes). With exception to the
verbose error messages, there's no real behavior issue.
This commit fixes the issue by checking if any bytes were decoded before
attempting to shrink the buffer.
Scans performed in the RTC SCAN_CLEANUP macro by the state.cb_end()
callback function never save the return value and thus fail to record a
detection. This patch sets `ret` so the detection isn't lost.
fixed a leak where host and port were not being properly cleaned up
cleaned up error handling for make_connection_real function
added various null param checks
a problem existed in which specifying --enable-libclamav-only would fail
if curl was not installed on the system
this fix puts a check in place which will ensure the curl check code is
not run if the option is turned on
in the future if curl becomes required in libclamav this check will need
to be removed
The newer freshclam uses libcurl for downloads and downloads the
updates via https. There are systems which don't have a "default CA
store" but instead the administrator maintains a CA-bundle of certs
they trust.
This patch allows the users to specify their own CA cert path by
setting the environment variable CURL_CA_BUNDLE to the path of their
choice.
Patch courtesy of Sebastian A. Siewior
These opcodes specify a function or keyword by number
instead of by name. The corresponding lookup tables
still have a few entries without names, but the majority
of them are been determined and verified.
The PROFILE_HASHTABLE preprocessor definition can be set at build
time and is intended to be used to enable profiling capabilities
for developers working with hash table and set data structure
profiling. This hashtable profiling functionality was added into
the code a while back and isn't currently functional, but would
ultimately be nice to have. This commit is a first step towards
getting it working.
When PROFILE_HASHTABLE is set, it causes several counters used for
collecting performance metrics to be inserted into the core hashtable
structures. When PROFILE_HASHTABLE is not set, however, these
counters are omitted, and the other members of the structure only
ever contain constant data. I'm guessing that at some point, as an
optimization in the latter case, ClamAV began declaring the hashtable
structures `const`, causing gcc (and maybe other compilers) to put
the structures in the read-only data section. Thus, the code
crashes when PROFILE_HASHTABLE is defined and the counters in the
read-only data section try to get incremented. The fix for this is
to just not mark these structures as `const` if PROFILE_HASHTABLE
is defined.