Python 3.5 compatibility fixes for Debian 9, etc that lack 3.6+.
Change a python f-string to an old-style `"".format()`.
Convert Path objects to strings for older `shutil` APIs that don't
accept Paths.
Fix missing return values for progress callbacks.
Fix Windows build.
The cli_debug_flag variable is not exported on Windows. The correct way
to check if in debug-mode is to check the command line options.
Added a test to verify that clamscan can extract images from an XLS
document. The document has 2 images: a PNG and JPEG version of the
clamav demon/logo. The test requires the json metadata feature to verify
that the MD5 of the images are correct.
No other image formats were tested because despite the format allegedly
supporting other imate formats, Excel converts TIFF, BMP, and GIF images
to PNG files when you insert them.
The split test files are flagged by some AV's because they look like
broken executables. Instead of splitting the test files to prevent
detections, we should encrypt them. This commit replaces the "reassemble
testfiles" script with a basic "XOR testfiles" script that can be used
to encrypt or decrypt test files. This commit also of course then
replaces all the split files with xor'ed files.
The test and unit_tests directories were a bit of a mess, so I
reorganized them all into unit_tests with all of the test files placed
under "unit_tests/input" using subdirectories for different types of files.
Fixup input output params to be anotated with [in,out], not [in/out].
Note: skipped some other incorrectly annodated [out] params that are
already staged to be fixed in a different PR.
The previous image extraction logic would search from the beginning of
the drawing group for the image file type magic bytes and then just
assume the rest of the file is that type. This is super hacky, didn't
support more than one image extraction, and resulted in "image files"
that contain a bunch of extra garbage data (which may include more
images or maybe just some metadata about how the images are used).
This commit implemented part of the office draw file specification to
correctly identify the start and size of each embedded image. Instead of
processing the drawing group as though it is one image to be extracted,
it collects the drawing group data into a single buffer, and then parses
the records within to identify the images within Blip records.
Based on: https://interoperability.blob.core.windows.net/files/MS-ODRAW/%5bMS-ODRAW%5d.pdf
Also resolved the following issue:
If XLM (and now images) are found when parsing an ole2 files the
following other embedded content may not be processed:
- document summary metadata
- embedded ole10 files
- ole2 temp subdirectories (i.e. recursion)
The logic to process the above ole2 extracted temp files was present in
the function which processes extracted VBA. When we added support for
extracting XLM macros, processing these other data was lost.
Really, the above need to be processed if any temp files were saved.
I fixed this by restructuring the features to extract any type of temp
file into separate functions per type of temp file. I then wrappped
those in an ole2 temp dir scanning function. OLE2 temp directory scanning
is recursive if there are subdirectories.
Added a feature to extract images from OLE2 BIFF streams.
This work was derived from InQuests blog post about extracting XLM and
images from XLS files:
https://inquest.net/blog/2019/01/29/Carving-Sneaky-XLM-Files
Assorted ole2 parser code cleanup and massive error handling cleanup.
Also fixed the following:
- The XLS parser may fail to process all BIFF records if some of the
records contain unexpected data or is otherwise malformed. Because the
record size is already known, we can skip over the "malformed" record
and continue with the rest.
- Fixed an issue where the ole2 header size was improperly calculated,
failing to account for the new "has_xlm" boolean added for context.
Trusted SHA256-based Authenticode hashes can now be loaded in
from .cat files. In addition:
- Files that are covered by Authenticode hashes loaded in from
.cat files will now be treated as VERIFIED like executables
where the embedded Authenticode sig is deemed to be trusted
based on .crb rules. This fixes a regression introduced in
0.102 (I think).
- The Authenticode hashes for signed EXEs without .crb coverage
will no longer be computed in cli_check_auth_header unless
hashes from .cat rules have been loaded. This fixes a slight
performance regression introduced in 0.102 (I think).
Add progress callbacks to libclamav for:
- database load
- engine compile
- engine free
Add a progress bar to clamscan for load & compile.
These are disabled if you run with --debug or stdout is not a TTY or you
are using one of --quiet, --infected, or --no-summary.
Added code so you can test the engine-free callback by building with
ENABLE_ENGINE_FREE_PROGRESSBAR defined.
The compile & free progress callbacks pre-calculate the number of
tasks to complete to estimate the progress. Some tasks may take longer
than others so the progress speed my appear to vary a little.
The callbacks return type is a cl_error_t but doesn't currently do
anything. It is reserved for future use.
Minor formatting change in matcher-ac.c to counteract weird
clang-format behavior, and to make it easier to read.
Added progress callbacks and clamscan progress bars to the news.
Adds a basic test to validate that ExcludePath correctly excludes a
subdirectory but does not exclude subsequent files. As with the other
ClamD/Scan tests, it will test in each mode: regular, stream, and
fdpass (if available).
Unlike the other tests, this one tests ClamDScan with Valgrind instead
of ClamD.
Refactored the clamd_test.py file to reduce duplicate code, and support
enabling and disabling valgrind when running ClamDScan and ClamD.
Add pytest to the github actions environments because the results when
using pytest are far easier to read.
ClamDScan will leak the memory for the scan target filename if using
`--fdpass` or using `--stream`. This commit fixes that leak.
Resolves: https://bugzilla.clamav.net/show_bug.cgi?id=12648
ClamDScan will fail to scan any file after running into an
"ExcludePath" exclusion when using `--fdpass` or `--stream` AND
--multiscan (-m). The issue is because the parallel_callback()
callback function used by file tree walk (ftw) feature returns an
error code for excluded files rather than "success".
Memory for the accidentally-excluded paths for a given directory also
appears to be leaked.
This commit resolves this accidental-abort issue and the memory leak.
There was an additional single file path memory leak when using
`--fdpass` caused by bad error handling in `cli_ftw()`.
This was fixed by removing the confusing ternaries, and using
separate pointers for each filename copy.
ClamDScan with ExcludePath regex may fail to exclude absolute paths
when performing relative scans because the exclude-check function may
match using provided relative path (E.g. `/some/path/../another/path`)
rather than an absolute path (E.g. `/some/path/another/path`).
This issue is resolved by getting the real path at the start of the
scan, eliminating `.` and `..` relative pathing from all filepaths.
TODO 1: In addition to being recursive (bad for stack safety), the
File Tree Walk (FTW) implementation is a spaghetti code and should
be refactored.
TODO 2: ExcludePath will print out "Excluded" for each path that is
excluded when using `--fdpass` or `--stream`, and for each path
directly scanned that is directly excluded. But in a recursive
regular-scan, the "Excluded" message for the those paths is missing.
There appear to be minors leak in clamd that can occur when shutting-
down immediately after a command (e.g. RELOAD).
These are causing intermittent clamd test failures.
It seems like they're caused by a thread leaking occasionally,
due to not exiting before the program terminates.
I don't believe these to be a serious issue. Tracking down the exact
cause and crafting a fix for the leaks isn't worth the effort.
This commit adds valgrind suppression rules to stabilize the tests.
Added feature to start FreshClam & Clamd as Windows services
Special thanks to Gianluigi Tiesi for allowing us to integrate this
feature from ClamWin directly into ClamAV.
Added internal --service-mode option for FreshClam and ClamD
This is used when Windows starts FreshClam or ClamD as a service so
that they will register with the service manager.
Code found in service.c.
Windows XP had a maximum section count of 96, and this has been
the max for ClamAV forever as well. Raising this prevents malicious
executables from being able to evade certain ClamAV signatures by
having 97 or more sections.
The non-existent file test has a hack to "expect" a wierd error message
caused by the '\v' character rather than the file not actually existing.
Recently something(?) changed and the test started reporting yet a
different message or no message.
Removing the '\v' special character fixes the test so it actually tests
a non-existent file and returns the same message as on other operating
systems.
Previously we'd not clang-formatted the c++ bytecode files because:
A) It's a massive difference in format
B) I wasn't sure, at the time, which code was "ours"
Reformatting now that the LLVM source is all removed and before it gets
updated to support modern LLVM versions.
Add a test where freshclam received a zero-byte cdiff to trigger a whole
CVD database download, and the CVD served is older than advertised.
This is a regression test for a bug found & fixed by Andrew Williams.
This commit fixes a bug in the libfreshclam error handling to where if
either of the following scenarios are encountered, the CVD download
attempt may be retried multiple times and always result in failure:
Scenario 1:
- Incremental downloads via CDIFFs are stopped because an empty CDIFF
file is encountered, and
- The CVD downloaded from the configured mirror is older than the
version advertised via DNS (for example, due to caching)
Scenario 2:
- Incremental downloads via CDIFFs fail, and
- The local database is more than 1 version out of date, and
- The CVD downloaded from the configured mirror is older than the
version advertised via DNS (for example, due to caching)
This bug was discovered by Coverity:
317956 Logically dead code
In updatedb: Code can never be reached because of a logical
contradiction
Adds 3 tests to validate that:
1. a CDIFF update works
2. a CDIFF partial update (with 1 missing CDIFF) works
and that a subsequent update is ok with being 1 behind
3. a CDIFF partial update (with 2 missing CDIFFs) works
and that a subsequent update will try to get the WHOLE CVD -
because being 2+ CDIFFs behind without any update isn't good enough.
Also fixed a minor bug so that the database name is properly displayed
when a partial update occurs instead of displaying "(null)".
Also changed the freshclam test port to 8001 to deconflict with
CVD-Update, in case that's running in the background.
TODO: Make the tests smarter so they find an open port instead of
hoping that 8001 is available.
The URL registry.hub.docker.com was apparently deprecated for a while,
and started to give 404 errors as of today for some repo's. The correct
URL is index.docker.io, so lets use that instead.
Signed-off-by: Olliver Schinagl <oliver@schinagl.nl>
Cloudflare deprecated the __cfduid cookie which caused ClamSubmit
failures on systems that stopped receiving the cookie.
This commit removes support for the __cfduid cookie.
Also made the session cookie optional, in case that disappears too.
Changed error messages over to use the logg() function like our other apps.
Tidied up some of the logic, and changed "cleanup" label to "done" to
match other code.
The for loop in cli_bcomp_scanbuf contains a few "continue" directives
that do not free the three-bytes subsigid buffer allocated within the
loop. This code path is triggered only when a signature contains more
than one byte compare subsignatures. Over a significant amount of time,
as for example when using clamd, this leads to memory exhaustion.
The `cli_append_virus()` function does an FP check. If it is an FP, it
will return `CL_CLEAN` and the match/alert/virus should be discarded.
This fix will respect FP verdicts when appending virus name in ac and
bm matchers in all match mode.
If zip content is detected within a file by way of the embedded file
type recognition scan (in `scanraw()`), a raw scan of that "ZIPSFX" will
detect all subsequent zip entries as new ZIPSFX's. Though they aren't
actually scanned later, it shows up in the metadata JSON. This commit
prevents embedded file type detection for ZIPSFX like we already have
for ZIP.
Semi-related, the mach-o unibin parser presently allows scanning of FAT
partitions anywhere in the fmap, to include the very beginning of the
fmap. This would be an infinite loop, scanning the same file over and
over again, were it not for the scan recursion limit. With the recursion
limit, it's ok, but still bad behavior. This commit prevents scanning
FAT files from the mach-o unibin parser where the offset is less than
the end of the headers.
Also fixed an unsigned integer comparison in the OLE2 parser that
might overflow.
This commit updates the ordering of the internal FTM sigs to
match what's in daily.ftm today. No FTM signature changes are
included as part of this commit (only re-ordering).
The template includes a comment block at the top to direct security
issue reports towards the SECURITY.md instructions.
A comment block at the bottom provides instructions for how to share
files needed to reproduce the bug.
These comments blocks disappear when the report is submitted.
The older style markdown headers are used to match the headers printed
by the ClamConf tool, so that copy-pasted output from ClamConf looks
good in the bug report.
We would like to switch from Bugzilla to Github Issues. This will make
issue reporting more accessible (more folks have a Github account than a
bugzilla.clamav.net account) in addition to the benefits of a more
modern issue tracker.
However, GitHub Issues reports are always public. Vulnerability reports
will have to go somewhere else. The preferred option is to report them
to Cisco PSIRT after which PSIRT will coordinate with the ClamAV team
and the reporter to resolve the issue.
The mail parser uses asserts extensively to detect error conditions.
It's lazy error handling; good for prototyping but bad for production.
Release mode builds are fine in 0.103 with autotools and visual-
studio but cmake release builds will crash because asserts are enabled
even for release.
In particular this assert(0) is a possible error condition in a
malformed mail file and should be handled properly.
This resolves:
https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=31782#c2
The condition triggering Heuristics.PNG.CVE-2010-1205 is more common
than expected. Considering this type of malformed PNG is somewhat common
and the CVE is more than 10 years old, it is reasonable to place this
detection behind the --alert-broken-media (SCAN_HEURISTIC_BROKEN_MEDIA)
option.