Includes rudimentary support for getting slices from FMap's and for
interacting with libclamav's context structure.
For now will use a Cisco-Talos org fork of the onenote_parser
until the feature to read open a onenote section from a slice (instead
of from a filepath) is added to the upstream.
This commit adds a feature to find, decode, and scan each image found
within HTML <style> tags where the image data is embedded in `url()`
function parameters a base64 blob
In C in the html normalization process we extract style tag contents
to new buffer for processing. We call into a new feature in Rust code to
find and decode each image (if there are multiple).
Once extracted, the images are scanned as contained files of unknown
type, and file type identifcation will determine the actual type.
Rework the append_virus mechanism to store evidence (strong indicators,
pua indicators, and eventually weak indicators) in vectors. When
appending a "virus", we will return CLEAN when in allmatch-mode, and
simply add the indicator to the appropriate vector.
Later we can check if there were any alerts to return a vector by
summing the lengths of the strong and pua indicator vectors.
This does away with storing the latest "virname" in the scan context.
Instead, we can query for the last indicator in the evidence, giving
priority to strong indicators.
When heuristic-precendence is enabled, add PUA as Strong instead of
as PotentiallyUnwanted. This way, they will be treated equally and
reported in order in allmatch mode.
Also document reason for disabling cache with metadata JSON enabled
* Windows: Fix utf8 filename support issue
The function used to verify the real path for the file being
scanned fails on some utf8 filenames, like:
file_with_ümlaut.eicar.com
The scan continues despite the failure, but the error reported from
clamd and warnings from clamscan are confusing because it looks like
the scan failed even if a verdict shows up later.
The fix here is to convert the path from utf8 to utf16 and then use
the CreateFileW() API to grab the file handle for the real path check.
Resolves: https://github.com/Cisco-Talos/clamav/issues/418
Resolves: CLAM-1782
* Windows: Fix utf8 libclamav logging issues
Print to the log using rust eprint() instead of fputs()
Rust's eprint() function supports full utf8, while fputs() gets
confused and prints stuff like:
file_with_├╝mlaut.eicar.com
instead of:
file_with_ümlaut.eicar.com
* Broke out the variants of error/result handling in `frs_error.rs`.
Made syntax slightly cleaner for `frs_call!`, explicitly moving
the output variables *out* of the function call so as not to make
the parameter order confusing.
* Wrapped the FuzzyHash map into a container rather than exposing
the HashMap directly. Simplifies casting, and allows it to feel
more like a class with methods.
* Fixed various clippy complaints regarding unsafe, etc.
* Rename `frs_error.rs` to `ffi_utils.rs` and migrated ffi-specific
features like the `validate_str_param!()` macro to this new module.
The new frs_error module makes it so you can painlessly transfer Rust
errors to-and-from C.
On the Rust-side, on failure it returns an error struct to the C caller.
The C caller detects the failure by checking the return value.
On failure, the caller may use an additional frs_error Rust API to get a
formatted error message.
Two variants are provided:
- One for functions that return a boolean and have an out-parameter for
function output.
- One for functions that return a pointer, where NULL means failure and
non-NULL contains the successful function output.
Add a new logical signature subsignature type for matching on images
with image fuzzy hashes.
Image fuzzy hash subsigantures follow this format:
fuzzy_img#<hash>#<dist>
In this initial implementation, the hamming distance (dist) is ignored
and only exact fuzzy hash matches will alert.
Fuzzy hash matching is only performed for supported image types.
Also: removed some excessive debug log messages on start-up.
Fixed an issue where the signature name (virname) is being allocated and
stored for every subsignature or even ever sub-pattern in an AC-pattern
(i.e. NDB sig or LDB subsig) containing a `{n-m}` or `*` wildcard.
This fix is only for LDB subsigs though. NDB signatures are still
allocaing one virname per sub-pattern.
This fix was required because I needed a place to store the virname with
fuzzy-hash subsignatures. Storing it in the fuzzy-hash subsig
metadatathe way AC-pattern, PCRE, and BComp subsigs were doing it
wouldn't work because it would cross the C-Rust FFI boundary and giving
pointers to Rust allocated stuff is dicey. Not to mention native Rust
strings are different thatn C strings. Anyways, the correct thing to do
was to store the virname with the actual logical signature.
TODO: Keep track of NDB signatures in the same way and store the virname
for NDB sigs there instead of in AC-patterns so that we can get rid of
the virname field in the AC-pattern struct.
Use bindgen to generate Rust-bindings for some libclamav internal
functions and structures in an new "sys.rs" module.
"sys.rs" will be generated at build-time and unfortunately must be
dropped in the source/libclamav_rust/src directory rather than under the
build directory. As far as I know, Cargo/Rust provide no way to set an
include path to the build directory to separately place generated files.
TODO: Verify if this is true and move sys.rs to the build directory if
possible.
Using the new bindings with:
- the logging module.
- the cdiff module.
Also:
- Removed clamav_rust.h from .gitignore, because we generate it in the
build directory now.
- Removed the hand-written bindings from the cbindgen exclusions.
lib.rs has an annotation that prevents cbindgen from looking at sys.rs.
- Fixed a `u8` -> `c_char` type issue in cdiff in the cli_getdsig() call
parameters.
Rustify cdiff_apply() and clean up error handling:
- Restore [some] safety and clean up error handling.
- Use rust-crypto sha2 instead of OpenSSL's.
Fix signedness of cli_versig2() dsig parameter.
c_char may be an i8 or u8 depending on platform:
https://doc.rust-lang.org/src/std/os/raw/mod.rs.html#91-133
Rustify cmd_close():
- Consolidate DEL/XCHG records.
- Tidy up ADD handling.
- Various error handling cleanup, etc.
Apply both .cdiff and .script CVD patches.
Note: A script is a non-compressed and unsigned file containing cdiff
commands. There is no header or footer that should be processed.
This Rust-based implementation of the cdiff-apply feature includes
equivalent features as found in the C-based implementation:
- cdiff file signature validation against sha256 of the file contents
- Gz decoding of file contents
- File open command
- File close command
- Signature add command
- Line delete command
- Xchg command
- Move command
- Unlink command
This Rust implementation adds cdiff-apply unit tests to verify correct
functionality.