clamav

Commit Graph

Author	SHA1	Message	Date
mko-x	a21cc6dcd7	Add explicit log level parameter to application logging API * Added loglevel parameter to logg() * Fix logg and mprintf internals with new loglevels * Update all logg calls to set loglevel * Update all mprintf calls to set loglevel * Fix hidden logg calls * Executed clam-format	4 years ago
micasnyd	140c88aa4e	Bump copyright for 2022 Includes minor format corrections.	4 years ago
Mickey Sola	6cbb648113	Update rust logging inclusion based on feedback Convert cli_dbgmsg to inline function to ensure ctx check for debug flag is always run Add copyright and licensing info Fix valgrind uninitialized buffer issue in cliunzip.c Windows build fix	4 years ago
Mickey Sola	e8d78c9627	Add logging interface via the log crate for libclamav_rust modules	4 years ago
Micah Snyder	01ca0a2edd	Fix issues spotted by Coverity from fmap recursion fix CID 361074: fmap.c: Possible invalid dereference if status != success and the new map was not yet allocated. CID 361077: others.c: Structurally dead code revealed a bug in the cli_recursion_stack_get_size() function. CID 361080, 361078, 361083: sigtool.c: Inverted check for if engine needs to be free'd, could leak the engine structure. CID 361075: sigtool.c: Missed a `return -1` that should've been `goto done;` and would leak the new_map buffer. CID 361079: sigtool/vba.c: Checking if we should free the new_map on failure only if ctx also needs to be free'd, which would leak the new_map if ctx was not allocated yet.	4 years ago
Micah Snyder	ffce672622	Fix recursion limit alert The previous commit broke alerting when exceeding the recursion limit because recursion tracking is so effective that by limiting the final layer of recursion to a scan of the fmap, we prevented it from ever hitting the recursion limit. This commit removes that restriction where it only does an fmap scan (aka "raw scan") of files that are at their limit so that we can actually hit the recursion limit and alert as intended. Also tidied up the cache_clean check so it checks the `fmap->dont_cache_flag` at the right point (before caching) instead of before setting the "CLEAN" verdict. Note: The `cache_clean` variable appears to be used to record the clean status so the `ret` variable can be re-used without losing the verdict. This is of course only required because the verdict is stored in the error enum. cough Also fixed a couple typos.	4 years ago
Micah Snyder	0354482e16	Fix issues reading from uncompressed nested files The fmap module provides a mechanism for creating a mapping into an existing map at an offset and length that's used when a file is found with an uncompressed archive or when embedded files are found with embedded file type recognition in scanraw(). This is the "fmap_duplicate()" function. Duplicate fmaps just reference the original fmap's 'data' or file handle/descriptor while allowing the caller to treat it like a new map using offsets and lengths that don't account for the original/actual file dimensions. fmap's keep track of this with m->nested_offset & m->real_len, which admittedly have confusing names. I found incorrect uses of these in a handful of locations. Notably: - In cli_magic_scan_nested_fmap_type(). The force-to-disk feature would have been checking incorrect sizes and may have written incorrect offsets for duplicate fmaps. - In XDP parser. - A bunch of places from the previous commit when making dupe maps. This commit fixes those and adds lots of documentation to the fmap.h API to try to prevent confusion in the future. nested_offset should never be referenced outside of fmap.c/h. The fmap_* functions for accessing or reading map data have two implementations, mem_* or handle_*, depending the data source. I found issues with some of these so I made a unit test that covers each of the functions I'm concerned about for both types of data sources and for both original fmaps and nested/duplicate fmaps. With the tests, I found and fixed issues in these fmap functions: - handle_need_offstr(): must account for the nested_offset in dupe maps. - handle_gets(): must account for nested_offset and use len & real_len correctly. - mem_need_offstr(): must account for nested_offset in dupe maps. - mem_gets(): must account for nested_offset and use len & real_len correctly. Moved CDBRANGE() macro out of function definition so for better legibility. Fixed a few warnings.	4 years ago
Micah Snyder	db013a2bfd	libclamav: Fix scan recursion tracking Scan recursion is the process of identifying files embedded in other files and then scanning them, recursively. Internally this process is more complex than it may sound because a file may have multiple layers of types before finding a new "file". At present we treat the recursion count in the scanning context as an index into both our fmap list AND our container list. These two lists are conceptually a part of the same thing and should be unified. But what's concerning is that the "recursion level" isn't actually incremented or decremented at the same time that we add a layer to the fmap or container lists but instead is more touchy-feely, increasing when we find a new "file". To account for this shadiness, the size of the fmap and container lists has always been a little longer than our "max scan recursion" limit so we don't accidentally overflow the fmap or container arrays (!). I've implemented a single recursion-stack as an array, similar to before, which includes a pointer to each fmap at each layer, along with the size and type. Push and pop functions add and remove layers whenever a new fmap is added. A boolean argument when pushing indicates if the new layer represents a new buffer or new file (descriptor). A new buffer will reset the "nested fmap level" (described below). This commit also provides a solution for an issue where we detect embedded files more than once during scan recursion. For illustration, imagine a tarball named foo.tar.gz with this structure: \| description \| type \| rec level \| nested fmap level \| \| ------------------------- \| ----- \| --------- \| ----------------- \| \| foo.tar.gz \| GZ \| 0 \| 0 \| \| └── foo.tar \| TAR \| 1 \| 0 \| \| ├── bar.zip \| ZIP \| 2 \| 1 \| \| │ └── hola.txt \| ASCII \| 3 \| 0 \| \| └── baz.exe \| PE \| 2 \| 1 \| But suppose baz.exe embeds a ZIP archive and a 7Z archive, like this: \| description \| type \| rec level \| nested fmap level \| \| ------------------------- \| ----- \| --------- \| ----------------- \| \| baz.exe \| PE \| 0 \| 0 \| \| ├── sfx.zip \| ZIP \| 1 \| 1 \| \| │ └── hello.txt \| ASCII \| 2 \| 0 \| \| └── sfx.7z \| 7Z \| 1 \| 1 \| \| └── world.txt \| ASCII \| 2 \| 0 \| (A) If we scan for embedded files at any layer, we may detect: \| description \| type \| rec level \| nested fmap level \| \| ------------------------- \| ----- \| --------- \| ----------------- \| \| foo.tar.gz \| GZ \| 0 \| 0 \| \| ├── foo.tar \| TAR \| 1 \| 0 \| \| │ ├── bar.zip \| ZIP \| 2 \| 1 \| \| │ │ └── hola.txt \| ASCII \| 3 \| 0 \| \| │ ├── baz.exe \| PE \| 2 \| 1 \| \| │ │ ├── sfx.zip \| ZIP \| 3 \| 1 \| \| │ │ │ └── hello.txt \| ASCII \| 4 \| 0 \| \| │ │ └── sfx.7z \| 7Z \| 3 \| 1 \| \| │ │ └── world.txt \| ASCII \| 4 \| 0 \| \| │ ├── sfx.zip \| ZIP \| 2 \| 1 \| \| │ │ └── hello.txt \| ASCII \| 3 \| 0 \| \| │ └── sfx.7z \| 7Z \| 2 \| 1 \| \| │ └── world.txt \| ASCII \| 3 \| 0 \| \| ├── sfx.zip \| ZIP \| 1 \| 1 \| \| └── sfx.7z \| 7Z \| 1 \| 1 \| (A) is bad because it scans content more than once. Note that for the GZ layer, it may detect the ZIP and 7Z if the signature hits on the compressed data, which it might, though extracting the ZIP and 7Z will likely fail. The reason the above doesn't happen now is that we restrict embedded type scans for a bunch of archive formats to include GZ and TAR. (B) If we scan for embedded files at the foo.tar layer, we may detect: \| description \| type \| rec level \| nested fmap level \| \| ------------------------- \| ----- \| --------- \| ----------------- \| \| foo.tar.gz \| GZ \| 0 \| 0 \| \| └── foo.tar \| TAR \| 1 \| 0 \| \| ├── bar.zip \| ZIP \| 2 \| 1 \| \| │ └── hola.txt \| ASCII \| 3 \| 0 \| \| ├── baz.exe \| PE \| 2 \| 1 \| \| ├── sfx.zip \| ZIP \| 2 \| 1 \| \| │ └── hello.txt \| ASCII \| 3 \| 0 \| \| └── sfx.7z \| 7Z \| 2 \| 1 \| \| └── world.txt \| ASCII \| 3 \| 0 \| (B) is almost right. But we can achieve it easily enough only scanning for embedded content in the current fmap when the "nested fmap level" is 0. The upside is that it should safely detect all embedded content, even if it may think the sfz.zip and sfx.7z are in foo.tar instead of in baz.exe. The biggest risk I can think of affects ZIPs. SFXZIP detection is identical to ZIP detection, which is why we don't allow SFXZIP to be detected if insize of a ZIP. If we only allow embedded type scanning at fmap-layer 0 in each buffer, this will fail to detect the embedded ZIP if the bar.exe was not compressed in foo.zip and if non-compressed files extracted from ZIPs aren't extracted as new buffers: \| description \| type \| rec level \| nested fmap level \| \| ------------------------- \| ----- \| --------- \| ----------------- \| \| foo.zip \| ZIP \| 0 \| 0 \| \| └── bar.exe \| PE \| 1 \| 1 \| \| └── sfx.zip \| ZIP \| 2 \| 2 \| Provided that we ensure all files extracted from zips are scanned in new buffers, option (B) should be safe. (C) If we scan for embedded files at the baz.exe layer, we may detect: \| description \| type \| rec level \| nested fmap level \| \| ------------------------- \| ----- \| --------- \| ----------------- \| \| foo.tar.gz \| GZ \| 0 \| 0 \| \| └── foo.tar \| TAR \| 1 \| 0 \| \| ├── bar.zip \| ZIP \| 2 \| 1 \| \| │ └── hola.txt \| ASCII \| 3 \| 0 \| \| └── baz.exe \| PE \| 2 \| 1 \| \| ├── sfx.zip \| ZIP \| 3 \| 1 \| \| │ └── hello.txt \| ASCII \| 4 \| 0 \| \| └── sfx.7z \| 7Z \| 3 \| 1 \| \| └── world.txt \| ASCII \| 4 \| 0 \| (C) is right. But it's harder to achieve. For this example we can get it by restricting 7ZSFX and ZIPSFX detection only when scanning an executable. But that may mean losing detection of archives embedded elsewhere. And we'd have to identify allowable container types for each possible embedded type, which would be very difficult. So this commit aims to solve the issue the (B)-way. Note that in all situations, we still have to scan with file typing enabled to determine if we need to reassign the current file type, such as re-identifying a Bzip2 archive as a DMG that happens to be Bzip2- compressed. Detection of DMG and a handful of other types rely on finding data partway through or near the ned of a file before reassigning the entire file as the new type. Other fixes and considerations in this commit: - The utf16 HTML parser has weak error handling, particularly with respect to creating a nested fmap for scanning the ascii decoded file. This commit cleans up the error handling and wraps the nested scan with the recursion-stack push()/pop() for correct recursion tracking. Before this commit, each container layer had a flag to indicate if the container layer is valid. We need something similar so that the cli_recursion_stack_get_() functions ignore normalized layers. Details... Imagine an LDB signature for HTML content that specifies a ZIP container. If the signature actually alerts on the normalized HTML and you don't ignore normalized layers for the container check, it will appear as though the alert is in an HTML container rather than a ZIP container. This commit accomplishes this with a boolean you set in the scan context before scanning a new layer. Then when the new fmap is created, it will use that flag to set similar flag for the layer. The context flag is reset those that anything after this doesn't have that flag. The flag allows the new recursion_stack_get() function to ignore normalized layers when iterating the stack to return a layer at a requested index, negative or positive. Scanning normalized extracted/normalized javascript and VBA should also use the 'layer is normalized' flag. - This commit also fixes Heuristic.Broken.Executable alert for ELF files to make sure that: A) these only alert if cli_append_virus() returns CL_VIRUS (aka it respects the FP check). B) all broken-executable alerts for ELF only happen if the SCAN_HEURISTIC_BROKEN option is enabled. - This commit also cleans up the error handling in cli_magic_scan_dir(). This was needed so we could correctly apply the layer-is-normalized-flag to all VBA macros extracted to a directory when scanning the directory. - Also fix an issue where exceeding scan maximums wouldn't cause embedded file detection scans to abort. Granted we don't actually want to abort if max filesize or max recursion depth are exceeded... only if max scansize, max files, and max scantime are exceeded. Add 'abort_scan' flag to scan context, to protect against depending on correct error propagation for fatal conditions. Instead, setting this flag in the scan context should guarantee that a fatal condition deep in scan recursion isn't lost which result in more stuff being scanned instead of aborting. This shouldn't be necessary, but some status codes like CL_ETIMEOUT never used to be fatal and it's easier to do this than to verify every parser only returns CL_ETIMEOUT and other "fatal status codes" in fatal conditions. - Remove duplicate is_tar() prototype from filestypes.c and include is_tar.h instead. - Presently we create the fmap hash when creating the fmap. This wastes a bit of CPU if the hash is never needed. Now that we're creating fmap's for all embedded files discovered with file type recognition scans, this is a much more frequent occurence and really slows things down. This commit fixes the issue by only creating fmap hashes as needed. This should not only resolve the perfomance impact of creating fmap's for all embedded files, but also should improve performance in general. - Add allmatch check to the zip parser after the central-header meta match. That way we don't multiple alerts with the same match except in allmatch mode. Clean up error handling in the zip parser a tiny bit. - Fixes to ensure that the scan limits such as scansize, filesize, recursion depth, # of embedded files, and scantime are always reported if AlertExceedsMax (--alert-exceeds-max) is enabled. - Fixed an issue where non-fatal alerts for exceeding scan maximums may mask signature matches later on. I changed it so these alerts use the "possibly unwanted" alert-type and thus only alert if no other alerts were found or if all-match or heuristic-precedence are enabled. - Added the "Heuristics.Limits.Exceeded." events to the JSON metadata when the --gen-json feature is enabled. These will show up once under "ParseErrors" the first time a limit is exceeded. In the present implementation, only one limits-exceeded events will be added, so as to prevent a malicious or malformed sample from filling the JSON buffer with millions of events and using a tonne of RAM.	4 years ago
Micah Snyder	4a9cff9214	CMake: support Xcode builds Xcode (and perhaps some other generators?) do not like targets that have only object files. See: https://cmake.org/cmake/help/latest/command/add_library.html#object-libraries And: https://cmake.org/pipermail/cmake/2016-May/063479.html This issue manifests when using `-G Xcode` on macOS as the library dylibs being missing when linking with other binaries. This commit removes the object libraries for libclamav, libfreshclam, libclamunrar_iface, libclamunrar, libclammspack, and (lib)common because they were used by static or shared libs that didn't themselves have any added sources. Add getter & setter for the debug flag, so it isn't referenced by unit tests or other code that links with libclamav. This is needed because global variables are exported symbols on Windows.	4 years ago
Micah Snyder	090c8990e3	libclamav, clamscan: load/unload callbacks & progress meters Add progress callbacks to libclamav for: - database load - engine compile - engine free Add a progress bar to clamscan for load & compile. These are disabled if you run with --debug or stdout is not a TTY or you are using one of --quiet, --infected, or --no-summary. Added code so you can test the engine-free callback by building with ENABLE_ENGINE_FREE_PROGRESSBAR defined. The compile & free progress callbacks pre-calculate the number of tasks to complete to estimate the progress. Some tasks may take longer than others so the progress speed my appear to vary a little. The callbacks return type is a cl_error_t but doesn't currently do anything. It is reserved for future use. Minor formatting change in matcher-ac.c to counteract weird clang-format behavior, and to make it easier to read. Added progress callbacks and clamscan progress bars to the news.	4 years ago
Micah Snyder	54903eec99	clang-format housekeeping	5 years ago
Micah Snyder	a746d344df	Remove Autotools build system & built-in LLVM CMake is now required to build. The built-in LLVM is no longer available. Also removed support for libltdl calls, which is not used in the CMake builds, was only used when building with Autotools. TODO: Fix CMake LLVM support & update to work with modern versions.	5 years ago
Micah Snyder	6096605cc6	CMake: Fix Windows build, UnRAR dll extension The previous UnRAR vuln fix worked for the (old, gone) Visual Studio solution but broke the CMake build. We didn't notice because we tested and approved it with Visual Studio before switching to CMake. This change switches it to use the correct macro for the libclamunrar_iface extension which is ".dll" instead of "lib".	5 years ago
Jonas Zaddach (jzaddach)	c63ccbf878	Do not search for versioned dynamic libraries on Windows	5 years ago
Micah Snyder (micasnyd)	b9ca6ea103	Update copyright dates for 2021 Also fixes up clang-format.	5 years ago
Micah Snyder	451279876e	CMake: Add fuzz support. Enabled the metadata collection feature, scan heuristics, and all-match mode when fuzzing in the interest of better code coverage. Also remove deprecated STREAM command.	5 years ago
Micah Snyder	2552cfd0d1	CMake: Add CTest support to match Autotools checks An ENABLE_TESTS CMake option is provided so that users can disable testing if they don't want it. Instructions for how to use this included in the INSTALL.cmake.md file. If you run `ctest`, each testcase will write out a log file to the <build>/unit_tests directory. As with Autotools' make check, the test files are from test/.split and unit_tests/.split files, but for CMake these are generated at build time instead of at test time. On Posix systems, sets the LD_LIBRARY_PATH so that ClamAV-compiled libraries can be loaded when running tests. On Windows systems, CTest will identify and collect all library dependencies and assemble a temporarily install under the build/unit_tests directory so that the libraries can be loaded when running tests. The same feature is used on Windows when using CMake to install to collect all DLL dependencies so that users don't have to install them manually afterwards. Each of the CTest tests are run using a custom wrapper around Python's unittest framework, which is also responsible for finding and inserting valgrind into the valgrind tests on Posix systems. Unlike with Autotools, the CMake CTest Valgrind-tests are enabled by default, if Valgrind can be found. There's no need to set VG=1. CTest's memcheck module is NOT supported, because we use Python to orchestrate our tests. Added a bunch of Windows compatibility changes to the unit tests. These were primarily changing / to PATHSEP and making adjustments to use Win32 C headers and ifdef out the POSIX ones which aren't available on Windows. Also disabled a bunch of tests on Win32 that don't work on Windows, notably the mmap ones and FD-passing (i.e. FILEDES) ones. Add JSON_C_HAVE_INTTYPES_H definition to clamav-config.h to eliminate warnings on Windows where json.h is included after inttypes.h because json-c's inttypes replacement relies on it. This is a it of a hack and may be removed if json-c fixes their inttypes header stuff in the future. Add preprocessor definitions on Windows to disable MSVC warnings about CRT secure and nonstandard functions. While there may be a better solution, this is needed to be able to see other more serious warnings. Add missing file comment block and copyright statement for clamsubmit.c. Also change json-c/json.h include filename to json.h in clamsubmit.c. The directory name is not required. Changed the hash table data integer type from long, which is poorly defined, to size_t -- which is capable of storing a pointer. Fixed a bunch of casts regarding this variable to eliminate warnings. Fixed two bugs causing utf8 encoding unit tests to fail on Windows: - The in_size variable should be the number of bytes, not the character count. This was was causing the SHIFT_JIS (japanese codepage) to UTF8 transcoding test to only transcode half the bytes. - It turns out that the MultiByteToWideChar() API can't transcode UTF16-BE to UTF16-LE. The solution is to just iterate over the buffer and flip the bytes on each uint16_t. This but was causing the UTF16-BE to UTF8 tests to fail. I also split up the utf8 transcoding tests into separate tests so I could see all of the failures instead of just the first one. Added a flags parameter to the unit test function to open testfiles because it turns out that on Windows if a file contains the \r\n it will replace it with just \n if you opened the file as a text file instead of as binary. However, if we open the CBC files as binary, then a bunch of bytecode tests fail. So I've changed the tests to open the CBC files in the bytecode tests as text files and open all other files as binary. Ported the feature tests from shell scripts to Python using a modified version of our QA test-framework, which is largely compatible and will allow us to migrate some QA tests into this repo. I'd like to add GitHub Actions pipelines in the future so that all public PR's get some testing before anyone has to manually review them. The clamd --log option was missing from the help string, though it definitely works. I've added it in this commit. It appears that clamd.c was never clang-format'd, so this commit also reformats clamd.c. Some of the check_clamd tests expected the path returned by clamd to match character for character with original path sent to clamd. However, as we now evaluate real paths before a scan, the path returned by clamd isn't going to match the relative (and possibly symlink-ridden) path passed to clamdscan. I fixed this test by changing the test to search for the basename: <signature> FOUND within the response instead of matching the exact path. Autotools: Link check_clamd with libclamav so we can use our utility functions in check_clamd.c.	5 years ago
Micah Snyder	c522f45267	clamonacc: Reduce warning log verbosity Users have complained about two specific log events that are extremely verbose in non-critical error conditions: - clamonacc reports "ERROR: Can't send to clamd: Bad address" This may occur when small files are created/destroyed before they can be sent to be scanned. The log message probably should only be reported in verbose mode. - clamonacc reports "ClamMisc: $/proc/XXX vanished before UIDs could be excluded; scanning anyway" This may occur when a process that accessed a file exits before clamonacc find out who accessed the file. This is a fairly frequent occurence. It can still be problematic if `clamd` was the process which accessed the file (like a clamd temp file if watching /tmp), generally it's not an issue and we want to silently scan it anyways. Also addressed copypaste issue in onas_send_stream() wherein fd is set to 0 (aka STDIN) if the provided fd == 0 (should've been -1 for invalid FD) and if filename == NULL. In fact clamonacc never scans STDIN so the scan should fail if filename == NULL and the provided FD is invalid (-1). I also found that "Access denied. ERROR" is easily provoked when using --fdpass or --stream using this simple script: for i in {1..5000}; do echo "blah $i" > tmp-$i && rm tmp-$i; done Clamdscan does not allow for scans to fail quietly because the file does not exist, but for clamonacc it's a common thing and we don't want to output an error. To solve this, I changed it so a return length of -1 will still result in an "internal error" message but return len 0 failures will be silently ignored. I've added a static variable to onas_client_scan() that keeps state in case clamd is stopped and started - that way it won't print an error message for every event when offline. Instead it will log an error for the first connection failure, and log again when the connection is re-established for a future scan. Calls to onas_client_scan() are already wrapped with the onas_scan_lock mutex so the static variable should be safe. Finally, there were a couple of error responses from clamd that can occur if the file isn't found which we want to silently ignore, so I've tweaked the code which checks for specific error messages to account for these.	5 years ago
Micah Snyder (micasnyd)	9e20cdf6ea	Add CMake build tooling This patch adds experimental-quality CMake build tooling. The libmspack build required a modification to use "" instead of <> for header #includes. This will hopefully be included in the libmspack upstream project when adding CMake build tooling to libmspack. Removed use of libltdl when using CMake. Flex & Bison are now required to build. If -DMAINTAINER_MODE, then GPERF is also required, though it currently doesn't actually do anything. TODO! I found that the autotools build system was generating the lexer output but not actually compiling it, instead using previously generated (and manually renamed) lexer c source. As a consequence, changes to the .l and .y files weren't making it into the build. To resolve this, I removed generated flex/bison files and fixed the tooling to use the freshly generated files. Flex and bison are now required build tools. On Windows, this adds a dependency on the winflexbison package, which can be obtained using Chocolatey or may be manually installed. CMake tooling only has partial support for building with external LLVM library, and no support for the internal LLVM (to be removed in the future). I.e. The CMake build currently only supports the bytecode interpreter. Many files used include paths relative to the top source directory or relative to the current project, rather than relative to each build target. Modern CMake support requires including internal dependency headers the same way you would external dependency headers (albeit with "" instead of <>). This meant correcting all header includes to be relative to the build targets and not relative to the workspace. For example, ... ```c include "../libclamav/clamav.h" include "clamd/clamd_others.h" ``` ... becomes: ```c // libclamav include "clamav.h" // clamd include "clamd_others.h" ``` Fixes header name conflicts by renaming a few of the files. Converted the "shared" code into a static library, which depends on libclamav. The ironically named "shared" static library provides features common to the ClamAV apps which are not required in libclamav itself and are not intended for use by downstream projects. This change was required for correct modern CMake practices but was also required to use the automake "subdir-objects" option. This eliminates warnings when running autoreconf which, in the next version of autoconf & automake are likely to break the build. libclamav used to build in multiple stages where an earlier stage is a static library containing utils required by the "shared" code. Linking clamdscan and clamdtop with this libclamav utils static lib allowed these two apps to function without libclamav. While this is nice in theory, the practical gains are minimal and it complicates the build system. As such, the autotools and CMake tooling was simplified for improved maintainability and this feature was thrown out. clamdtop and clamdscan now require libclamav to function. Removed the nopthreads version of the autotools libclamav_internal_utils static library and added pthread linking to a couple apps that may have issues building on some platforms without it, with the intention of removing needless complexity from the source. Kept the regular version of libclamav_internal_utils.la though it is no longer used anywhere but in libclamav. Added an experimental doxygen build option which attempts to build clamav.h and libfreshclam doxygen html docs. The CMake build tooling also may build the example program(s), which isn't a feature in the Autotools build system. Changed C standard to C90+ due to inline linking issues with socket.h when linking libfreshclam.so on Linux. Generate common.rc for win32. Fix tabs/spaces in shared Makefile.am, and remove vestigial ifndef from misc.c. Add CMake files to the automake dist, so users can try the new CMake tooling w/out having to build from a git clone. clamonacc changes: - Renamed FANOTIFY macro to HAVE_SYS_FANOTIFY_H to better match other similar macros. - Added a new clamav-clamonacc.service systemd unit file, based on the work of ChadDevOps & Aaron Brighton. - Added missing clamonacc man page. Updates to clamdscan man page, add missing options. Remove vestigial CL_NOLIBCLAMAV definitions (all apps now use libclamav). Rename Windows mspack.dll to libmspack.dll so all ClamAV-built libraries have the lib-prefix with Visual Studio as with CMake.	5 years ago
Mickey Sola	9ea3b93018	Recurse all fpmaps when doing fpchecks Changes cli_checkfp_virus to a recursive function which checks all parent fmaps in the context for false positives Simplifies params needed for cli_checkfp_virus to use the current digest and fmap length that resides within the fmap struct itself	5 years ago
Andrew	1d66184a7d	More Coverity bug fixes Looking through the list of issues, I spotted some easy ones and submitted some fixes: - 225229 - In cli_rarload: Leak of memory or pointers to system resources. If finding the necessary libunrar functions fails (should be rare),we now dlclose libunrar. 225224 - In main (freshclam.c): A copied piece of code is inconsistent with the original (CWE-398). A minor copy-paste error was present, and optOutList could be cleaned up in one of the failure edge cases. 225228 - In decodecdb: Out-of-bounds access to a buffer (CWE-119). Off by one error when tokenizing certain CDB sig fields for printing with sigtool. Ex: $ cat test.cdb a:CL_TYPE_7Z:1-2-3:/./:1-2-3:1-2-3:0:1-2-3:: $ cat test.cdb \| ../installed/bin/sigtool --decode VIRUS NAME: a CONTAINER TYPE: CL_TYPE_7Z CONTAINER SIZE: WITHIN RANGE 1 to 2 FILENAME REGEX: /./ COMPRESSED FILESIZE: WITHIN RANGE 1 to 2 UNCOMPRESSED FILESIZE: WITHIN RANGE 1 to 2 ENCRYPTION: NO FILE POSITION: ================================================================= ==17245==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x7fffe3136d10 at pc 0x7f0f31c3f414 bp 0x7fffe3136c70 sp 0x7fffe3136c60 WRITE of size 8 at 0x7fffe3136d10 thread T0 #0 0x7f0f31c3f413 in cli_strtokenize ../../libclamav/str.c:524 #1 0x559e9797dc91 in decodecdb ../../sigtool/sigtool.c:2929 #2 0x559e9797ea66 in decodesig ../../sigtool/sigtool.c:3058 #3 0x559e9797f31e in decodesigs ../../sigtool/sigtool.c:3162 #4 0x559e97981fbc in main ../../sigtool/sigtool.c:3638 #5 0x7f0f3100fb96 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x21b96) #6 0x559e9795a1d9 in _start (/home/zelda/workspace/clamav-devel/installed/bin/sigtool+0x381d9) Address 0x7fffe3136d10 is located in stack of thread T0 at offset 48 in frame #0 0x559e9797d113 in decodecdb ../../sigtool/sigtool.c:2840 This frame has 1 object(s): [32, 48) 'range' <== Memory access at offset 48 overflows this variable HINT: this may be a false positive if your program uses some custom stack unwind mechanism or swapcontext (longjmp and C++ exceptions are supported) SUMMARY: AddressSanitizer: stack-buffer-overflow ../../libclamav/str.c:524 in cli_strtokenize - 225223 - In cli_egg_deflate_decompress: Reads an uninitialized pointer or its target (CWE-457). Certain fail cases would call inflateEnd on an uninitialized stream. Now it’s only called after initialization occurs. - 225220 - In buildcld: Use of an uninitialized variable (CWE-457). Certain fail cases would result in oldDir being used before initialization. It now gets zeroed before the first fail case. - 225219 - In cli_egg_open: Leak of memory or pointers to system resources (CWE-404). If certain realloc’s failed, several structures would not be cleaned up - 225218 - In cli_scanhwpml: Code block is unreachable because of the syntactic structure of the code (CWE-561). With certain macros set, there could be two consecutive return statements.	6 years ago
Micah Snyder	e01ba94e36	bb12506: Fix phishing/heuristic alert verbosity Some detections, like phishing, are considered heuristic alerts because they match based on behavior more than on content. A subset of these are considered "potentially unwanted" (low-severity). These low-severity alerts include: - phishing - PDFs with obfuscated object names - bytecode signature alerts that start with "BC.Heuristics" The concept is that unless you enable "heuristic precedence" (a method of lowing the threshold to immediateley alert on low-severity detections), the scan should continue after a match in case a higher severity match is found. Only at the end will it print the low-severity match if nothing else was found. The current implementation is buggy though. Scanning of archives does not correctly bail out for the entire archive if one email contains a phishing link. Instead, it sets the "heuristic found" flag then and alerts for every subsequent file in the archive because it doesn't know if the heuristic was found in an embedded file or the target file. Because it's just a heuristic and the status is "clean", it keeps scanning. This patch corrects the behavior by checking if a low-severity alerts were found at the end of scanning the target file, instead of at the end of each embedded file. Additionally, this patch fixes an in issue with phishing alerts wherein heuristic precedence mode did not cause a scan to stop after the first alert. The above changes required restructuring to create an fmap inside of cl_scandesc_callback() so that scan_common() could be modified to require an fmap and set up so that the current *ctx->fmap pointer is never NULL when scan_common() evaluates match results. Also fixed a couple minor bugs in the phishing unit tests and cleaned up the test code for improved legitibility and type safety.	6 years ago
Micah Snyder	9f2de39e04	New tmp sub-dir per scan; JSON meta improvements This commit improves the layout of the tmp file output and the JSON metadata output when using the --leave-temps and --gen-json options. For all scans, each scan target will get a unique tmp sub-directory. If using --leave-temps, that subdir will include the basename of the original file to make it easier to identify. Additionally, when using --leave-temps option, all extracted objects will have their subdirectories extracted in recursive subdirectories including filename prefixes where available. When not using the --leave-temps option, the layout of the tmp sub-directory will remain flat, so as to alleviate the possibility of exceeding PATH_MAX. The JSON metadata generated by the --gen-json option is now generated for all file types, not just a select few. The format is also pretty-printed for readability and now includes filenames and file paths when available. Also: - Added missing ALLMATCH check when determining if bytecode hooks should be run. - Added cl_engine_get_str API to windows libclamav symbol export file.	6 years ago
Micah Snyder	206dbaefe8	Update copyright dates for 2020	6 years ago
Micah Snyder (micasnyd)	6a0abb897a	Adds --max-scantime clamscan option and MaxScanTime clamd config option. --max-scantime replaces the --timelimit clamscan option that had been experimental. Default max-scantime set to 2 minutes (120000 milliseconds).	6 years ago
Micah Snyder	ee40795fe2	Converted mpool calls to macros when USE_MPOOL is defined to clearly differentiate between function and macro behavior.	6 years ago
Micah Snyder	5f4f69102d	Correcting types from int to cl_error_t where appropriate. Eliminating unused variables and referencing unused parameters to remove warnings.	6 years ago
Micah Snyder (micasnyd)	1c996e8872	bb12238 - Removing support for deprecated readdir_r() function. The readdir() function is thread safe so long as you don't share a dir object between threads. If you do, it requires a mutex.	6 years ago
Andrew	3cf1b1c58d	Add ability to whitelist leaf certificates	6 years ago
Andrew	92088f91f1	Add support for cert blacklisting and whitelisting upfront Instead of checking the Authenticode header as an FP prevention mechanism, we now check it in the beginning if it exists. Also, we can now do actual blacklisting with .crb rules (previously, a blacklist rule just let you override a whitelist rule).	6 years ago
Micah Snyder	52cddcbcfd	Updating and cleaning up copyright notices.	6 years ago
Micah Snyder	b3e82e5e61	Replacing libclamav/cltypes.h with clamav-types.h.in, which generates a header clamav-types.h that we install alongside clamav.h.	6 years ago
Micah Snyder	72fd33c8b2	clang-format'd using new .clang-format rules.	6 years ago
Russ Kubik	6a591aa48e	Prevent shared libraries from being loaded by libclam when statically linking unrar libraries (#148 )	7 years ago
Micah Snyder	d39cb6581f	Updating libclamunrar from legacy C implementation to modern unrar 5.6.5. API changes and supporting changes included to pass the filepath of the scanned file into libclamav through the cli_ctx structure, required by the unrar library to open archives. The filename argument may be optional for the scandesc scanning variant, but libclamav will make a best effort to identify the filename from the file descriptor if it was not provided. In addition, included the ability to prefix temp file and directory names with file basenames.	7 years ago
Micah Snyder	d7979d4ff7	Restructured scan options flags from a single bitflag field to a structure containing multiple bitflag fields. This also required adding a new function to the bytecode API to get scan options a la carte, and modifying the existing function to hand back scan options in the old/deprecated uint32_t bitflag format. Re-generated bytecode iface header files. Updated libclamav documentation detailing new scan options structure. Renamed references to 'algorithmic' detection to 'heuristic' detection. Renaming references to 'properties' to 'collect metadata'. Renamed references to 'scan all' to 'scan all match'. Renamed a couple of 'Hueristic.' signature names as 'Heuristics.' signatures (plural) to match majority of other heuristics.	7 years ago
Josh Soref	7cd9337a70	Spelling Adjustments (#30 ) * spelling: accessed * spelling: alignment * spelling: amalgamated * spelling: answers * spelling: another * spelling: acquisition * spelling: apitid * spelling: ascii * spelling: appending * spelling: appropriate * spelling: arbitrary * spelling: architecture * spelling: asynchronous * spelling: attachments * spelling: argument * spelling: authenticode * spelling: because * spelling: boundary * spelling: brackets * spelling: bytecode * spelling: calculation * spelling: cannot * spelling: changes * spelling: check * spelling: children * spelling: codegen * spelling: commands * spelling: container * spelling: concatenated * spelling: conditions * spelling: continuous * spelling: conversions * spelling: corresponding * spelling: corrupted * spelling: coverity * spelling: crafting * spelling: daemon * spelling: definition * spelling: delivered * spelling: delivery * spelling: delimit * spelling: dependencies * spelling: dependency * spelling: detection * spelling: determine * spelling: disconnects * spelling: distributed * spelling: documentation * spelling: downgraded * spelling: downloading * spelling: endianness * spelling: entities * spelling: especially * spelling: empty * spelling: expected * spelling: explicitly * spelling: existent * spelling: finished * spelling: flexibility * spelling: flexible * spelling: freshclam * spelling: functions * spelling: guarantee * spelling: hardened * spelling: headaches * spelling: heighten * spelling: improper * spelling: increment * spelling: indefinitely * spelling: independent * spelling: inaccessible * spelling: infrastructure Conflicts: docs/html/node68.html * spelling: initializing * spelling: inited * spelling: instream * spelling: installed * spelling: initialization * spelling: initialize * spelling: interface * spelling: intrinsics * spelling: interpreter * spelling: introduced * spelling: invalid * spelling: latency * spelling: lawyers * spelling: libclamav * spelling: likelihood * spelling: loop * spelling: maximum * spelling: million * spelling: milliseconds * spelling: minimum * spelling: minzhuan * spelling: multipart * spelling: misled * spelling: modifiers * spelling: notifying * spelling: objects * spelling: occurred * spelling: occurs * spelling: occurrences * spelling: optimization * spelling: original * spelling: originated * spelling: output * spelling: overridden * spelling: parenthesis * spelling: partition * spelling: performance * spelling: permission * spelling: phishing * spelling: portions * spelling: positives * spelling: preceded * spelling: properties * spelling: protocol * spelling: protos * spelling: quarantine * spelling: recursive * spelling: referring * spelling: reorder * spelling: reset * spelling: resources * spelling: resume * spelling: retrieval * spelling: rewrite * spelling: sanity * spelling: scheduled * spelling: search * spelling: section * spelling: separator * spelling: separated * spelling: specify * spelling: special * spelling: statement * spelling: streams * spelling: succession * spelling: suggests * spelling: superfluous * spelling: suspicious * spelling: synonym * spelling: temporarily * spelling: testfiles * spelling: transverse * spelling: turkish * spelling: typos * spelling: unable * spelling: unexpected * spelling: unexpectedly * spelling: unfinished * spelling: unfortunately * spelling: uninitialized * spelling: unlocking * spelling: unnecessary * spelling: unpack * spelling: unrecognized * spelling: unsupported * spelling: usable * spelling: wherever * spelling: wishlist * spelling: white * spelling: infrastructure * spelling: directories * spelling: overridden * spelling: permission * spelling: yesterday * spelling: initialization * spelling: intrinsics * space adjustment for spelling changes * minor modifications by klin	8 years ago
Micah Snyder	d0cba11ea7	adding back changes to eliminate warnings from mspack, matcher, others, and readdb.	8 years ago
Micah Snyder	169af0fc67	Revert "eliminating warnings. mostly correcting variable types. also correcting struct initialization in a couple instances (var = {0} does not zero the memory on all platforms). Also some minor formatting corrections in areas I was already working. eliminated some unused variables." This reverts commit `84a7f40288`.	8 years ago
Micah Snyder	84a7f40288	eliminating warnings. mostly correcting variable types. also correcting struct initialization in a couple instances (var = {0} does not zero the memory on all platforms). Also some minor formatting corrections in areas I was already working. eliminated some unused variables.	8 years ago
Steven Morgan	48d3f284db	fix for 0.99.3 false negative of Andr.Trojan.SMSsend-2.	9 years ago
Steven Morgan	167c007929	fix 0.99.3 false negative of virus Pdf.Exploit.CVE_2016_1046-1.	9 years ago
Steven Morgan	56eed3edf7	Fix for regression FN's.	9 years ago
Steven Morgan	cbf5017a7d	bb11805 fix multiple results. Refactor false positive and heuristic precedence logic.	9 years ago
Mickey Sola	631cb6a005	Fixes and updates to intermediate container sig rules based on code review	9 years ago
klin	031fe00a4d	restructure container typing system to use array (#2 )	9 years ago
Steven Morgan	90f29efa62	Suppress multiple viruses for BLOCKMAX edge case.	9 years ago
Steven Morgan	312b7e5391	bb11522 - enable clamscan option --blockmax to flag files as virus Heuristic.Limits.Exceeded when --max-filesize, --max-scansize, or --max-recursion is exceeded.	9 years ago
Kevin Lin	5eaf0b320a	bb#11003 - fix dconf and option handling for nocert and dumpcert	10 years ago
Kevin Lin	059ca61484	compiler warning suppression	10 years ago

1 2 3 4 5 ...

262 Commits (2e55c901b1b0571a340602ee2681255876f63f1a)