Visual Studio projects removed in favor of CMake because it's far easier
to build and maintain. Also removed the old InnoSetup installer now that
CMake's CPack provides installer creation.
While working on this I found that the THIS_IS_CLAMAV macro was missing,
resulting in warnings for the `have_rar` and `have_clamjit` exported
global variables.
I also stumbled across some code duplication and more cl_error_t / int
type issues in the pcre code, so this commit includes a little cleanup.
Also creates a ZIP for non-Admin (per-user) installs.
WIX requires the license file to have a .txt or .rtf extension so I
added the .txt extension. I've taken the opportunity to migrate the 3rd
party licenses to a COPYING subdirectory and have added licensing
details to the README.md file.
To build the installer, install WIX and simply run `cpack -C Release`
Also removed the explicit --config option from the
clamav-clamonacc.service file because it should not be required and
isn't being generated correctly when using autotools anyways, especially
after changes in this commit.
Enabled the metadata collection feature, scan heuristics, and all-match
mode when fuzzing in the interest of better code coverage.
Also remove deprecated STREAM command.
An ENABLE_TESTS CMake option is provided so that users can disable
testing if they don't want it. Instructions for how to use this
included in the INSTALL.cmake.md file.
If you run `ctest`, each testcase will write out a log file to the
<build>/unit_tests directory.
As with Autotools' make check, the test files are from test/.split
and unit_tests/.split files, but for CMake these are generated at
build time instead of at test time.
On Posix systems, sets the LD_LIBRARY_PATH so that ClamAV-compiled
libraries can be loaded when running tests.
On Windows systems, CTest will identify and collect all library
dependencies and assemble a temporarily install under the
build/unit_tests directory so that the libraries can be loaded when
running tests.
The same feature is used on Windows when using CMake to install to
collect all DLL dependencies so that users don't have to install them
manually afterwards.
Each of the CTest tests are run using a custom wrapper around Python's
unittest framework, which is also responsible for finding and inserting
valgrind into the valgrind tests on Posix systems.
Unlike with Autotools, the CMake CTest Valgrind-tests are enabled by
default, if Valgrind can be found. There's no need to set VG=1.
CTest's memcheck module is NOT supported, because we use Python to
orchestrate our tests.
Added a bunch of Windows compatibility changes to the unit tests.
These were primarily changing / to PATHSEP and making adjustments
to use Win32 C headers and ifdef out the POSIX ones which aren't
available on Windows. Also disabled a bunch of tests on Win32
that don't work on Windows, notably the mmap ones and FD-passing
(i.e. FILEDES) ones.
Add JSON_C_HAVE_INTTYPES_H definition to clamav-config.h to eliminate
warnings on Windows where json.h is included after inttypes.h because
json-c's inttypes replacement relies on it.
This is a it of a hack and may be removed if json-c fixes their
inttypes header stuff in the future.
Add preprocessor definitions on Windows to disable MSVC warnings about
CRT secure and nonstandard functions. While there may be a better
solution, this is needed to be able to see other more serious warnings.
Add missing file comment block and copyright statement for clamsubmit.c.
Also change json-c/json.h include filename to json.h in clamsubmit.c.
The directory name is not required.
Changed the hash table data integer type from long, which is poorly
defined, to size_t -- which is capable of storing a pointer. Fixed a
bunch of casts regarding this variable to eliminate warnings.
Fixed two bugs causing utf8 encoding unit tests to fail on Windows:
- The in_size variable should be the number of bytes, not the character
count. This was was causing the SHIFT_JIS (japanese codepage) to UTF8
transcoding test to only transcode half the bytes.
- It turns out that the MultiByteToWideChar() API can't transcode
UTF16-BE to UTF16-LE. The solution is to just iterate over the buffer
and flip the bytes on each uint16_t. This but was causing the UTF16-BE
to UTF8 tests to fail.
I also split up the utf8 transcoding tests into separate tests so I
could see all of the failures instead of just the first one.
Added a flags parameter to the unit test function to open testfiles
because it turns out that on Windows if a file contains the \r\n it will
replace it with just \n if you opened the file as a text file instead of
as binary. However, if we open the CBC files as binary, then a bunch of
bytecode tests fail. So I've changed the tests to open the CBC files in
the bytecode tests as text files and open all other files as binary.
Ported the feature tests from shell scripts to Python using a modified
version of our QA test-framework, which is largely compatible and will
allow us to migrate some QA tests into this repo. I'd like to add GitHub
Actions pipelines in the future so that all public PR's get some testing
before anyone has to manually review them.
The clamd --log option was missing from the help string, though it
definitely works. I've added it in this commit.
It appears that clamd.c was never clang-format'd, so this commit also
reformats clamd.c.
Some of the check_clamd tests expected the path returned by clamd to
match character for character with original path sent to clamd. However,
as we now evaluate real paths before a scan, the path returned by clamd
isn't going to match the relative (and possibly symlink-ridden) path
passed to clamdscan. I fixed this test by changing the test to search
for the basename: <signature> FOUND within the response instead of
matching the exact path.
Autotools: Link check_clamd with libclamav so we can use our utility
functions in check_clamd.c.
The clamd TOCTOU access check fix introduced and expectation that the
scanfile API will set errno if access was denied. We should instead use
the cl_error_t error code enum.
Also added Duane Waddle to the 0.104 contributors acknowledgements.
The fmap_duplicate function is used create a new fmap with a view into
an existing fmap. When the new view is a different size than the old
fmap, a new hash must be calculated for the duplicate fmap. However,
when the duplicated fmap is the same size as the original fmap, the hash
will be the same and there's no point recalculating.
The issue is apparent when scanning large EXE files because the hash was
being calculated at the beginning and end of the scan.
Digging into this issue revealed that hash calculations for fmaps were
also being performed at the wrong place. For scans of maps we use
fmap_duplicate() early in the process to apply the name API argument to
the duplicate fmap. Fixing the logic so we doing recalculate the hash
revealed that we never calculated hashes for fmap's created from buffers
in the first place, so that also had to be fixed be relocating where the
hash is calculated.
I also found that fmap_duplicate()'s offset argument used an off_t,
though it and all caller offsets are not allowed to be negative. This
was a bit of tangent to fix a bunch of off_t variables and paramters
that should've been size_t.
Added a couple unit tests to verify that making duplicate fmaps, and
duplicate-duplicate fmaps works as expected after the change.
Changed CLI_ISCONTAINED() and CLI_ISCONTAINED2() macros to cast to
size_t, because pointers and buffer sizes may not be negative, and these
two macros do not rely on substraction.
Reduced the verbosity of a GPT parser warning that occurs frequently
when parsing DMG files prior to DMG file type recognition.
DMG files support a handful of compression formats. File type
recognition for DMG presently works by doing "embedded" file type
recognition during the raw scan after having already identified the file
type by traditional file type magic checks. I found that when DMG uses
bzip2 for compression, we identify an MBR type containing a BZ type, at
which point the raw scan detects it as DMG. The previous commits broke
this by disabling embedded file type recognition for BZ and other
compression & archivie types. Ideally the fix would be to do DMG file
type detection by checking the end of the file; perhaps adding negative
offset support for FTM sigs could fix it. Until we can implement that or
another/better solution for DMG file type detection, we'll have to allow
embedded file type recognition for BZ files.
Also added some comments to narrate the scan process.
PNG file image data is decompressed to determine if it exceeds the
calculated image size as a heuristic for CVE-2010-1205. However, the
image data isn't actually scanned so we can reduce scan time and RAM
usage by not allocating space for the entire decompressed image.
Also skipped decompression of image data if the calculated image data
excceeds max-scansize.
ClamAV's embedded file type recognition detects some files found in
non-archive formats but for archive formats and compressed data streams
like bzip2 and gzip, it will often detect file type magic bytes of
compressed files and then attempt to parse the compressed data as if
they were whole files, resulting in wasted CPU cycles and confusing
warnings.
This patch prevents embedded file type recognition for CL_TYPE_GZ and
CL_TYPE_BZ.
Also revert the UTF8 Byte Order Mark (BOM) detection and associated
scanning of all text types as HTML files that had been added in 0.103.
Scanning a file as HTML is not performant because it creates temp files
and and normalizes the original files 3 ways.
Better text type detection, transcoding, and HTML detection is probably
still needed, but will have to wait. Scanning any embedded content that
looked like text with the HTML parser impacts performance too much.
Integrated the JPEG exploit check into the JPEG parser and removed it
from special.c.
As a happy consequence of this, the photoshop file detection and
embedded JPEG thumbnail exploit check was merged in as well, which means
that the embedded thumbnails can also be scanned as embedded JPEG files.
Adds debug output to the JPEG format validator to help resolve issues
with unusually formatted JPEGs and to validate that the JPEG parser is
working correctly.
Relaxes the rules around duplicate application markers or application
markers that appear later than expected, due to prior XMP metadata, etc.
Removed the requirement for an application marker to exist, as some
older JPEGs don't appear to use JFIF, Exif, or SPIFF application
extensions.
I tested against a relatively large data set of JPEGs from Mac & Windows
stock photos, personal photos, and assorted downloaded photos. FP rates
when alerting on broken media should be very low.
Added a new scan option to alert on broken media (graphics) file
formats. This feature mitigates the risk of malformed media files
intended to exploit vulnerabilities in other software. At present
media validation exists for JPEG, TIFF, PNG, and GIF files.
To enable this feature, set `AlertBrokenMedia yes` in clamd.conf, or
use the `--alert-broken-media` option when using `clamscan`.
These options are disabled by default for now.
Application developers may enable this scan option by enabling
`CL_SCAN_HEURISTIC_BROKEN_MEDIA` for the `heuristic` scan option bit
field.
Fixed PNG parser logic bugs that caused an excess of parsing errors
and fixed a stack exhaustion issue affecting some systems when
scanning PNG files. PNG file type detection was disabled via
signature database update for 0.103.0 to mitigate effects from these
bugs.
Fixed an issue where PNG and GIF files no longer work with Target:5
(graphics) signatures if detected as CL_TYPE_PNG/GIF rather than as
CL_TYPE_GRAPHICS. Target types now support up to 10 possible file
types to make way for additional graphics types in future releases.
Scanning JPEG, TIFF, PNG, and GIF files will no longer return "parse"
errors when file format validation fails. Instead, the scan will alert
with the "Heuristics.Broken.Media" signature prefix and a descriptive
suffix to indicate the issue, provided that the "alert broken media"
feature is enabled.
GIF format validation will no longer fail if the GIF image is missing
the trailer byte, as this appears to be a relatively common issue in
otherwise functional GIF files.
Added a TIFF dynamic configuration (DCONF) option, which was missing.
This will allow us to disable TIFF format validation via signature
database update in the event that it proves to be problematic.
This feature already exists for many other file types.
Added CL_TYPE_JPEG and CL_TYPE_TIFF types.
This improvement looks up the filename given the file descriptor.
This is supported on Mac and Linux but not presently supported
on other UNIX operating systems. FD-passing is not available on
Windows.
On supported systems, the verdict in the clamd log and the VirusEvent
will show the actual file path instead of something like fd[14].
Users have complained about two specific log events that are extremely
verbose in non-critical error conditions:
- clamonacc reports "ERROR: Can't send to clamd: Bad address"
This may occur when small files are created/destroyed before they can
be sent to be scanned. The log message probably should only be
reported in verbose mode.
- clamonacc reports "ClamMisc: $/proc/XXX vanished before UIDs could be
excluded; scanning anyway"
This may occur when a process that accessed a file exits before
clamonacc find out who accessed the file. This is a fairly frequent
occurence. It can still be problematic if `clamd` was the process which
accessed the file (like a clamd temp file if watching /tmp), generally
it's not an issue and we want to silently scan it anyways.
Also addressed copypaste issue in onas_send_stream() wherein fd is set
to 0 (aka STDIN) if the provided fd == 0 (should've been -1 for invalid
FD) and if filename == NULL. In fact clamonacc never scans STDIN so the
scan should fail if filename == NULL and the provided FD is invalid
(-1).
I also found that "Access denied. ERROR" is easily provoked when using
--fdpass or --stream using this simple script:
for i in {1..5000}; do echo "blah $i" > tmp-$i && rm tmp-$i; done
Clamdscan does not allow for scans to fail quietly because the file does
not exist, but for clamonacc it's a common thing and we don't want to
output an error. To solve this, I changed it so a return length of -1
will still result in an "internal error" message but return len 0
failures will be silently ignored.
I've added a static variable to onas_client_scan() that keeps state in
case clamd is stopped and started - that way it won't print an error
message for every event when offline. Instead it will log an error for
the first connection failure, and log again when the connection is
re-established for a future scan. Calls to onas_client_scan() are
already wrapped with the onas_scan_lock mutex so the static variable
should be safe.
Finally, there were a couple of error responses from clamd that can
occur if the file isn't found which we want to silently ignore, so I've
tweaked the code which checks for specific error messages to account for
these.
The security improvement to perform file realpath lookups prior to a
scan has the adverse effect of causing file scans to fail on Windows
when scanning on some filesystems.
Specifically, it was observed that the ImDisk driver doesn't handle the
IRP_MJ_QUERY_INFORMATION message so the call to look up the realpath
using GetFinalPathNameByHandleW() doesn't work.
There are two other API's I've found which can query the real file path.
The first is to create a file mapping of the target file and then use
GetMappedFileNameW() to get the file path. The other is to use the
NtQueryObject() undocumented NT API to get the file path. Each of
these should return roughly the same thing. For files in an ImDisk
RAM-disk drive, the resulting filepath for R:\clam.exe would
be \\Device\ImDisk0\clam.exe. The trouble is, mapping
\\Device\ImDisk0\clam.exe back to R:\clam.exe would rely on an
assumption that ImDisk is using the default drive letter, which is a tad
hacky.
Instead, this patch simply allows the scan to proceed if the realpath
lookup failed. If the user is using the quarantine (remove/move)
features AND if the scan target filepath has a directory junction (soft
link), then the quarantine action will fail. It's not ideal but it is
quite unlikely.
On systems with both libiconv and built-in iconv (libc), the compile
test must include the libiconv header path because it _will_ fail if
it builds against libiconv's iconv.h and doesn't link with libiconv.
This fix is similar to the one for Snort3, here:
https://github.com/snort3/snort3/issues/62
Remove the "-rc2" from the version string.
Also bump FLEVEL from 120 -> 121.
Also fixes two issues:
- The VERSION_SUFFIX defined by clamav-config.h.cmake.in must be defined
with #define instead of #cmakedefine, so it is defined as an empty
string even if there is no suffix (eg for an actual release)
- Removed a bashism in the libcheck detection code for autotools,
resolving https://bugzilla.clamav.net/show_bug.cgi?id=12598
At least some unicode filenames may fail to scan in 0.102.4+ because
while Windows char* strings may be UTF8, the GetFinalPathNameByHandleA
function does not return UTF8 strings and instead does lossy conversion
to ASCII. To fix this, we need to use GetFinalPathNameByHandleW instead
and then convert from UTF16-LE to UTF8.
While fixing this bug, I found and fixed a couple other serious issues
with the Win32 implementation of cli_codepage_to_utf8().
If a file is on a network share, the realpath comes back with a path
name that looks like "\\\\?\\UNC\\<host>\\<share>\\...". In thi scase,
the "\\\\?\\UNC\\" prefix is critical or else clamscan.exe won't be able
to open the file. This patch checks for the "\\\\?\\UNC" prefix and if
it exists, it keeps the prefix, else it trims the "\\\\?\\" portion as
before. This should fix scanning of files on network shares.
Fixes error handling issues in ARJ parser wherein FALSE is mistakenly
returned instead of a CL_E* error code, as the type is return type is
`int`, but in reality a cl_error_t enum value is expected.
Flex and Bison are generally available not not particularly easy to
install and on macOS, the Bison version is relatively ancient and not
compatible. Homebrew doesn't necessarily play nice with Xcode, so to
make CMake builds work on macOS without mandating the use of Homebrew,
our best option is to make Flex & Bison optional.
Flex and Bison generated files will be kept in revision control and will
get re-generated only if you use -DMAINTAINER_MODE=ON which will
introduce the Flex and Bison tool dependencies.
CMake: don't emit fullpath in yara generated source
Autotool's ylwrap script has a hack that prevents the full path of the
bison & flex generated source from being included in the debug line
numbers and in the preprocessor include guard macros. CMake doesn't have
this, so when it sets the output file to the full path, the current
user's path is leaked into the generated source.
Added `%output "yara_grammar.c"` to yara_grammar.y and re-generated the
.c & .h file with this change. This overrides the "FILE" setting used
when generating those line numbers and include guard macro names so that
the path isn't included.
Similarly, added `%option outfile="yara_lexer.c` to yara_lexer.l and
re-generated the .c file with this change. This has the same effect but
for flex so that full filepaths are not emitted into the source.
Revert NEWS.md item regarding Flex, Bison change.
Revert placing yara grammar/lexer files in win32 compat.
The pcre2.h header dependency is propagated to the bytecode runtime,
lzma_sdk, yara, and regex build targets within the libclamav build
because it is included by matcher.h which is included all over the
place.
This patch adds the pcre2 dependency to the affected build targets so
that systems where pcre2 isn't in the standard include path can still
build.
Also removed CMake `PCRE2_DIR` from documentation, as it doesn't apply
to this PCRE2 detection logic that we settled on.
Update the NEWS to add and correct content prior to the release
candidate.
Changed the version string to have the -rc suffix.
Also fixed a couple of --help and manpage issues.
This patch adds experimental-quality CMake build tooling.
The libmspack build required a modification to use "" instead of <> for
header #includes. This will hopefully be included in the libmspack
upstream project when adding CMake build tooling to libmspack.
Removed use of libltdl when using CMake.
Flex & Bison are now required to build.
If -DMAINTAINER_MODE, then GPERF is also required, though it currently
doesn't actually do anything. TODO!
I found that the autotools build system was generating the lexer output
but not actually compiling it, instead using previously generated (and
manually renamed) lexer c source. As a consequence, changes to the .l
and .y files weren't making it into the build. To resolve this, I
removed generated flex/bison files and fixed the tooling to use the
freshly generated files. Flex and bison are now required build tools.
On Windows, this adds a dependency on the winflexbison package,
which can be obtained using Chocolatey or may be manually installed.
CMake tooling only has partial support for building with external LLVM
library, and no support for the internal LLVM (to be removed in the
future). I.e. The CMake build currently only supports the bytecode
interpreter.
Many files used include paths relative to the top source directory or
relative to the current project, rather than relative to each build
target. Modern CMake support requires including internal dependency
headers the same way you would external dependency headers (albeit
with "" instead of <>). This meant correcting all header includes to
be relative to the build targets and not relative to the workspace.
For example, ...
```c
include "../libclamav/clamav.h"
include "clamd/clamd_others.h"
```
... becomes:
```c
// libclamav
include "clamav.h"
// clamd
include "clamd_others.h"
```
Fixes header name conflicts by renaming a few of the files.
Converted the "shared" code into a static library, which depends on
libclamav. The ironically named "shared" static library provides
features common to the ClamAV apps which are not required in
libclamav itself and are not intended for use by downstream projects.
This change was required for correct modern CMake practices but was
also required to use the automake "subdir-objects" option.
This eliminates warnings when running autoreconf which, in the next
version of autoconf & automake are likely to break the build.
libclamav used to build in multiple stages where an earlier stage is
a static library containing utils required by the "shared" code.
Linking clamdscan and clamdtop with this libclamav utils static lib
allowed these two apps to function without libclamav. While this is
nice in theory, the practical gains are minimal and it complicates
the build system. As such, the autotools and CMake tooling was
simplified for improved maintainability and this feature was thrown
out. clamdtop and clamdscan now require libclamav to function.
Removed the nopthreads version of the autotools
libclamav_internal_utils static library and added pthread linking to
a couple apps that may have issues building on some platforms without
it, with the intention of removing needless complexity from the
source. Kept the regular version of libclamav_internal_utils.la
though it is no longer used anywhere but in libclamav.
Added an experimental doxygen build option which attempts to build
clamav.h and libfreshclam doxygen html docs.
The CMake build tooling also may build the example program(s), which
isn't a feature in the Autotools build system.
Changed C standard to C90+ due to inline linking issues with socket.h
when linking libfreshclam.so on Linux.
Generate common.rc for win32.
Fix tabs/spaces in shared Makefile.am, and remove vestigial ifndef
from misc.c.
Add CMake files to the automake dist, so users can try the new
CMake tooling w/out having to build from a git clone.
clamonacc changes:
- Renamed FANOTIFY macro to HAVE_SYS_FANOTIFY_H to better match other
similar macros.
- Added a new clamav-clamonacc.service systemd unit file, based on
the work of ChadDevOps & Aaron Brighton.
- Added missing clamonacc man page.
Updates to clamdscan man page, add missing options.
Remove vestigial CL_NOLIBCLAMAV definitions (all apps now use
libclamav).
Rename Windows mspack.dll to libmspack.dll so all ClamAV-built
libraries have the lib-prefix with Visual Studio as with CMake.
Real-path checks are still needed in clamdscan when doing fd-passing and
streaming. This commit remedies that and improves some of the error
handling.
In addition, some cleanup to eliminate warnings on Windows added to the
shared code.
This patch relocates the real-path check from clamdscan and clamonacc
to clamd. While clamonacc is unlikely to send directories or symlinks
to be scanned, clamdscan may send directories. Real-path checks have
to be performed on the files, not the directories -- both because the
directories may contain symlinks and because the cli_realpath()
function wasn't written to support directories on Windows.
Using file type recognition scan mode for disk images and other raw
archive formats is problematic. One simple reason is that the contained
files will be detected and parsed and scanned twice, first when deteced
by the type recog scan, and later when the archive is extracted and the
files are properly scanned. Another reason is an increased likelihood
for incorrect type recognition, as seen with supposed MHTML files (they
weren't) found in GPT disk images.
Though a previous patch disabled embedded type recognition for GPT
files, this one extens this to the following:
- CL_TYPE_CPIO_OLD
- CL_TYPE_ZIP
- CL_TYPE_OLD_TAR
- CL_TYPE_POSIX_TAR
ZIP is included because file entries in a ZIP are incorrectly detected
as ZIPSFX's and though we also ensure not to scan ZIPSFX's found in
ZIP's, it's more efficient not to do the type recognition in the first
place and it prevents us from adding those bogus ZIPSFX entries into the
scan properties JSON.
This patch also fixes what appears to be a copy-paste typo, where
CL_TYPE_ISHIELD_MSI types were accidentally having their container value
set to CL_TYPE_AUTOIT.
Exit early from VBA scanning loop if virus found.
Add VBA/XLM suffix to ContainsMacros heuristics.
Fix setting status code for error and virus conditions.
Increment/decrement recursion counter when scanning vba dir.
Notably the commit adds a heuristic alert when VBA is extracted using
the new VBA extraction code and similarly adds "HasMacros":true to the
JSON scan properties.
In addition, a change was added to the cli_sanitize_filepath() function
so it converts posix pathseps to Windows pathseps on Windows and also
outputs a sanitized basename pointer (optional) which is used when
generating a temporary filename so that using a prefix with pathseps in
it won't cause file creation failures (observed with --leave-temps where
original filenames are incorporated into temporarily filenames).
Included soem error handling improvements for cli_vba_scandir() to
better track alert and macro detections.
Downgraded utf8 conversion error messages to debug messages because they
are too verbose in files with invalid filenames (observed in some
malware).
Changed the xlm macro and vba project temp filenames to include
"xlm_macros" and "vba_project" prefix, to make it easier to find them.
Relocated XLM and VBA temp files from the top-level tmp directory to the
current sub_tmpdir, so tempfiles for a given scan are more organized.
Fix an infinite loop in the new XLM macro parser.
Fix error handling, resource cleanup in OLE2 parser.
Fix issues tracking detected "viruses" in VBA & OLE2 parsers affecting
non-allmatch (regular) scan mode, wherein multiple viruses may be found
but each record lost and the overall detection comes up clean.
Also silence switch() fall-through warning for WORD/PPT/XL/HWP (OOXML)
file type fall-throughs to the ZIP parser (because they are zips).
Also silence switch() fall-through warning when handling the limits-
exceeded error types, checking for the limits-exceeded heuristic, and
continuing on to bail out with a clean verdict.
Changes cli_checkfp_virus to a recursive function which checks all
parent fmaps in the context for false positives
Simplifies params needed for cli_checkfp_virus to use the current digest
and fmap length that resides within the fmap struct itself
Add missing ping_clamd() declaration in client.h
Fix check for ping option to first check if ping option is NULL before
strdup'ing and checking if the alloc failed.
Fix format string for uint64_t print.
Correctly assign name pointer to stack buffer in cpio parser.
Remove vestigial variables from insert_list() function matcher-ac.c,
left over from before the load-time optimizations completely
restructured everything.
Silence warnings about unused parameters in progress bar callback
function.
Valgrind reports uninitialized `tag` stack buffer being used. While this
appears to be a false positive, it can't hurt to initialize this and
similar buffers in this function.
Fixes bound checks in recently rewritten VBA parser code (i.e. issue
does not affect prior versions).
Also improves VBA terminator header parsing to better match the spec,
per recommendation by Jonas Zaddach.