Users have reported slow scan speeds of some PDF documents. The scan
speed was very slow on Windows in particular. Investigation indicated
significant time spent in cli_realloc.
Performance is particularly bad with a relatively large data stream is
decompressed using small chunk sizes and the final buffer is reallocated
to a larger buffer size each time a chunk is added.
This commit replaces BUFSIZ, which varies from 256B -> 8192B, with
INFLATE_CHUNK_SIZE, set to 256kB, the chunk size recommended by the zlib
documentation for efficient inflate performance. The output buffer is
shrunk (reallocated) down to the final decoded buffer length so as not
to waste memory when many small buffers must be decompressed.
A followup fix should provide a standard way to do zlib decompression
across libclamav where a linked list of decompressed chunks are
assembled and then the final output buffer is allocated at the end.
Implemented several maximums in parsing MIME messages to avoid Denial of
Service attempts, as well as improving parsing logic to avoid repeatedly
calling the realloc function. These are in response to excessively long scan
times for specially crafted files.
This is in response to CVE-2019-15961.
The limits added are
1. Limit on number of MIME parts per message.
2. Limit on number of header bytes.
3. Limit on number of email headers.
4. Limit on number of line folds.
5. Limit on numbef of MIME arguments.
Certs can omit the boolean field in the Basic Constraints section,
since the RFC specifies a default value for this field. This fixes
the following error:
LibClamAV debug: asn1_expect_objtype: expected type 01, got 02
LibClamAV debug: asn1_get_x509: An error occurred when parsing x509 extensions
LibClamAV debug: asn1_parse_mscat: skipping x509 certificate with errors
Ex: 05de45fd6a406dc147a4c8040a54eee947cd6eba02f28c0279ffd1a229e17130
Allow UTCDate fields in x509 certs to omit the seconds. Technically
this is disallowed by RFC5280, but Windows Authenticode verification
routines don't seem to mind it, so we'll allow it too. This fixes
the following error:
LibClamAV debug: asn1_getnum: expecting digits, found 'Z'
LibClamAV debug: asn1_get_time: invalid second 4294967295
LibClamAV debug: asn1_get_x509: unable to extract the notBefore time
LibClamAV debug: asn1_parse_mscat: skipping x509 certificate with errors
Ex: d577010638f208ad8b6dab1a33dc583b2ec6b0c719021fb9f083dd595ede27e8
Also, add a check on CRT_RAWMAXLEN, since if it's > 256 problems
will arise
The previous commit allowed a CRB cert's exponent to be ignored
when evaluating blacklist rules, but this commit also allows
the exponent to be ignored for whitelist rules as well.
Also, previous versions of ClamAV allowed the serial number hash
field in a CRB rule to be blank, effectively wildcarding the serial
number. This functionality broke with some of the changes introduced
in 0.102.0, so this commit addresses that.
CRB rules allow the exponent to be specified, but currently this value
gets ignored and hardcoded to 65537. It turns out that most certs I
tested against (12,000 from VT) use e==65537, but a handful don't.
This commit addresses the signature load time issue in the following steps:
1. Loaded list items are allocated but left unattached; only a node reference is set on them for further processing. This is done with no increase of memory usage. See changes in insert_list and matcher-ac.h
2. Before the tries are built, the whole list of entries is sorted by node, then by pattern, then by partno. This requires O(N log(N)) time.
3. The list is processed linearly, one node at a time and the `next_same` chain is built. Each next_same chain head is also extracted. This requires O(N) time.
4. The list of heads is sorted by partno. This requires O(M log(M)) time on average with M<=N.
5. The list of heads is processed linearly and the `next` chain is built. This has O(M) complexity.
And improves scantime performance, by adding checks to:
1. Place longer lists earlier in the trie.
2. Keep close patterns close, rather than scattering them further apart.
This reduced memory cache faults to improve load and scan time performance.
Bumps the version from 0.102.0 to 0.103.0-devel-<date>.
Bumps the FLEVEL from 111 to 120.
Bumps the libclamav and libfreshclam revision numbers from 4 -> 5, and 0 -> 1, respectively.