Includes rudimentary support for getting slices from FMap's and for
interacting with libclamav's context structure.
For now will use a Cisco-Talos org fork of the onenote_parser
until the feature to read open a onenote section from a slice (instead
of from a filepath) is added to the upstream.
I'm unsure why, but building with cmke -D MAINTAINER_MODE=ON is failing
right now. Updating to a newer version of bindgen appears to resolve the
issue.
I was able to update it by changing the version specified in
libclamav_rust/Cargo.toml, and then running `cargo update -p bindgen`
Not that I expect anyone else to be running maintainer-mode, but I did
also confirm using `cargo-msrv` that the minimum supported version of
rust did not change as a result of this commit.
When processing UTF-8 HTML code, the image extraction logic may panic if
the string contains a multi-byte grapheme that includes a '(', ')',
whitespace, or one of the other characters used to split the text when
searching for the base64 image content.
The panic is because the `split_at()` method will panic if you try to
split in the middle of a unicode grapheme.
This commit fixes the issue by processing the HTML string one grapheme
at a time instead of one character (byte) at a time.
The `grapheme_indices()` method is used to get the correct position of
the start of each grapheme for splitting the string.