clamav

Commit Graph

Author	SHA1	Message	Date
Micah Snyder (micasnyd)	7a70a03ba0	fuzz-29454: fix buffer overread in PDF parser The size of the UE buffer for the new Adobe Reader X encryption support was not properly recorded and may result in reading too far into the UE buffer. This patch checks the size of the UE buffer and rejects it if the length is not 32, as it does with the other AES256 CBC method.	4 years ago
Micah Snyder (micasnyd)	b9ca6ea103	Update copyright dates for 2021 Also fixes up clang-format.	4 years ago
Micah Snyder (micasnyd)	205d8dcd6e	fuzz-24408: Fix NULL-deref bug in PDF parser Fixes a NULL-dereference bug recently added when improving support for PDF decryption. Issue does not affect prior versions.	5 years ago
Micah Snyder	e2f59af30a	Clang-format touchup	5 years ago
Clement Lecigne	4c96f017f9	pdf: do not override pdf->fileIDlen if there is no new fileID.	5 years ago
Andrew	319bfb51a5	Fix several coverity warnings 290424 Missing break in switch - In hash_match: Missing break statement between cases in switch statement 290414 Resource leak - In cli_scanishield_msi: Leak of memory or pointers to system resources. Memory leak in a fail case 288197 Resource leak - In decrypt_any: Leak of memory or pointers to system resources. Memory leak in a fail case 290426 Resource leak - In cli_magic_scan: Leak of memory or pointers to system resources. Leaked a file prefix when running with --save-temps 192923 Resource leak - In cli_scanrar: Leak of memory or pointers to system resources. Leaked a file descriptor if a virus was found in a RAR file comment 225146 Resource leak - In cli_scanegg: Leak of memory or pointers to system resources. Leaked a file descriptor if unable to write a comment file to disk 290425 Resource leak - In scan_common: Leak of memory or pointers to system resources. Memory leaks in various fail cases. Also changes cli_scanrar to write out the file comment only if --leave-temps is specified and scan the buffer (like what is done in cli_scanegg) instead of writing the file out, scanning that, and then deleting the file if --leave-temps is not specified. The unit tests stopped working when correcting an issue with a switch statement that determined what type of signature had matched on a Google SafeBrowsing GDB rule. Looking into the unit tests, it looks like the code had always assumed that the test cases would be detected by a malware test rule in unit_tests/input/daily.gdb, but now some of the tests get matched on the phishing test rule. I updated the test logic to be more clear, and added tests for both cases now. Fix some memory leaks in libclamav/scanners.c	5 years ago
Micah Snyder (micasnyd)	cdbc833a32	PDF: Delay Javascript detection until JS found PDFs may contain Javascript actions in objects. Those may actually be indirect references to other objects where the Javascript resides rather than storing the Javascript in the current object. The thing is, sometimes the indirect object is empty (no actual script), in which case the PDF may not have any active content. This commit changes the Javascript detection logic so it only records the stats/JSON after detecting the "Javascript" and "JS" tags in the object (indirect or direct) where the Javascript is supposed to reside. This moves the logic from an action callback when the object dictionary key "Javascript" is detected over into the object extraction logic, after all of the objects have been parsed.	5 years ago
Micah Snyder	11ef77007b	Improve tmp sub-directory names At present many parsers create tmp subdirectories to store extracted files. For parsers like the vba parser, this is required as the directory is later scanned. For other parsers, these subdirectories are probably not helpful now that we provide recursive sub-dirs when --leave-temps is enabled. It's not quite as simple as removing the extra subdirectories, however. Certain parsers, like autoit, don't create very unique filenames and would result in file name collisions when --leave-temps is not enabled. The best thing to do would be to make sure each parser uses unique filenames and doesn't rely on cli_magic_scan_dir() to scan extracted content before removing the extra subdirectory. In the meantime, this commit gives the extra subdirectories meaningful names to improve readability. This commit also: - Provides the 'bmp' prefix for extracted PE icons. - Removes empty tmp subdirs when extracting rtf files, to eliminate clutter. - The PDF parser sometimes creates tmp files when decompressing streams before it knows if there is actually any content to decompress. This resulted in a large number of empty files. While it would be best to avoid creating empty files in the first place, that's not quite as as it sounds. This commit does the next best thing and deletes the tmp files if nothing was actually extracted, even if --leave-temps is enabled. - Removes the "scantemp" prefix for unnamed fmaps scanned with cli_magic_scan(). The 5-character hashes given to tmp files with prefixes resulted in occasional file name collisions when extracting certain file types with thousands of embedded files. - The VBA and TAR parsers mistakenly used NAME_MAX instead of PATH_MAX, resulting in truncated file paths and failed extraction when --leave-temps is enabled and a lot of recursion is in play. This commit switches them from NAME_MAX to PATH_MAX.	5 years ago
Micah Snyder	9b9999d778	Rename core scanning functions Many of the core scanning functions' names no longer represent their specific purpose or arguments. This commit aims to make the names more intuitive. Names are now prefixed with "magic" if they involve file-typing and file-type parsing. In addition, each function now includes the type of input being scanned whether its "desc", "fmap", or "buff". Some of the APIs also now specify "type" to indicate that a type other than "ANY" may be passed in to select the type rather than use file type magic for type recognition. \| current name \| new name \| \| ------------------------- \| --------------------------------- \| \| magic_scandesc() \| cli_magic_scan() \| \| cli_magic_scandesc_type() \| <delete> \| \| cli_magic_scandesc() \| cli_magic_scan_desc() \| \| cli_base_scandesc() \| cli_magic_scan_desc_type() \| \| cli_partition_scandesc() \| <delete> \| \| cli_map_scandesc() \| magic_scan_nested_fmap_type() \| \| cli_map_scan() \| cli_magic_scan_nested_fmap_type() \| \| cli_mem_scandesc() \| cli_magic_scan_buff() \| \| cli_scanbuff() \| cli_scan_buff() \| \| cli_scandesc() \| cli_scan_desc() \| \| cli_fmap_scandesc() \| cli_scan_fmap() \| \| cli_scanfile() \| cli_magic_scan_file() \| \| cli_scandir() \| cli_magic_scan_dir() \| \| cli_filetype2() \| cli_determine_fmap_type() \| \| cli_filetype() \| cli_compare_ftm_file() \| \| cli_partitiontype() \| cli_compare_ftm_partition() \| \| cli_scanraw() \| scanraw() \|	5 years ago
Micah Snyder	005cbf5a37	Record names of extracted files A way is needed to record scanned file names for two purposes: 1. File names (and extensions) must be stored in the json metadata properties recorded when using the --gen-json clamscan option. Future work may use this to compare file extensions with detected file types. 2. File names are useful when interpretting tmp directory output when using the --leave-temps option. This commit enables file name retention for later use by storing file names in the fmap header structure, if a file name exists. To store the names in fmaps, an optional name argument has been added to any internal scan API's that create fmaps and every call to these APIs has been modified to pass a file name or NULL if a file name is not required. The zip and gpt parsers required some modification to record file names. The NSIS and XAR parsers fail to collect file names at all and will require future work to support file name extraction. Also: - Added recursive extraction to the tmp directory when the --leave-temps option is enabled. When not enabled, the tmp directory structure remains flat so as to prevent the likelihood of exceeding MAX_PATH. The current tmp directory is stored in the scan context. - Made the cli_scanfile() internal API non-static and added it to scanners.h so it would be accessible outside of scanners.c in order to remove code duplication within libmspack.c. - Added function comments to scanners.h and matcher.h - Converted a TDB-type macros and LSIG-type macros to enums for improved type safey. - Converted more return status variables from `int` to `cl_error_t` for improved type safety, and corrected ooxml file typing functions so they use `cli_file_t` exclusively rather than mixing types with `cl_error_t`. - Restructured the magic_scandesc() function to use goto's for error handling and removed the early_ret_from_magicscan() macro and magic_scandesc_cleanup() function. This makes the code easier to read and made it easier to add the recursive tmp directory cleanup to magic_scandesc(). - Corrected zip, egg, rar filename extraction issues. - Removed use of extra sub-directory layer for zip, egg, and rar file extraction. For Zip, this also involved changing the extracted filenames to be randomly generated rather than using the "zip.###" file name scheme.	5 years ago
Mickey Sola	706dd7d7bc	pdf - fixup Aldo's PR based on review by team	5 years ago
Aldo Mazzeo	7d2ce0b32c	Adding support for Adobe Reader X encryption scheme	5 years ago
Micah Snyder (micasnyd)	4b7a738152	fuzz-21329: Fix out-of-bounds read in PDF parser Fix for an out-of-bounds read in the PDF parser when initializing aes crypto routines that may result in a crash. Bug found by OSS-Fuzz. Also added checks for the arc4 init routine to mitigate the risk of a similar issue.	5 years ago
Jonas Zaddach (jzaddach)	d5a733ef90	XLM (Excel 4.0) macro detection and extraction XLM is a macro language in Excel that was used before VBA (before 1996). It is still parsed and executed by modern Excel and is gaining popularity with malware authors. This patch adds rudimentary support for detecting and extracting Excel 4.0 (XLM) macros. The code is based on Didier Steven's plugin_biff for oletools.py.	5 years ago
Mickey Sola	5d411c68fb	bb12461 - error out properly when pdf parser fails to allocate a map; normalize/sanitize user supplied filename and comment info when parsing arj headers; add better bound checking and error handling to arj header parsers	5 years ago
Micah Snyder	898c08f08b	Formatting touch-up	5 years ago
Micah Snyder	206dbaefe8	Update copyright dates for 2020	5 years ago
Micah Snyder	88ce6b8170	Fix to dereference pdf pointer after NULL check, not before.	6 years ago
Micah Snyder	4524c398f3	Argument and return types for fmap_readn(), cli_writen(), cli_readn() converted to use size_t instead of int.	6 years ago
Micah Snyder	ca8b4c466e	Assortment of warning fixes.	6 years ago
Micah Snyder (micasnyd)	88d271cbf5	Added pdf max object checks to limit max # of objects but continue scanning those that have already been found.	6 years ago
Micah Snyder	df52009b40	pdf.c formatting fixes.	6 years ago
Clement Lecigne	3e77daa791	pdf: fix octal conversion in pdf_readstring.	6 years ago
Clement Lecigne	e2b774d791	pdf: handle dictionary object with newlines.	6 years ago
Micah Snyder	a8ca96687a	Clean up of PDF object finding logic. Changes include recording object sizes as objects are found, identifying object streams in the object parsing section instead of the PDF parsing section, and limiting of stream and other object parsing to the size of the object instead of the size of the PDF. It is also easier to read and includes more inline documentation.	6 years ago
Micah Snyder	25d72538cd	fuzz - 12181 - Fixed 1-byte buffer over-read in PDF parser.	6 years ago
Micah Snyder	1e50361baf	fuzz - 12168 - Fix for 1 byte out of bounds read in PDF parser. Fix includes a check to ensure that it is safe to index -1 from the start of an object a well as additional checks to invalidate some negative integer values.	6 years ago
Micah Snyder	da15bcfd37	fuzz - 12149 - Fix for out of bounds read in PDF object stream parsing code.	6 years ago
Micah Snyder	479a9a235a	Fixes for issues identified by coverity.	6 years ago
Micah Snyder	da8d941cc8	fuzz - 12131, 12132, 12205 - Speed up PDF parse speed for truncated (or otherwise malformed) PDFs.	6 years ago
Micah Snyder	52cddcbcfd	Updating and cleaning up copyright notices.	6 years ago
Micah Snyder	72fd33c8b2	clang-format'd using new .clang-format rules.	6 years ago
Micah Snyder (micasnyd)	9280b4ea0f	Fix for 3 pdf parsing bugs introduced with the addition of object stream parsing, identified in regression testing.	7 years ago
Micah Snyder	d77b8ae0fb	Fixes to a handful of bugs identified during regression testing of PDF and UnRAR changes. Fix for minor memory leak in fmap_dump_to_file(). Fix to PDF object stream logic, accounting for a realloc() issue when the only pdf object stream fails to parse, and for when pdf objects in a stream appear to extend further than the size of the stream. Fix for memory leak cleaning up PDF object stream buffer in error condition. Fix to bug in pdf_decodestream wherein objects were found in an object stream, but the object stream could later be free'd if max scansize was exceeded, resulting in a NULL dereference. General cleanup of pdf_decodestream/pdf_decodestream_internal exit code logic.	7 years ago
Micah Snyder	d39cb6581f	Updating libclamunrar from legacy C implementation to modern unrar 5.6.5. API changes and supporting changes included to pass the filepath of the scanned file into libclamav through the cli_ctx structure, required by the unrar library to open archives. The filename argument may be optional for the scandesc scanning variant, but libclamav will make a best effort to identify the filename from the file descriptor if it was not provided. In addition, included the ability to prefix temp file and directory names with file basenames.	7 years ago
Micah Snyder (micasnyd)	f61e92da8f	Changing numerous scan options' names, primarily those of heuristic signatature alert options. Original options (command line and clamd) will remain as deprecated & undocumented for a couple releases. Added 2 extra scan options to allow users to differentiate between alerting on encrypted archives vs encrypted documents (bb11911).	7 years ago
Micah Snyder	d7979d4ff7	Restructured scan options flags from a single bitflag field to a structure containing multiple bitflag fields. This also required adding a new function to the bytecode API to get scan options a la carte, and modifying the existing function to hand back scan options in the old/deprecated uint32_t bitflag format. Re-generated bytecode iface header files. Updated libclamav documentation detailing new scan options structure. Renamed references to 'algorithmic' detection to 'heuristic' detection. Renaming references to 'properties' to 'collect metadata'. Renamed references to 'scan all' to 'scan all match'. Renamed a couple of 'Hueristic.' signature names as 'Heuristics.' signatures (plural) to match majority of other heuristics.	7 years ago
Micah Snyder (micasnyd)	89d5207b31	Added new pdf object stream parsing capability.	7 years ago
Micah Snyder	f842e965fe	Replacing strntol with strntoul to ensure proper (un)signedness when parsing numbers from PDFs.	7 years ago
Micah Snyder	3955b36133	Adjustment to pdf find_obj logic to allow the parser to skip, continue when it finds objects that cannot be parsed and may not in fact be objects at all.	7 years ago
Micah Snyder	2176b2c358	Uncommenting len adjustment that is clearly correct, needed.	7 years ago
Micah Snyder	bf6e777fa7	bb12133: Wrapping cli_strntol to provide easy error detection. Applying cli_strntol_wrap with error checking. Adding logic to identify when a parsing error is in fact a new revision of the PDF.	7 years ago
Micah Snyder	53cbdee38a	bb12133: Implementing cli_strntol based on gnu gcc's strtol implementation with modifications to limit string buffer length for non-null terminated strings. Using cli_strntol in pdf.c for added safety.	7 years ago
Micah Snyder (micasnyd)	a79be7590e	bb12134: Adding missing len decrement and adding additional len check.	7 years ago
Micah Snyder	69b4a22370	bb12006: correction to dictionary length checks when parsing pdf objects.	7 years ago
Micah Snyder	53c957a9da	bb12004: adding check for min pdf size needed to check pdf version	7 years ago
Micah Snyder	4a2576fefd	Removing hard-coded heuristic signature that flags when a PDF has an abnormally high number of filters. Removing due to false positive and because in its current form it cannot be disabled or modified without recompiling ClamAV.	7 years ago
Micah Snyder	c9a070c9d3	More cleanup re: variables possibly used before initialized.	7 years ago
Steven Morgan	a5e2b97d24	bb11981 - fix for some unit tests.	8 years ago
Mickey Sola	c8ba4ae2e4	11942 - fixing heap overflow in handle_pdfname. Patch submitted by Suleman Ali.	8 years ago

1 2 3 4 5 ...

290 Commits (9f407d83b3dd2f18b2ffb764da71ccd992f16872)