Török Edvin
4956690d99
add an #ifdef NOISY to pdf.c
...
Usage: uncomment #define NOISY, rebuild, and scan some PDF files
with an empty DB. It should print info messages for successful
extraction/decryption, and warnings where it fails.
14 years ago
Török Edvin
7719760b66
pdf: implement text extraction (bb #2022 )
14 years ago
Török Edvin
21a3345714
Fix Win32 build (bb #4083 ).
...
Thanks to Sherpya.
14 years ago
Török Edvin
8db140ffbd
pdf: disable the newline workaround when parsing encrypted streams
14 years ago
Török Edvin
f3199751a7
pdf: XRef streams are not encrypted
14 years ago
Török Edvin
27c8b02b74
pdf: support PDF 1.5 Crypt filters.
14 years ago
Török Edvin
bcc6856753
pdf: support scanning inside encrypted strings
14 years ago
Török Edvin
bbfad9bad0
pdf: support for AESV3, V 5 security handler, and encrypted linearized PDFs.
14 years ago
Török Edvin
374be101f4
scan inside encrypted PDF streams (bb #2794 ).
...
TODO: scan encrypted PDF strings, and PDF1.5/1.6 CryptFilters.
14 years ago
Tomasz Kojm
664be8da02
bb#3421, #3244 , #3732
14 years ago
Török Edvin
6dcf6031db
fix trailer_end check
14 years ago
Török Edvin
82c0e6bc31
fix bug introduced by bb #3364 fix.
14 years ago
Török Edvin
8bf6e781e9
fix distcheck.
14 years ago
Török Edvin
3f8016ce90
fix encrypted linearized pdf detection (bb #3364 ).
14 years ago
Török Edvin
c16b3abb8c
flag and dump PDF objects with /Launch (bb #3514 ).
14 years ago
Török Edvin
b8b055c52b
pdf: fix incorrect blocking of some encrypted PDF with empty user passwords. (bb #3364 )
...
Length was not found, because of the order we read the values.
14 years ago
Török Edvin
7261220f67
fix encrypted pdf detection (bb #2988 )
14 years ago
Török Edvin
0b073f2e10
pdf.c: fix pdf_handle_enc
14 years ago
Török Edvin
618f62dbe6
Encrypted.PDF -> Heuristics.Encrypted.PDF
...
Be consistent: all engine detections are prefixed with Heuristics.
14 years ago
Török Edvin
7606789f91
Better detection for encrypted PDFs (bb #2448 )
...
If --block-encrypted is specified then we can detect Encrypted.PDF if:
- PDF is encrypted with R 2,3,4 or 5
- PDF is not displayable without specifying a password
If PDF is encrypted, but is displayable without specifying a password, then it
is not detected as Encrypted.PDF.
14 years ago
Török Edvin
4619289aef
pdf: Fix missed detection (bb #2455 ).
15 years ago
Török Edvin
a91013cde7
pdf: fix another uninit (bb #2404 ).
15 years ago
Török Edvin
b5ed1fe6d3
pdf: fix uninit value (bb #2455 ).
15 years ago
Török Edvin
019f195519
fix crashes (bb #2358 , bb #2380 , bb #2396 ).
...
Thanks to Arkadiusz Miskiewicz <arekm*maven.pl> for bb #2380 .
15 years ago
Török Edvin
a95d300f6b
pdf: fix "Unknown error code ERROR".
15 years ago
Török Edvin
78f2c1d94f
pdf: fix false positive
...
Fix sid 17816118: only consider valid hexadecimal digits after #, and
consider ( as terminator.
15 years ago
Török Edvin
5af966d317
bb #2295 .
15 years ago
Török Edvin
2db6eb291d
Keep parsing after %%EOF (bb #2264 ).
15 years ago
Török Edvin
e142504b07
Fix 'Unknown error code ERROR' (bb #2296 ).
...
CL_BREAK from cli_bytecode_runhook must not be allowed escape.
Affects only pdf.c, since in pdf.c we return CL_CLEAN when ret != CL_VIRUS.
15 years ago
Török Edvin
8f18920f99
Fix crash on 64-bit Solaris Intel (bb #2314 ).
...
memcmp does 8-byte reads if length > 8, which might cross a page-boundary and
crash. Not strictly a memcmp bug, since manpage doesn't say that memcmp must stop at
first difference. Linux doesn't crash because it only does 4/8-byte reads on 4/8-byte aligned
addresses, hence it can never cross a page boundary.
Fix this by making sure that what we request from memcmp is entirely readable.
15 years ago
Török Edvin
8f6bf9fc08
Fix mmap failed(2) on 32-bit FreeBSD (bb #2300 ).
...
off_t is 64-bit, size_t is still 32-bit and that causes unexpected integer
promotion here:
map_off = map->len - 2048
First the unsigned subtraction is performed, and then the unsigned (!) value
is sign-extended to 64-bit. Hence a negative value becomes positive, which is
wrong.
15 years ago
Török Edvin
dc5143b466
Add missing boundscheck to pdf code (bb #2226 ).
15 years ago
Török Edvin
f73212dc62
Fix bytecode virusname reporting (bb #2255 ).
...
Also adds possibility to stop a hook from executing, and set
a virus as heuristic (by using BC.Heuristic* name)
15 years ago
Török Edvin
bdbae20323
Improve handling of pdf objs (bb #2216 ).
15 years ago
Török Edvin
b220bb3058
Fix wrong bounds for stream in pdf.
...
If pdf is truncated the length became negative (endstream before stream).
Make sure this doesn't happen.
15 years ago
Török Edvin
38c9fc17cd
pdf: stream is sometimes not followed by EOL immediately.
...
Retry inflate by skipping after EOL.
See sample id0009445634.
15 years ago
Török Edvin
4d808a8664
Dump JPG images from PDFs.
...
Sometimes a JPG is not a JPG, and may contain HTML malware.
See sample id0008931254.
15 years ago
Török Edvin
5e2b776b11
Fix parsing of some PDFs.
...
/Filter[ was parsed incorrectly. The [ is not part of the name.
15 years ago
Török Edvin
57549ff480
Obey HeuristicScanPrecedence for pdf.
15 years ago
Török Edvin
cf0f529bb3
pdf: give low priority to Heuristic signature.
15 years ago
Török Edvin
76cdacdd92
pdf: flush on stream end too.
15 years ago
Török Edvin
89590e9974
Output partially extracted blocks in pdf.
...
Sometimes PDF claims the zlib data is longer/shorter than it really is.
We always prefer the longest one, which can lead zlib to return an error
when we run off the end.
So dump the remaining extracted data from zlib's buffer to disk, it usually
contains all we need already (and if not we're going to dump the raw inflate
stream anyway).
This fixes 3 missed samples of Exploit.PDF-60 in the regression test.
15 years ago
Török Edvin
d1a28db048
Fix off by two in new pdf parser.
15 years ago
Török Edvin
dc200c6b19
Add bytecode API for pdf.
15 years ago
Török Edvin
f14bf644de
Guard Heuristics.PDF.ObfuscatedNameObject by CL_SCAN_ALGORITHMIC.
15 years ago
Török Edvin
9acc81d603
pdf: improve handling of truncated files, and fix some filter handling bugs.
...
Also don't dump images by default, this will be overridable from bytecode.
15 years ago
Török Edvin
cacd0927b4
pdf: fix uninitialized values, and bytesleft.
15 years ago
Török Edvin
e8c7cc2185
pdf: avoid negative lengths.
...
Thanks to nitrox for reporting.
15 years ago
Török Edvin
80db7712ec
Add filter abbreviations used in images.
15 years ago
Török Edvin
b835a528db
Some more fixes for signed/encrypted pdfs.
15 years ago