Changes needed to build at all:
- Check for SSL_new in configure, now that SSL_library_init is a macro.
- Do not access struct members directly. This includes some new code in
pgcrypto, to use the resource owner mechanism to ensure that we don't
leak OpenSSL handles, now that we can't embed them in other structs
anymore.
- RAND_SSLeay() -> RAND_OpenSSL()
Changes that were needed to silence deprecation warnings, but were not
strictly necessary:
- RAND_pseudo_bytes() -> RAND_bytes().
- SSL_library_init() and OpenSSL_config() -> OPENSSL_init_ssl()
- ASN1_STRING_data() -> ASN1_STRING_get0_data()
- DH_generate_parameters() -> DH_generate_parameters()
- Locking callbacks are not needed with OpenSSL 1.1.0 anymore. (Good
riddance!)
Also change references to SSLEAY_VERSION_NUMBER with OPENSSL_VERSION_NUMBER,
for the sake of consistency. OPENSSL_VERSION_NUMBER has existed since time
immemorial.
Fix SSL test suite to work with OpenSSL 1.1.0. CA certificates must have
the "CA:true" basic constraint extension now, or OpenSSL will refuse them.
Regenerate the test certificates with that. The "openssl" binary, used to
generate the certificates, is also now more picky, and throws an error
if an X509 extension is specified in "req_extensions", but that section
is empty.
Backpatch to 9.5 and 9.6, per popular demand. The file structure was
somewhat different in earlier branches, so I didn't bother to go further
than that. In back-branches, we still support OpenSSL 0.9.7 and above.
OpenSSL 0.9.6 should still work too, but I didn't test it. In master, we
only support 0.9.8 and above.
Patch by Andreas Karlsson, with additional changes by me.
Discussion: <20160627151604.GD1051@msg.df7cb.de>
OpenSSL has an unfortunate tendency to mix per-session state error
handling with per-thread error handling. This can cause problems when
programs that link to libpq with OpenSSL enabled have some other use of
OpenSSL; without care, one caller of OpenSSL may cause problems for the
other caller. Backend code might similarly be affected, for example
when a third party extension independently uses OpenSSL without taking
the appropriate precautions.
To fix, don't trust other users of OpenSSL to clear the per-thread error
queue. Instead, clear the entire per-thread queue ahead of certain I/O
operations when it appears that there might be trouble (these I/O
operations mostly need to call SSL_get_error() to check for success,
which relies on the queue being empty). This is slightly aggressive,
but it's pretty clear that the other callers have a very dubious claim
to ownership of the per-thread queue. Do this is both frontend and
backend code.
Finally, be more careful about clearing our own error queue, so as to
not cause these problems ourself. It's possibly that control previously
did not always reach SSLerrmessage(), where ERR_get_error() was supposed
to be called to clear the queue's earliest code. Make sure
ERR_get_error() is always called, so as to spare other users of OpenSSL
the possibility of similar problems caused by libpq (as opposed to
problems caused by a third party OpenSSL library like PHP's OpenSSL
extension). Again, do this is both frontend and backend code.
See bug #12799 and https://bugs.php.net/bug.php?id=68276
Based on patches by Dave Vitek and Peter Eisentraut.
From: Peter Geoghegan <pg@bowt.ie>
Per discussion, the original name was a bit misleading, and
PQsslAttributeNames() seems more apropos. It's not quite too late to
change this in 9.5, so let's change it while we can.
Also, make sure that the pointer array is const, not only the pointed-to
strings.
Minor documentation wordsmithing while at it.
Lars Kanis, slight adjustments by me
Thom Brown reported that SSL connections didn't seem to work on Windows in
9.5. Asif Naeem figured out that the cause was my_sock_read() looking at
"errno" when it needs to look at "SOCK_ERRNO". This mistake was introduced
in commit 680513ab79, which cloned the
backend's custom SSL BIO code into libpq, and didn't translate the errno
handling properly. Moreover, it introduced unnecessary errno save/restore
logic, which was particularly confusing because it was incomplete; and it
failed to check for all three of EINTR, EAGAIN, and EWOULDBLOCK in
my_sock_write. (That might not be necessary; but since we're copying
well-tested backend code that does do that, it seems prudent to copy it
faithfully.)
If someone else already set the callbacks, don't overwrite them with
ours. When unsetting the callbacks, only unset them if they point to
ours.
Author: Jan Urbański <wulczer@wulczer.org>
This makes it possible to query for things like the SSL version and cipher
used, without depending on OpenSSL functions or macros. That is a good
thing if we ever get another SSL implementation.
PQgetssl() still works, but it should be considered as deprecated as it
only works with OpenSSL. In particular, PQgetSslInUse() should be used to
check if a connection uses SSL, because as soon as we have another
implementation, PQgetssl() will return NULL even if SSL is in use.
strncpy() has a well-deserved reputation for being unsafe, so make an
effort to get rid of nearly all occurrences in HEAD.
A large fraction of the remaining uses were passing length less than or
equal to the known strlen() of the source, in which case no null-padding
can occur and the behavior is equivalent to memcpy(), though doubtless
slower and certainly harder to reason about. So just use memcpy() in
these cases.
In other cases, use either StrNCpy() or strlcpy() as appropriate (depending
on whether padding to the full length of the destination buffer seems
useful).
I left a few strncpy() calls alone in the src/timezone/ code, to keep it
in sync with upstream (the IANA tzcode distribution). There are also a
few such calls in ecpg that could possibly do with more analysis.
AFAICT, none of these changes are more than cosmetic, except for the four
occurrences in fe-secure-openssl.c, which are in fact buggy: an overlength
source leads to a non-null-terminated destination buffer and ensuing
misbehavior. These don't seem like security issues, first because no stack
clobber is possible and second because if your values of sslcert etc are
coming from untrusted sources then you've got problems way worse than this.
Still, it's undesirable to have unpredictable behavior for overlength
inputs, so back-patch those four changes to all active branches.
The RFCs say that the CN must not be checked if a subjectAltName extension
of type dNSName is present. IOW, if subjectAltName extension is present,
but there are no dNSNames, we can still check the CN.
Alexey Klyukin
This patch makes libpq check the server's hostname against DNS names listed
in the X509 subjectAltName extension field in the server certificate. This
allows the same certificate to be used for multiple domain names. If there
are no SANs in the certificate, the Common Name field is used, like before
this patch. If both are given, the Common Name is ignored. That is a bit
surprising, but that's the behavior mandated by the relevant RFCs, and it's
also what the common web browsers do.
This also adds a libpq_ngettext helper macro to allow plural messages to be
translated in libpq. Apparently this happened to be the first plural message
in libpq, so it was not needed before.
Alexey Klyukin, with some kibitzing by me.
This refactoring is in preparation for adding support for other SSL
implementations, with no user-visible effects. There are now two #defines,
USE_OPENSSL which is defined when building with OpenSSL, and USE_SSL which
is defined when building with any SSL implementation. Currently, OpenSSL is
the only implementation so the two #defines go together, but USE_SSL is
supposed to be used for implementation-independent code.
The libpq SSL code is changed to use a custom BIO, which does all the raw
I/O, like we've been doing in the backend for a long time. That makes it
possible to use MSG_NOSIGNAL to block SIGPIPE when using SSL, which avoids
a couple of syscall for each send(). Probably doesn't make much performance
difference in practice - the SSL encryption is expensive enough to mask the
effect - but it was a natural result of this refactoring.
Based on a patch by Martijn van Oosterhout from 2006. Briefly reviewed by
Alvaro Herrera, Andreas Karlsson, Jeff Janes.
Commit 820f08cabd claimed to make the server
and libpq handle SSL protocol versions identically, but actually the server
was still accepting SSL v3 protocol while libpq wasn't. Per discussion,
SSL v3 is obsolete, and there's no good reason to continue to accept it.
So make the code really equivalent on both sides. The behavior now is
that we use the highest mutually-supported TLS protocol version.
Marko Kreen, some comment-smithing by me
Per report from Jeffrey Walton, libpq has been accepting only TLSv1
exactly. Along the lines of the backend code, libpq will now support
new versions as OpenSSL adds them.
Marko Kreen, reviewed by Wim Lewis.
In libpq, we set up and pass to OpenSSL callback routines to handle
locking. When we run out of SSL connections, we try to clean things
up by de-registering the hooks. Unfortunately, we had a few calls
into the OpenSSL library after these hooks were de-registered during
SSL cleanup which lead to deadlocking. This moves the thread callback
cleanup to be after all SSL-cleanup related OpenSSL library calls.
I've been unable to reproduce the deadlock with this fix.
In passing, also move the close_SSL call to be after unlocking our
ssl_config mutex when in a failure state. While it looks pretty
unlikely to be an issue, it could have resulted in deadlocks if we
ended up in this code path due to something other than SSL_new
failing. Thanks to Heikki for pointing this out.
Back-patch to all supported versions; note that the close_SSL issue
only goes back to 9.0, so that hunk isn't included in the 8.4 patch.
Initially found and reported by Vesa-Matti J Kari; many thanks to
both Heikki and Andres for their help running down the specific
issue and reviewing the patch.
We should really be reporting a useful error along with returning
a valid return code if pthread_mutex_lock() throws an error for
some reason. Add that and back-patch to 9.0 as the prior patch.
Pointed out by Alvaro Herrera
I've been working with Nick Phillips on an issue he ran into when
trying to use threads with SSL client certificates. As it turns out,
the call in initialize_SSL() to SSL_CTX_use_certificate_chain_file()
will modify our SSL_context without any protection from other threads
also calling that function or being at some other point and trying to
read from SSL_context.
To protect against this, I've written up the attached (based on an
initial patch from Nick and much subsequent discussion) which puts
locks around SSL_CTX_use_certificate_chain_file() and all of the other
users of SSL_context which weren't already protected.
Nick Phillips, much reworked by Stephen Frost
Back-patch to 9.0 where we started loading the cert directly instead of
using a callback.
We had two copies of this function in the backend and libpq, which was
already pretty bogus, but it turns out that we need it in some other
programs that don't use libpq (such as pg_test_fsync). So put it where
it probably should have been all along. The signal-mask-initialization
support in src/backend/libpq/pqsignal.c stays where it is, though, since
we only need that in the backend.
Both libpq and the backend would truncate a common name extracted from a
certificate at 32 bytes. Replace that fixed-size buffer with dynamically
allocated string so that there is no hard limit. While at it, remove the
code for extracting peer_dn, which we weren't using for anything; and
don't bother to store peer_cn longer than we need it in libpq.
This limit was not so terribly unreasonable when the code was written,
because we weren't using the result for anything critical, just logging it.
But now that there are options for checking the common name against the
server host name (in libpq) or using it as the user's name (in the server),
this could result in undesirable failures. In the worst case it even seems
possible to spoof a server name or user name, if the correct name is
exactly 32 bytes and the attacker can persuade a trusted CA to issue a
certificate in which that string is a prefix of the certificate's common
name. (To exploit this for a server name, he'd also have to send the
connection astray via phony DNS data or some such.) The case that this is
a realistic security threat is a bit thin, but nonetheless we'll treat it
as one.
Back-patch to 8.4. Older releases contain the faulty code, but it's not
a security problem because the common name wasn't used for anything
interesting.
Reported and patched by Heikki Linnakangas
Security: CVE-2012-0867
This makes it possible to use a libpq app with home directory set
to /dev/null, for example - treating it the same as if the file
doesn't exist (which it doesn't).
Per bug #6302, reported by Diego Elio Petteno
On balance, the need to cover this case changes my mind in favor of pushing
all error-message generation duties into the two fe-secure.c routines.
So do it that way.
In many cases, pqsecure_read/pqsecure_write set up useful error messages,
which were then overwritten with useless ones by their callers. Fix this
by defining the responsibility to set an error message to be entirely that
of the lower-level function when using SSL.
Back-patch to 8.3; the code is too different in 8.2 to be worth the
trouble.
This disables an entirely unnecessary "sanity check" that causes failures
in nonblocking mode, because OpenSSL complains if we move or compact the
write buffer. The only actual requirement is that we not modify pending
data once we've attempted to send it, which we don't. Per testing and
research by Martin Pihlak, though this fix is a lot simpler than his patch.
I put the same change into the backend, although it's less clear whether
it's necessary there. We do use nonblock mode in some situations in
streaming replication, so seems best to keep the same behavior in the
backend as in libpq.
Back-patch to all supported releases.
Instead, just act as though the certificate file(s) are not present.
There is only one case where this need be a hard failure condition: when
sslmode is verify-ca or verify-full, not having a root cert file is an
error. Change the logic so that we complain only in that case, and
otherwise fall through cleanly. This is how it used to behave pre-9.0,
but my patch 4ed4b6c54e of 2010-05-26 broke
the case. Per report from Christian Kastner.
parameter against server cert's CN field) to succeed in the case where
both host and hostaddr are specified. As with the existing precedents
for Kerberos, GSSAPI, SSPI, it is the calling application's responsibility
that host and hostaddr match up --- we just use the host name as given.
Per bug #5559 from Christopher Head.
In passing, make the error handling and messages for the no-host-name-given
failure more consistent among these four cases, and correct a lie in the
documentation: we don't attempt to reverse-lookup host from hostaddr
if host is missing.
Back-patch to 8.4 where SSL cert verification was introduced.
additional cases correctly. The original coding failed to load additional
(chain) certificates from the client cert file, meaning that indirectly signed
client certificates didn't work unless one hacked the server's root.crt file
to include intermediate CAs (not the desired approach). Another problem was
that everything got loaded into the shared SSL_context object, which meant
that concurrent connections trying to use different sslcert settings could
well fail due to conflicting over the single available slot for a keyed
certificate.
To fix, get rid of the use of SSL_CTX_set_client_cert_cb(), which is
deprecated anyway in the OpenSSL documentation, and instead just
unconditionally load the client cert and private key during connection
initialization. This lets us use SSL_CTX_use_certificate_chain_file(),
which does the right thing with additional certs, and is lots simpler than
the previous hacking about with BIO-level access. A small disadvantage is
that we have to load the primary client cert a second time with
SSL_use_certificate_file, so that that one ends up in the correct slot
within the connection's SSL object where it can get paired with the key.
Given the other overhead of making an SSL connection, that doesn't seem
worth worrying about.
Per discussion ensuing from bug #5468.
at least in some Windows versions, these functions are capable of returning
a failure indication without setting errno. That puts us into an infinite
loop if the previous value happened to be EINTR. Per report from Brendan
Hill.
Back-patch to 8.2. We could take it further back, but since this is only
known to be an issue on Windows and we don't support Windows before 8.2,
it does not seem worth the trouble.
attacks where an attacker would put <attack>\0<propername> in the field and
trick the validation code that the certificate was for <attack>.
This is a very low risk attack since it reuqires the attacker to trick the
CA into issuing a certificate with an incorrect field, and the common
PostgreSQL deployments are with private CAs, and not external ones. Also,
default mode in 8.4 does not do any name validation, and is thus also not
vulnerable - but the higher security modes are.
Backpatch all the way. Even though versions 8.3.x and before didn't have
certificate name validation support, they still exposed this field for
the user to perform the validation in the application code, and there
is no way to detect this problem through that API.
Security: CVE-2009-4034
sockopt(SO_NOSIGPIPE) or the MSG_NOSIGNAL flag to send().
We assume these features are available if (1) the symbol is defined at
compile time and (2) the kernel doesn't reject the call at runtime.
It might turn out that there are some platforms where (1) and (2) are
true and yet the signal isn't really blocked, in which case applications
would die on server crash. If that sort of thing gets reported, then
we'll have to add additional defenses of some kind.
Jeremy Kerr