Since partitioned tables, and indexes, lack storage the "is encrypted"
property is not relevant to them because encryption is done at the SMGR
level. Therefore we should either throw an error or return NULL, and
here we choose to return NULL to make the function easier to work with.
Since only WAL encryption use the cached context it should not be part
of every internal key. Instead store it in the WAL decryption key cache
and the backend global state for WAL encryption.
It is unclear to me why we stopped encrypting the FSM and visibility
map forks in commit e514ac5cc1 which
sadly does not explain why we stopped doing so. Both of them leak
metadata that we do not have to and the write load to these tables
are not heavy enough to excuse it for perfomrance reasons.
So instead of leaking data and having to jsutify that we jsut encrypt
all forks.
We never took advantage of that the relation keys were stored in two
different files, one for the RelFileNumber and flags and one for the
actual keys so this just complicated the code for almost no gain.
Bumps the version of the file format.
Updating of principal keys has not been used since commit
06885e3559 and the code is broken anyway
due to the byte offset being wrong after writing a header to the map
file. That broken code will be remove in a future commit.
The WAL record has a field for which principal key was used to encrypt
the internal key but it was never initialized. The reaosn it worked is
because if we were lucky enough to have the data on stack be zero we
would treat it as no principal key info was passed and that code path
worked well since the parameter is optional.
Alternatively we could drop the field from the WAL record and pass NULL
instead pg_tde_write_map_entry() since that seems to so far have worked.
smgr_create is called for all forks. It is possible that additional
forks for existing tables are created during a tde creation event.
In practice this happens quite often with CREATE INDEX CONCURRENTLY.
Without this fix, pg_tde created encryption keys for these existing
tables, and later writes and reads tried to use these keys with
all these issues. In practice some of the file got encrypted, some
didn't, and earlier records that weren't encrypted became unreadable.
We should not XLog a default key copy. It doesn’t make sense for
reliability as we write results dirctly to the disk. And as for the
replicas, they would just try to get a key, retrieve the default one
and do copying on its own instead of reading that from the XLog.
XLogging a new use of default keys creates the next issue:
On server start, the WAL init tries to get a current WAL key to decide
what to do next - create a new encrypted or unencrypted key, or do
nothing - and for that it needs a principal key. During the
GetPrincipalKey call, if there is only a default principal key, the
server will create a new principal key for WAL (in this case) by
copying the default key with the WAL Oid. Hence, create a new
principal key with XLogInserts generated.
Fixes PG-1476
By first looking up the principal key and then the relation key we hid
errors when we have lost the principal key but some relations are still
encrypted. Which could for example lead to trying to read an encrypted
table as it was unencrypted causing errors like the following:
ERROR: invalid page in block 0 of relation base/5/16448
This does not solve the much scarier issue when we get another principal
key back from the key server but we at least get better error messages
for this common case.
When PostgreSQL throws an error it releases all LWLocks so explictly
releasing them in some places while not in others just confuses the
reader and the simplest solution is to never release them in the
error path unless there is a really good reason.
Since we now autoamtically generate keys this sanity check no longer
makes any sense. Arguably we could replace it with another sanity
check but I think we can just trust the code to do the right thing.
Some includes should be in .c files while some were not nedded at all.
On top of that we move some define into the right .c file and remove
an unused typedef.
The new and old principal keys were switched for the rotate function,
and as we do not have principal key validation for tdemap data, the
function doesn't notice this.
The problem also isn't visible until a server restart / new connection
because of internal key caching, which means the SQL tests also missed
to detect this.