postgres

Commit Graph

Author	SHA1	Message	Date
Andrew Pogrebnoy	1e85076491	PG-1895 Rewrite global providers during basebackup Before this commit, we excluded 1664_providers from being rewritten during a backup, since this is one of the files that the user has to copy to the destination before the backup starts. It was done so the user can have different global providers and keys to encrypt backup's WAL than the source server. However, this raised several issues in case the server creates new providers or modifies existing ones during the backup. Then, those changes will be lost in the backup and data related to such providers might become unreadable, or redo might struggle to perform a rotation. This commit treats 1664_providers as the rest of the files and makes provider changes safe during the backup. We still don't rewrite wal_keys, as we generate a unique WAL key for the backup, and the server can't generate new WAL keys can't during the backup. Fixes PG-1895	1 week ago
Andreas Karlsson	72e3b597c8	Add some simple tests for the archive and restore command These error paths are annoying to test with the whole setup so we call the CLI tools directly. Plus add a couple of tests for the --help argument.	1 week ago
Anders Åstrand	1b13757895	Clean up some triple newlines These seem to have appeared after the last purge.	1 week ago
Anders Åstrand	d0d0d2dfb4	PG-1892 Rework WalEncryptionKey into ranges This structure isn't really a key but meta data for a range of WAL using a given key. Rework these structures to better represent this. Also decouple the structure from the file format and use InternalKey for the actual decrypted key data. There is still plenty of cleanup to do in relation to this, but I believe this is at least easier to understand than what we currently have. I wish the end of the ranges were handled better though, but that is a bit out of scope for this commit.	1 week ago
Anders Åstrand	329bbfba3c	PG-1892 Do not use InternalKey for encrypted keys It gets confusing when this structure is used for both encrypted and decrypted key data. With this change it's obvious that any allocation of InternalKey can potentially leak key material to swap files. Also be more explicit with empty alignment in the TDEMapEntry struct as multiple levels of anonymous structs to maintain alignment would get a bit messy.	1 week ago
Anders Åstrand	ca48e7fb7a	PG-1892 Move InternalKey to common file This structure seems generally useful and doesn't belong with the relation key code.	1 week ago
Andreas Karlsson	e9e9266d03	Use Perl constructs instead of env variables for TDE_MODE Exposing and setting the environment variables everywhere makes it harder to refactor and to understand what is going on. Instead we can just write a couple of helper functions and use Perl scalar variables.	1 week ago
Andreas Karlsson	e257f22861	Only initialize WAL shmem once in EXEC_BACKEND builds This only affects EXEC_BACKEND/Windows builds which we currently do not support, but we fix this anyway to make the code more consistent and easier to understand since we try to care about this in other places. In the future we may want to add CI and proper support for EXEC_BACKEND builds. The issue was originally found by Zsolt Parragi.	2 weeks ago
Andreas Karlsson	8b3289b3d3	Disable archiving during TDE_MODE_SMGR setup Archiving being enabled during the setup of the SMGR environment caused one of the test suites for WAL archiving to fail so we disable it while running queries in single-user mode.	2 weeks ago
Andreas Karlsson	c13cd7c608	Respect extra arguments to initdb in TDE_MODE Previously we ignored the extra arguments to initdb when initializing the pg_tde directory and just copied the directory from a database initialized without the extra arguments, or if available from the cache. Now make sure that when extra arguments are supplied that we do not use the cache and that we copy the pg_tde directory from database initialized with the extra arguments. As far as I know this is only relevant to the --allow-group-access flag but we may as well make the solution generic.	2 weeks ago
Anders Åstrand	15ae5c6bd5	Remove unused #define I'm not sure how this ended up here, but it doesn't seem to belong here at all and isn't even used it seems.	2 weeks ago
Anders Åstrand	ffb828f413	PG-1870 Pre-generate pg_tde files for TAP tests This is currently only added for meson, but could also be added for make. This has to be setup before the actual TAP tests are run as they are run in parallel and as such would all try to setup the template at the same time if we let them use the same folder for it without it being pre-generated. This seems to shorten the test suite run-time by ~25% on my laptop, so it seems worth doing.	2 weeks ago
Anders Åstrand	97972f8139	PG-1870 Enable table encryption by default in TAP This enables table encryption by default in TAP tests when TDE_MODE=1. Use TDE_MODE_SMGR=0 to turn off table encryption when running with pg_tde loaded. The setup for running regress with tde turned on has been slightly modified to match what is done for TAP tests to let tests that run the regress suite under TAP work.	2 weeks ago
Anders Åstrand	a2be026da6	PG-1870 Enable WAL encryption in TAP tests This enables WAL encryption by default when the TAP tests are run with TDE_MODE=1. Use TDE_MODE_WAL=0 to disable wal encryption while still having pg_tde enabled.	2 weeks ago
Anders Åstrand	d31c4892dc	PG-1870 Load pg_tde and setup key in TAP tests This makes sure pg_tde is loaded and keys are setup when running postgresql TAP suite. No TDE features are enabled at this point. Single user mode is used to generate a template of pg_tde setup files which are then copied to each created cluster's data directory.	2 weeks ago
Andreas Karlsson	415cb8dd5b	Properly print errors from system() in archive and restore commands We used to assume that the only errors which could happen were ones which set the errno, but that is not the case. We also want to give nice output on non-zero return values and if the process was killed by a signal.	3 weeks ago
Andrew Pogrebnoy	481030de9a	pg_basebackup: encrypt streamed WAL with new key Before, pg_basebackup would encrypt streamed WAL according to the keys in pg_tde/wal_keys in the destination dir. This commit introduces the number of changes: pg_basebackup encrypts WAL only if the "-E --encrypt-wal" flag is provided. In such a case, it would extract the principal key, truncate pg_tde/wal_keys and encrypt WAL with a newly generated WAL key. We still expect pg_tde/wal_keys and pg_tde/1664_providers in the destination dir. In case these files are not provided, but "-E" is specified, it fails with an error. We also throw a warning if pg_basebackup runs w/o -E, but there is wal_keys on the source as WAL might be compromised, and the backup is broken For PG-1603, PG-1857	3 weeks ago
Andreas Karlsson	6df714b25c	PG-1867 Improve archiving test and fix race condition There was a race condition in the WAL archiving tests where if the end-of-recovery checkpoint had completed the tests for the WAL contents were non-sensical and racy. Solve this by explicilty promoting the server first after we have looked at the WAL contents but still making sure to wait until all WAL has been replayed. Additionally improve the tests by actually making sure the replica starts in a good state where all WAL is encrypted and testing both the plaintext and the encrypted scenarios.	3 weeks ago
Andreas Karlsson	80e0bb0b56	PG-1867 Make pg_tde_restore_encrypt re-use old keys Unfortunately the logic for generating a new key to protect the stream cipher used to encrypt the WAL stream in our restore command was based on totally incorrect assumptions due to how the recovery is implemented. Recovery is a state machine which can go back and forward between one mode where it streams from a primary and another where it first tries to fetch WAL from the archive and if that fails from the pg_wal directory, and in the pg_wal directory we may have files which are encrypted with whatever keys were there originally. To handle all the possible scenarios we remove the ability of pg_tde_restore_encrypt to generate new keys and just has it use whatever keys there are in the key file. This unfortunately means we open ourselves to some attacks on the stream cipher if the system is tricked into encrypting a different WAL stream at the same TLI and LSN as we already have encrypted. As far as I know this should be rare under normal operations since normally e.g. the WAL should be the same in the archive as the one in pg_wal or which we receive through streaming. Ideally we would want to fix this but for now it is better to have WAL encryption with this weakness than to not have it at all. This also incidentally fixes a bug we discovered caused by generating a new key only invalidating one key rather than all keys which should have become invalid, since we no longer generate a new key.	3 weeks ago
Andreas Karlsson	1338ceb137	Do not try to fetch the last key when we do not have to This is likely a leftover from when the logic for unencrypted keys was different from the one for real encryption keys.	3 weeks ago
Anders Åstrand	aed49c0847	PG-1866 Reset WAL key cache on shmem init It seems like there are cases when the postmaster have "restarted" after a backend crash where the wal cache inherited from the postmaster is wrong. I'm not at all sure exactly how and why this happens, but this patch fixes a bug with this and allows recovery/013_crash_restart to pass with WAL encryption enabled.	3 weeks ago
Andrew Pogrebnoy	621c3f8d3d	Suppress LSAN complaints on pgbench	3 weeks ago
Andrew Pogrebnoy	dbaeda163d	Fix leaked var in tde archiver tools aka make sanitizers happy	3 weeks ago
Anders Åstrand	b709662c25	Hold required lock when initializing shmem According to the documentation, each backend is supposed to hold AddinShmemInitLock when calling ShmemInitStruct. We only did that for half of our calls before this patch.	3 weeks ago
Zsolt Parragi	ff8a389bfd	PG-1604 fix: preallocate one more record for the cache There is at lesat one corner case scenario where we have to load the last record into the cache during a write: * replica crashes, receives last segment from primary * replica replays last segment, reaches end * replica activtes new key * replica replays prepared transaction, has to use old keys again * old key write function sees that we generated a new key, tries to load it In this scenario we could get away by detecting that we are in a write, and asserting if we tried to use the last key. But in a release build assertions are not fired, and we would end up writing some non encrypted data to disk, and later if we have to run recovery failing. It could be a FATAL, but that would still crash the server, and the next startup would crash again and again... Instead, to properly avoid this situation we preallocate memory for one more key in the cache during initialization. Since we can only add one extra key to the cache during the servers run, this means we no longer try to allocate in the critical section in any corner case. While this is not the nicest solution, it is simple and keeps the current cache and decrypt/encrypt logic the same as before. Any other solution would be more complex, and even more of a hack, as it would require dealing with a possibly out of date cache.	3 weeks ago
Andreas Karlsson	167aef2ba2	PG-1605 Fix issue with test which crashes when re-run We make sure to delete the keyring files before running the test.	4 weeks ago
Andreas Karlsson	f96ade0f2d	PG-1605 Fix encryption with old keys with disabled WAL encryption To not break recovery when we replay encrypted WAL but WAL encryption is disabled the simplest way is to treat disabled WAL encryption just like enabled WAL encryption. The issue is not big in practice since it should only hit users who disable WAL encryption and then crash the database but treating both cases the same way makes the code simple to understand.	4 weeks ago
Zsolt Parragi	9dfed22f84	PG-1604: Improve last key LSN calculation logic Previosly we simply set the LSN for the new key to the first write location. This is however not correct, as there are many corner cases around this: * recovery / replication might write old LSNs * we can't handle multiple keys with the same TLI/LSN, which can happen with quick restarts without writes To support this in this commit we modify the following: * We only activate new keys outside crash recovery, or immediately if encryption is turned off * We also take the already existing last key into account (if exists), and only activate a new key if we progressed past its start location The remaining changes are just support infrastructure for this: * Since we might rewrite old records, we use the already existing keys for those writes, not the active last keys * We prefetch existing keys during initialization, so it doesn't accidentally happen in the critical section during a write There is a remaining bug with stopping wal encryption, also mentioned in a TODO message in the code. This will be addressed in a later PR as this fix already took too long.	4 weeks ago
Zsolt Parragi	c7e7dc52a7	Xlog encryption bugfix: offset calculation was off on TLI change The min/max comparisons of LSNs assumed that everyting is in the same timeline. In practice, with replication + recovery combinations, it is possible that keys span at least 3 timelines, which means that this has to be included in both combinations, as in other timelines, the restrictions are less strict.	4 weeks ago
Anders Åstrand	6ddc86c4e3	Fix tabs in usage instructions These should be spaces inside the usage instruction string, not tabs.	4 weeks ago
Anders Åstrand	f5082879dc	PG-1862 Use single argument for wrapped command Use a single argument for the wrapped command in the archivation wrappers. Instead of giving all of the arguments of the command separately and trying to figure out which one should be replaced by the path to the unencrypted WAL segment, we take a single argument and do % parameter replacement similar to what postgres does with archive_command and restore_command. This also mean that we can simplify by using system() instead of exec(). We also clean up usage instructions and make the two wrappers more symmetrical by requiring the same parameters. Co-authored-by: Andreas Karlsson <andreas.karlsson@percona.com>	4 weeks ago
Anders Åstrand	db41dae201	Fix typo in pg_tde_archive_decrypt It's decrypt, not deceypt.	4 weeks ago
Andreas Karlsson	2d91a89189	PG-1842 PG-1843 Optimize deletion of leftover relation keys Instead of first deleting any leftover key and then writing the new key we do a single pass through the file where we replace any old key that we find. To make this happen on redo too we need to stop generating a separate WAL record for the key deletion for encrypted tables and only generate that record for unencrypted tables where we still need a key deletion record. We except this optimization to primarily be visible on WAL replay where only a single backend is used to replay everything, but it also speeds up table creation in general on workloads with many tables.	4 weeks ago
Andreas Karlsson	13c1038eeb	PG-1863 Consistently use create/delete for keys We used a mix of create, add, delete and remove. We still use free and save in pg_tde_tdemap.c but that is soemthing we can fix later.	4 weeks ago
Andreas Karlsson	bccaa7ab15	PG-1863 Do not try to delete keys or log WAL for temporary tables We forgot to have a check against trying to delete leftover SMGR keys for temporary tables which is a useless operation since they are store in memory. Additionally we forgot to prevent WAL from being written when creating or removing a key in smgrcreate() for temporary tables.	4 weeks ago
Andreas Karlsson	d83b36e58d	Remove merge separator from documentation page It was accidentally introduced in commit `8d88d3f28a`.	1 month ago
Andreas Karlsson	57eefad2a1	Clean up code blocks in our documentation - Fix whitespace - Make sure to use the right languages - Do not wrap short SQL queries unnecessarily - Add missing end of code block - Add missing semicolon to SQL query	1 month ago
Andreas Karlsson	b2bb77c0ef	Move common things for key files into a separate header file Instead of having the WAL key code include the headers for the SMGR keys we move the shared code into a separate header file. Additionally we clean up some minor header issues.	1 month ago
Andreas Karlsson	cff0bf5ad3	Remove warning about WAL encryption being unstable In the next release the WAL encryption will no longer be in beta testing and the on disk format is guaranteed to be stable going forward.	1 month ago
Andreas Karlsson	3d90419b08	Constify some function arguments Some of the functions which take a principal key should take a const pointer.	1 month ago
Andreas Karlsson	1d12fe4d26	Remove two pointless debug log statemenets Just logging that the function was called at DEBUG2 is not very helpful to anyone and is presumably jsut a leftover from someone's attempt at debugging a particular issue they had at some point.	1 month ago
Andreas Karlsson	695a1426cc	Move code from key rotation helpers into the function Breaking these particular snippets out as separate functions did not improve readability and was only done because they use to be called from multiple locations. This change has already been done in the WAL key code.	1 month ago
Andreas Karlsson	cf91f94710	Use sizeof directly instead of defines We have already done this refactoring for the WAL key code so let's do it for the SMGR keys too. This makes the code easier to understand.	1 month ago
Dragos Andriciuc	bbe1728be4	PG-1523 Rework uninstallation documentation cleanup (#490 ) - update the pg_tde uninstall steps and add a troubleshooting section in case user receives an error during uninstall - improve introductory paragraph	1 month ago
Artem Gavrilov	458e6ed0be	Add missing test to meson build Add missing key vaidation test to meson build configuration.	1 month ago
Andrew Pogrebnoy	da899e0f32	PG-1603 Make pg_basebackup work with encrypted WAL When WAL is streamed during the backup (default mode), it comes in unencrypted. But we need keys to encrypt it. For now, we expect that the user would put `pg_tde` dir containing the `1664_key` and `1664_providers` into the destination directory before starting the backup. We encrypt the streamed WAL according to internal keys. No `pg_tde` dir means no streamed WAL encryption.	1 month ago
Andreas Karlsson	602cd736e2	Split key type enum into two to make code less confusing Also rename enum variants for consistency plus renumber the types for the WAL keys which is fine since this file is newly introduced which makes breaking backwards compatibility not an issue.	1 month ago
Andreas Karlsson	588938d7b9	Clean up type code for the key map file Let's stop pretending that we support more than two status: empty or that there is a SMGR key.	1 month ago
Andreas Karlsson	d7b42c1fde	Remove checks for empty entries in WAL key file Sincw we never delete WAL keys this logic only confuses the reader of the code. Plus we can optimize the insertion of a new WAL key by using seek().	1 month ago
Andrew Pogrebnoy	1a20e9bb45	PG-1813 Make WAL keys TLI aware Before this commit, WAL keys didn't mind TLI at all. But after pg_rewind, for example, pg_wal/ may contain segments from two timelines. And the wal reader choosing the key may pick the wrong one because LSNs of different TLIs may overlap. There was also another bug: There is a key with the start LSN 0/30000 in TLI 1. And after the start in TLI 2, the wal writer creates a new key with the SN 0/30000, but in TLI 2. But the reader wouldn't fetch the latest key because w/o TLI, these are the same. This commit adds TLI to the Internal keys and makes use of it along with LSN for key compares.	1 month ago

1 2 3 4 5 ...

60682 Commits (1e850764910b3aa670607f3c510df401315ea0ca) All Branches Search

60682 Commits (1e850764910b3aa670607f3c510df401315ea0ca)

All Branches