postgres

Commit Graph

Author	SHA1	Message	Date
Artem Gavrilov	d0b53f35fb	Merge pull request #571 from percona/release-17.5.3 Merge back release 17.5.3	1 week ago
Andrew Pogrebnoy	1e85076491	PG-1895 Rewrite global providers during basebackup Before this commit, we excluded 1664_providers from being rewritten during a backup, since this is one of the files that the user has to copy to the destination before the backup starts. It was done so the user can have different global providers and keys to encrypt backup's WAL than the source server. However, this raised several issues in case the server creates new providers or modifies existing ones during the backup. Then, those changes will be lost in the backup and data related to such providers might become unreadable, or redo might struggle to perform a rotation. This commit treats 1664_providers as the rest of the files and makes provider changes safe during the backup. We still don't rewrite wal_keys, as we generate a unique WAL key for the backup, and the server can't generate new WAL keys can't during the backup. Fixes PG-1895	1 week ago
Andreas Karlsson	72e3b597c8	Add some simple tests for the archive and restore command These error paths are annoying to test with the whole setup so we call the CLI tools directly. Plus add a couple of tests for the --help argument.	1 week ago
Anders Åstrand	1b13757895	Clean up some triple newlines These seem to have appeared after the last purge.	1 week ago
Dragos Andriciuc	f16d4e5dc8	Update git ver and pdf cover date (#569 )	1 week ago
Dragos Andriciuc	c4c685160c	Small date fix (#568 )	1 week ago
Anders Åstrand	d0d0d2dfb4	PG-1892 Rework WalEncryptionKey into ranges This structure isn't really a key but meta data for a range of WAL using a given key. Rework these structures to better represent this. Also decouple the structure from the file format and use InternalKey for the actual decrypted key data. There is still plenty of cleanup to do in relation to this, but I believe this is at least easier to understand than what we currently have. I wish the end of the ranges were handled better though, but that is a bit out of scope for this commit.	1 week ago
Anders Åstrand	329bbfba3c	PG-1892 Do not use InternalKey for encrypted keys It gets confusing when this structure is used for both encrypted and decrypted key data. With this change it's obvious that any allocation of InternalKey can potentially leak key material to swap files. Also be more explicit with empty alignment in the TDEMapEntry struct as multiple levels of anonymous structs to maintain alignment would get a bit messy.	1 week ago
Anders Åstrand	ca48e7fb7a	PG-1892 Move InternalKey to common file This structure seems generally useful and doesn't belong with the relation key code.	1 week ago
Andreas Karlsson	e9e9266d03	Use Perl constructs instead of env variables for TDE_MODE Exposing and setting the environment variables everywhere makes it harder to refactor and to understand what is going on. Instead we can just write a couple of helper functions and use Perl scalar variables.	1 week ago
Dragos Andriciuc	7ef4fceabf	Update `pg_verifybackup` tool with workaround (#553 )	2 weeks ago
Andreas Karlsson	e257f22861	Only initialize WAL shmem once in EXEC_BACKEND builds This only affects EXEC_BACKEND/Windows builds which we currently do not support, but we fix this anyway to make the code more consistent and easier to understand since we try to care about this in other places. In the future we may want to add CI and proper support for EXEC_BACKEND builds. The issue was originally found by Zsolt Parragi.	2 weeks ago
Andreas Karlsson	8b3289b3d3	Disable archiving during TDE_MODE_SMGR setup Archiving being enabled during the setup of the SMGR environment caused one of the test suites for WAL archiving to fail so we disable it while running queries in single-user mode.	2 weeks ago
Andreas Karlsson	c13cd7c608	Respect extra arguments to initdb in TDE_MODE Previously we ignored the extra arguments to initdb when initializing the pg_tde directory and just copied the directory from a database initialized without the extra arguments, or if available from the cache. Now make sure that when extra arguments are supplied that we do not use the cache and that we copy the pg_tde directory from database initialized with the extra arguments. As far as I know this is only relevant to the --allow-group-access flag but we may as well make the solution generic.	2 weeks ago
Anders Åstrand	15ae5c6bd5	Remove unused #define I'm not sure how this ended up here, but it doesn't seem to belong here at all and isn't even used it seems.	2 weeks ago
Dragos Andriciuc	d48db2b58f	Update warnings for basebackup (#558 )	2 weeks ago
Anders Åstrand	ffb828f413	PG-1870 Pre-generate pg_tde files for TAP tests This is currently only added for meson, but could also be added for make. This has to be setup before the actual TAP tests are run as they are run in parallel and as such would all try to setup the template at the same time if we let them use the same folder for it without it being pre-generated. This seems to shorten the test suite run-time by ~25% on my laptop, so it seems worth doing.	2 weeks ago
Anders Åstrand	97972f8139	PG-1870 Enable table encryption by default in TAP This enables table encryption by default in TAP tests when TDE_MODE=1. Use TDE_MODE_SMGR=0 to turn off table encryption when running with pg_tde loaded. The setup for running regress with tde turned on has been slightly modified to match what is done for TAP tests to let tests that run the regress suite under TAP work.	2 weeks ago
Anders Åstrand	a2be026da6	PG-1870 Enable WAL encryption in TAP tests This enables WAL encryption by default when the TAP tests are run with TDE_MODE=1. Use TDE_MODE_WAL=0 to disable wal encryption while still having pg_tde enabled.	2 weeks ago
Anders Åstrand	d31c4892dc	PG-1870 Load pg_tde and setup key in TAP tests This makes sure pg_tde is loaded and keys are setup when running postgresql TAP suite. No TDE features are enabled at this point. Single user mode is used to generate a template of pg_tde setup files which are then copied to each created cluster's data directory.	2 weeks ago
Dragos Andriciuc	a3f36c8983	Add pg_receivewal to list of unsupported tools with WAL encrypt (#556 )	2 weeks ago
Dragos Andriciuc	89f5000235	Update Patroni config file (#551 ) The original config file was taken from [here](https://github.com/jobinau/pgscripts/blob/main/patroni/patroni.yml) and it is currently replaced with a more "up to date" version [here](`bbb1df011f/templates/patroni.yml.j2`).	2 weeks ago
Dragos Andriciuc	afdbffb422	Add information regarding key rotation during backups for pg_basebackup making servers fail to start (#550 ) - add as known issue in release notes - fix a broken link in features.md (not related to issue...) - add to global key providers a warning about keyring provider with WAL encrypt - add new subtopic in Backup WAL about key rotations during backups for file-based key providers Based on PG-1895 description.	2 weeks ago
Dragos Andriciuc	532d264054	Remove an extra s from param and remove ensure_new_key param (#555 )	2 weeks ago
Andrew Pogrebnoy	2364be29cc	Fix XLogging of rotated key Before this commit, we XLogged the provider ID (keyringId) of the old key. Yet, we then attempt to fetch the new key from the old provider during the Redo, which obviously fails and crashes the recovery. So the next steps lead to the recovery stalemate: - Create new provider (with new destination - mount_path, url etc). - Create new server/global key. - Rotate key. - <Crash!> This commit fixes it by Xlogging the new key's provider ID. For: PG-1895	2 weeks ago
Andrew Pogrebnoy	a711d8befa	Fix possible _keys file loss during key rotation There is no reason to do durable_unlink before durable_rename. Rename can handle existing file. But with this sequence, the cluster may endup in unrecoverable state should server crash in-between this two ops, as there is going to be no "_keys" at all. The current sequence may also cause an issue the backup: <durable_unlink>, <pg_basebackup gets a file list>, <durable_rename>. And no "_keys" file in the backup as the result.	2 weeks ago
Dragos Andriciuc	fb543801dc	Add WAL release note for 2.0 release (#482 ) - add new date variable for 2.0 release - populated with feedback after code freeze and team comments	2 weeks ago
Dragos Andriciuc	6719db5704	Update WAL backup topic regarding TAR not supporting `-X stream` when WAL encryption is enabled (#548 ) - renamed title, reorg content for easier scanning - turned into note the `pg_tde/wal_keys` at the end	2 weeks ago
Dragos Andriciuc	acaddab9ab	PG-1858 Document backing up with WAL with encrypt enabled (#534 ) - add new topic called # Backing up with WAL encryption enabled - add two suptopics for other wal methods and restore backup created with wal encrypt - reword to short form option flags	3 weeks ago
Dragos Andriciuc	307b33d656	Add WAL content for 2.0 release (#499 ) - remove (tech preview) - remove mentions of WAL being BETA and warning notes - add WAL tool support to limitations, improve flow, add button to setup - add limitation regarding WAL shipping standy not supported with WAL encryption - add mention of open source and enterprise ed being supported for pg_tde - add none method to basebackup and link to topic - add Example Patroni configuration for Patroni tool - improve supported vs unsupported tools section in Limitations	3 weeks ago
Dragos Andriciuc	c0ad12b50f	PG-1832 Document the archive and restore commands cont (#531 ) Continued from #523 - add pg_tde archive and restore commands - update cli-tools.md with paragraphs explaining New and extended tools - update pg-tde-restore-encrypt tool with new information and better descriptions for clarity - update the Features topic button for better clarity	3 weeks ago
Dragos Andriciuc	9d6297fa30	Update the Features topic buttons for better clarity (#508 )	3 weeks ago
Anders Åstrand	50b3ec8e97	Bump percona version to 17.5.3 This is the version we're about to release.	3 weeks ago
Anders Åstrand	d55240d24c	Bump pg_tde version to 2.0 This release is supposed to be 2.0. The SQL upgrade file is a dummy, but I believe it's required.	3 weeks ago
Andreas Karlsson	415cb8dd5b	Properly print errors from system() in archive and restore commands We used to assume that the only errors which could happen were ones which set the errno, but that is not the case. We also want to give nice output on non-zero return values and if the process was killed by a signal.	3 weeks ago
Andrew Pogrebnoy	481030de9a	pg_basebackup: encrypt streamed WAL with new key Before, pg_basebackup would encrypt streamed WAL according to the keys in pg_tde/wal_keys in the destination dir. This commit introduces the number of changes: pg_basebackup encrypts WAL only if the "-E --encrypt-wal" flag is provided. In such a case, it would extract the principal key, truncate pg_tde/wal_keys and encrypt WAL with a newly generated WAL key. We still expect pg_tde/wal_keys and pg_tde/1664_providers in the destination dir. In case these files are not provided, but "-E" is specified, it fails with an error. We also throw a warning if pg_basebackup runs w/o -E, but there is wal_keys on the source as WAL might be compromised, and the backup is broken For PG-1603, PG-1857	3 weeks ago
Andreas Karlsson	6df714b25c	PG-1867 Improve archiving test and fix race condition There was a race condition in the WAL archiving tests where if the end-of-recovery checkpoint had completed the tests for the WAL contents were non-sensical and racy. Solve this by explicilty promoting the server first after we have looked at the WAL contents but still making sure to wait until all WAL has been replayed. Additionally improve the tests by actually making sure the replica starts in a good state where all WAL is encrypted and testing both the plaintext and the encrypted scenarios.	3 weeks ago
Andreas Karlsson	80e0bb0b56	PG-1867 Make pg_tde_restore_encrypt re-use old keys Unfortunately the logic for generating a new key to protect the stream cipher used to encrypt the WAL stream in our restore command was based on totally incorrect assumptions due to how the recovery is implemented. Recovery is a state machine which can go back and forward between one mode where it streams from a primary and another where it first tries to fetch WAL from the archive and if that fails from the pg_wal directory, and in the pg_wal directory we may have files which are encrypted with whatever keys were there originally. To handle all the possible scenarios we remove the ability of pg_tde_restore_encrypt to generate new keys and just has it use whatever keys there are in the key file. This unfortunately means we open ourselves to some attacks on the stream cipher if the system is tricked into encrypting a different WAL stream at the same TLI and LSN as we already have encrypted. As far as I know this should be rare under normal operations since normally e.g. the WAL should be the same in the archive as the one in pg_wal or which we receive through streaming. Ideally we would want to fix this but for now it is better to have WAL encryption with this weakness than to not have it at all. This also incidentally fixes a bug we discovered caused by generating a new key only invalidating one key rather than all keys which should have become invalid, since we no longer generate a new key.	3 weeks ago
Andreas Karlsson	1338ceb137	Do not try to fetch the last key when we do not have to This is likely a leftover from when the logic for unencrypted keys was different from the one for real encryption keys.	3 weeks ago
Anders Åstrand	aed49c0847	PG-1866 Reset WAL key cache on shmem init It seems like there are cases when the postmaster have "restarted" after a backend crash where the wal cache inherited from the postmaster is wrong. I'm not at all sure exactly how and why this happens, but this patch fixes a bug with this and allows recovery/013_crash_restart to pass with WAL encryption enabled.	3 weeks ago
Andrew Pogrebnoy	621c3f8d3d	Suppress LSAN complaints on pgbench	3 weeks ago
Andrew Pogrebnoy	dbaeda163d	Fix leaked var in tde archiver tools aka make sanitizers happy	3 weeks ago
Anders Åstrand	b709662c25	Hold required lock when initializing shmem According to the documentation, each backend is supposed to hold AddinShmemInitLock when calling ShmemInitStruct. We only did that for half of our calls before this patch.	3 weeks ago
Zsolt Parragi	ff8a389bfd	PG-1604 fix: preallocate one more record for the cache There is at lesat one corner case scenario where we have to load the last record into the cache during a write: * replica crashes, receives last segment from primary * replica replays last segment, reaches end * replica activtes new key * replica replays prepared transaction, has to use old keys again * old key write function sees that we generated a new key, tries to load it In this scenario we could get away by detecting that we are in a write, and asserting if we tried to use the last key. But in a release build assertions are not fired, and we would end up writing some non encrypted data to disk, and later if we have to run recovery failing. It could be a FATAL, but that would still crash the server, and the next startup would crash again and again... Instead, to properly avoid this situation we preallocate memory for one more key in the cache during initialization. Since we can only add one extra key to the cache during the servers run, this means we no longer try to allocate in the critical section in any corner case. While this is not the nicest solution, it is simple and keeps the current cache and decrypt/encrypt logic the same as before. Any other solution would be more complex, and even more of a hack, as it would require dealing with a possibly out of date cache.	3 weeks ago
Andreas Karlsson	167aef2ba2	PG-1605 Fix issue with test which crashes when re-run We make sure to delete the keyring files before running the test.	4 weeks ago
Andreas Karlsson	f96ade0f2d	PG-1605 Fix encryption with old keys with disabled WAL encryption To not break recovery when we replay encrypted WAL but WAL encryption is disabled the simplest way is to treat disabled WAL encryption just like enabled WAL encryption. The issue is not big in practice since it should only hit users who disable WAL encryption and then crash the database but treating both cases the same way makes the code simple to understand.	4 weeks ago
Zsolt Parragi	9dfed22f84	PG-1604: Improve last key LSN calculation logic Previosly we simply set the LSN for the new key to the first write location. This is however not correct, as there are many corner cases around this: * recovery / replication might write old LSNs * we can't handle multiple keys with the same TLI/LSN, which can happen with quick restarts without writes To support this in this commit we modify the following: * We only activate new keys outside crash recovery, or immediately if encryption is turned off * We also take the already existing last key into account (if exists), and only activate a new key if we progressed past its start location The remaining changes are just support infrastructure for this: * Since we might rewrite old records, we use the already existing keys for those writes, not the active last keys * We prefetch existing keys during initialization, so it doesn't accidentally happen in the critical section during a write There is a remaining bug with stopping wal encryption, also mentioned in a TODO message in the code. This will be addressed in a later PR as this fix already took too long.	4 weeks ago
Zsolt Parragi	c7e7dc52a7	Xlog encryption bugfix: offset calculation was off on TLI change The min/max comparisons of LSNs assumed that everyting is in the same timeline. In practice, with replication + recovery combinations, it is possible that keys span at least 3 timelines, which means that this has to be included in both combinations, as in other timelines, the restrictions are less strict.	4 weeks ago
Anders Åstrand	6ddc86c4e3	Fix tabs in usage instructions These should be spaces inside the usage instruction string, not tabs.	4 weeks ago
Anders Åstrand	f5082879dc	PG-1862 Use single argument for wrapped command Use a single argument for the wrapped command in the archivation wrappers. Instead of giving all of the arguments of the command separately and trying to figure out which one should be replaced by the path to the unencrypted WAL segment, we take a single argument and do % parameter replacement similar to what postgres does with archive_command and restore_command. This also mean that we can simplify by using system() instead of exec(). We also clean up usage instructions and make the two wrappers more symmetrical by requiring the same parameters. Co-authored-by: Andreas Karlsson <andreas.karlsson@percona.com>	4 weeks ago

1 2 3 4 5 ...

60701 Commits (d0b53f35fb29b9b0c0af973f197b609f13a630f4) All Branches Search

60701 Commits (d0b53f35fb29b9b0c0af973f197b609f13a630f4)

All Branches