When a relation moved to a new location it causes the change of relfilenode id
for it. Hence we must re-encrypt and store its internal key with the new id.
Also, we have to store the changed internal key in the new physical location,
and copy there principal key info and keyring data.
Fixes https://perconadev.atlassian.net/browse/PG-1038
* PG-1058 Fix MergeJoin issue
Resolved an issue in MergeJoin by ensuring the decrypted buffer contents are
also copied from the source to the destination tuple slot during
slot copy operations.
Co Authored by:
Andrew Pogrebnoy <absourd.noise@gmail.com>
Artem Gavrilov <artem.gavrilov@percona.com>
* PG-1056 Add failing test
* PG-1056 Use proper AM in test
* Fix UPDATE SET ... RETURNING processing for encrypted tuples
If `get_heap_tuple` is NULL, the core uses `copy_heap_tuple` instead. The former returns a pointer to a tuple in the slot and the latter makes a copy of such a tuple. For UPDATE SET, the core uses the slot for INSERT and later for RETURNING processing. If we copy the tuple the next happens:
1. The core creates a slot with the generic tuple.
2. It passed to `pg_tdeam_tuple_update()` and it gets a copy of the tuple here [6d4f7e5b7b/src17/access/pg_tdeam_handler.c (L336)].
3. This generic tuple is filled with the proper data and used for the update here [6d4f7e5b7b/src17/access/pg_tdeam_handler.c (L343)].
4. Later on, RETURNING processing uses the slot's tuple but is still a generic unmodified one because of the copy.
5. That results in wrong RETURNING data.
To avoid this, we should return a pointer to the slot's tuple instead of copying it.
Fixes PG-1056
* PG-1056 Split 'update' testcase for tde_heap and tde_heap_basic
---------
Co-authored-by: Andrew Pogrebnoy <absourd.noise@gmail.com>
* Remove 'percona_tde.pg_tde_key_provider' user catalog and introduce a provider info file for key providers
This commit removes the 'percona_tde.pg_tde_key_provider' user catalog and
replaces it with a provider info file to save key providers.
This change ensures that the key provider information can be accessed
during recovery, the user catalogs cannot be relied upon in such scenarios.
The commit maintains the current API functions, so callers will not experience
any differences in functionality or usage after this change.
Additionally, the commit adjusts how the shared memory manager retrieves
information about the number of LWLocks required by the extension, optimizing
the process.
TODO: Implement xlog message for cleaning up the provider info file during
recovery operations to ensure consistency and avoid potential issues.
* TDE TupleTableSlot for storing decrypted tuple along with the buffer tuple
Tuple data in the shared buffer is encrypted. To store the tuple in the
tupleTableslot, the tuple data is decrypted into allocated memory. This memory
needs to be properly cleaned up. However, with the existing
BufferHeapTupleTableSlot, there is no way to free this memory until the end of
the current query executor cycle.
To address this, the commit introduces TDEBufferHeapTupleTableSlot, a clone of
BufferHeapTupleTableSlot that keeps a reference to the allocated decrypted tuple
and frees it when the tuple slot is cleared. Most of the code is borrowed from
the BufferHeapTupleTableSlot implementation, ensuring that
TDEBufferHeapTupleTableSlot can be cast to BufferHeapTupleTableSlot
Apart from the above, a workaround to clear the decrypted tuple pointer
is added to the TDEBufferHeapTupleTableSlot for cases when the
slot is reused while the previously decrypted tuple was cleared out by
MemoryContext deletion, instead of through the slot cleanup callback.
* Fix issue-153: Server crash and database corruption
We can't use the Tuple CID as an IV because it changes when the tuple is deleted.
If we have a trigger function that needs the deleted tuple, it will get the
wrong IV when decrypting. This happens because the CID used to encrypt the tuple
(during INSERT/UPDATE) is different from the CID passed to the decryption
function (during delete).
To fix this, we need to stop using the CID for IV calculation.
* Update test case to produce same result on all environment
* A function to get the current master key info.
Commit adds tde_master_key_info() function that returns
the information about the master key for the database.
SELECT * FROM tde_master_key_info();
-[ RECORD 1 ]------------+------------------------------
master_key_name | test-db-master-key
key_provider_name | file-vault
key_provider_id | 1
master_key_internal_name | test-db-master-key_7
master_key_version | 7
key_createion_time | 2024-03-26 22:28:20.998034+05
* Disallow deletion of keyring used by Master key.
The commit adds the before-delete trigger on the keyring catalog to ensure that
the key provider used by the master key should not be deleted.
* Introducing catalog table for managing key providers
This commit introduces a user catalog table, percona_tde.pg_tde_key_provider,
within the percona_tde schema, as part of the pg_tde extension. The purpose of
this table is to store essential provider information. The catalog accommodates
various key providers, present and future, utilizing a JSON type
options field to capture provider-specific details.
To facilitate the creation of key providers,
the commit introduces new SQL interfaces:
- pg_tde_add_key_provider(provider_type VARCHAR(10),
provider_name VARCHAR(128), options JSON)
- pg_tde_add_key_provider_file(provider_name VARCHAR(128),
file_path TEXT)
- pg_tde_add_key_provider_vault_v2(provider_name VARCHAR(128),
vault_token TEXT, vault_url TEXT,
vault_mount_path TEXT, vault_ca_path TEXT)
Additionally, the commit implements the C interface for catalog
interaction, detailed in the 'tde_keyring.h' file.
These changes lay the foundation for implementing multi-tenancy in pg_tde by
eliminating the necessity of a 'keyring.json' file for configuring a
cluster-wide key provider. With this enhancement, each database can have its
dedicated key provider, added via SQL interface, removing the need
for DBA intervention in TDE setup."
* Establishing a Framework for Master Key and Shared Cache Management
Up until now, pg_tde relied on a hard-coded master key name, primarily for
proof-of-concept purposes. This commit introduces a more robust infrastructure
for configuring the master key and managing a dynamic shared memory-based
master-key cache to enhance accessibility.
For user interaction, a new SQL interface is provided:
- pg_tde_set_master_key(master_key_name VARCHAR(255), provider_name VARCHAR(255));
This interface enables users to set a master key for a specific database and make
further enhancements toward implementing the multi-tenancy.
In addition to the public SQL interface, the commit optimizes the internal
master-key API. It introduces straightforward Get and Set functions,
handling locking, retrieval, caching, and seamlessly assigning a master key for
a database.
The commit also introduces a unified internal interface for requesting and
utilizing shared memory, contributing to a more cohesive and efficient
master key and cache management system.
* Revamping the Keyring API Interface and Integrating Master Key
This commit unifies the master-key and key-provider modules with the core of
pg_tde, marking a significant evolution in the architecture.
As part of this integration, the keyring API undergoes substantial changes
to enhance flexibility and remove unnecessary components such as the key cache.
As a result of the keyring refactoring, the file keyring is also rewritten,
offering a template for implementing additional key providers for the extension.
The modifications make the keyring API more pluggable, streamlining
interactions and paving the way for future enhancements.
* An Interface for Informing the Shared Memory Manager about Lock Requirements
This commit addresses PostgreSQL core's requirement for upfront information
regarding the number of locks the extension needs. Given the connection
between locks and the shared memory interface, a new callback routine
is introduced. This routine allows modules to specify
the number of locks they require.
In addition to this functionality, the commit includes code cleanups
and adjustments to nomenclature for improved clarity and consistency.
* Adjusting test cases
* Extension Initialization and Cleanup Mechanism
This commit enhances the extension by adding a new mechanism to facilitate
cleanup or setup procedures when the extension is installed in a database.
The core addition is a function "pg_tde_extension_initialize" invoked upon
executing the database's 'CREATE EXTENSION' command.
The commit introduces a callback registration mechanism to streamline
future development and ensure extensibility. This enables any module
to specify a callback function (registered using on_ext_install() ) to be
invoked during extension creation.
As of this commit, the callback functionality is explicitly utilized by the
master key module to handle the cleanup of the master key information file.
This file might persist in the database directory if the extension had been
previously deleted in the same database.
This enhancement paves the way for a more modular and maintainable extension
architecture, allowing individual modules to manage their specific
setup and cleanup tasks seamlessly."
* Adjusting Vault-V2 key provider to use new keyring architecture
If TOAST data gets compressed, it has an extended header containing
compression info. We used to encrypt this header along with the actual
data which in turn caused a crash as PG needs this data in later
stages. So it should be taken into account while encrypting data during
externalisation.
Then, during detoasting, we should not decrypt this compression header
as it is being extracted with the data with the first TOAST chunk. So,
copy the first N bytes (now it is 4 bytes) of the first chunk as it is
and decrypt the rest of the data.
Fixes https://github.com/Percona-Lab/postgres-tde-ext/issues/63
Encryption/decryption of the same data should be exact as long as the
offset is the same. But as we encode in 16-byte blocks, the size of
`encKey` always is a multiple of 16. We start from `aes_block_no`-th
index of the encKey[] so N-th will be crypted with the same encKey byte
despite what start_offset `pg_tde_crypt()` was called with.
For example `start_offset = 10; MAX_AES_ENC_BATCH_KEY_SIZE = 6`:
```
data: [10 11 12 13 14 15 16]
encKeys: [...][0 1 2 3 4 5][0 1 2 3 4 5]
```
so the 10th data byte is encoded with the 4th byte of the 2nd encKey
etc. We need this shift so each byte will be coded the same despite the
initial offset.
Let's see the same data but sent to the func starting from the offset 0:
```
data: [0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16]
encKeys: [0 1 2 3 4 5][0 1 2 3 4 5][ 0 1 2 3 4 5]
```
again, the 10th data byte is encoded with the 4th byte of the 2nd
`encKey` etc.
The issue in `pg_tde_crypt()` was that along with shifting encKeys in
the first batch, we started skipping `aes_block_no` bytes in `data`
with every batch after the first one. That lead to:
1. Some bytes remained unencrypted.
2. If data was encrypted and decrypted in different batches these
skipped bytes would differ hence decrypted data would be wrong. TOASTed
data, for example, is encrypted as one chunk but decrypted in
`TOAST_MAX_CHUNK_SIZE` chunks.
The issue with pg_tde_move_encrypted_data() was that encryption and
decryption were happening in the same loop but `encKeys` may have had
different shifting as in and out data may have different start_offset.
It wasn't an issue if the data was less than `DATA_BYTES_PER_AES_BATCH`.
However, it resulted in data corruption while moving the larger tuples.
Plus, it makes sense to reuse `pg_tde_crypt()` instead of copying its
tricky code.
Also, encKey had a maximum size of 1632 bytes but only 1600 bytes
maximum could have been used. We don't need that extra buffer 32 bytes
buffer anymore.
The same with the `dataLen` in `Aes128EncryptedZeroBlocks2()` - I don't
see why do we need that extra block.
Fixes https://github.com/Percona-Lab/postgres-tde-ext/issues/72
Issue: the code cleanup introduced in PR #52 modified the original
tuple in the update method instead of decrypting the tuple data
into a copy. This caused data curruption crashes in some test.
Fix: reintroduce the missing palloc to the update method.
Fixes#61Fixes#62Fixes#64
1. Inserts and Updates now are encrypted in WAL.
We encrypt new tuples directly in Buffer after they were insrerted there. To
pass it to XLog we could memcpy Buffer data into into the tuple. But later
tuple has to be unencrypted for index instertions etc. So we pass directly
data from the Buffer into XLog.
2. Log into WAL and replicate *.tde forks creation.
3. Added docker-compose for the streaming replication test setup.
(not perfect - needs two `up -d` in a row to start the secondary)
4. Added tests for multi inserts. Need tests for replications though.
Fork file encryption was missing from the previous pull requests,
resulting in the server not initializing the keyring on normal executions,
and because of this, missing if the keyring configuration wasn't specified
at all.
Fixes#46