Encryption/decryption of the same data should be exact as long as the
offset is the same. But as we encode in 16-byte blocks, the size of
`encKey` always is a multiple of 16. We start from `aes_block_no`-th
index of the encKey[] so N-th will be crypted with the same encKey byte
despite what start_offset `pg_tde_crypt()` was called with.
For example `start_offset = 10; MAX_AES_ENC_BATCH_KEY_SIZE = 6`:
```
data: [10 11 12 13 14 15 16]
encKeys: [...][0 1 2 3 4 5][0 1 2 3 4 5]
```
so the 10th data byte is encoded with the 4th byte of the 2nd encKey
etc. We need this shift so each byte will be coded the same despite the
initial offset.
Let's see the same data but sent to the func starting from the offset 0:
```
data: [0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16]
encKeys: [0 1 2 3 4 5][0 1 2 3 4 5][ 0 1 2 3 4 5]
```
again, the 10th data byte is encoded with the 4th byte of the 2nd
`encKey` etc.
The issue in `pg_tde_crypt()` was that along with shifting encKeys in
the first batch, we started skipping `aes_block_no` bytes in `data`
with every batch after the first one. That lead to:
1. Some bytes remained unencrypted.
2. If data was encrypted and decrypted in different batches these
skipped bytes would differ hence decrypted data would be wrong. TOASTed
data, for example, is encrypted as one chunk but decrypted in
`TOAST_MAX_CHUNK_SIZE` chunks.
The issue with pg_tde_move_encrypted_data() was that encryption and
decryption were happening in the same loop but `encKeys` may have had
different shifting as in and out data may have different start_offset.
It wasn't an issue if the data was less than `DATA_BYTES_PER_AES_BATCH`.
However, it resulted in data corruption while moving the larger tuples.
Plus, it makes sense to reuse `pg_tde_crypt()` instead of copying its
tricky code.
Also, encKey had a maximum size of 1632 bytes but only 1600 bytes
maximum could have been used. We don't need that extra buffer 32 bytes
buffer anymore.
The same with the `dataLen` in `Aes128EncryptedZeroBlocks2()` - I don't
see why do we need that extra block.
Fixes https://github.com/Percona-Lab/postgres-tde-ext/issues/72
As we are using #ifdef checks insted of #if, the zero defined debug
settings enable debug output. This causes increased log sizes and
decreased performance.
Issue: with the fix in #67, pgtde decrypts tuples during update into
a new memory region, and changes the t_data pointer to this new region.
Because of this, later updates to tuple flags also happen in the new
data, and the original persisted tuple flags are never updated.
Fix: after the update statement is done with the decrypted data,
restore the t_data pointer to the original. This way, flag changes
happen where they should.
Fixes#68
Issue: the code cleanup introduced in PR #52 modified the original
tuple in the update method instead of decrypting the tuple data
into a copy. This caused data curruption crashes in some test.
Fix: reintroduce the missing palloc to the update method.
Fixes#61Fixes#62Fixes#64
1. Inserts and Updates now are encrypted in WAL.
We encrypt new tuples directly in Buffer after they were insrerted there. To
pass it to XLog we could memcpy Buffer data into into the tuple. But later
tuple has to be unencrypted for index instertions etc. So we pass directly
data from the Buffer into XLog.
2. Log into WAL and replicate *.tde forks creation.
3. Added docker-compose for the streaming replication test setup.
(not perfect - needs two `up -d` in a row to start the secondary)
4. Added tests for multi inserts. Need tests for replications though.
* Few enhancements and code cleanup around tuple encryption/decryption
Commit contains the following noteworthy changes
-- Getting rid of VLAs from the code base
-- Add an interface to move the encrypted data from one location to another.
-- Make the encryption and decryption happen in batches to eliminate the
requirement of dynamic allocation in crypt functions.
Toast values are inserted into toast relations directly by the core, so the
trick is sending the encrypted data to the core before it gets externalized.
Similarly, when creating the tuple with toasted value, decrypt the toated
chunks before constructing the resulting tuple.
This commit disables padding for the fork file encryption to fix the
above warning, and also contains related test / logging improvements.
The meson test runner is also restricted to one process to workaround
issues where multiple processes write the same keyring data file,
resulting in randomly failing test executions.
Fork file encryption was missing from the previous pull requests,
resulting in the server not initializing the keyring on normal executions,
and because of this, missing if the keyring configuration wasn't specified
at all.
Fixes#46
This commit adds a new SQL function, `pgtde_is_encrypted(tablename)` which returns a boolean value:
true if the table is encrypted with pg_tde, false otherwise.
As the pg_tde access method only writes encrypted data, it just checks if the table uses the am.
There is a check during the UPDATE for changed indexes.
Sometimes it causes a crash since the new tuple is unencrypted and the
old is encrypted.
Besides comparison didn't work properly. Which leads to taking
a suboptimal code path. And probably computing wrong replica identity and
freeing used tuple in some cases.
Issue: openssl aes context setup takes more time than actual AES-CTR
encryption of small blocks.
To workaround this, instead of reinitailizing AES-CTR with different
parameters for each block, this commit relies on the implementation
details of AES-CTR: it just encrypts the counter using AES-ECB and
then xors it with the data. Using this information, it is possible
to keep a single AES context for each encryption key, and use it
to encrypt/decrypt any offset, without reinitializing the context.
1. If tde config file is writable to DB it shouldn't prevent its loading.
2. If keyringCheckConfigFile() returns 0, config won't be loaded. Hence
keyringAssignConfigFile() and all its checks will be omitted. And user
gonna get a bit confusing error at the access of tde table:
```
PANIC: Couldn't write keyring data:
```
So added check if keyringFileDataFileName is set. And the error now:
```
ERROR: Keyring datafile is not set
```
3. Add quote marks around file names in log messages so it's easier to spot
if the file name is empty.
`Keyring file not found, not loading existing keys.`
vs
`Keyring file '' not found, not loading existing keys.`
It covers the case when line pointers (lp) point to unsorted offsets
(tuples):
```
SELECT lp, lp_off, t_ctid FROM heap_page_items(get_raw_page('sbtest1', 0));
lp | lp_off | t_ctid
----+--------+--------
1 | 7960 | (0,1)
2 | 7264 | (0,2)
3 | 7032 | (0,3)
4 | 6800 | (0,4)
5 | 7728 | (0,5)
6 | 0 |
7 | 7496 | (0,7)
(7 rows)
```
This condition can be achieved by deleting some tuples and running
VACUUM
Also, the encoded bytes printing is hidden behind higher ENCRYPTION_DEBUG.
It floods log a lot otherwise.
VACUUM FULL rewrites tuple in order to possibly compact them after
columns drops etc. And when it calls `raw_pg_tde_insert` buffer of
the `RewriteState` is not valid as the Page is being built. So we have
to use the block num from `RewriteState`.
Issue: the heap AM has a function which automatically compacts pages
when certain conditions are met. When this happens, it moves the
tuples around within the page. As encryption uses the offset of tuples
for decrypting them, this results in garbage data and possible crashes.
Fix: this commit copies the two compaction functions from the server code,
and modifies them to re-encrypt data when moved. This is not optimized at
all, if needed, we can improve this by a lot.
Also, for now only one execution path is handled from the two, as that's
the only one hit by sysbench. We'll have to figure out a testcase for
the other and fix that too, for now, it only contains an assert(0).
heap code changes.
The script generates file-wise patches between two PG commits and applies it to
the TDE extension source.
By default, it only performs a dry run of the patch application. See the usage
options for applying clean patches or forcefully applying all patches.
It clones both PG and TDE repositories in the working directory. If TDE path is
specified either with its usage option or via the environment variable, then
the script will use the given TDE source code.