mirror of https://github.com/postgres/postgres
Tag:
Branch:
Tree:
b14e9ce7d5
REL2_0B
REL6_4
REL6_5_PATCHES
REL7_0_PATCHES
REL7_1_STABLE
REL7_2_STABLE
REL7_3_STABLE
REL7_4_STABLE
REL8_0_STABLE
REL8_1_STABLE
REL8_2_STABLE
REL8_3_STABLE
REL8_4_STABLE
REL8_5_ALPHA1_BRANCH
REL8_5_ALPHA2_BRANCH
REL8_5_ALPHA3_BRANCH
REL9_0_ALPHA4_BRANCH
REL9_0_ALPHA5_BRANCH
REL9_0_STABLE
REL9_1_STABLE
REL9_2_STABLE
REL9_3_STABLE
REL9_4_STABLE
REL9_5_STABLE
REL9_6_STABLE
REL_10_STABLE
REL_11_STABLE
REL_12_STABLE
REL_13_STABLE
REL_14_STABLE
REL_15_STABLE
REL_16_STABLE
REL_17_STABLE
REL_18_STABLE
Release_1_0_3
WIN32_DEV
ecpg_big_bison
master
PG95-1_01
PG95-1_08
PG95-1_09
REL2_0
REL6_1
REL6_1_1
REL6_2
REL6_2_1
REL6_3
REL6_3_2
REL6_4_2
REL6_5
REL6_5_1
REL6_5_2
REL6_5_3
REL7_0
REL7_0_2
REL7_0_3
REL7_1
REL7_1_1
REL7_1_2
REL7_1_3
REL7_1_BETA
REL7_1_BETA2
REL7_1_BETA3
REL7_2
REL7_2_1
REL7_2_2
REL7_2_3
REL7_2_4
REL7_2_5
REL7_2_6
REL7_2_7
REL7_2_8
REL7_2_BETA1
REL7_2_BETA2
REL7_2_BETA3
REL7_2_BETA4
REL7_2_BETA5
REL7_2_RC1
REL7_2_RC2
REL7_3
REL7_3_1
REL7_3_10
REL7_3_11
REL7_3_12
REL7_3_13
REL7_3_14
REL7_3_15
REL7_3_16
REL7_3_17
REL7_3_18
REL7_3_19
REL7_3_2
REL7_3_20
REL7_3_21
REL7_3_3
REL7_3_4
REL7_3_5
REL7_3_6
REL7_3_7
REL7_3_8
REL7_3_9
REL7_4
REL7_4_1
REL7_4_10
REL7_4_11
REL7_4_12
REL7_4_13
REL7_4_14
REL7_4_15
REL7_4_16
REL7_4_17
REL7_4_18
REL7_4_19
REL7_4_2
REL7_4_20
REL7_4_21
REL7_4_22
REL7_4_23
REL7_4_24
REL7_4_25
REL7_4_26
REL7_4_27
REL7_4_28
REL7_4_29
REL7_4_3
REL7_4_30
REL7_4_4
REL7_4_5
REL7_4_6
REL7_4_7
REL7_4_8
REL7_4_9
REL7_4_BETA1
REL7_4_BETA2
REL7_4_BETA3
REL7_4_BETA4
REL7_4_BETA5
REL7_4_RC1
REL7_4_RC2
REL8_0_0
REL8_0_0BETA1
REL8_0_0BETA2
REL8_0_0BETA3
REL8_0_0BETA4
REL8_0_0BETA5
REL8_0_0RC1
REL8_0_0RC2
REL8_0_0RC3
REL8_0_0RC4
REL8_0_0RC5
REL8_0_1
REL8_0_10
REL8_0_11
REL8_0_12
REL8_0_13
REL8_0_14
REL8_0_15
REL8_0_16
REL8_0_17
REL8_0_18
REL8_0_19
REL8_0_2
REL8_0_20
REL8_0_21
REL8_0_22
REL8_0_23
REL8_0_24
REL8_0_25
REL8_0_26
REL8_0_3
REL8_0_4
REL8_0_5
REL8_0_6
REL8_0_7
REL8_0_8
REL8_0_9
REL8_1_0
REL8_1_0BETA1
REL8_1_0BETA2
REL8_1_0BETA3
REL8_1_0BETA4
REL8_1_0RC1
REL8_1_1
REL8_1_10
REL8_1_11
REL8_1_12
REL8_1_13
REL8_1_14
REL8_1_15
REL8_1_16
REL8_1_17
REL8_1_18
REL8_1_19
REL8_1_2
REL8_1_20
REL8_1_21
REL8_1_22
REL8_1_23
REL8_1_3
REL8_1_4
REL8_1_5
REL8_1_6
REL8_1_7
REL8_1_8
REL8_1_9
REL8_2_0
REL8_2_1
REL8_2_10
REL8_2_11
REL8_2_12
REL8_2_13
REL8_2_14
REL8_2_15
REL8_2_16
REL8_2_17
REL8_2_18
REL8_2_19
REL8_2_2
REL8_2_20
REL8_2_21
REL8_2_22
REL8_2_23
REL8_2_3
REL8_2_4
REL8_2_5
REL8_2_6
REL8_2_7
REL8_2_8
REL8_2_9
REL8_2_BETA1
REL8_2_BETA2
REL8_2_BETA3
REL8_2_RC1
REL8_3_0
REL8_3_1
REL8_3_10
REL8_3_11
REL8_3_12
REL8_3_13
REL8_3_14
REL8_3_15
REL8_3_16
REL8_3_17
REL8_3_18
REL8_3_19
REL8_3_2
REL8_3_20
REL8_3_21
REL8_3_22
REL8_3_23
REL8_3_3
REL8_3_4
REL8_3_5
REL8_3_6
REL8_3_7
REL8_3_8
REL8_3_9
REL8_3_BETA1
REL8_3_BETA2
REL8_3_BETA3
REL8_3_BETA4
REL8_3_RC1
REL8_3_RC2
REL8_4_0
REL8_4_1
REL8_4_10
REL8_4_11
REL8_4_12
REL8_4_13
REL8_4_14
REL8_4_15
REL8_4_16
REL8_4_17
REL8_4_18
REL8_4_19
REL8_4_2
REL8_4_20
REL8_4_21
REL8_4_22
REL8_4_3
REL8_4_4
REL8_4_5
REL8_4_6
REL8_4_7
REL8_4_8
REL8_4_9
REL8_4_BETA1
REL8_4_BETA2
REL8_4_RC1
REL8_4_RC2
REL8_5_ALPHA1
REL8_5_ALPHA2
REL8_5_ALPHA3
REL9_0_0
REL9_0_1
REL9_0_10
REL9_0_11
REL9_0_12
REL9_0_13
REL9_0_14
REL9_0_15
REL9_0_16
REL9_0_17
REL9_0_18
REL9_0_19
REL9_0_2
REL9_0_20
REL9_0_21
REL9_0_22
REL9_0_23
REL9_0_3
REL9_0_4
REL9_0_5
REL9_0_6
REL9_0_7
REL9_0_8
REL9_0_9
REL9_0_ALPHA4
REL9_0_ALPHA5
REL9_0_BETA1
REL9_0_BETA2
REL9_0_BETA3
REL9_0_BETA4
REL9_0_RC1
REL9_1_0
REL9_1_1
REL9_1_10
REL9_1_11
REL9_1_12
REL9_1_13
REL9_1_14
REL9_1_15
REL9_1_16
REL9_1_17
REL9_1_18
REL9_1_19
REL9_1_2
REL9_1_20
REL9_1_21
REL9_1_22
REL9_1_23
REL9_1_24
REL9_1_3
REL9_1_4
REL9_1_5
REL9_1_6
REL9_1_7
REL9_1_8
REL9_1_9
REL9_1_ALPHA1
REL9_1_ALPHA2
REL9_1_ALPHA3
REL9_1_ALPHA4
REL9_1_ALPHA5
REL9_1_BETA1
REL9_1_BETA2
REL9_1_BETA3
REL9_1_RC1
REL9_2_0
REL9_2_1
REL9_2_10
REL9_2_11
REL9_2_12
REL9_2_13
REL9_2_14
REL9_2_15
REL9_2_16
REL9_2_17
REL9_2_18
REL9_2_19
REL9_2_2
REL9_2_20
REL9_2_21
REL9_2_22
REL9_2_23
REL9_2_24
REL9_2_3
REL9_2_4
REL9_2_5
REL9_2_6
REL9_2_7
REL9_2_8
REL9_2_9
REL9_2_BETA1
REL9_2_BETA2
REL9_2_BETA3
REL9_2_BETA4
REL9_2_RC1
REL9_3_0
REL9_3_1
REL9_3_10
REL9_3_11
REL9_3_12
REL9_3_13
REL9_3_14
REL9_3_15
REL9_3_16
REL9_3_17
REL9_3_18
REL9_3_19
REL9_3_2
REL9_3_20
REL9_3_21
REL9_3_22
REL9_3_23
REL9_3_24
REL9_3_25
REL9_3_3
REL9_3_4
REL9_3_5
REL9_3_6
REL9_3_7
REL9_3_8
REL9_3_9
REL9_3_BETA1
REL9_3_BETA2
REL9_3_RC1
REL9_4_0
REL9_4_1
REL9_4_10
REL9_4_11
REL9_4_12
REL9_4_13
REL9_4_14
REL9_4_15
REL9_4_16
REL9_4_17
REL9_4_18
REL9_4_19
REL9_4_2
REL9_4_20
REL9_4_21
REL9_4_22
REL9_4_23
REL9_4_24
REL9_4_25
REL9_4_26
REL9_4_3
REL9_4_4
REL9_4_5
REL9_4_6
REL9_4_7
REL9_4_8
REL9_4_9
REL9_4_BETA1
REL9_4_BETA2
REL9_4_BETA3
REL9_4_RC1
REL9_5_0
REL9_5_1
REL9_5_10
REL9_5_11
REL9_5_12
REL9_5_13
REL9_5_14
REL9_5_15
REL9_5_16
REL9_5_17
REL9_5_18
REL9_5_19
REL9_5_2
REL9_5_20
REL9_5_21
REL9_5_22
REL9_5_23
REL9_5_24
REL9_5_25
REL9_5_3
REL9_5_4
REL9_5_5
REL9_5_6
REL9_5_7
REL9_5_8
REL9_5_9
REL9_5_ALPHA1
REL9_5_ALPHA2
REL9_5_BETA1
REL9_5_BETA2
REL9_5_RC1
REL9_6_0
REL9_6_1
REL9_6_10
REL9_6_11
REL9_6_12
REL9_6_13
REL9_6_14
REL9_6_15
REL9_6_16
REL9_6_17
REL9_6_18
REL9_6_19
REL9_6_2
REL9_6_20
REL9_6_21
REL9_6_22
REL9_6_23
REL9_6_24
REL9_6_3
REL9_6_4
REL9_6_5
REL9_6_6
REL9_6_7
REL9_6_8
REL9_6_9
REL9_6_BETA1
REL9_6_BETA2
REL9_6_BETA3
REL9_6_BETA4
REL9_6_RC1
REL_10_0
REL_10_1
REL_10_10
REL_10_11
REL_10_12
REL_10_13
REL_10_14
REL_10_15
REL_10_16
REL_10_17
REL_10_18
REL_10_19
REL_10_2
REL_10_20
REL_10_21
REL_10_22
REL_10_23
REL_10_3
REL_10_4
REL_10_5
REL_10_6
REL_10_7
REL_10_8
REL_10_9
REL_10_BETA1
REL_10_BETA2
REL_10_BETA3
REL_10_BETA4
REL_10_RC1
REL_11_0
REL_11_1
REL_11_10
REL_11_11
REL_11_12
REL_11_13
REL_11_14
REL_11_15
REL_11_16
REL_11_17
REL_11_18
REL_11_19
REL_11_2
REL_11_20
REL_11_21
REL_11_22
REL_11_3
REL_11_4
REL_11_5
REL_11_6
REL_11_7
REL_11_8
REL_11_9
REL_11_BETA1
REL_11_BETA2
REL_11_BETA3
REL_11_BETA4
REL_11_RC1
REL_12_0
REL_12_1
REL_12_10
REL_12_11
REL_12_12
REL_12_13
REL_12_14
REL_12_15
REL_12_16
REL_12_17
REL_12_18
REL_12_19
REL_12_2
REL_12_20
REL_12_21
REL_12_22
REL_12_3
REL_12_4
REL_12_5
REL_12_6
REL_12_7
REL_12_8
REL_12_9
REL_12_BETA1
REL_12_BETA2
REL_12_BETA3
REL_12_BETA4
REL_12_RC1
REL_13_0
REL_13_1
REL_13_10
REL_13_11
REL_13_12
REL_13_13
REL_13_14
REL_13_15
REL_13_16
REL_13_17
REL_13_18
REL_13_19
REL_13_2
REL_13_20
REL_13_21
REL_13_22
REL_13_23
REL_13_3
REL_13_4
REL_13_5
REL_13_6
REL_13_7
REL_13_8
REL_13_9
REL_13_BETA1
REL_13_BETA2
REL_13_BETA3
REL_13_RC1
REL_14_0
REL_14_1
REL_14_10
REL_14_11
REL_14_12
REL_14_13
REL_14_14
REL_14_15
REL_14_16
REL_14_17
REL_14_18
REL_14_19
REL_14_2
REL_14_20
REL_14_3
REL_14_4
REL_14_5
REL_14_6
REL_14_7
REL_14_8
REL_14_9
REL_14_BETA1
REL_14_BETA2
REL_14_BETA3
REL_14_RC1
REL_15_0
REL_15_1
REL_15_10
REL_15_11
REL_15_12
REL_15_13
REL_15_14
REL_15_15
REL_15_2
REL_15_3
REL_15_4
REL_15_5
REL_15_6
REL_15_7
REL_15_8
REL_15_9
REL_15_BETA1
REL_15_BETA2
REL_15_BETA3
REL_15_BETA4
REL_15_RC1
REL_15_RC2
REL_16_0
REL_16_1
REL_16_10
REL_16_11
REL_16_2
REL_16_3
REL_16_4
REL_16_5
REL_16_6
REL_16_7
REL_16_8
REL_16_9
REL_16_BETA1
REL_16_BETA2
REL_16_BETA3
REL_16_RC1
REL_17_0
REL_17_1
REL_17_2
REL_17_3
REL_17_4
REL_17_5
REL_17_6
REL_17_7
REL_17_BETA1
REL_17_BETA2
REL_17_BETA3
REL_17_RC1
REL_18_0
REL_18_1
REL_18_BETA1
REL_18_BETA2
REL_18_BETA3
REL_18_RC1
Release_1_0_2
Release_2_0
Release_2_0_0
release-6-3
${ noResults }
185 Commits (b14e9ce7d55c75ffa160b07765eb9dffde70b5fa)
| Author | SHA1 | Message | Date |
|---|---|---|---|
|
|
b14e9ce7d5 |
Extend PgStat_HashKey.objid from 4 to 8 bytes
This opens the possibility to define keys for more types of statistics kinds in PgStat_HashKey, the first case being 8-byte query IDs for statistics like pg_stat_statements. This increases the size of PgStat_HashKey from 12 to 16 bytes, while PgStatShared_HashEntry, entry stored in the dshash for pgstats, keeps the same size due to alignment. xl_xact_stats_item, that tracks the stats items to drop in commit WAL records, is increased from 12 to 16 bytes. Note that individual chunks in commit WAL records should be multiples of sizeof(int), hence 8-byte object IDs are stored as two uint32, based on a suggestion from Heikki Linnakangas. While on it, the field of PgStat_HashKey is renamed from "objoid" to "objid", as for some stats kinds this field does not refer to OIDs but just IDs, like for replication slot stats. This commit bumps the following format variables: - PGSTAT_FILE_FORMAT_ID, as PgStat_HashKey is written to the stats file for non-serialized stats kinds in the dshash table. - XLOG_PAGE_MAGIC for the changes in xl_xact_stats_item. - Catalog version, for the SQL function pg_stat_have_stats(). Reviewed-by: Bertrand Drouvot Discussion: https://postgr.es/m/ZsvTS9EW79Up8I62@paquier.xyz |
1 year ago |
|
|
2b5f57977f |
Add const qualifiers to XLogRegister*() functions
Add const qualifiers to XLogRegisterData() and XLogRegisterBufData(). Several unconstify() calls can be removed. Reviewed-by: Aleksander Alekseev <aleksander@timescale.com> Discussion: https://www.postgresql.org/message-id/dd889784-9ce7-436a-b4f1-52e4a5e577bd@eisentraut.org |
1 year ago |
|
|
402b586d0a |
Do not summarize WAL if generated with wal_level=minimal.
To do this, we must include the wal_level in the first WAL record covered by each summary file; so add wal_level to struct Checkpoint and the payload of XLOG_CHECKPOINT_REDO and XLOG_END_OF_RECOVERY. This, in turn, requires bumping XLOG_PAGE_MAGIC and, since the Checkpoint is also stored in the control file, also PG_CONTROL_VERSION. It's not great to do that so late in the release cycle, but the alternative seems to ship v17 without robust protections against this scenario, which could result in corrupted incremental backups. A side effect of this patch is that, when a server with wal_level=replica is started with summarize_wal=on for the first time, summarization will no longer begin with the oldest WAL that still exists in pg_wal, but rather from the first checkpoint after that. This change should be harmless, because a WAL summary for a partial checkpoint cycle can never make an incremental backup possible when it would otherwise not have been. Report by Fujii Masao. Patch by me. Review and/or testing by Jakub Wartak and Fujii Masao. Discussion: http://postgr.es/m/6e30082e-041b-4e31-9633-95a66de76f5d@oss.nttdata.com |
1 year ago |
|
|
f83d709760 |
Merge prune, freeze and vacuum WAL record formats
The new combined WAL record is now used for pruning, freezing and 2nd pass of vacuum. This is in preparation for changing VACUUM to write a combined prune+freeze record per page, instead of separate two records. The new WAL record format now supports that, but the code still always writes separate records for pruning and freezing. This reserves separate XLOG_HEAP2_* info codes for when the pruning record is emitted for on-access pruning or VACUUM, per Peter Geoghegan's suggestion. The record format is identical, but having separate info codes makes it easier analyze pruning and vacuuming with pg_waldump. The function to emit the new WAL record, log_heap_prune_and_freeze(), is in pruneheap.c. The existing heap_log_freeze_plan() and its subroutines are moved to pruneheap.c without changes, to keep them together with log_heap_prune_and_freeze(). Author: Melanie Plageman <melanieplageman@gmail.com> Discussion: https://www.postgresql.org/message-id/CAAKRu_azf-zH%3DDgVbquZ3tFWjMY1w5pO8m-TXJaMdri8z3933g@mail.gmail.com Discussion: https://www.postgresql.org/message-id/CAAKRu_b2oE4GL%3Dq4g9mcByS9yT7wTQvEH9OLpabj28e%2BWKFi2A@mail.gmail.com |
2 years ago |
|
|
29275b1d17 |
Update copyright for 2024
Reported-by: Michael Paquier Discussion: https://postgr.es/m/ZZKTDPxBBMt3C0J9@paquier.xyz Backpatch-through: 12 |
2 years ago |
|
|
611806cd72 |
Add trailing commas to enum definitions
Since C99, there can be a trailing comma after the last value in an enum definition. A lot of new code has been introducing this style on the fly. Some new patches are now taking an inconsistent approach to this. Some add the last comma on the fly if they add a new last value, some are trying to preserve the existing style in each place, some are even dropping the last comma if there was one. We could nudge this all in a consistent direction if we just add the trailing commas everywhere once. I omitted a few places where there was a fixed "last" value that will always stay last. I also skipped the header files of libpq and ecpg, in case people want to use those with older compilers. There were also a small number of cases where the enum type wasn't used anywhere (but the enum values were), which ended up confusing pgindent a bit, so I left those alone. Discussion: https://www.postgresql.org/message-id/flat/386f8c45-c8ac-4681-8add-e3b0852c1620%40eisentraut.org |
2 years ago |
|
|
afd12774ae |
During online checkpoints, insert XLOG_CHECKPOINT_REDO at redo point.
This allows tools that read the WAL sequentially to identify (possible) redo points when they're reached, rather than only being able to detect them in retrospect when XLOG_CHECKPOINT_ONLINE is found, possibly much later in the WAL stream. There are other possible applications as well; see the discussion links below. Any redo location that precedes the checkpoint location should now point to an XLOG_CHECKPOINT_REDO record, so add a cross-check to verify this. While adjusting the code in CreateCheckPoint() for this patch, I made it call WALInsertLockAcquireExclusive a bit later than before, since there appears to be no need for it to be held while checking whether the system is idle, whether this is an end-of-recovery checkpoint, or what the current timeline is. Bump XLOG_PAGE_MAGIC. Patch by me, based in part on earlier work from Dilip Kumar. Review by Dilip Kumar, Amit Kapila, Andres Freund, and Michael Paquier. Discussion: http://postgr.es/m/CA+TgmoYy-Vc6G9QKcAKNksCa29cv__czr+N9X_QCxEfQVpp_8w@mail.gmail.com Discussion: http://postgr.es/m/20230614194717.jyuw3okxup4cvtbt%40awork3.anarazel.de Discussion: http://postgr.es/m/CA+hUKG+b2ego8=YNW2Ohe9QmSiReh1-ogrv8V_WZpJTqP3O+2w@mail.gmail.com |
2 years ago |
|
|
6af1793954 |
Add info in WAL records in preparation for logical slot conflict handling
This commit only implements one prerequisite part for allowing logical decoding. The commit message contains an explanation of the overall design, which later commits will refer back to. Overall design: 1. We want to enable logical decoding on standbys, but replay of WAL from the primary might remove data that is needed by logical decoding, causing error(s) on the standby. To prevent those errors, a new replication conflict scenario needs to be addressed (as much as hot standby does). 2. Our chosen strategy for dealing with this type of replication slot is to invalidate logical slots for which needed data has been removed. 3. To do this we need the latestRemovedXid for each change, just as we do for physical replication conflicts, but we also need to know whether any particular change was to data that logical replication might access. That way, during WAL replay, we know when there is a risk of conflict and, if so, if there is a conflict. 4. We can't rely on the standby's relcache entries for this purpose in any way, because the startup process can't access catalog contents. 5. Therefore every WAL record that potentially removes data from the index or heap must carry a flag indicating whether or not it is one that might be accessed during logical decoding. Why do we need this for logical decoding on standby? First, let's forget about logical decoding on standby and recall that on a primary database, any catalog rows that may be needed by a logical decoding replication slot are not removed. This is done thanks to the catalog_xmin associated with the logical replication slot. But, with logical decoding on standby, in the following cases: - hot_standby_feedback is off - hot_standby_feedback is on but there is no a physical slot between the primary and the standby. Then, hot_standby_feedback will work, but only while the connection is alive (for example a node restart would break it) Then, the primary may delete system catalog rows that could be needed by the logical decoding on the standby (as it does not know about the catalog_xmin on the standby). So, it’s mandatory to identify those rows and invalidate the slots that may need them if any. Identifying those rows is the purpose of this commit. Implementation: When a WAL replay on standby indicates that a catalog table tuple is to be deleted by an xid that is greater than a logical slot's catalog_xmin, then that means the slot's catalog_xmin conflicts with the xid, and we need to handle the conflict. While subsequent commits will do the actual conflict handling, this commit adds a new field isCatalogRel in such WAL records (and a new bit set in the xl_heap_visible flags field), that is true for catalog tables, so as to arrange for conflict handling. The affected WAL records are the ones that already contain the snapshotConflictHorizon field, namely: - gistxlogDelete - gistxlogPageReuse - xl_hash_vacuum_one_page - xl_heap_prune - xl_heap_freeze_page - xl_heap_visible - xl_btree_reuse_page - xl_btree_delete - spgxlogVacuumRedirect Due to this new field being added, xl_hash_vacuum_one_page and gistxlogDelete do now contain the offsets to be deleted as a FLEXIBLE_ARRAY_MEMBER. This is needed to ensure correct alignment. It's not needed on the others struct where isCatalogRel has been added. This commit just introduces the WAL format changes mentioned above. Handling the actual conflicts will follow in future commits. Bumps XLOG_PAGE_MAGIC as the several WAL records are changed. Author: "Drouvot, Bertrand" <bertranddrouvot.pg@gmail.com> Author: Andres Freund <andres@anarazel.de> (in an older version) Author: Amit Khandekar <amitdkhan.pg@gmail.com> (in an older version) Reviewed-by: "Drouvot, Bertrand" <bertranddrouvot.pg@gmail.com> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Robert Haas <robertmhaas@gmail.com> Reviewed-by: Fabrízio de Royes Mello <fabriziomello@gmail.com> Reviewed-by: Melanie Plageman <melanieplageman@gmail.com> |
3 years ago |
|
|
a6cd1fc692 |
Change xl_hash_vacuum_one_page.ntuples from int to uint16.
This will create two bytes of padding space in xl_hash_vacuum_one_page which can be used for future patches. This makes the datatype of xl_hash_vacuum_one_page.ntuples same as gistxlogDelete.ntodelete which is advisable as both are used for the same purpose. Author: Bertrand Drouvot Reviewed-by: Nathan Bossart Discussion: https://postgr.es/m/b0e20c40-cb7a-fc1c-c607-2a78dac5021e@gmail.com |
3 years ago |
|
|
c8e1ba736b |
Update copyright for 2023
Backpatch-through: 11 |
3 years ago |
|
|
1eda3ce802 |
Mark argument of RegisterCustomRmgr() as const.
|
3 years ago |
|
|
9e5405993c |
Deduplicate freeze plans in freeze WAL records.
Make heapam WAL records that describe freezing performed by VACUUM more space efficient by storing each distinct "freeze plan" once, alongside an array of associated page offset numbers (one per freeze plan). The freeze plans required for most heap pages tend to naturally have a great deal of redundancy, so this technique is very effective in practice. It often leads to freeze WAL records that are less than 20% of the size of equivalent WAL records generated using the previous approach. The freeze plan concept was introduced by commit |
3 years ago |
|
|
507ba16b28 |
Convert macros to static inline functions (xlog_internal.h)
Reviewed-by: Amul Sul <sulamul@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/5b558da8-99fb-0a99-83dd-f72f05388517%40enterprisedb.com |
3 years ago |
|
|
b0a55e4329 |
Change internal RelFileNode references to RelFileNumber or RelFileLocator.
We have been using the term RelFileNode to refer to either (1) the integer that is used to name the sequence of files for a certain relation within the directory set aside for that tablespace/database combination; or (2) that value plus the OIDs of the tablespace and database; or occasionally (3) the whole series of files created for a relation based on those values. Using the same name for more than one thing is confusing. Replace RelFileNode with RelFileNumber when we're talking about just the single number, i.e. (1) from above, and with RelFileLocator when we're talking about all the things that are needed to locate a relation's files on disk, i.e. (2) from above. In the places where we refer to (3) as a relfilenode, instead refer to "relation storage". Since there is a ton of SQL code in the world that knows about pg_class.relfilenode, don't change the name of that column, or of other SQL-facing things that derive their name from it. On the other hand, do adjust closely-related internal terminology. For example, the structure member names dbNode and spcNode appear to be derived from the fact that the structure itself was called RelFileNode, so change those to dbOid and spcOid. Likewise, various variables with names like rnode and relnode get renamed appropriately, according to how they're being used in context. Hopefully, this is clearer than before. It is also preparation for future patches that intend to widen the relfilenumber fields from its current width of 32 bits. Variables that store a relfilenumber are now declared as type RelFileNumber rather than type Oid; right now, these are the same, but that can now more easily be changed. Dilip Kumar, per an idea from me. Reviewed also by Andres Freund. I fixed some whitespace issues, changed a couple of words in a comment, and made one other minor correction. Discussion: http://postgr.es/m/CA+TgmoamOtXbVAQf9hWFzonUo6bhhjS6toZQd7HZ-pmojtAmag@mail.gmail.com Discussion: http://postgr.es/m/CA+Tgmobp7+7kmi4gkq7Y+4AM9fTvL+O1oQ4-5gFTT+6Ng-dQ=g@mail.gmail.com Discussion: http://postgr.es/m/CAFiTN-vTe79M8uDH1yprOU64MNFE+R3ODRuA+JWf27JbhY4hJw@mail.gmail.com |
4 years ago |
|
|
8ec569479f |
Apply PGDLLIMPORT markings broadly.
Up until now, we've had a policy of only marking certain variables in the PostgreSQL header files with PGDLLIMPORT, but now we've decided to mark them all. This means that extensions running on Windows should no longer operate at a disadvantage as compared to extensions running on Linux: if the variable is present in a header file, it should be accessible. Discussion: http://postgr.es/m/CA+TgmoYanc1_FSfimhgiWSqVyP5KKmh5NP2BWNwDhO8Pg2vGYQ@mail.gmail.com |
4 years ago |
|
|
2258e76f90 |
Add contrib/pg_walinspect.
Provides similar functionality to pg_waldump, but from a SQL interface rather than a separate utility. Author: Bharath Rupireddy Reviewed-by: Greg Stark, Kyotaro Horiguchi, Andres Freund, Ashutosh Sharma, Nitin Jadhav, RKN Sai Krishna Discussion: https://postgr.es/m/CALj2ACUGUYXsEQdKhEdsBzhGEyF3xggvLdD8C0VT72TNEfOiog%40mail.gmail.com |
4 years ago |
|
|
957aa4d87a |
Fix another buildfarm issue from commit 5c279a6d35.
Discussion: https://postgr.es/m/3280724.1649340315@sss.pgh.pa.us |
4 years ago |
|
|
5c279a6d35 |
Custom WAL Resource Managers.
Allow extensions to specify a new custom resource manager (rmgr), which allows specialized WAL. This is meant to be used by a Table Access Method or Index Access Method. Prior to this commit, only Generic WAL was available, which offers support for recovery and physical replication but not logical replication. Reviewed-by: Julien Rouhaud, Bharath Rupireddy, Andres Freund Discussion: https://postgr.es/m/ed1fb2e22d15d3563ae0eb610f7b61bb15999c0a.camel%40j-davis.com |
4 years ago |
|
|
8b1dccd37c |
pgstat: scaffolding for transactional stats creation / drop.
One problematic part of the current statistics collector design is that there is no reliable way of getting rid of statistics entries. Because of that pgstat_vacuum_stat() (called by [auto-]vacuum) matches all stats for the current database with the catalog contents and tries to drop now-superfluous entries. That's quite expensive. What's worse, it doesn't work on physical replicas, despite physical replicas collection statistics entries. This commit introduces infrastructure to create / drop statistics entries transactionally, together with the underlying catalog objects (functions, relations, subscriptions). pgstat_xact.c maintains a list of stats entries created / dropped transactionally in the current transaction. To ensure the removal of statistics entries is durable dropped statistics entries are included in commit / abort (and prepare) records, which also ensures that stats entries are dropped on standbys. Statistics entries created separately from creating the underlying catalog object (e.g. when stats were previously lost due to an immediate restart) are *not* WAL logged. However that can only happen outside of the transaction creating the catalog object, so it does not lead to "leaked" statistics entries. For this to work, functions creating / dropping functions / relations / subscriptions need to call into pgstat. For subscriptions this was already done when dropping subscriptions, via pgstat_report_subscription_drop() (now renamed to pgstat_drop_subscription()). This commit does not actually drop stats yet, it just provides the infrastructure. It is however a largely independent piece of infrastructure, so committing it separately makes sense. Bumps XLOG_PAGE_MAGIC. Author: Andres Freund <andres@anarazel.de> Reviewed-By: Thomas Munro <thomas.munro@gmail.com> Reviewed-By: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Discussion: https://postgr.es/m/20220303021600.hs34ghqcw6zcokdh@alap3.anarazel.de |
4 years ago |
|
|
8e375ea4a0 |
Bump XLOG_PAGE_MAGIC due to the addition of wal_compression=zstd
While on it, fix a thinko in the docs, introduced by the same commit.
Oversights in
|
4 years ago |
|
|
7a5f6b4748 |
Make logical decoding a part of the rmgr.
Add a new rmgr method, rm_decode, and use that rather than a switch statement. In preparation for rmgr extensibility. Reviewed-by: Julien Rouhaud Discussion: https://postgr.es/m/ed1fb2e22d15d3563ae0eb610f7b61bb15999c0a.camel%40j-davis.com Discussion: https://postgr.es/m/20220118095332.6xtlcjoyxobv6cbk@jrouhaud |
4 years ago |
|
|
27b77ecf9f |
Update copyright for 2022
Backpatch-through: 10 |
4 years ago |
|
|
ff9f111bce
|
Fix WAL replay in presence of an incomplete record
Physical replication always ships WAL segment files to replicas once
they are complete. This is a problem if one WAL record is split across
a segment boundary and the primary server crashes before writing down
the segment with the next portion of the WAL record: WAL writing after
crash recovery would happily resume at the point where the broken record
started, overwriting that record ... but any standby or backup may have
already received a copy of that segment, and they are not rewinding.
This causes standbys to stop following the primary after the latter
crashes:
LOG: invalid contrecord length 7262 at A8/D9FFFBC8
because the standby is still trying to read the continuation record
(contrecord) for the original long WAL record, but it is not there and
it will never be. A workaround is to stop the replica, delete the WAL
file, and restart it -- at which point a fresh copy is brought over from
the primary. But that's pretty labor intensive, and I bet many users
would just give up and re-clone the standby instead.
A fix for this problem was already attempted in commit
|
4 years ago |
|
|
445e36ad4e |
Bump XLOG_PAGE_MAGIC for format changes related to FPW compression
Oversight in
|
5 years ago |
|
|
08aa89b326 |
Remove COMMIT_TS_SETTS record.
Commit |
5 years ago |
|
|
8523492d4e |
Remove tupgone special case from vacuumlazy.c.
Retry the call to heap_prune_page() in rare cases where there is
disagreement between the heap_prune_page() call and the call to
HeapTupleSatisfiesVacuum() that immediately follows. Disagreement is
possible when a concurrently-aborted transaction makes a tuple DEAD
during the tiny window between each step. This was the only case where
a tuple considered DEAD by VACUUM still had storage following pruning.
VACUUM's definition of dead tuples is now uniformly simple and
unambiguous: dead tuples from each page are always LP_DEAD line pointers
that were encountered just after we performed pruning (and just before
we considered freezing remaining items with tuple storage).
Eliminating the tupgone=true special case enables INDEX_CLEANUP=off
style skipping of index vacuuming that takes place based on flexible,
dynamic criteria. The INDEX_CLEANUP=off case had to know about skipping
indexes up-front before now, due to a subtle interaction with the
special case (see commit
|
5 years ago |
|
|
9f3665fbfc |
Don't consider newly inserted tuples in nbtree VACUUM.
Remove the entire idea of "stale stats" within nbtree VACUUM (stop caring about stats involving the number of inserted tuples). Also remove the vacuum_cleanup_index_scale_factor GUC/param on the master branch (though just disable them on postgres 13). The vacuum_cleanup_index_scale_factor/stats interface made the nbtree AM partially responsible for deciding when pg_class.reltuples stats needed to be updated. This seems contrary to the spirit of the index AM API, though -- it is not actually necessary for an index AM's bulk delete and cleanup callbacks to provide accurate stats when it happens to be inconvenient. The core code owns that. (Index AMs have the authority to perform or not perform certain kinds of deferred cleanup based on their own considerations, such as page deletion and recycling, but that has little to do with pg_class.reltuples/num_index_tuples.) This issue was fairly harmless until the introduction of the autovacuum_vacuum_insert_threshold feature by commit |
5 years ago |
|
|
e5d8a99903 |
Use full 64-bit XIDs in deleted nbtree pages.
Otherwise we risk "leaking" deleted pages by making them non-recyclable indefinitely. Commit |
5 years ago |
|
|
d168b66682 |
Enhance nbtree index tuple deletion.
Teach nbtree and heapam to cooperate in order to eagerly remove
duplicate tuples representing dead MVCC versions. This is "bottom-up
deletion". Each bottom-up deletion pass is triggered lazily in response
to a flood of versions on an nbtree leaf page. This usually involves a
"logically unchanged index" hint (these are produced by the executor
mechanism added by commit
|
5 years ago |
|
|
ca3b37487b |
Update copyright for 2021
Backpatch-through: 9.5 |
5 years ago |
|
|
676a9c3cc4 |
Correct several behavior descriptions in comments.
Reuse cautionary language from src/test/ssl/README in src/test/kerberos/README. SLRUs have had access to six-character segments names since commit |
5 years ago |
|
|
c55040ccd0 |
WAL Log invalidations at command end with wal_level=logical.
When wal_level=logical, write invalidations at command end into WAL so that decoding can use this information. This patch is required to allow the streaming of in-progress transactions in logical decoding. The actual work to allow streaming will be committed as a separate patch. We still add the invalidations to the cache and write them to WAL at commit time in RecordTransactionCommit(). This uses the existing XLOG_INVALIDATIONS xlog record type, from the RM_STANDBY_ID resource manager (see LogStandbyInvalidations for details). So existing code relying on those invalidations (e.g. redo) does not need to be changed. The invalidations written at command end uses a new xlog record type XLOG_XACT_INVALIDATIONS, from RM_XACT_ID resource manager. See LogLogicalInvalidations for details. These new xlog records are ignored by existing redo procedures, which still rely on the invalidations written to commit records. The invalidations are decoded and accumulated in top-transaction, and then executed during replay. This obviates the need to decode the invalidations as part of a commit record. Bump XLOG_PAGE_MAGIC, since this introduces XLOG_XACT_INVALIDATIONS. Author: Dilip Kumar, Tomas Vondra, Amit Kapila Reviewed-by: Amit Kapila Tested-by: Neha Sharma and Mahendra Singh Thalor Discussion: https://postgr.es/m/688b0b7f-2f6c-d827-c27b-216a8e3ea700@2ndquadrant.com |
5 years ago |
|
|
0bead9af48 |
Immediately WAL-log subtransaction and top-level XID association.
The logical decoding infrastructure needs to know which top-level transaction the subxact belongs to, in order to decode all the changes. Until now that might be delayed until commit, due to the caching (GPROC_MAX_CACHED_SUBXIDS), preventing features requiring incremental decoding. So we also write the assignment info into WAL immediately, as part of the next WAL record (to minimize overhead) only when wal_level=logical. We can not remove the existing XLOG_XACT_ASSIGNMENT WAL as that is required for avoiding overflow in the hot standby snapshot. Bump XLOG_PAGE_MAGIC, since this introduces XLR_BLOCK_ID_TOPLEVEL_XID. Author: Tomas Vondra, Dilip Kumar, Amit Kapila Reviewed-by: Amit Kapila Tested-by: Neha Sharma and Mahendra Singh Thalor Discussion: https://postgr.es/m/688b0b7f-2f6c-d827-c27b-216a8e3ea700@2ndquadrant.com |
5 years ago |
|
|
a8aaa0c786
|
Morph pg_replication_slots.min_safe_lsn to safe_wal_size
The previous definition of the column was almost universally disliked, so provide this updated definition which is more useful for monitoring purposes: a large positive value is good, while zero or a negative value means danger. This should be operationally more convenient. Backpatch to 13, where the new column to pg_replication_slots (and the feature it represents) were added. Author: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Author: Álvaro Herrera <alvherre@alvh.no-ip.org> Reported-by: Fujii Masao <masao.fujii@oss.nttdata.com> Discussion: https://postgr.es/m/9ddfbf8c-2f67-904d-44ed-cf8bc5916228@oss.nttdata.com |
6 years ago |
|
|
c6b92041d3 |
Skip WAL for new relfilenodes, under wal_level=minimal.
Until now, only selected bulk operations (e.g. COPY) did this. If a given relfilenode received both a WAL-skipping COPY and a WAL-logged operation (e.g. INSERT), recovery could lose tuples from the COPY. See src/backend/access/transam/README section "Skipping WAL for New RelFileNode" for the new coding rules. Maintainers of table access methods should examine that section. To maintain data durability, just before commit, we choose between an fsync of the relfilenode and copying its contents to WAL. A new GUC, wal_skip_threshold, guides that choice. If this change slows a workload that creates small, permanent relfilenodes under wal_level=minimal, try adjusting wal_skip_threshold. Users setting a timeout on COMMIT may need to adjust that timeout, and log_min_duration_statement analysis will reflect time consumption moving to COMMIT from commands like COPY. Internally, this requires a reliable determination of whether RollbackAndReleaseCurrentSubTransaction() would unlink a relation's current relfilenode. Introduce rd_firstRelfilenodeSubid. Amend the specification of rd_createSubid such that the field is zero when a new rel has an old rd_node. Make relcache.c retain entries for certain dropped relations until end of transaction. Bump XLOG_PAGE_MAGIC, since this introduces XLOG_GIST_ASSIGN_LSN. Future servers accept older WAL, so this bump is discretionary. Kyotaro Horiguchi, reviewed (in earlier, similar versions) by Robert Haas. Heikki Linnakangas and Michael Paquier implemented earlier designs that materially clarified the problem. Reviewed, in earlier designs, by Andrew Dunstan, Andres Freund, Alvaro Herrera, Tom Lane, Fujii Masao, and Simon Riggs. Reported by Martijn van Oosterhout. Discussion: https://postgr.es/m/20150702220524.GA9392@svana.org |
6 years ago |
|
|
616ae3d2b0 |
Move routine definitions of xlogarchive.c to a new header file
The definitions of the routines defined in xlogarchive.c have been part of xlog_internal.h which is included by several frontend tools, but all those routines are only called by the backend. More cleanup could be done within xlog_internal.h, but that's already a nice cut. This will help a follow-up patch for pg_rewind where handling of restore_command is added for frontends. Author: Alexey Kondratov, Michael Paquier Reviewed-by: Álvaro Herrera, Alexander Korotkov Discussion: https://postgr.es/m/a3acff50-5a0d-9a2c-b3b2-ee36168955c1@postgrespro.ru |
6 years ago |
|
|
0d861bbb70 |
Add deduplication to nbtree.
Deduplication reduces the storage overhead of duplicates in indexes that
use the standard nbtree index access method. The deduplication process
is applied lazily, after the point where opportunistic deletion of
LP_DEAD-marked index tuples occurs. Deduplication is only applied at
the point where a leaf page split would otherwise be required. New
posting list tuples are formed by merging together existing duplicate
tuples. The physical representation of the items on an nbtree leaf page
is made more space efficient by deduplication, but the logical contents
of the page are not changed. Even unique indexes make use of
deduplication as a way of controlling bloat from duplicates whose TIDs
point to different versions of the same logical table row.
The lazy approach taken by nbtree has significant advantages over a GIN
style eager approach. Most individual inserts of index tuples have
exactly the same overhead as before. The extra overhead of
deduplication is amortized across insertions, just like the overhead of
page splits. The key space of indexes works in the same way as it has
since commit
|
6 years ago |
|
|
d2e5e20e57 |
Add xl_btree_delete optimization.
Commit
|
6 years ago |
|
|
7559d8ebfa |
Update copyrights for 2020
Backpatch-through: update all files in master, backpatch legal files through 9.4 |
6 years ago |
|
|
9f83468b35 |
Remove unneeded "pin scan" nbtree VACUUM code.
The REDO routine for nbtree's xl_btree_vacuum record type hasn't performed a "pin scan" since commit |
6 years ago |
|
|
9989d37d1c |
Remove XLogFileNameP() from the tree
XLogFileNameP() is a wrapper routine able to build a palloc'd string for a WAL segment name, which is used for error string generation. There were several code paths where it gets called in a critical section, where memory allocation is not allowed. This results in triggering an assertion failure instead of generating the wanted error message. Another, more annoying, problem is that if the allocation to generate the WAL segment name fails on OOM, then the failure would be escalated to a PANIC. This removes the routine and all its callers are replaced with a logic using a fixed-size buffer. This way, all the existing mistakes are fixed and future ones are prevented. Author: Masahiko Sawada Reviewed-by: Michael Paquier, Álvaro Herrera Discussion: https://postgr.es/m/CA+fd4k5gC9H4uoWMLg9K_QfNrnkkdEw+-AFveob9YX7z8JnKTA@mail.gmail.com |
6 years ago |
|
|
43a54a3bcc |
Bump WAL version.
Oversight in commit
|
6 years ago |
|
|
c96581abe4 |
Fix inconsistencies and typos in the tree, take 11
This fixes various typos in docs and comments, and removes some orphaned definitions. Author: Alexander Lakhin Discussion: https://postgr.es/m/5da8e325-c665-da95-21e0-c8a99ea61fbf@gmail.com |
6 years ago |
|
|
0896ae561b |
Fix inconsistencies and typos in the tree
This is numbered take 7, and addresses a set of issues around: - Fixes for typos and incorrect reference names. - Removal of unneeded comments. - Removal of unreferenced functions and structures. - Fixes regarding variable name consistency. Author: Alexander Lakhin Discussion: https://postgr.es/m/10bfd4ac-3e7c-40ab-2b2e-355ed15495e8@gmail.com |
7 years ago |
|
|
8255c7a5ee |
Phase 2 pgindent run for v12.
Switch to 2.1 version of pg_bsd_indent. This formats multiline function declarations "correctly", that is with additional lines of parameter declarations indented to match where the first line's left parenthesis is. Discussion: https://postgr.es/m/CAEepm=0P3FeTXRcU5B2W3jv3PgRVZ-kGUXLGfd42FFhUROO3ug@mail.gmail.com |
7 years ago |
|
|
be76af171c |
Initial pgindent run for v12.
This is still using the 2.0 version of pg_bsd_indent. I thought it would be good to commit this separately, so as to document the differences between 2.0 and 2.1 behavior. Discussion: https://postgr.es/m/16296.1558103386@sss.pgh.pa.us |
7 years ago |
|
|
b5f58cf213 |
Convert gist to compute page level xid horizon on primary.
Due to parallel development, gist added the missing conflict information in |
7 years ago |
|
|
558a9165e0 |
Compute XID horizon for page level index vacuum on primary.
Previously the xid horizon was only computed during WAL replay. That
had two major problems:
1) It relied on knowing what the table pointed to looks like. That was
easy enough before the introducing of tableam (we knew it had to be
heap, although some trickery around logging the heap relfilenodes
was required). But to properly handle table AMs we need
per-database catalog access to look up the AM handler, which
recovery doesn't allow.
2) Not knowing the xid horizon also makes it hard to support logical
decoding on standbys. When on a catalog table, we need to be able
to conflict with slots that have an xid horizon that's too old. But
computing the horizon by visiting the heap only works once
consistency is reached, but we always need to be able to detect
conflicts.
There's also a secondary problem, in that the current method performs
redundant work on every standby. But that's counterbalanced by
potentially computing the value when not necessary (either because
there's no standby, or because there's no connected backends).
Solve 1) and 2) by moving computation of the xid horizon to the
primary and by involving tableam in the computation of the horizon.
To address the potentially increased overhead, increase the efficiency
of the xid horizon computation for heap by sorting the tids, and
eliminating redundant buffer accesses. When prefetching is available,
additionally perform prefetching of buffers. As this is more of a
maintenance task, rather than something routinely done in every read
only query, we add an arbitrary 10 to the effective concurrency -
thereby using IO concurrency, when not globally enabled. That's
possibly not the perfect formula, but seems good enough for now.
Bumps WAL format, as latestRemovedXid is now part of the records, and
the heap's relfilenode isn't anymore.
Author: Andres Freund, Amit Khandekar, Robert Haas
Reviewed-By: Robert Haas
Discussion:
https://postgr.es/m/20181212204154.nsxf3gzqv3gesl32@alap3.anarazel.de
https://postgr.es/m/20181214014235.dal5ogljs3bmlq44@alap3.anarazel.de
https://postgr.es/m/20180703070645.wchpu5muyto5n647@alap3.anarazel.de
|
7 years ago |
|
|
ea92368cd1 |
Move max_wal_senders out of max_connections for connection slot handling
Since its introduction, max_wal_senders is counted as part of max_connections when it comes to define how many connection slots can be used for replication connections with a WAL sender context. This can lead to confusion for some users, as it could be possible to block a base backup or replication from happening because other backend sessions are already taken for other purposes by an application, and superuser-only connection slots are not a correct solution to handle that case. This commit makes max_wal_senders independent of max_connections for its handling of PGPROC entries in ProcGlobal, meaning that connection slots for WAL senders are handled using their own free queue, like autovacuum workers and bgworkers. One compatibility issue that this change creates is that a standby now requires to have a value of max_wal_senders at least equal to its primary. So, if a standby created enforces the value of max_wal_senders to be lower than that, then this could break failovers. Normally this should not be an issue though, as any settings of a standby are inherited from its primary as postgresql.conf gets normally copied as part of a base backup, so parameters would be consistent. Author: Alexander Kukushkin Reviewed-by: Kyotaro Horiguchi, Petr Jelínek, Masahiko Sawada, Oleksii Kliukin Discussion: https://postgr.es/m/CAFh8B=nBzHQeYAu0b8fjK-AF1X4+_p6GRtwG+cCgs6Vci2uRuQ@mail.gmail.com |
7 years ago |
|
|
97c39498e5 |
Update copyright for 2019
Backpatch-through: certain files through 9.4 |
7 years ago |