You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
postgres/src/include/replication/slot.h

312 lines
10 KiB

/*-------------------------------------------------------------------------
* slot.h
* Replication slot management.
*
* Copyright (c) 2012-2025, PostgreSQL Global Development Group
*
*-------------------------------------------------------------------------
*/
#ifndef SLOT_H
#define SLOT_H
#include "access/xlog.h"
#include "access/xlogreader.h"
#include "storage/condition_variable.h"
#include "storage/lwlock.h"
#include "storage/shmem.h"
#include "storage/spin.h"
Allow multiple xacts during table sync in logical replication. For the initial table data synchronization in logical replication, we use a single transaction to copy the entire table and then synchronize the position in the stream with the main apply worker. There are multiple downsides of this approach: (a) We have to perform the entire copy operation again if there is any error (network breakdown, error in the database operation, etc.) while we synchronize the WAL position between tablesync worker and apply worker; this will be onerous especially for large copies, (b) Using a single transaction in the synchronization-phase (where we can receive WAL from multiple transactions) will have the risk of exceeding the CID limit, (c) The slot will hold the WAL till the entire sync is complete because we never commit till the end. This patch solves all the above downsides by allowing multiple transactions during the tablesync phase. The initial copy is done in a single transaction and after that, we commit each transaction as we receive. To allow recovery after any error or crash, we use a permanent slot and origin to track the progress. The slot and origin will be removed once we finish the synchronization of the table. We also remove slot and origin of tablesync workers if the user performs DROP SUBSCRIPTION .. or ALTER SUBSCRIPTION .. REFERESH and some of the table syncs are still not finished. The commands ALTER SUBSCRIPTION ... REFRESH PUBLICATION and ALTER SUBSCRIPTION ... SET PUBLICATION ... with refresh option as true cannot be executed inside a transaction block because they can now drop the slots for which we have no provision to rollback. This will also open up the path for logical replication of 2PC transactions on the subscriber side. Previously, we can't do that because of the requirement of maintaining a single transaction in tablesync workers. Bump catalog version due to change of state in the catalog (pg_subscription_rel). Author: Peter Smith, Amit Kapila, and Takamichi Osumi Reviewed-by: Ajin Cherian, Petr Jelinek, Hou Zhijie and Amit Kapila Discussion: https://postgr.es/m/CAA4eK1KHJxaZS-fod-0fey=0tq3=Gkn4ho=8N4-5HWiCfu0H1A@mail.gmail.com
5 years ago
#include "replication/walreceiver.h"
/* directory to store replication slot data in */
#define PG_REPLSLOT_DIR "pg_replslot"
/*
* Behaviour of replication slots, upon release or crash.
*
* Slots marked as PERSISTENT are crash-safe and will not be dropped when
* released. Slots marked as EPHEMERAL will be dropped when released or after
* restarts. Slots marked TEMPORARY will be dropped at the end of a session
* or on error.
*
* EPHEMERAL is used as a not-quite-ready state when creating persistent
* slots. EPHEMERAL slots can be made PERSISTENT by calling
* ReplicationSlotPersist(). For a slot that goes away at the end of a
* session, TEMPORARY is the appropriate choice.
*/
typedef enum ReplicationSlotPersistency
{
RS_PERSISTENT,
RS_EPHEMERAL,
RS_TEMPORARY,
} ReplicationSlotPersistency;
/*
* Slots can be invalidated, e.g. due to max_slot_wal_keep_size. If so, the
* 'invalidated' field is set to a value other than _NONE.
*
* When adding a new invalidation cause here, remember to update
* SlotInvalidationCauses and RS_INVAL_MAX_CAUSES.
*/
typedef enum ReplicationSlotInvalidationCause
{
RS_INVAL_NONE,
/* required WAL has been removed */
RS_INVAL_WAL_REMOVED,
/* required rows have been removed */
RS_INVAL_HORIZON,
/* wal_level insufficient for slot */
RS_INVAL_WAL_LEVEL,
} ReplicationSlotInvalidationCause;
extern PGDLLIMPORT const char *const SlotInvalidationCauses[];
Add a slot synchronization function. This commit introduces a new SQL function pg_sync_replication_slots() which is used to synchronize the logical replication slots from the primary server to the physical standby so that logical replication can be resumed after a failover or planned switchover. A new 'synced' flag is introduced in pg_replication_slots view, indicating whether the slot has been synchronized from the primary server. On a standby, synced slots cannot be dropped or consumed, and any attempt to perform logical decoding on them will result in an error. The logical replication slots on the primary can be synchronized to the hot standby by using the 'failover' parameter of pg-create-logical-replication-slot(), or by using the 'failover' option of CREATE SUBSCRIPTION during slot creation, and then calling pg_sync_replication_slots() on standby. For the synchronization to work, it is mandatory to have a physical replication slot between the primary and the standby aka 'primary_slot_name' should be configured on the standby, and 'hot_standby_feedback' must be enabled on the standby. It is also necessary to specify a valid 'dbname' in the 'primary_conninfo'. If a logical slot is invalidated on the primary, then that slot on the standby is also invalidated. If a logical slot on the primary is valid but is invalidated on the standby, then that slot is dropped but will be recreated on the standby in the next pg_sync_replication_slots() call provided the slot still exists on the primary server. It is okay to recreate such slots as long as these are not consumable on standby (which is the case currently). This situation may occur due to the following reasons: - The 'max_slot_wal_keep_size' on the standby is insufficient to retain WAL records from the restart_lsn of the slot. - 'primary_slot_name' is temporarily reset to null and the physical slot is removed. The slot synchronization status on the standby can be monitored using the 'synced' column of pg_replication_slots view. A functionality to automatically synchronize slots by a background worker and allow logical walsenders to wait for the physical will be done in subsequent commits. Author: Hou Zhijie, Shveta Malik, Ajin Cherian based on an earlier version by Peter Eisentraut Reviewed-by: Masahiko Sawada, Bertrand Drouvot, Peter Smith, Dilip Kumar, Nisha Moond, Kuroda Hayato, Amit Kapila Discussion: https://postgr.es/m/514f6f2f-6833-4539-39f1-96cd1e011f23@enterprisedb.com
2 years ago
/*
* On-Disk data of a replication slot, preserved across restarts.
*/
typedef struct ReplicationSlotPersistentData
{
/* The slot's identifier */
NameData name;
/* database the slot is active on */
Oid database;
/*
* The slot's behaviour when being dropped (or restored after a crash).
*/
ReplicationSlotPersistency persistency;
/*
* xmin horizon for data
*
* NB: This may represent a value that hasn't been written to disk yet;
* see notes for effective_xmin, below.
*/
TransactionId xmin;
/*
* xmin horizon for catalog tuples
*
* NB: This may represent a value that hasn't been written to disk yet;
* see notes for effective_xmin, below.
*/
TransactionId catalog_xmin;
/* oldest LSN that might be required by this replication slot */
XLogRecPtr restart_lsn;
/* RS_INVAL_NONE if valid, or the reason for having been invalidated */
ReplicationSlotInvalidationCause invalidated;
/*
* Oldest LSN that the client has acked receipt for. This is used as the
* start_lsn point in case the client doesn't specify one, and also as a
* safety measure to jump forwards in case the client specifies a
* start_lsn that's further in the past than this value.
*/
XLogRecPtr confirmed_flush;
/*
Add support for prepared transactions to built-in logical replication. To add support for streaming transactions at prepare time into the built-in logical replication, we need to do the following things: * Modify the output plugin (pgoutput) to implement the new two-phase API callbacks, by leveraging the extended replication protocol. * Modify the replication apply worker, to properly handle two-phase transactions by replaying them on prepare. * Add a new SUBSCRIPTION option "two_phase" to allow users to enable two-phase transactions. We enable the two_phase once the initial data sync is over. We however must explicitly disable replication of two-phase transactions during replication slot creation, even if the plugin supports it. We don't need to replicate the changes accumulated during this phase, and moreover, we don't have a replication connection open so we don't know where to send the data anyway. The streaming option is not allowed with this new two_phase option. This can be done as a separate patch. We don't allow to toggle two_phase option of a subscription because it can lead to an inconsistent replica. For the same reason, we don't allow to refresh the publication once the two_phase is enabled for a subscription unless copy_data option is false. Author: Peter Smith, Ajin Cherian and Amit Kapila based on previous work by Nikhil Sontakke and Stas Kelvich Reviewed-by: Amit Kapila, Sawada Masahiko, Vignesh C, Dilip Kumar, Takamichi Osumi, Greg Nancarrow Tested-By: Haiying Tang Discussion: https://postgr.es/m/02DA5F5E-CECE-4D9C-8B4B-418077E2C010@postgrespro.ru Discussion: https://postgr.es/m/CAA4eK1+opiV4aFTmWWUF9h_32=HfPOW9vZASHarT0UA5oBrtGw@mail.gmail.com
4 years ago
* LSN at which we enabled two_phase commit for this slot or LSN at which
* we found a consistent point at the time of slot creation.
*/
Add support for prepared transactions to built-in logical replication. To add support for streaming transactions at prepare time into the built-in logical replication, we need to do the following things: * Modify the output plugin (pgoutput) to implement the new two-phase API callbacks, by leveraging the extended replication protocol. * Modify the replication apply worker, to properly handle two-phase transactions by replaying them on prepare. * Add a new SUBSCRIPTION option "two_phase" to allow users to enable two-phase transactions. We enable the two_phase once the initial data sync is over. We however must explicitly disable replication of two-phase transactions during replication slot creation, even if the plugin supports it. We don't need to replicate the changes accumulated during this phase, and moreover, we don't have a replication connection open so we don't know where to send the data anyway. The streaming option is not allowed with this new two_phase option. This can be done as a separate patch. We don't allow to toggle two_phase option of a subscription because it can lead to an inconsistent replica. For the same reason, we don't allow to refresh the publication once the two_phase is enabled for a subscription unless copy_data option is false. Author: Peter Smith, Ajin Cherian and Amit Kapila based on previous work by Nikhil Sontakke and Stas Kelvich Reviewed-by: Amit Kapila, Sawada Masahiko, Vignesh C, Dilip Kumar, Takamichi Osumi, Greg Nancarrow Tested-By: Haiying Tang Discussion: https://postgr.es/m/02DA5F5E-CECE-4D9C-8B4B-418077E2C010@postgrespro.ru Discussion: https://postgr.es/m/CAA4eK1+opiV4aFTmWWUF9h_32=HfPOW9vZASHarT0UA5oBrtGw@mail.gmail.com
4 years ago
XLogRecPtr two_phase_at;
/*
* Allow decoding of prepared transactions?
*/
bool two_phase;
/* plugin name */
NameData plugin;
Add a slot synchronization function. This commit introduces a new SQL function pg_sync_replication_slots() which is used to synchronize the logical replication slots from the primary server to the physical standby so that logical replication can be resumed after a failover or planned switchover. A new 'synced' flag is introduced in pg_replication_slots view, indicating whether the slot has been synchronized from the primary server. On a standby, synced slots cannot be dropped or consumed, and any attempt to perform logical decoding on them will result in an error. The logical replication slots on the primary can be synchronized to the hot standby by using the 'failover' parameter of pg-create-logical-replication-slot(), or by using the 'failover' option of CREATE SUBSCRIPTION during slot creation, and then calling pg_sync_replication_slots() on standby. For the synchronization to work, it is mandatory to have a physical replication slot between the primary and the standby aka 'primary_slot_name' should be configured on the standby, and 'hot_standby_feedback' must be enabled on the standby. It is also necessary to specify a valid 'dbname' in the 'primary_conninfo'. If a logical slot is invalidated on the primary, then that slot on the standby is also invalidated. If a logical slot on the primary is valid but is invalidated on the standby, then that slot is dropped but will be recreated on the standby in the next pg_sync_replication_slots() call provided the slot still exists on the primary server. It is okay to recreate such slots as long as these are not consumable on standby (which is the case currently). This situation may occur due to the following reasons: - The 'max_slot_wal_keep_size' on the standby is insufficient to retain WAL records from the restart_lsn of the slot. - 'primary_slot_name' is temporarily reset to null and the physical slot is removed. The slot synchronization status on the standby can be monitored using the 'synced' column of pg_replication_slots view. A functionality to automatically synchronize slots by a background worker and allow logical walsenders to wait for the physical will be done in subsequent commits. Author: Hou Zhijie, Shveta Malik, Ajin Cherian based on an earlier version by Peter Eisentraut Reviewed-by: Masahiko Sawada, Bertrand Drouvot, Peter Smith, Dilip Kumar, Nisha Moond, Kuroda Hayato, Amit Kapila Discussion: https://postgr.es/m/514f6f2f-6833-4539-39f1-96cd1e011f23@enterprisedb.com
2 years ago
/*
* Was this slot synchronized from the primary server?
*/
char synced;
/*
* Is this a failover slot (sync candidate for standbys)? Only relevant
* for logical slots on the primary server.
*/
bool failover;
} ReplicationSlotPersistentData;
/*
* Shared memory state of a single replication slot.
*
* The in-memory data of replication slots follows a locking model based
* on two linked concepts:
* - A replication slot's in_use flag is switched when added or discarded using
* the LWLock ReplicationSlotControlLock, which needs to be hold in exclusive
* mode when updating the flag by the backend owning the slot and doing the
* operation, while readers (concurrent backends not owning the slot) need
* to hold it in shared mode when looking at replication slot data.
* - Individual fields are protected by mutex where only the backend owning
* the slot is authorized to update the fields from its own slot. The
* backend owning the slot does not need to take this lock when reading its
* own fields, while concurrent backends not owning this slot should take the
* lock when reading this slot's data.
*/
typedef struct ReplicationSlot
{
/* lock, on same cacheline as effective_xmin */
slock_t mutex;
/* is this slot defined */
bool in_use;
/* Who is streaming out changes for this slot? 0 in unused slots. */
pid_t active_pid;
/* any outstanding modifications? */
bool just_dirtied;
bool dirty;
/*
* For logical decoding, it's extremely important that we never remove any
* data that's still needed for decoding purposes, even after a crash;
* otherwise, decoding will produce wrong answers. Ordinary streaming
* replication also needs to prevent old row versions from being removed
* too soon, but the worst consequence we might encounter there is
* unwanted query cancellations on the standby. Thus, for logical
* decoding, this value represents the latest xmin that has actually been
* written to disk, whereas for streaming replication, it's just the same
* as the persistent value (data.xmin).
*/
TransactionId effective_xmin;
TransactionId effective_catalog_xmin;
/* data surviving shutdowns and crashes */
ReplicationSlotPersistentData data;
/* is somebody performing io on this slot? */
LWLock io_in_progress_lock;
/* Condition variable signaled when active_pid changes */
ConditionVariable active_cv;
/* all the remaining data is only used for logical slots */
/*
* When the client has confirmed flushes >= candidate_xmin_lsn we can
* advance the catalog xmin. When restart_valid has been passed,
* restart_lsn can be increased.
*/
TransactionId candidate_catalog_xmin;
XLogRecPtr candidate_xmin_lsn;
XLogRecPtr candidate_restart_valid;
XLogRecPtr candidate_restart_lsn;
/*
* This value tracks the last confirmed_flush LSN flushed which is used
* during a shutdown checkpoint to decide if logical's slot data should be
* forcibly flushed or not.
*/
XLogRecPtr last_saved_confirmed_flush;
/*
* The time when the slot became inactive. For synced slots on a standby
* server, it represents the time when slot synchronization was most
* recently stopped.
*/
TimestampTz inactive_since;
} ReplicationSlot;
Don't call palloc() while holding a spinlock, either. Fix some more violations of the "only straight-line code inside a spinlock" rule. These are hazardous not only because they risk holding the lock for an excessively long time, but because it's possible for palloc to throw elog(ERROR), leaving a stuck spinlock behind. copy_replication_slot() had two separate places that did pallocs while holding a spinlock. We can make the code simpler and safer by copying the whole ReplicationSlot struct into a local variable while holding the spinlock, and then referencing that copy. (While that's arguably more cycles than we really need to spend holding the lock, the struct isn't all that big, and this way seems far more maintainable than copying fields piecemeal. Anyway this is surely much cheaper than a palloc.) That bug goes back to v12. InvalidateObsoleteReplicationSlots() not only did a palloc while holding a spinlock, but for extra sloppiness then leaked the memory --- probably for the lifetime of the checkpointer process, though I didn't try to verify that. Fortunately that silliness is new in HEAD. pg_get_replication_slots() had a cosmetic violation of the rule, in that it only assumed it's safe to call namecpy() while holding a spinlock. Still, that's a hazard waiting to bite somebody, and there were some other cosmetic coding-rule violations in the same function, so clean it up. I back-patched this as far as v10; the code exists before that but it looks different, and this didn't seem important enough to adapt the patch further back. Discussion: https://postgr.es/m/20200602.161518.1399689010416646074.horikyota.ntt@gmail.com
5 years ago
#define SlotIsPhysical(slot) ((slot)->data.database == InvalidOid)
#define SlotIsLogical(slot) ((slot)->data.database != InvalidOid)
/*
* Shared memory control area for all of replication slots.
*/
typedef struct ReplicationSlotCtlData
{
/*
* This array should be declared [FLEXIBLE_ARRAY_MEMBER], but for some
* reason you can't do that in an otherwise-empty struct.
*/
ReplicationSlot replication_slots[1];
} ReplicationSlotCtlData;
/*
* Set slot's inactive_since property unless it was previously invalidated.
*/
static inline void
ReplicationSlotSetInactiveSince(ReplicationSlot *s, TimestampTz ts,
bool acquire_lock)
{
if (acquire_lock)
SpinLockAcquire(&s->mutex);
if (s->data.invalidated == RS_INVAL_NONE)
s->inactive_since = ts;
if (acquire_lock)
SpinLockRelease(&s->mutex);
}
/*
* Pointers to shared memory
*/
extern PGDLLIMPORT ReplicationSlotCtlData *ReplicationSlotCtl;
extern PGDLLIMPORT ReplicationSlot *MyReplicationSlot;
/* GUCs */
extern PGDLLIMPORT int max_replication_slots;
extern PGDLLIMPORT char *synchronized_standby_slots;
/* shmem initialization functions */
extern Size ReplicationSlotsShmemSize(void);
extern void ReplicationSlotsShmemInit(void);
/* management of individual slots */
extern void ReplicationSlotCreate(const char *name, bool db_specific,
ReplicationSlotPersistency persistency,
Add a slot synchronization function. This commit introduces a new SQL function pg_sync_replication_slots() which is used to synchronize the logical replication slots from the primary server to the physical standby so that logical replication can be resumed after a failover or planned switchover. A new 'synced' flag is introduced in pg_replication_slots view, indicating whether the slot has been synchronized from the primary server. On a standby, synced slots cannot be dropped or consumed, and any attempt to perform logical decoding on them will result in an error. The logical replication slots on the primary can be synchronized to the hot standby by using the 'failover' parameter of pg-create-logical-replication-slot(), or by using the 'failover' option of CREATE SUBSCRIPTION during slot creation, and then calling pg_sync_replication_slots() on standby. For the synchronization to work, it is mandatory to have a physical replication slot between the primary and the standby aka 'primary_slot_name' should be configured on the standby, and 'hot_standby_feedback' must be enabled on the standby. It is also necessary to specify a valid 'dbname' in the 'primary_conninfo'. If a logical slot is invalidated on the primary, then that slot on the standby is also invalidated. If a logical slot on the primary is valid but is invalidated on the standby, then that slot is dropped but will be recreated on the standby in the next pg_sync_replication_slots() call provided the slot still exists on the primary server. It is okay to recreate such slots as long as these are not consumable on standby (which is the case currently). This situation may occur due to the following reasons: - The 'max_slot_wal_keep_size' on the standby is insufficient to retain WAL records from the restart_lsn of the slot. - 'primary_slot_name' is temporarily reset to null and the physical slot is removed. The slot synchronization status on the standby can be monitored using the 'synced' column of pg_replication_slots view. A functionality to automatically synchronize slots by a background worker and allow logical walsenders to wait for the physical will be done in subsequent commits. Author: Hou Zhijie, Shveta Malik, Ajin Cherian based on an earlier version by Peter Eisentraut Reviewed-by: Masahiko Sawada, Bertrand Drouvot, Peter Smith, Dilip Kumar, Nisha Moond, Kuroda Hayato, Amit Kapila Discussion: https://postgr.es/m/514f6f2f-6833-4539-39f1-96cd1e011f23@enterprisedb.com
2 years ago
bool two_phase, bool failover,
bool synced);
extern void ReplicationSlotPersist(void);
extern void ReplicationSlotDrop(const char *name, bool nowait);
Add a slot synchronization function. This commit introduces a new SQL function pg_sync_replication_slots() which is used to synchronize the logical replication slots from the primary server to the physical standby so that logical replication can be resumed after a failover or planned switchover. A new 'synced' flag is introduced in pg_replication_slots view, indicating whether the slot has been synchronized from the primary server. On a standby, synced slots cannot be dropped or consumed, and any attempt to perform logical decoding on them will result in an error. The logical replication slots on the primary can be synchronized to the hot standby by using the 'failover' parameter of pg-create-logical-replication-slot(), or by using the 'failover' option of CREATE SUBSCRIPTION during slot creation, and then calling pg_sync_replication_slots() on standby. For the synchronization to work, it is mandatory to have a physical replication slot between the primary and the standby aka 'primary_slot_name' should be configured on the standby, and 'hot_standby_feedback' must be enabled on the standby. It is also necessary to specify a valid 'dbname' in the 'primary_conninfo'. If a logical slot is invalidated on the primary, then that slot on the standby is also invalidated. If a logical slot on the primary is valid but is invalidated on the standby, then that slot is dropped but will be recreated on the standby in the next pg_sync_replication_slots() call provided the slot still exists on the primary server. It is okay to recreate such slots as long as these are not consumable on standby (which is the case currently). This situation may occur due to the following reasons: - The 'max_slot_wal_keep_size' on the standby is insufficient to retain WAL records from the restart_lsn of the slot. - 'primary_slot_name' is temporarily reset to null and the physical slot is removed. The slot synchronization status on the standby can be monitored using the 'synced' column of pg_replication_slots view. A functionality to automatically synchronize slots by a background worker and allow logical walsenders to wait for the physical will be done in subsequent commits. Author: Hou Zhijie, Shveta Malik, Ajin Cherian based on an earlier version by Peter Eisentraut Reviewed-by: Masahiko Sawada, Bertrand Drouvot, Peter Smith, Dilip Kumar, Nisha Moond, Kuroda Hayato, Amit Kapila Discussion: https://postgr.es/m/514f6f2f-6833-4539-39f1-96cd1e011f23@enterprisedb.com
2 years ago
extern void ReplicationSlotDropAcquired(void);
extern void ReplicationSlotAlter(const char *name, const bool *failover,
const bool *two_phase);
Raise an error while trying to acquire an invalid slot. Once a replication slot is invalidated, it cannot be altered or used to fetch changes. However, a process could still acquire an invalid slot and fail later. For example, if a process acquires a logical slot that was invalidated due to wal_removed, it will eventually fail in CreateDecodingContext() when attempting to access the removed WAL. Similarly, for physical replication slots, even if the slot is invalidated and invalidation_reason is set to wal_removed, the walsender does not currently check for invalidation when starting physical replication. Instead, replication starts, and an error is only reported later while trying to access WAL. Similarly, we prohibit modifying slot properties for invalid slots but give the error for the same after acquiring the slot. This patch improves error handling by detecting invalid slots earlier at the time of slot acquisition which is the first step. This also helped in unifying different ERROR messages at different places and gave a consistent message for invalid slots. This means that the message for invalid slots will change to a generic message. This will also be helpful for future patches where we are planning to invalidate slots due to more reasons like idle_timeout because we don't have to modify multiple places in such cases and avoid the chances of missing out on a particular place. Author: Nisha Moond <nisha.moond412@gmail.com> Author: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> Reviewed-by: Vignesh C <vignesh21@gmail.com> Reviewed-by: Peter Smith <smithpb2250@gmail.com> Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Discussion: https://postgr.es/m/CABdArM6pBL5hPnSQ+5nEVMANcF4FCH7LQmgskXyiLY75TMnKpw@mail.gmail.com
7 months ago
extern void ReplicationSlotAcquire(const char *name, bool nowait,
bool error_if_invalid);
extern void ReplicationSlotRelease(void);
extern void ReplicationSlotCleanup(bool synced_only);
extern void ReplicationSlotSave(void);
extern void ReplicationSlotMarkDirty(void);
/* misc stuff */
extern void ReplicationSlotInitialize(void);
extern bool ReplicationSlotValidateName(const char *name, int elevel);
extern void ReplicationSlotReserveWal(void);
extern void ReplicationSlotsComputeRequiredXmin(bool already_locked);
extern void ReplicationSlotsComputeRequiredLSN(void);
extern XLogRecPtr ReplicationSlotsComputeLogicalRestartLSN(void);
extern bool ReplicationSlotsCountDBSlots(Oid dboid, int *nslots, int *nactive);
extern void ReplicationSlotsDropDBSlots(Oid dboid);
extern bool InvalidateObsoleteReplicationSlots(ReplicationSlotInvalidationCause cause,
XLogSegNo oldestSegno,
Oid dboid,
TransactionId snapshotConflictHorizon);
extern ReplicationSlot *SearchNamedReplicationSlot(const char *name, bool need_lock);
extern int ReplicationSlotIndex(ReplicationSlot *slot);
pgstat: Prevent stats reset from corrupting slotname by removing slotname Previously PgStat_StatReplSlotEntry contained the slotname, which was mainly used when writing out the stats during shutdown, to identify the slot in the serialized data (at runtime the index in ReplicationSlotCtl->replication_slots is used, but that can change during a restart). Unfortunately the slotname was overwritten when the slot's stats were reset. That turned out to only cause "real" problems if the slot was active during the reset, triggering an assertion failure at the next pgstat_report_replslot(). In other paths the stats were re-initialized during pgstat_acquire_replslot(). Fix this by removing slotname from PgStat_StatReplSlotEntry. Instead we can get the slot's name from the slot itself. Besides fixing a bug, this also is architecturally cleaner (a name is not really statistics). This is safe because stats, for a slot removed while shut down, will not be restored at startup. In 15 the slotname is not removed, but renamed, to avoid changing the stats format. In master, bump PGSTAT_FILE_FORMAT_ID. This commit does not contain a test for the fix. I think this can only be tested by a tap test starting pg_recvlogical in the background and checking pg_recvlogical's output. That type of test is notoriously hard to be reliable, so committing it shortly before the release is wrapped seems like a bad idea. Reported-by: Jaime Casanova <jcasanov@systemguards.com.ec> Author: Andres Freund <andres@anarazel.de> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Discussion: https://postgr.es/m/YxfagaTXUNa9ggLb@ahch-to Backpatch: 15-, where the bug was introduced in 5891c7a8ed8f
3 years ago
extern bool ReplicationSlotName(int index, Name name);
extern void ReplicationSlotNameForTablesync(Oid suboid, Oid relid, char *syncslotname, Size szslot);
Allow multiple xacts during table sync in logical replication. For the initial table data synchronization in logical replication, we use a single transaction to copy the entire table and then synchronize the position in the stream with the main apply worker. There are multiple downsides of this approach: (a) We have to perform the entire copy operation again if there is any error (network breakdown, error in the database operation, etc.) while we synchronize the WAL position between tablesync worker and apply worker; this will be onerous especially for large copies, (b) Using a single transaction in the synchronization-phase (where we can receive WAL from multiple transactions) will have the risk of exceeding the CID limit, (c) The slot will hold the WAL till the entire sync is complete because we never commit till the end. This patch solves all the above downsides by allowing multiple transactions during the tablesync phase. The initial copy is done in a single transaction and after that, we commit each transaction as we receive. To allow recovery after any error or crash, we use a permanent slot and origin to track the progress. The slot and origin will be removed once we finish the synchronization of the table. We also remove slot and origin of tablesync workers if the user performs DROP SUBSCRIPTION .. or ALTER SUBSCRIPTION .. REFERESH and some of the table syncs are still not finished. The commands ALTER SUBSCRIPTION ... REFRESH PUBLICATION and ALTER SUBSCRIPTION ... SET PUBLICATION ... with refresh option as true cannot be executed inside a transaction block because they can now drop the slots for which we have no provision to rollback. This will also open up the path for logical replication of 2PC transactions on the subscriber side. Previously, we can't do that because of the requirement of maintaining a single transaction in tablesync workers. Bump catalog version due to change of state in the catalog (pg_subscription_rel). Author: Peter Smith, Amit Kapila, and Takamichi Osumi Reviewed-by: Ajin Cherian, Petr Jelinek, Hou Zhijie and Amit Kapila Discussion: https://postgr.es/m/CAA4eK1KHJxaZS-fod-0fey=0tq3=Gkn4ho=8N4-5HWiCfu0H1A@mail.gmail.com
5 years ago
extern void ReplicationSlotDropAtPubNode(WalReceiverConn *wrconn, char *slotname, bool missing_ok);
extern void StartupReplicationSlots(void);
extern void CheckPointReplicationSlots(bool is_shutdown);
extern void CheckSlotRequirements(void);
extern void CheckSlotPermissions(void);
Add a slot synchronization function. This commit introduces a new SQL function pg_sync_replication_slots() which is used to synchronize the logical replication slots from the primary server to the physical standby so that logical replication can be resumed after a failover or planned switchover. A new 'synced' flag is introduced in pg_replication_slots view, indicating whether the slot has been synchronized from the primary server. On a standby, synced slots cannot be dropped or consumed, and any attempt to perform logical decoding on them will result in an error. The logical replication slots on the primary can be synchronized to the hot standby by using the 'failover' parameter of pg-create-logical-replication-slot(), or by using the 'failover' option of CREATE SUBSCRIPTION during slot creation, and then calling pg_sync_replication_slots() on standby. For the synchronization to work, it is mandatory to have a physical replication slot between the primary and the standby aka 'primary_slot_name' should be configured on the standby, and 'hot_standby_feedback' must be enabled on the standby. It is also necessary to specify a valid 'dbname' in the 'primary_conninfo'. If a logical slot is invalidated on the primary, then that slot on the standby is also invalidated. If a logical slot on the primary is valid but is invalidated on the standby, then that slot is dropped but will be recreated on the standby in the next pg_sync_replication_slots() call provided the slot still exists on the primary server. It is okay to recreate such slots as long as these are not consumable on standby (which is the case currently). This situation may occur due to the following reasons: - The 'max_slot_wal_keep_size' on the standby is insufficient to retain WAL records from the restart_lsn of the slot. - 'primary_slot_name' is temporarily reset to null and the physical slot is removed. The slot synchronization status on the standby can be monitored using the 'synced' column of pg_replication_slots view. A functionality to automatically synchronize slots by a background worker and allow logical walsenders to wait for the physical will be done in subsequent commits. Author: Hou Zhijie, Shveta Malik, Ajin Cherian based on an earlier version by Peter Eisentraut Reviewed-by: Masahiko Sawada, Bertrand Drouvot, Peter Smith, Dilip Kumar, Nisha Moond, Kuroda Hayato, Amit Kapila Discussion: https://postgr.es/m/514f6f2f-6833-4539-39f1-96cd1e011f23@enterprisedb.com
2 years ago
extern ReplicationSlotInvalidationCause
GetSlotInvalidationCause(const char *invalidation_reason);
extern bool SlotExistsInSyncStandbySlots(const char *slot_name);
extern bool StandbySlotsHaveCaughtup(XLogRecPtr wait_for_lsn, int elevel);
extern void WaitForStandbyConfirmation(XLogRecPtr wait_for_lsn);
Phase 2 of pgindent updates. Change pg_bsd_indent to follow upstream rules for placement of comments to the right of code, and remove pgindent hack that caused comments following #endif to not obey the general rule. Commit e3860ffa4dd0dad0dd9eea4be9cc1412373a8c89 wasn't actually using the published version of pg_bsd_indent, but a hacked-up version that tried to minimize the amount of movement of comments to the right of code. The situation of interest is where such a comment has to be moved to the right of its default placement at column 33 because there's code there. BSD indent has always moved right in units of tab stops in such cases --- but in the previous incarnation, indent was working in 8-space tab stops, while now it knows we use 4-space tabs. So the net result is that in about half the cases, such comments are placed one tab stop left of before. This is better all around: it leaves more room on the line for comment text, and it means that in such cases the comment uniformly starts at the next 4-space tab stop after the code, rather than sometimes one and sometimes two tabs after. Also, ensure that comments following #endif are indented the same as comments following other preprocessor commands such as #else. That inconsistency turns out to have been self-inflicted damage from a poorly-thought-through post-indent "fixup" in pgindent. This patch is much less interesting than the first round of indent changes, but also bulkier, so I thought it best to separate the effects. Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
8 years ago
#endif /* SLOT_H */