|
|
|
|
@ -1,4 +1,4 @@ |
|
|
|
|
<!-- $PostgreSQL: pgsql/doc/src/sgml/backup.sgml,v 2.84 2006/09/15 21:55:07 momjian Exp $ --> |
|
|
|
|
<!-- $PostgreSQL: pgsql/doc/src/sgml/backup.sgml,v 2.85 2006/09/15 22:02:21 momjian Exp $ --> |
|
|
|
|
|
|
|
|
|
<chapter id="backup"> |
|
|
|
|
<title>Backup and Restore</title> |
|
|
|
|
@ -1203,6 +1203,312 @@ restore_command = 'copy /mnt/server/archivedir/%f "%p"' # Windows |
|
|
|
|
</sect2> |
|
|
|
|
</sect1> |
|
|
|
|
|
|
|
|
|
<sect1 id="warm-standby"> |
|
|
|
|
<title>Warm Standby Servers for High Availability</title> |
|
|
|
|
|
|
|
|
|
<indexterm zone="backup"> |
|
|
|
|
<primary>Warm Standby</primary> |
|
|
|
|
</indexterm> |
|
|
|
|
|
|
|
|
|
<indexterm zone="backup"> |
|
|
|
|
<primary>PITR Standby</primary> |
|
|
|
|
</indexterm> |
|
|
|
|
|
|
|
|
|
<indexterm zone="backup"> |
|
|
|
|
<primary>Standby Server</primary> |
|
|
|
|
</indexterm> |
|
|
|
|
|
|
|
|
|
<indexterm zone="backup"> |
|
|
|
|
<primary>Log Shipping</primary> |
|
|
|
|
</indexterm> |
|
|
|
|
|
|
|
|
|
<indexterm zone="backup"> |
|
|
|
|
<primary>Witness Server</primary> |
|
|
|
|
</indexterm> |
|
|
|
|
|
|
|
|
|
<indexterm zone="backup"> |
|
|
|
|
<primary>STONITH</primary> |
|
|
|
|
</indexterm> |
|
|
|
|
|
|
|
|
|
<indexterm zone="backup"> |
|
|
|
|
<primary>High Availability</primary> |
|
|
|
|
</indexterm> |
|
|
|
|
|
|
|
|
|
<para> |
|
|
|
|
Continuous Archiving can be used to create a High Availability (HA) |
|
|
|
|
cluster configuration with one or more Standby Servers ready to take |
|
|
|
|
over operations in the case that the Primary Server fails. This |
|
|
|
|
capability is more widely known as Warm Standby Log Shipping. |
|
|
|
|
</para> |
|
|
|
|
|
|
|
|
|
<para> |
|
|
|
|
The Primary and Standby Server work together to provide this capability, |
|
|
|
|
though the servers are only loosely coupled. The Primary Server operates |
|
|
|
|
in Continuous Archiving mode, while the Standby Server operates in a |
|
|
|
|
continuous Recovery mode, reading the WAL files from the Primary. No |
|
|
|
|
changes to the database tables are required to enable this capability, |
|
|
|
|
so it offers a low administration overhead in comparison with other |
|
|
|
|
replication approaches. This configuration also has a very low |
|
|
|
|
performance impact on the Primary server. |
|
|
|
|
</para> |
|
|
|
|
|
|
|
|
|
<para> |
|
|
|
|
Directly moving WAL or "log" records from one database server to another |
|
|
|
|
is typically described as Log Shipping. PostgreSQL implements file-based |
|
|
|
|
Log Shipping, meaning WAL records are batched one file at a time. WAL |
|
|
|
|
files can be shipped easily and cheaply over any distance, whether it be |
|
|
|
|
to an adjacent system, another system on the same site or another system |
|
|
|
|
on the far side of the globe. The bandwidth required for this technique |
|
|
|
|
varies according to the transaction rate of the Primary Server. |
|
|
|
|
Record-based Log Shipping is also possible with custom-developed |
|
|
|
|
procedures, discussed in a later section. Future developments are likely |
|
|
|
|
to include options for synchronous and/or integrated record-based log |
|
|
|
|
shipping. |
|
|
|
|
</para> |
|
|
|
|
|
|
|
|
|
<para> |
|
|
|
|
It should be noted that the log shipping is asynchronous, i.e. the WAL |
|
|
|
|
records are shipped after transaction commit. As a result there can be a |
|
|
|
|
small window of data loss, should the Primary Server suffer a |
|
|
|
|
catastrophic failure. The window of data loss is minimised by the use of |
|
|
|
|
the archive_timeout parameter, which can be set as low as a few seconds |
|
|
|
|
if required. A very low setting can increase the bandwidth requirements |
|
|
|
|
for file shipping. |
|
|
|
|
</para> |
|
|
|
|
|
|
|
|
|
<para> |
|
|
|
|
The Standby server is not available for access, since it is continually |
|
|
|
|
performing recovery processing. Recovery performance is sufficiently |
|
|
|
|
good that the Standby will typically be only minutes away from full |
|
|
|
|
availability once it has been activated. As a result, we refer to this |
|
|
|
|
capability as a Warm Standby configuration that offers High |
|
|
|
|
Availability. Restoring a server from an archived base backup and |
|
|
|
|
rollforward can take considerably longer and so that technique only |
|
|
|
|
really offers a solution for Disaster Recovery, not HA. |
|
|
|
|
</para> |
|
|
|
|
|
|
|
|
|
<para> |
|
|
|
|
Other mechanisms for High Availability replication are available, both |
|
|
|
|
commercially and as open-source software. |
|
|
|
|
</para> |
|
|
|
|
|
|
|
|
|
<para> |
|
|
|
|
In general, log shipping between servers running different release |
|
|
|
|
levels will not be possible. It is the policy of the PostgreSQL Worldwide |
|
|
|
|
Development Group not to make changes to disk formats during minor release |
|
|
|
|
upgrades, so it is likely that running different minor release levels |
|
|
|
|
on Primary and Standby servers will work successfully. However, no |
|
|
|
|
formal support for that is offered and you are advised not to allow this |
|
|
|
|
to occur over long periods. |
|
|
|
|
</para> |
|
|
|
|
|
|
|
|
|
<sect2 id="warm-standby-planning"> |
|
|
|
|
<title>Planning</title> |
|
|
|
|
|
|
|
|
|
<para> |
|
|
|
|
On the Standby server all tablespaces and paths will refer to similarly |
|
|
|
|
named mount points, so it is important to create the Primary and Standby |
|
|
|
|
servers so that they are as similar as possible, at least from the |
|
|
|
|
perspective of the database server. Furthermore, any CREATE TABLESPACE |
|
|
|
|
commands will be passed across as-is, so any new mount points must be |
|
|
|
|
created on both servers before they are used on the Primary. Hardware |
|
|
|
|
need not be the same, but experience shows that maintaining two |
|
|
|
|
identical systems is easier than maintaining two dissimilar ones over |
|
|
|
|
the whole lifetime of the application and system. |
|
|
|
|
</para> |
|
|
|
|
|
|
|
|
|
<para> |
|
|
|
|
There is no special mode required to enable a Standby server. The |
|
|
|
|
operations that occur on both Primary and Standby servers are entirely |
|
|
|
|
normal continuous archiving and recovery tasks. The primary point of |
|
|
|
|
contact between the two database servers is the archive of WAL files |
|
|
|
|
that both share: Primary writing to the archive, Standby reading from |
|
|
|
|
the archive. Care must be taken to ensure that WAL archives for separate |
|
|
|
|
servers do not become mixed together or confused. |
|
|
|
|
</para> |
|
|
|
|
|
|
|
|
|
<para> |
|
|
|
|
The magic that makes the two loosely coupled servers work together is |
|
|
|
|
simply a restore_command that waits for the next WAL file to be archived |
|
|
|
|
from the Primary. The restore_command is specified in the recovery.conf |
|
|
|
|
file on the Standby Server. Normal recovery processing would request a |
|
|
|
|
file from the WAL archive, causing an error if the file was unavailable. |
|
|
|
|
For Standby processing it is normal for the next file to be unavailable, |
|
|
|
|
so we must be patient and wait for it to appear. A waiting |
|
|
|
|
restore_command can be written as a custom script that loops after |
|
|
|
|
polling for the existence of the next WAL file. There must also be some |
|
|
|
|
way to trigger failover, which should interrupt the restore_command, |
|
|
|
|
break the loop and return a file not found error to the Standby Server. |
|
|
|
|
This then ends recovery and the Standby will then come up as a normal |
|
|
|
|
server. |
|
|
|
|
</para> |
|
|
|
|
|
|
|
|
|
<para> |
|
|
|
|
Sample code for the C version of the restore_command would be be: |
|
|
|
|
<programlisting> |
|
|
|
|
triggered = false; |
|
|
|
|
while (!NextWALFileReady() && !triggered) |
|
|
|
|
{ |
|
|
|
|
sleep(100000L); // wait for ~0.1 sec |
|
|
|
|
if (CheckForExternalTrigger()) |
|
|
|
|
triggered = true; |
|
|
|
|
} |
|
|
|
|
if (!triggered) |
|
|
|
|
CopyWALFileForRecovery(); |
|
|
|
|
</programlisting> |
|
|
|
|
</para> |
|
|
|
|
|
|
|
|
|
<para> |
|
|
|
|
PostgreSQL does not provide the system software required to identify a |
|
|
|
|
failure on the Primary and notify the Standby system and then the |
|
|
|
|
Standby database server. Many such tools exist and are well integrated |
|
|
|
|
with other aspects of a system failover, such as ip address migration. |
|
|
|
|
</para> |
|
|
|
|
|
|
|
|
|
<para> |
|
|
|
|
Triggering failover is an important part of planning and design. The |
|
|
|
|
restore_command is executed in full once for each WAL file. The process |
|
|
|
|
running the restore_command is therefore created and dies for each file, |
|
|
|
|
so there is no daemon or server process and so we cannot use signals and |
|
|
|
|
a signal handler. A more permanent notification is required to trigger |
|
|
|
|
the failover. It is possible to use a simple timeout facility, |
|
|
|
|
especially if used in conjunction with a known archive_timeout setting |
|
|
|
|
on the Primary. This is somewhat error prone since a network or busy |
|
|
|
|
Primary server might be sufficient to initiate failover. A notification |
|
|
|
|
mechanism such as the explicit creation of a trigger file is less error |
|
|
|
|
prone, if this can be arranged. |
|
|
|
|
</para> |
|
|
|
|
</sect2> |
|
|
|
|
|
|
|
|
|
<sect2 id="warm-standby-config"> |
|
|
|
|
<title>Implementation</title> |
|
|
|
|
|
|
|
|
|
<para> |
|
|
|
|
The short procedure for configuring a Standby Server is as follows. For |
|
|
|
|
full details of each step, refer to previous sections as noted. |
|
|
|
|
<orderedlist> |
|
|
|
|
<listitem> |
|
|
|
|
<para> |
|
|
|
|
Set up Primary and Standby systems as near identically as possible, |
|
|
|
|
including two identical copies of PostgreSQL at same release level. |
|
|
|
|
</para> |
|
|
|
|
</listitem> |
|
|
|
|
<listitem> |
|
|
|
|
<para> |
|
|
|
|
Set up Continuous Archiving from the Primary to a WAL archive located |
|
|
|
|
in a directory on the Standby Server. Ensure that both <xref |
|
|
|
|
linkend="guc-archive-command"> and <xref linkend="guc-archive-timeout"> |
|
|
|
|
are set. (See <xref linkend="backup-archiving-wal">) |
|
|
|
|
</para> |
|
|
|
|
</listitem> |
|
|
|
|
<listitem> |
|
|
|
|
<para> |
|
|
|
|
Make a Base Backup of the Primary Server. (See <xref |
|
|
|
|
linkend="backup-base-backup">) |
|
|
|
|
</para> |
|
|
|
|
</listitem> |
|
|
|
|
<listitem> |
|
|
|
|
<para> |
|
|
|
|
Begin recovery on the Standby Server from the local WAL archive, |
|
|
|
|
using a recovery.conf that specifies a restore_command that waits as |
|
|
|
|
described previously. (See <xref linkend="backup-pitr-recovery">) |
|
|
|
|
</para> |
|
|
|
|
</listitem> |
|
|
|
|
</orderedlist> |
|
|
|
|
</para> |
|
|
|
|
|
|
|
|
|
<para> |
|
|
|
|
Recovery treats the WAL Archive as read-only, so once a WAL file has |
|
|
|
|
been copied to the Standby system it can be copied to tape at the same |
|
|
|
|
time as it is being used by the Standby database server to recover. |
|
|
|
|
Thus, running a Standby Server for High Availability can be performed at |
|
|
|
|
the same time as files are stored for longer term Disaster Recovery |
|
|
|
|
purposes. |
|
|
|
|
</para> |
|
|
|
|
|
|
|
|
|
<para> |
|
|
|
|
For testing purposes, it is possible to run both Primary and Standby |
|
|
|
|
servers on the same system. This does not provide any worthwhile |
|
|
|
|
improvement on server robustness, nor would it be described as HA. |
|
|
|
|
</para> |
|
|
|
|
</sect2> |
|
|
|
|
|
|
|
|
|
<sect2 id="warm-standby-failover"> |
|
|
|
|
<title>Failover</title> |
|
|
|
|
|
|
|
|
|
<para> |
|
|
|
|
If the Primary Server fails then the Standby Server should take begin |
|
|
|
|
failover procedures. |
|
|
|
|
</para> |
|
|
|
|
|
|
|
|
|
<para> |
|
|
|
|
If the Standby Server fails then no failover need take place. If the |
|
|
|
|
Standby Server can be restarted, then the recovery process can also be |
|
|
|
|
immediately restarted, taking advantage of Restartable Recovery. |
|
|
|
|
</para> |
|
|
|
|
|
|
|
|
|
<para> |
|
|
|
|
If the Primary Server fails and then immediately restarts, you must have |
|
|
|
|
a mechanism for informing it that it is no longer the Primary. This is |
|
|
|
|
sometimes known as STONITH (Should the Other Node In The Head), which is |
|
|
|
|
necessary to avoid situations where both systems think they are the |
|
|
|
|
Primary, which can lead to confusion and ultimately data loss. |
|
|
|
|
</para> |
|
|
|
|
|
|
|
|
|
<para> |
|
|
|
|
Many failover systems use just two systems, the Primary and the Standby, |
|
|
|
|
connected by some kind of heartbeat mechanism to continually verify the |
|
|
|
|
connectivity between the two and the viability of the Primary. It is |
|
|
|
|
also possible to use a third system, known as a Witness Server to avoid |
|
|
|
|
some problems of inappropriate failover, but the additional complexity |
|
|
|
|
may not be worthwhile unless it is set-up with sufficient care and |
|
|
|
|
rigorous testing. |
|
|
|
|
</para> |
|
|
|
|
|
|
|
|
|
<para> |
|
|
|
|
At the instant that failover takes place to the Standby, we have only a |
|
|
|
|
single server in operation. This is known as a degenerate state. |
|
|
|
|
The former Standby is now the Primary, but the former Primary is down |
|
|
|
|
and may stay down. We must now fully re-create a Standby server, |
|
|
|
|
either on the former Primary system when it comes up, or on a third, |
|
|
|
|
possibly new, system. Once complete the Primary and Standby can be |
|
|
|
|
considered to have switched roles. Some people choose to use a third |
|
|
|
|
server to provide additional protection across the failover interval, |
|
|
|
|
though clearly this complicates the system configuration and |
|
|
|
|
operational processes (and this can also act as a Witness Server). |
|
|
|
|
</para> |
|
|
|
|
|
|
|
|
|
<para> |
|
|
|
|
So, switching from Primary to Standby Server can be fast, but requires |
|
|
|
|
some time to re-prepare the failover cluster. Regular switching from |
|
|
|
|
Primary to Standby is encouraged, since it allows the regular downtime |
|
|
|
|
one each system required to maintain HA. This also acts as a test of the |
|
|
|
|
failover so that it definitely works when you really need it. Written |
|
|
|
|
administration procedures are advised. |
|
|
|
|
</para> |
|
|
|
|
</sect2> |
|
|
|
|
|
|
|
|
|
<sect2 id="warm-standby-record"> |
|
|
|
|
<title>Implementing Record-based Log Shipping</title> |
|
|
|
|
|
|
|
|
|
<para> |
|
|
|
|
The main features for Log Shipping in this release are based around the |
|
|
|
|
file-based Log Shipping described above. It is also possible to |
|
|
|
|
implement record-based Log Shipping using the pg_xlogfile_name_offset() |
|
|
|
|
function, though this requires custom development. |
|
|
|
|
</para> |
|
|
|
|
|
|
|
|
|
<para> |
|
|
|
|
An external program can call pg_xlogfile_name_offset() to find out the |
|
|
|
|
filename and the exact byte offset within it of the latest WAL pointer. |
|
|
|
|
If the external program regularly polls the server it can find out how |
|
|
|
|
far forward the pointer has moved. It can then access the WAL file |
|
|
|
|
directly and copy those bytes across to a less up-to-date copy on a |
|
|
|
|
Standby Server. |
|
|
|
|
</para> |
|
|
|
|
</sect2> |
|
|
|
|
</sect1> |
|
|
|
|
|
|
|
|
|
<sect1 id="migration"> |
|
|
|
|
<title>Migration Between Releases</title> |
|
|
|
|
|
|
|
|
|
|