mirror of https://github.com/postgres/postgres
Some wording changes from Vadim's original text doc. Processes cleanly, but may need fixup.REL6_5_PATCHES
parent
bb0fc46a90
commit
0807dbb294
@ -0,0 +1,545 @@ |
|||||||
|
<chapter id="mvcc"> |
||||||
|
<title>Multi-Version Concurrency Control</title> |
||||||
|
|
||||||
|
<abstract> |
||||||
|
<para> |
||||||
|
Multi-Version Concurrency Control |
||||||
|
(MVCC) |
||||||
|
is an advanced technique for improving database performance in a |
||||||
|
multi-user environment. |
||||||
|
<ulink url="mailto:vadim@krs.ru">Vadim Mikheev</ulink> provided |
||||||
|
the implementation for <productname>Postgres</productname>. |
||||||
|
</para> |
||||||
|
</abstract> |
||||||
|
|
||||||
|
<sect1> |
||||||
|
<title>Introduction</title> |
||||||
|
|
||||||
|
<para> |
||||||
|
Unlike most other database systems which use locks for concurrency control, |
||||||
|
<productname>Postgres</productname> |
||||||
|
maintains data consistency by using a multiversion model. |
||||||
|
This means that while querying database each transaction sees |
||||||
|
a snapshot of data (a <firstterm>database version</firstterm>) |
||||||
|
as it was some |
||||||
|
time ago, regardless of the current state of data queried. |
||||||
|
This protects the transaction from viewing inconsistent data that |
||||||
|
could be caused by (other) concurrent transaction updates on the same |
||||||
|
data rows, providing <firstterm>transaction isolation</firstterm> |
||||||
|
for each database session. |
||||||
|
</para> |
||||||
|
|
||||||
|
<para> |
||||||
|
The main difference between multiversion and lock models is that |
||||||
|
in MVCC locks acquired for querying (reading) data don't conflict |
||||||
|
with locks acquired for writing data and so reading never blocks |
||||||
|
writing and writing never blocks reading. |
||||||
|
</para> |
||||||
|
</sect1> |
||||||
|
|
||||||
|
<sect1> |
||||||
|
<title>Transaction Isolation</title> |
||||||
|
|
||||||
|
<para> |
||||||
|
The <acronym>ANSI</acronym>/<acronym>ISO</acronym> <acronym>SQL</acronym> |
||||||
|
standard defines four levels of transaction |
||||||
|
isolation in terms of three phenomena that must be prevented |
||||||
|
between concurrent transactions. |
||||||
|
These undesirable phenomena are: |
||||||
|
|
||||||
|
<variablelist> |
||||||
|
<varlistentry> |
||||||
|
<term> |
||||||
|
dirty reads |
||||||
|
</term> |
||||||
|
<listitem> |
||||||
|
<para> |
||||||
|
A transaction reads data written by concurrent uncommitted transaction. |
||||||
|
</para> |
||||||
|
</listitem> |
||||||
|
</varlistentry> |
||||||
|
|
||||||
|
<varlistentry> |
||||||
|
<term> |
||||||
|
non-repeatable reads |
||||||
|
</term> |
||||||
|
<listitem> |
||||||
|
<para> |
||||||
|
A transaction re-reads data it has previously read and finds that data |
||||||
|
has been modified by another committed transaction. |
||||||
|
</para> |
||||||
|
</listitem> |
||||||
|
</varlistentry> |
||||||
|
|
||||||
|
<varlistentry> |
||||||
|
<term> |
||||||
|
phantom read |
||||||
|
</term> |
||||||
|
<listitem> |
||||||
|
<para> |
||||||
|
A transaction re-executes a query returning a set of rows that satisfy a |
||||||
|
search condition and finds that additional rows satisfying the condition |
||||||
|
has been inserted by another committed transaction. |
||||||
|
</para> |
||||||
|
</listitem> |
||||||
|
</varlistentry> |
||||||
|
</variablelist> |
||||||
|
</para> |
||||||
|
|
||||||
|
<para> |
||||||
|
Accordingly, the four isolation levels are defined to be: |
||||||
|
|
||||||
|
<segmentedlist> |
||||||
|
<segtitle> |
||||||
|
Isolation Level |
||||||
|
</segtitle> |
||||||
|
<segtitle> |
||||||
|
Dirty Read |
||||||
|
</segtitle> |
||||||
|
<segtitle> |
||||||
|
Non-Repeatable Read |
||||||
|
</segtitle> |
||||||
|
<segtitle> |
||||||
|
Phantom Read |
||||||
|
</segtitle> |
||||||
|
<seglistitem> |
||||||
|
<seg> |
||||||
|
Read uncommitted |
||||||
|
</seg> |
||||||
|
<seg> |
||||||
|
Possible |
||||||
|
</seg> |
||||||
|
<seg> |
||||||
|
Possible |
||||||
|
</seg> |
||||||
|
<seg> |
||||||
|
Possible |
||||||
|
</seg> |
||||||
|
</seglistitem> |
||||||
|
|
||||||
|
<seglistitem> |
||||||
|
<seg> |
||||||
|
Read committed |
||||||
|
</seg> |
||||||
|
<seg> |
||||||
|
Not possible |
||||||
|
</seg> |
||||||
|
<seg> |
||||||
|
Possible |
||||||
|
</seg> |
||||||
|
<seg> |
||||||
|
Possible |
||||||
|
</seg> |
||||||
|
</seglistitem> |
||||||
|
|
||||||
|
<seglistitem> |
||||||
|
<seg> |
||||||
|
Repeatable read |
||||||
|
</seg> |
||||||
|
<seg> |
||||||
|
Not possible |
||||||
|
</seg> |
||||||
|
<seg> |
||||||
|
Not possible |
||||||
|
</seg> |
||||||
|
<seg> |
||||||
|
Possible |
||||||
|
</seg> |
||||||
|
</seglistitem> |
||||||
|
|
||||||
|
<seglistitem> |
||||||
|
<seg> |
||||||
|
Serializable |
||||||
|
</seg> |
||||||
|
<seg> |
||||||
|
Not possible |
||||||
|
</seg> |
||||||
|
<seg> |
||||||
|
Not possible |
||||||
|
</seg> |
||||||
|
<seg> |
||||||
|
Not possible |
||||||
|
</seg> |
||||||
|
</seglistitem> |
||||||
|
</segmentedlist> |
||||||
|
|
||||||
|
<productname>Postgres</productname> |
||||||
|
offers the read committed and serializable isolation levels. |
||||||
|
</para> |
||||||
|
</sect1> |
||||||
|
|
||||||
|
<sect1> |
||||||
|
<title>Read Committed Isolation Level</title> |
||||||
|
|
||||||
|
<para> |
||||||
|
This is the default isolation level in <productname>Postgres</productname>. |
||||||
|
When a transaction runs on this isolation level, a query sees only |
||||||
|
data committed before the query began and never sees either dirty data or |
||||||
|
concurrent transaction changes committed during query execution. |
||||||
|
</para> |
||||||
|
|
||||||
|
<para> |
||||||
|
If a row returned by a query while executing an |
||||||
|
<command>UPDATE</command> statement |
||||||
|
(or <command>DELETE</command> |
||||||
|
or <command>SELECT FOR UPDATE</command>) |
||||||
|
is being updated by a |
||||||
|
concurrent uncommitted transaction then the second transaction |
||||||
|
that tries to update this row will wait for the other transaction to |
||||||
|
commit or rollback. In the case of rollback, the waiting transaction |
||||||
|
can proceed to change the row. In the case of commit (and if the |
||||||
|
row still exists; i.e. was not deleted by the other transaction), the |
||||||
|
query will be re-executed for this row to check that new row |
||||||
|
version satisfies query search condition. If the new row version |
||||||
|
satisfies the query search condition then row will be |
||||||
|
updated (or deleted or marked for update). |
||||||
|
</para> |
||||||
|
|
||||||
|
<para> |
||||||
|
Note that the results of execution of SELECT or INSERT (with a query) |
||||||
|
statements will not be affected by concurrent transactions. |
||||||
|
</para> |
||||||
|
</sect1> |
||||||
|
|
||||||
|
<sect1> |
||||||
|
<title>Serializable Isolation Level</title> |
||||||
|
|
||||||
|
<para> |
||||||
|
This level provides the highest transaction isolation. When a |
||||||
|
transaction is on the <firstterm>serializable</firstterm> level, |
||||||
|
a query sees only data |
||||||
|
committed before the transaction began and never see either dirty data |
||||||
|
or concurrent transaction changes committed during transaction |
||||||
|
execution. So, this level emulates serial transaction execution, |
||||||
|
as if transactions would be executed one after another, serially, |
||||||
|
rather than concurrently. |
||||||
|
</para> |
||||||
|
|
||||||
|
<para> |
||||||
|
If a row returned by query while executing |
||||||
|
<command>UPDATE</command>/<command>DELETE</command>/<command>SELECT FOR UPDATE</command> |
||||||
|
statement is being updated by |
||||||
|
a concurrent uncommitted transaction then the second transaction |
||||||
|
that tries to update this row will wait for the other transaction to |
||||||
|
commit or rollback. In the case of rollback, the waiting transaction |
||||||
|
can proceed to change the row. In the case of a concurrent |
||||||
|
transaction commit, a serializable transaction will be rolled back |
||||||
|
with the message |
||||||
|
|
||||||
|
<programlisting> |
||||||
|
ERROR: Can't serialize access due to concurrent update |
||||||
|
</programlisting> |
||||||
|
|
||||||
|
because a serializable transaction cannot modify rows changed by |
||||||
|
other transactions after the serializable transaction began. |
||||||
|
</para> |
||||||
|
|
||||||
|
<note> |
||||||
|
<para> |
||||||
|
Note that results of execution of <command>SELECT</command> |
||||||
|
or <command>INSERT</command> (with a query) |
||||||
|
will not be affected by concurrent transactions. |
||||||
|
</para> |
||||||
|
</note> |
||||||
|
</sect1> |
||||||
|
|
||||||
|
<sect1> |
||||||
|
<title>Locking and Tables</title> |
||||||
|
|
||||||
|
<para> |
||||||
|
<productname>Postgres</productname> |
||||||
|
provides various lock modes to control concurrent |
||||||
|
access to data in tables. Some of these lock modes are acquired by |
||||||
|
<productname>Postgres</productname> |
||||||
|
automatically before statement execution, while others are |
||||||
|
provided to be used by applications. All lock modes (except for |
||||||
|
AccessShareLock) acquired in a transaction are held for the duration |
||||||
|
of the transaction. |
||||||
|
</para> |
||||||
|
|
||||||
|
<para> |
||||||
|
In addition to locks, short-term share/exclusive latches are used |
||||||
|
to control read/write access to table pages in shared buffer pool. |
||||||
|
Latches are released immediately after a tuple is fetched or updated. |
||||||
|
</para> |
||||||
|
|
||||||
|
<sect2> |
||||||
|
<title>Table-level locks</title> |
||||||
|
|
||||||
|
<para> |
||||||
|
<variablelist> |
||||||
|
<varlistentry> |
||||||
|
<term> |
||||||
|
AccessShareLock |
||||||
|
</term> |
||||||
|
<listitem> |
||||||
|
<para> |
||||||
|
An internal lock mode acquiring automatically over tables |
||||||
|
being queried. <productname>Postgres</productname> |
||||||
|
releases these locks after statement is |
||||||
|
done. |
||||||
|
</para> |
||||||
|
|
||||||
|
<para> |
||||||
|
Conflicts with AccessExclusiveLock only. |
||||||
|
</para> |
||||||
|
</listitem> |
||||||
|
</varlistentry> |
||||||
|
|
||||||
|
<varlistentry> |
||||||
|
<term> |
||||||
|
RowShareLock |
||||||
|
</term> |
||||||
|
<listitem> |
||||||
|
<para> |
||||||
|
Acquired by <command>SELECT FOR UPDATE</command> |
||||||
|
and <command>LOCK TABLE</command> |
||||||
|
for <option>IN ROW SHARE MODE</option> statements. |
||||||
|
</para> |
||||||
|
|
||||||
|
<para> |
||||||
|
Conflicts with ExclusiveLock and AccessExclusiveLock modes. |
||||||
|
</para> |
||||||
|
</listitem> |
||||||
|
</varlistentry> |
||||||
|
|
||||||
|
<varlistentry> |
||||||
|
<term> |
||||||
|
RowExclusiveLock |
||||||
|
</term> |
||||||
|
<listitem> |
||||||
|
<para> |
||||||
|
Acquired by <command>UPDATE</command>, <command>DELETE</command>, |
||||||
|
<command>INSERT</command> and <command>LOCK TABLE</command> |
||||||
|
for <option>IN ROW EXCLUSIVE MODE</option> statements. |
||||||
|
</para> |
||||||
|
|
||||||
|
<para> |
||||||
|
Conflicts with ShareLock, ShareRowExclusiveLock, ExclusiveLock and |
||||||
|
AccessExclusiveLock modes. |
||||||
|
</para> |
||||||
|
</listitem> |
||||||
|
</varlistentry> |
||||||
|
|
||||||
|
<varlistentry> |
||||||
|
<term> |
||||||
|
ShareLock |
||||||
|
</term> |
||||||
|
<listitem> |
||||||
|
<para> |
||||||
|
Acquired by <command>CREATE INDEX</command> |
||||||
|
and <command>LOCK TABLE</command> table |
||||||
|
for <option>IN SHARE MODE</option> |
||||||
|
statements. |
||||||
|
</para> |
||||||
|
|
||||||
|
<para> |
||||||
|
Conflicts with RowExclusiveLock, ShareRowExclusiveLock, |
||||||
|
ExclusiveLock and AccessExclusiveLock modes. |
||||||
|
</para> |
||||||
|
</listitem> |
||||||
|
</varlistentry> |
||||||
|
|
||||||
|
<varlistentry> |
||||||
|
<term> |
||||||
|
ShareRowExclusiveLock |
||||||
|
</term> |
||||||
|
<listitem> |
||||||
|
<para> |
||||||
|
Acquired by <command>LOCK TABLE</command> for |
||||||
|
<option>IN SHARE ROW EXCLUSIVE MODE</option> statements. |
||||||
|
</para> |
||||||
|
|
||||||
|
<para> |
||||||
|
Conflicts with RowExclusiveLock, ShareLock, ShareRowExclusiveLock, |
||||||
|
ExclusiveLock and AccessExclusiveLock modes. |
||||||
|
</para> |
||||||
|
</listitem> |
||||||
|
</varlistentry> |
||||||
|
|
||||||
|
<varlistentry> |
||||||
|
<term> |
||||||
|
ExclusiveLock |
||||||
|
</term> |
||||||
|
<listitem> |
||||||
|
<para> |
||||||
|
Acquired by <command>LOCK TABLE</command> table |
||||||
|
for <option>IN EXCLUSIVE MODE</option> statements. |
||||||
|
</para> |
||||||
|
|
||||||
|
<para> |
||||||
|
Conflicts with RowShareLock, RowExclusiveLock, ShareLock, |
||||||
|
ShareRowExclusiveLock, ExclusiveLock and AccessExclusiveLock |
||||||
|
modes. |
||||||
|
</para> |
||||||
|
</listitem> |
||||||
|
</varlistentry> |
||||||
|
|
||||||
|
<varlistentry> |
||||||
|
<term> |
||||||
|
AccessExclusiveLock |
||||||
|
</term> |
||||||
|
<listitem> |
||||||
|
<para> |
||||||
|
Acquired by <command>ALTER TABLE</command>, |
||||||
|
<command>DROP TABLE</command>, |
||||||
|
<command>VACUUM</command> and <command>LOCK TABLE</command> |
||||||
|
statements. |
||||||
|
</para> |
||||||
|
|
||||||
|
<para> |
||||||
|
Conflicts with RowShareLock, RowExclusiveLock, ShareLock, |
||||||
|
ShareRowExclusiveLock, ExclusiveLock and AccessExclusiveLock |
||||||
|
modes. |
||||||
|
|
||||||
|
<note> |
||||||
|
<para> |
||||||
|
Note that only AccessExclusiveLock blocks <command>SELECT</command> (without FOR |
||||||
|
UPDATE) statement. |
||||||
|
</para> |
||||||
|
</note> |
||||||
|
</para> |
||||||
|
</listitem> |
||||||
|
</varlistentry> |
||||||
|
</variablelist> |
||||||
|
</para> |
||||||
|
</sect2> |
||||||
|
|
||||||
|
<sect2> |
||||||
|
<title>Row-level locks</title> |
||||||
|
|
||||||
|
<para> |
||||||
|
These locks are acquired by means of modification of internal |
||||||
|
fields of row being updated/deleted/marked for update. |
||||||
|
<productname>Postgres</productname> |
||||||
|
doesn't remember any information about modified rows in memory and |
||||||
|
so hasn't limit for locked rows without lock escalation. |
||||||
|
</para> |
||||||
|
|
||||||
|
<para> |
||||||
|
However, take into account that <command>SELECT FOR UPDATE</command> will modify |
||||||
|
selected rows to mark them and so will results in disk writes. |
||||||
|
</para> |
||||||
|
|
||||||
|
<para> |
||||||
|
Row-level locks don't affect data querying. They are used to block |
||||||
|
writers to <emphasis>the same row</emphasis> only. |
||||||
|
</para> |
||||||
|
</sect2> |
||||||
|
</sect1> |
||||||
|
|
||||||
|
<sect1> |
||||||
|
<title>Locking and Indices</title> |
||||||
|
|
||||||
|
<para> |
||||||
|
Though <productname>Postgres</productname> |
||||||
|
provides unblocking read/write access to table |
||||||
|
data, it is not the case for all index access methods implemented |
||||||
|
in <productname>Postgres</productname>. |
||||||
|
</para> |
||||||
|
|
||||||
|
<para> |
||||||
|
The various index types are handled as follows: |
||||||
|
|
||||||
|
<variablelist> |
||||||
|
<varlistentry> |
||||||
|
<term> |
||||||
|
GiST and R-Tree indices |
||||||
|
</term> |
||||||
|
<listitem> |
||||||
|
<para> |
||||||
|
Share/exclusive INDEX-level locks are used for read/write access. |
||||||
|
Locks are released after statement is done. |
||||||
|
</para> |
||||||
|
</listitem> |
||||||
|
</varlistentry> |
||||||
|
|
||||||
|
<varlistentry> |
||||||
|
<term> |
||||||
|
Hash indices |
||||||
|
</term> |
||||||
|
<listitem> |
||||||
|
<para> |
||||||
|
Share/exclusive PAGE-level locks are used for read/write access. |
||||||
|
Locks are released after page is processed. |
||||||
|
</para> |
||||||
|
|
||||||
|
<para> |
||||||
|
Page-level locks produces better concurrency than index-level ones |
||||||
|
but are subject to deadlocks. |
||||||
|
</para> |
||||||
|
</listitem> |
||||||
|
</varlistentry> |
||||||
|
|
||||||
|
<varlistentry> |
||||||
|
<term> |
||||||
|
Btree |
||||||
|
</term> |
||||||
|
<listitem> |
||||||
|
<para> |
||||||
|
Short-term share/exclusive PAGE-level latches are used for |
||||||
|
read/write access. Latches are released immediately after the index |
||||||
|
tuple is inserted/fetched. |
||||||
|
</para> |
||||||
|
|
||||||
|
<para> |
||||||
|
Btree indices provide highest concurrency without deadlock |
||||||
|
conditions. |
||||||
|
</para> |
||||||
|
</listitem> |
||||||
|
</varlistentry> |
||||||
|
</variablelist> |
||||||
|
</para> |
||||||
|
</sect1> |
||||||
|
|
||||||
|
<sect1> |
||||||
|
<title>Data consistency checks at the application level</title> |
||||||
|
|
||||||
|
<para> |
||||||
|
Because readers in <productname>Postgres</productname> |
||||||
|
don't lock data, regardless of |
||||||
|
transaction isolation level, data read by one transaction can be |
||||||
|
overwritten by another. In the other words, if a row is returned |
||||||
|
by <command>SELECT</command> it doesn't mean that this row really |
||||||
|
exists at the time it is returned (i.e. sometime after the |
||||||
|
statement or transaction began) nor |
||||||
|
that the row is protected from deletion/updation by concurrent |
||||||
|
transactions before the current transaction commit or rollback. |
||||||
|
</para> |
||||||
|
|
||||||
|
<para> |
||||||
|
To ensure the actual existance of a row and protect it against |
||||||
|
concurrent updates one must use <command>SELECT FOR UPDATE</command> or |
||||||
|
an appropriate <command>LOCK TABLE</command> statement. |
||||||
|
This should be taken into account when porting applications using |
||||||
|
serializable mode to <productname>Postgres</productname> from other environments. |
||||||
|
|
||||||
|
<note> |
||||||
|
<para> |
||||||
|
Before version 6.5 <productname>Postgres</productname> |
||||||
|
used read-locks and so the |
||||||
|
above consideration is also the case |
||||||
|
when upgrading to 6.5 (or higher) from previous |
||||||
|
<productname>Postgres</productname> versions. |
||||||
|
</para> |
||||||
|
</note> |
||||||
|
</para> |
||||||
|
</sect1> |
||||||
|
</chapter> |
||||||
|
|
||||||
|
<!-- Keep this comment at the end of the file |
||||||
|
Local variables: |
||||||
|
mode: sgml |
||||||
|
sgml-omittag:nil |
||||||
|
sgml-shorttag:t |
||||||
|
sgml-minimize-attributes:nil |
||||||
|
sgml-always-quote-attributes:t |
||||||
|
sgml-indent-step:1 |
||||||
|
sgml-indent-data:t |
||||||
|
sgml-parent-document:nil |
||||||
|
sgml-default-dtd-file:"./reference.ced" |
||||||
|
sgml-exposed-tags:nil |
||||||
|
sgml-local-catalogs:"/usr/lib/sgml/catalog" |
||||||
|
sgml-local-ecat-files:nil |
||||||
|
End: |
||||||
|
--> |
||||||
Loading…
Reference in new issue