MVCC in postgresql¶
This is based on my lecture on the databases, which (is still available in Polish)
There are two ways to implement proper transaction isolation:
- First is by using locking. In this padagrim whenever transaction reads data a lock is issued, and any write to that data will wait until reading transaction finishes (this might be simplified).
- Second is by using MVCC — that is multi version concurrency. It works as follows: each transaction sees database in a state at the time the transaction, so reads and writes don’t need to wait for each other (there is a problem with write skew anomaly, which is solved by the postgresql 9.1 and newer.
How does MVCC work¶
Start of the transaction¶
When transaction starts following things happen (or may happen depending on isolation level):
- Transaction is assigned a
txida transaction ID, transaction id’s are ordered 32 bit integers (that may wrap around at some point in time, but Postgres handles it).
txid``s of all committed transactions are stored (possibly in a more efficient way than storing all ``txids)
Each row contains couple of magic columns:
- This is the
txidof transaction that inserted this row
- This is the
txidof transaction that deleted this column
- Index of statement in transaction that added/deleted that row
- When a transaction inserts a row it
xminis set to
txidof this transaction.
- When a transaction deletes a row it just sets
- Updates are replaced with a delete and insert pair.
Row is visible for transaction
txid if (all statements must be true):
xmin < txid(row was inserted by a transaction before this one).
xminis commited (in case of Read Commited isolation level), or
xminwas commited before start of current transaction (other isolation levels)
xmaxis empty or
xmax > txid(row was deleted by a transaction that started after this one).
In case of transaction that issue multiple statements
used for example to have a cursor that consistently iterates over a table,
even if the same transaction alters the table.
Data can’t be deleted from disk immediately in databases using MVCC, because
ongoing transactions might not ‘see’ the delete, and still need to access
deleted row. Data can be deleted only after all transactions with
xmax have either committed or have been rolled back.
Postgresql does this (more or less) automatically, but you might call
VACUUM by hand if you need to reclaim space (this space will not necessarily
be freed to the OS, rather it will be accessible for new inserts).
- BRIUCE MOMJIAN
MVCC Unmasked: http://momjian.us/main/writings/pgsql/mvcc.pdf (I have also cached it
locally, due to permissive CC license)
- Serializable Snapshot Isolation on Postgresql Wiki,
(due to even more permissive license it is also cached
- My old lectures: http://db.fizyka.pw.edu.pl/~bzdak/bazy_danych_ed_20/wyklad10/wyk10.html#mvcc-w-postgresq
- Wikibook I have collected
(in case some future problems with Wikipedia
you can try version hosted here).