Until d2ea2d310d, the PS_USE_PS_STRINGS
option was used on the GNU/Hurd. As this option got removed and
PS_USE_CLOBBER_ARGV appears to work fine nowadays on the Hurd, define
this one to re-enable process title changes on this platform.
In the 14 and 15 branches, the existing test for __hurd__ (added 25
years ago by commit 209aa77d, removed in 16 by the above commit) is left
unchanged for now as it was activating slightly different code paths and
would need investigation by a Hurd user.
Author: Michael Banck <mbanck@debian.org>
Discussion: https://postgr.es/m/CA%2BhUKGJMNGUAqf27WbckYFrM-Mavy0RKJvocfJU%3DJ2XcAZyv%2Bw%40mail.gmail.com
Backpatch-through: 16
When instrumenting a MERGE command containing both WHEN NOT MATCHED BY
SOURCE and WHEN NOT MATCHED BY TARGET actions using EXPLAIN ANALYZE, a
concurrent update of the target relation could lead to an Assert
failure in show_modifytable_info(). In a non-assert build, this would
lead to an incorrect value for "skipped" tuples in the EXPLAIN output,
rather than a crash.
This could happen if the concurrent update caused a matched row to no
longer match, in which case ExecMerge() treats the single originally
matched row as a pair of not matched rows, and potentially executes 2
not-matched actions for the single source row. This could then lead to
a state where the number of rows processed by the ModifyTable node
exceeds the number of rows produced by its source node, causing
"skipped_path" in show_modifytable_info() to be negative, triggering
the Assert.
Fix this in ExecMergeMatched() by incrementing the instrumentation
tuple count on the source node whenever a concurrent update of this
kind is detected, if both kinds of merge actions exist, so that the
number of source rows matches the number of actions potentially
executed, and the "skipped" tuple count is correct.
Back-patch to v17, where support for WHEN NOT MATCHED BY SOURCE
actions was introduced.
Bug: #19111
Reported-by: Dilip Kumar <dilipbalaut@gmail.com>
Author: Dean Rasheed <dean.a.rasheed@gmail.com>
Reviewed-by: Dilip Kumar <dilipbalaut@gmail.com>
Discussion: https://postgr.es/m/19111-5b06624513d301b3@postgresql.org
Backpatch-through: 17
Commit 5e4fcbe531 added a check_rights parameter to this function
for use by ALTER TABLE commands that re-create statistics objects.
However, we intentionally ignore check_rights when verifying
relation ownership because this function's lookup could return a
different answer than the caller's. This commit adds a note to
this effect so that we remember it down the road.
Reviewed-by: Noah Misch <noah@leadboat.com>
Backpatch-through: 14
Previously, when pgbench ran a custom script that triggered retriable errors
(e.g., deadlocks) followed by multiple \syncpipeline commands in pipeline mode,
the following assertion failure could occur:
Assertion failed: (res == ((void*)0)), function discardUntilSync, file pgbench.c, line 3594.
The issue was that discardUntilSync() assumed a pipeline sync result
(PGRES_PIPELINE_SYNC) would always be followed by either another sync result
or NULL. This assumption was incorrect: when multiple sync requests were sent,
a sync result could instead be followed by another result type. In such cases,
discardUntilSync() mishandled the results, leading to the assertion failure.
This commit fixes the issue by making discardUntilSync() correctly handle cases
where a pipeline sync result is followed by other result types. It now continues
discarding results until another pipeline sync followed by NULL is reached.
Backpatched to v17, where support for \syncpipeline command in pgbench was
introduced.
Author: Yugo Nagata <nagata@sraoss.co.jp>
Reviewed-by: Chao Li <lic@highgo.com>
Reviewed-by: Fujii Masao <masao.fujii@gmail.com>
Discussion: https://postgr.es/m/20251111105037.f3fc554616bc19891f926c5b@sraoss.co.jp
Backpatch-through: 17
On the CREATE POLICY page, the "Policies Applied by Command Type"
table was missing MERGE ... THEN DELETE and some of the policies
applied during INSERT ... ON CONFLICT and MERGE. Fix that, and try to
improve readability by listing the various MERGE cases separately,
rather than together with INSERT/UPDATE/DELETE. Mention COPY ... TO
along with SELECT, since it behaves in the same way. In addition,
document which policy violations cause errors to be thrown, and which
just cause rows to be silently ignored.
Also, a paragraph above the table states that INSERT ... ON CONFLICT
DO UPDATE only checks the WITH CHECK expressions of INSERT policies
for rows appended to the relation by the INSERT path, which is
incorrect -- all rows proposed for insertion are checked, regardless
of whether they end up being inserted. Fix that, and also mention that
the same applies to INSERT ... ON CONFLICT DO NOTHING.
In addition, in various other places on that page, clarify how the
different types of policy are applied to different commands, and
whether or not errors are thrown when policy checks do not pass.
Backpatch to all supported versions. Prior to v17, MERGE did not
support RETURNING, and so MERGE ... THEN INSERT would never check new
rows against SELECT policies. Prior to v15, MERGE was not supported at
all.
Author: Dean Rasheed <dean.a.rasheed@gmail.com>
Reviewed-by: Viktor Holmberg <v@viktorh.net>
Reviewed-by: Jian He <jian.universality@gmail.com>
Discussion: https://postgr.es/m/CAEZATCWqnfeChjK=n1V_dYZT4rt4mnq+ybf9c0qXDYTVMsy8pg@mail.gmail.com
Backpatch-through: 14
If DSM entry initialization fails, backends could try to use an
uninitialized DSM segment, DSA, or dshash table (since the entry is
still added to the registry). To fix, keep track of whether
initialization completed, and ERROR if a backend tries to attach to
an uninitialized entry. We could instead retry initialization as
needed, but that seemed complicated, error prone, and unlikely to
help most cases. Furthermore, such problems probably indicate a
coding error.
Reported-by: Alexander Lakhin <exclusion@gmail.com>
Reviewed-by: Sami Imseih <samimseih@gmail.com>
Discussion: https://postgr.es/m/dd36d384-55df-4fc2-825c-5bc56c950fa9%40gmail.com
Backpatch-through: 17
Before we started to freeze async notify entries (commit 8eeb4a0f7c),
no one looked at the 'xid' on an entry with invalid 'dboid'. But now
we might actually need to freeze it later. Initialize them with
InvalidTransactionId to begin with, to avoid that work later.
Álvaro pointed this out in review of commit 8eeb4a0f7c, but I forgot
to include this change there.
Author: Álvaro Herrera <alvherre@kurilemu.de>
Discussion: https://www.postgresql.org/message-id/202511071410.52ll56eyixx7@alvherre.pgsql
Backpatch-through: 14
Previous commit fixed a bug where VACUUM would truncate the CLOG
that's still needed to check the commit status of XIDs in the async
notify queue, but as mentioned in the commit message, it wasn't a full
fix. If a backend is executing asyncQueueReadAllNotifications() and
has just made a local copy of an async SLRU page which contains old
XIDs, vacuum can concurrently truncate the CLOG covering those XIDs,
and the backend still gets an error when it calls
TransactionIdDidCommit() on those XIDs in the local copy. This commit
fixes that race condition.
To fix, hold the SLRU bank lock across the TransactionIdDidCommit()
calls in NOTIFY processing.
Per Tom Lane's idea. Backpatch to all supported versions.
Reviewed-by: Joel Jacobson <joel@compiler.org>
Reviewed-by: Arseniy Mukhin <arseniy.mukhin.dev@gmail.com>
Discussion: https://www.postgresql.org/message-id/2759499.1761756503@sss.pgh.pa.us
Backpatch-through: 14
The async notification queue contains the XID of the sender, and when
processing notifications we call TransactionIdDidCommit() on the
XID. But we had no safeguards to prevent the CLOG segments containing
those XIDs from being truncated away. As a result, if a backend didn't
for some reason process its notifications for a long time, or when a
new backend issued LISTEN, you could get an error like:
test=# listen c21;
ERROR: 58P01: could not access status of transaction 14279685
DETAIL: Could not open file "pg_xact/000D": No such file or directory.
LOCATION: SlruReportIOError, slru.c:1087
To fix, make VACUUM "freeze" the XIDs in the async notification queue
before truncating the CLOG. Old XIDs are replaced with
FrozenTransactionId or InvalidTransactionId.
Note: This commit is not a full fix. A race condition remains, where a
backend is executing asyncQueueReadAllNotifications() and has just
made a local copy of an async SLRU page which contains old XIDs, while
vacuum concurrently truncates the CLOG covering those XIDs. When the
backend then calls TransactionIdDidCommit() on those XIDs from the
local copy, you still get the error. The next commit will fix that
remaining race condition.
This was first reported by Sergey Zhuravlev in 2021, with many other
people hitting the same issue later. Thanks to:
- Alexandra Wang, Daniil Davydov, Andrei Varashen and Jacques Combrink
for investigating and providing reproducable test cases,
- Matheus Alcantara and Arseniy Mukhin for review and earlier proposed
patches to fix this,
- Álvaro Herrera and Masahiko Sawada for reviews,
- Yura Sokolov aka funny-falcon for the idea of marking transactions
as committed in the notification queue, and
- Joel Jacobson for the final patch version. I hope I didn't forget
anyone.
Backpatch to all supported versions. I believe the bug goes back all
the way to commit d1e027221d, which introduced the SLRU-based async
notification queue.
Discussion: https://www.postgresql.org/message-id/16961-25f29f95b3604a8a@postgresql.org
Discussion: https://www.postgresql.org/message-id/18804-bccbbde5e77a68c2@postgresql.org
Discussion: https://www.postgresql.org/message-id/CAK98qZ3wZLE-RZJN_Y%2BTFjiTRPPFPBwNBpBi5K5CU8hUHkzDpw@mail.gmail.com
Backpatch-through: 14
Previously, if async notify processing encountered an error, we would
report the error to the client and advance our read position past the
offending entry to prevent trying to process it over and over
again. Trying to continue after an error has a few problems however:
- We have no way of telling the client that a notification was
lost. They get an ERROR, but that doesn't tell you much. As such,
it's not clear if keeping the connection alive after losing a
notification is a good thing. Depending on the application logic,
missing a notification could cause the application to get stuck
waiting, for example.
- If the connection is idle, PqCommReadingMsg is set and any ERROR is
turned into FATAL anyway.
- We bailed out of the notification processing loop on first error
without processing any subsequent notifications. The subsequent
notifications would not be processed until another notify interrupt
arrives. For example, if there were two notifications pending, and
processing the first one caused an ERROR, the second notification
would not be processed until someone sent a new NOTIFY.
This commit changes the behavior so that any ERROR while processing
async notifications is turned into FATAL, causing the client
connection to be terminated. That makes the behavior more consistent
as that's what happened in idle state already, and terminating the
connection is a clear signal to the application that it might've
missed some notifications.
The reason to do this now is that the next commits will change the
notification processing code in a way that would make it harder to
skip over just the offending notification entry on error.
Reviewed-by: Matheus Alcantara <matheusssilv97@gmail.com>
Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de>
Reviewed-by: Arseniy Mukhin <arseniy.mukhin.dev@gmail.com>
Discussion: https://www.postgresql.org/message-id/fedbd908-4571-4bbe-b48e-63bfdcc38f64@iki.fi
Backpatch-through: 14
Explicitly document that privileges are transferred along with the
ownership. Backpatch to all supported versions since this behavior
has always been present.
Author: Laurenz Albe <laurenz.albe@cybertec.at>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Reviewed-by: David G. Johnston <david.g.johnston@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Josef Šimánek <josef.simanek@gmail.com>
Reported-by: Gilles Parc <gparc@free.fr>
Discussion: https://postgr.es/m/2023185982.281851219.1646733038464.JavaMail.root@zimbra15-e2.priv.proxad.net
Backpatch-through: 14
The range for commit_siblings was incorrectly listed as starting on 1
instead of 0 in the sample configuration file. Backpatch down to all
supported branches.
Author: Man Zeng <zengman@halodbtech.com>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://postgr.es/m/tencent_53B70BA72303AE9C6889E78E@qq.com
Backpatch-through: 14
pg_resetwal didn't accept multixid 0 or multixact offset UINT32_MAX,
but they are both valid values that can appear in the control file.
That caused pg_upgrade to fail if you tried to upgrade a cluster
exactly at multixid or offset wraparound, because pg_upgrade calls
pg_resetwal to restore multixid/offset on the new cluster to the
values from the old cluster. To fix, allow those values in
pg_resetwal.
Fixes bugs #18863 and #18865 reported by Dmitry Kovalenko.
Backpatch down to v15. Version 14 has the same bug, but the patch
doesn't apply cleanly there. It could be made to work but it doesn't
seem worth the effort given how rare it is to hit this problem with
pg_upgrade, and how few people are upgrading to v14 anymore.
Author: Maxim Orlov <orlovmg@gmail.com>
Discussion: https://www.postgresql.org/message-id/CACG%3DezaApSMTjd%3DM2Sfn5Ucuggd3FG8Z8Qte8Xq9k5-%2BRQis-g@mail.gmail.com
Discussion: https://www.postgresql.org/message-id/18863-72f08858855344a2@postgresql.org
Discussion: https://www.postgresql.org/message-id/18865-d4c66cf35c2a67af@postgresql.org
Backpatch-through: 15
The synopsis for the ALTER PUBLICATION ... DROP ... command incorrectly
implied that a column list and WHERE clause could be specified as part of
the publication object. However, these options are not allowed for
DROP operations, making the documentation misleading.
This commit corrects the synopsis to clearly show only the valid forms
of publication objects.
Backpatched to v15, where the incorrect synopsis was introduced.
Author: Peter Smith <smithpb2250@gmail.com>
Reviewed-by: Fujii Masao <masao.fujii@gmail.com>
Discussion: https://postgr.es/m/CAHut+PsPu+47Q7b0o6h1r-qSt90U3zgbAHMHUag5o5E1Lo+=uw@mail.gmail.com
Backpatch-through: 15
Previously, error messages for oversized injection point names, libraries,
and functions showed buffer sizes (64, 128, 128) instead of the usable
character limits (63, 127, 127) as it did not count for the
zero-terminated byte, which was confusing. These messages are adjusted
to show better the reality.
The limit enforced for the private area was also too strict by one byte,
as specifying a zone worth exactly INJ_PRIVATE_MAXLEN should be able to
work because three is no zero-terminated byte in this case.
This is a stylistic change (well, mostly, a private_area size of exactly
1024 bytes can be defined with this change, something that nobody seem
to care about based on the lack of complaints). However, this is a
testing facility let's keep the logic consistent across all the branches
where this code exists, as there is an argument in favor of out-of-core
extensions that use injection points.
Author: Xuneng Zhou <xunengzhou@gmail.com>
Co-authored-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/CABPTF7VxYp4Hny1h+7ejURY-P4O5-K8WZg79Q3GUx13cQ6B2kg@mail.gmail.com
Backpatch-through: 17
A similar check existed in the MSVC scripts that have been removed in
v17 by 1301c80b21, but nothing of the kind was checked in meson when
building with a 4-byte off_t.
This commit adds a check to fail the builds when trying to use a
relation file size higher than 1GB when off_t is 4 bytes, like
./configure, rather than detecting these failures at runtime because the
code is not able to handle large files in this case.
Backpatch down to v16, where meson has been introduced.
Discussion: https://postgr.es/m/aQ0hG36IrkaSGfN8@paquier.xyz
Backpatch-through: 16
This omission allowed table owners to create statistics in any
schema, potentially leading to unexpected naming conflicts. For
ALTER TABLE commands that require re-creating statistics objects,
skip this check in case the user has since lost CREATE on the
schema. The addition of a second parameter to CreateStatistics()
breaks ABI compatibility, but we are unaware of any impacted
third-party code.
Reported-by: Jelte Fennema-Nio <postgres@jeltef.nl>
Author: Jelte Fennema-Nio <postgres@jeltef.nl>
Co-authored-by: Nathan Bossart <nathandbossart@gmail.com>
Reviewed-by: Noah Misch <noah@leadboat.com>
Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de>
Security: CVE-2025-12817
Backpatch-through: 13
Several functions could overflow their size calculations, when presented
with very large inputs from remote and/or untrusted locations, and then
allocate buffers that were too small to hold the intended contents.
Switch from int to size_t where appropriate, and check for overflow
conditions when the inputs could have plausibly originated outside of
the libpq trust boundary. (Overflows from within the trust boundary are
still possible, but these will be fixed separately.) A version of
add_size() is ported from the backend to assist with code that performs
more complicated concatenation.
Reported-by: Aleksey Solovev (Positive Technologies)
Reviewed-by: Noah Misch <noah@leadboat.com>
Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de>
Security: CVE-2025-12818
Backpatch-through: 13
generic-gcc.h maps our read and write barriers to C11 acquire and
release fences using compiler builtins, for platforms where we don't
have our own hand-rolled assembler. This is apparently enough for GCC,
but the C11 memory model is only defined in terms of atomic accesses,
and our barriers for non-atomic, non-volatile accesses were not always
respected under Clang's stricter interpretation of the standard.
This explains the occasional breakage observed on new RISC-V + Clang
animal greenfly in lock-free PgAioHandle manipulation code containing a
repeating pattern of loads and read barriers. The problem can also be
observed in code generated for MIPS and LoongAarch, though we aren't
currently testing those with Clang, and on x86, though we use our own
assembler there. The scariest aspect is that we use the generic version
on very common ARM systems, but it doesn't seem to reorder the relevant
code there (or we'd have debugged this long ago).
Fix by inserting an explicit compiler barrier. It expands to an empty
assembler block declared to have memory side-effects, so registers are
flushed and reordering is prevented. In those respects this is like the
architecture-specific assembler versions, but the compiler is still in
charge of generating the appropriate fence instruction. Done for write
barriers on principle, though concrete problems have only been observed
with read barriers.
Reported-by: Alexander Lakhin <exclusion@gmail.com>
Tested-by: Alexander Lakhin <exclusion@gmail.com>
Discussion: https://postgr.es/m/d79691be-22bd-457d-9d90-18033b78c40a%40gmail.com
Backpatch-through: 13
The following parameters can only be set at server start because
their context is PGC_POSTMASTER, but this information was missing
or incorrectly documented. This commit adds or corrects
that information for the following parameters:
* debug_io_direct
* dynamic_shared_memory_type
* event_source
* huge_pages
* io_max_combine_limit
* max_notify_queue_pages
* shared_memory_type
* track_commit_timestamp
* wal_decode_buffer_size
Backpatched to all supported branches.
Author: Karina Litskevich <litskevichkarina@gmail.com>
Reviewed-by: Chao Li <lic@highgo.com>
Reviewed-by: Fujii Masao <masao.fujii@gmail.com>
Discussion: https://postgr.es/m/CAHGQGwGfPzcin-_6XwPgVbWTOUFVZgHF5g9ROrwLUdCTfjy=0A@mail.gmail.com
Backpatch-through: 13
If these parameters are set without units, the values are interpreted
as blocks. This detail was previously missing from the documentation,
so this commit adds it.
Backpatch to v17 where io_combine_limit was added.
Author: Karina Litskevich <litskevichkarina@gmail.com>
Reviewed-by: Chao Li <lic@highgo.com>
Reviewed-by: Xuneng Zhou <xunengzhou@gmail.com>
Reviewed-by: Fujii Masao <masao.fujii@gmail.com>
Discussion: https://postgr.es/m/CACiT8iZCDkz1bNYQNQyvGhXWJExSnJULRTYT894u4-Ti7Yh6jw@mail.gmail.com
Backpatch-through: 17
XLogRecPtrIsInvalid() is inconsistent with the affirmative form of
macros used for other datatypes, and leads to awkward double negatives
in a few places. This commit introduces XLogRecPtrIsValid(), which
allows code to be written more naturally.
This patch only adds the new macro. XLogRecPtrIsInvalid() is left in
place, and all existing callers remain untouched. This means all
supported branches can accept hypothetical bug fixes that use the new
macro, and at the same time any code that compiled with the original
formulation will continue to silently compile just fine.
Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Backpatch-through: 13
Discussion: https://postgr.es/m/aQB7EvGqrbZXrMlg@ip-10-97-1-34.eu-west-3.compute.internal
postgres_fdw supports EvalPlanQual testing by using the infrastructure
provided by the core with the RecheckForeignScan callback routine (cf.
commits 5fc4c26db and 385f337c9), but there has been no test coverage
for that, except that recent commit 12609fbac, which fixed an issue in
commit 385f337c9, added a test case to exercise only a code path added
by that commit to the core infrastructure. So let's add test cases to
exercise other code paths as well at this time.
Like commit 12609fbac, back-patch to all supported branches.
Reported-by: Masahiko Sawada <sawada.mshk@gmail.com>
Author: Etsuro Fujita <etsuro.fujita@gmail.com>
Discussion: https://postgr.es/m/CAPmGK15%2B6H%3DkDA%3D-y3Y28OAPY7fbAdyMosVofZZ%2BNc769epVTQ%40mail.gmail.com
Backpatch-through: 13
If any shell command fails, the whole script should fail. To avoid
future omissions, add this even for single-command scripts that use su
with heredoc syntax, as they might be extended or copied-and-pasted.
Extracted from a larger patch that wanted to use #error during
compilation, leading to the diagnosis of this problem.
Reviewed-by: Tristan Partin <tristan@partin.io> (earlier version)
Discussion: https://postgr.es/m/DDZP25P4VZ48.3LWMZBGA1K9RH%40partin.io
Backpatch-through: 15
We've successfully used libsanitizer for awhile with the undefined
and alignment sanitizers, but with some other sanitizers (at least
thread and hwaddress) it crashes due to internal recursion before
it's fully initialized itself. It turns out that that's due to the
"__ubsan_default_options" hack installed by commit f686ae82f, and we
can fix it by ensuring that __ubsan_default_options is built without
any sanitizer instrumentation hooks.
Reported-by: Emmanuel Sibi <emmanuelsibi.mec@gmail.com>
Reported-by: Alexander Lakhin <exclusion@gmail.com>
Diagnosed-by: Emmanuel Sibi <emmanuelsibi.mec@gmail.com>
Fix-suggested-by: Jacob Champion <jacob.champion@enterprisedb.com>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/F7543B04-E56C-4D68-A040-B14CCBAD38F1@gmail.com
Discussion: https://postgr.es/m/dbf77bf7-6e54-ed8a-c4ae-d196eeb664ce@gmail.com
Backpatch-through: 16
The test introduced by 17b2d5ec75 verifies that a WAL receiver
survives across a timeline jump by searching the server logs for
termination messages. However, it called restart() before the timeline
switch, which kills the WAL receiver and may log the exact message being
checked, hence failing the test. As TAP tests reuse the same log file
across restarts, a rotate_logfile() is used before the restart so as the
log matching check is not impacted by log entries generated by a
previous shutdown.
Recent changes to file handle inheritance altered I/O timing enough to
make this fail consistently while testing another patch.
While on it, this adds an extra check based on a PID comparison. This
test may lead to false positives as it could be possible that the WAL
receiver has processed a timeline jump before the initial PID is
grabbed, but it should be good enough in most cases.
Like 17b2d5ec75, backpatch down to v13.
Author: Bryan Green <dbryan.green@gmail.com>
Co-authored-by: Xuneng Zhou <xunengzhou@gmail.com>
Discussion: https://postgr.es/m/9d00b597-d64a-4f1e-802e-90f9dc394c70@gmail.com
Backpatch-through: 13
In 2a0faed9d7, which added JIT compilation support for expressions, I
accidentally used sizeof(LLVMBasicBlockRef *) instead of
sizeof(LLVMBasicBlockRef) as part of computing the size of an allocation. That
turns out to have no real negative consequences due to LLVMBasicBlockRef being
a pointer itself (and thus having the same size). It still is wrong and
confusing, so fix it.
Reported by coverity.
Backpatch-through: 13
Commit a95e3d84c0 added ActiveSnapshot push+pop when processing
work-items (BRIN autosummarization), but forgot to handle the case of
a transaction failing during the run, which drops the snapshot untimely.
Fix by making the pop conditional on an element being actually there.
Author: Álvaro Herrera <alvherre@kurilemu.de>
Backpatch-through: 13
Discussion: https://postgr.es/m/202511041648.nofajnuddmwk@alvherre.pgsql
Debian Trixie CI images are generated now [1], so use them with the
following changes:
- detect_stack_use_after_return=0 option is added to the ASAN_OPTIONS
because ASAN uses a "shadow stack" to track stack variable lifetimes
and this confuses Postgres' stack depth check [2].
- Perl is updated to the newer version (perl5.40-i386-linux-gnu).
- LLVM-14 is no longer default installation, no need to force using
LLVM-16.
- Switch MinGW CC/CXX to x86_64-w64-mingw32ucrt-* to fix build failure
from missing _iswctype_l in mingw-w64 v12 headers.
[1] https://github.com/anarazel/pg-vm-images/commit/35a144793f
[2] https://postgr.es/m/20240130212304.q66rquj5es4375ab%40awork3.anarazel.de
Author: Nazir Bilal Yavuz <byavuz81@gmail.com>
Discussion: https://postgr.es/m/CAN55FZ1_B1usTskAv+AYt1bA7abVd9YH6XrUUSbr-2Z0d5Wd8w@mail.gmail.com
Backpatch: 15-, where CI support was added
Backpatch commit 7bc9a8bdd2 to 13-17. The motivation for backpatching is that
we want to update CI to Debian Trixie. Trixie contains a newer mingw
installation, which would trigger the warning addressed by 7bc9a8bdd2. The
risk of backpatching seems fairly low, given that it did not cause issues in
the branches the commit is already present.
While CI is not present in 13-14, it seems better to be consistent across
branches.
Author: Thomas Munro <tmunro@postgresql.org>
Discussion: https://postgr.es/m/o5yadhhmyjo53svzwvaocww6zkrp63i4f32cw3treuh46pxtza@hyqio5b2tkt6
Backpatch-through: 13
A generated column may end up being part of the partition key
expression, if it's specified as an expression e.g. "(<generated
column name>)" or if the partition key expression contains a whole-row
reference, even though we do not allow a generated column to be part
of partition key expression. Fix this hole.
Co-authored-by: jian he <jian.universality@gmail.com>
Co-authored-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>
Reviewed-by: Fujii Masao <masao.fujii@oss.nttdata.com>
Discussion: https://www.postgresql.org/message-id/flat/CACJufxF%3DWDGthXSAQr9thYUsfx_1_t9E6N8tE3B8EqXcVoVfQw%40mail.gmail.com
It's possible to define BRIN indexes on functions that require a
snapshot to run, but the autosummarization feature introduced by commit
7526e10224 fails to provide one. This causes autovacuum to leave a
BRIN placeholder tuple behind after a failed work-item execution, making
such indexes less efficient. Repair by obtaining a snapshot prior to
running the task, and add a test to verify this behavior.
Author: Álvaro Herrera <alvherre@kurilemu.de>
Reported-by: Giovanni Fabris <giovanni.fabris@icon.it>
Reported-by: Arthur Nascimento <tureba@gmail.com>
Backpatch-through: 13
Discussion: https://postgr.es/m/202511031106.h4fwyuyui6fz@alvherre.pgsql
Commit b4f584f9d2 (affecting v15~, later backpatched down to 13 as of
3635a0a35a) introduced an unconditional WAL receiver shutdown when
switching from streaming to archive WAL sources. This causes problems
during a timeline switch, when a WAL receiver enters WALRCV_WAITING
state but remains alive, waiting for instructions.
The unconditional shutdown can break some monitoring scenarios as the
WAL receiver gets repeatedly terminated and re-spawned, causing
pg_stat_wal_receiver.status to show a "streaming" instead of "waiting"
status, masking the fact that the WAL receiver is waiting for a new TLI
and a new LSN to be able to continue streaming.
This commit changes the WAL receiver behavior so as the shutdown becomes
conditional, with InstallXLogFileSegmentActive being always reset to
prevent the regression fixed by b4f584f9d2a1: only terminate the WAL
receiver when it is actively streaming (WALRCV_STREAMING,
WALRCV_STARTING, or WALRCV_RESTARTING). When in WALRCV_WAITING state,
just reset InstallXLogFileSegmentActive flag to allow archive
restoration without killing the process. WALRCV_STOPPED and
WALRCV_STOPPING are not reachable states in this code path. For the
latter, the startup process is the one in charge of setting
WALRCV_STOPPING via ShutdownWalRcv(), waiting for the WAL receiver to
reach a WALRCV_STOPPED state after switching walRcvState, so
WaitForWALToBecomeAvailable() cannot be reached while a WAL receiver is
in a WALRCV_STOPPING state.
A regression test is added to check that a WAL receiver is not stopped
on timeline jump, that fails when the fix of this commit is reverted.
Reported-by: Ryan Bird <ryanzxg@gmail.com>
Author: Xuneng Zhou <xunengzhou@gmail.com>
Reviewed-by: Noah Misch <noah@leadboat.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/19093-c4fff49a608f82a0@postgresql.org
Backpatch-through: 13
The C standard says that the second and third arguments of a
conditional operator shall be both void type or both not-void
type. The Windows version of INTERRUPTS_PENDING_CONDITION()
got this wrong. It's pretty harmless because the result of
the operator is ignored anyway, but apparently recent versions
of MSVC have started issuing a warning about it. Silence the
warning by casting the dummy zero to void.
Reported-by: Christian Ullrich <chris@chrullrich.net>
Author: Bryan Green <dbryan.green@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/cc4ef8db-f8dc-4347-8a22-e7ebf44c0308@chrullrich.net
Backpatch-through: 13
The code updates the system identifier, then runs pg_walreset; if the
latter fails, it complains about the former, which makes no sense.
Change the error message to complain about the right thing.
Noticed while reviewing a patch touching nearby code.
Author: Álvaro Herrera <alvherre@kurilemu.de>
Backpatch-through: 17
This commit reverts 818fefd8fd, that has been introduced to address a
an instability in some of the TAP tests due to the presence of random
standby snapshot WAL records, when slots are invalidated by
InvalidatePossiblyObsoleteSlot().
Anyway, this commit had also the consequence of introducing a behavior
regression. After 818fefd8fd, the code may determine that a slot needs
to be invalidated while it may not require one: the slot may have moved
from a conflicting state to a non-conflicting state between the moment
when the mutex is released and the moment when we recheck the slot, in
InvalidatePossiblyObsoleteSlot(). Hence, the invalidations may be more
aggressive than they actually have to.
105b2cb336 has tackled the test instability in a way that should be
hopefully sufficient for the buildfarm, even for slow members:
- In v18, the test relies on an injection point that bypasses the
creation of the random records generated for standby snapshots,
eliminating the random factor that impacted the test. This option was
not available when 818fefd8fd was discussed.
- In v16 and v17, the problem was bypassed by disallowing a slot to
become active in some of the scenarios tested.
While on it, this commit adds a comment to document that it is fine for
a recheck to use xmin and LSN values stored in the slot, without storing
and reusing them across multiple checks.
Reported-by: "suyu.cmj" <mengjuan.cmj@alibaba-inc.com>
Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Discussion: https://postgr.es/m/f492465f-657e-49af-8317-987460cb68b0.mengjuan.cmj@alibaba-inc.com
Backpatch-through: 16