This change has the advantage of removing some weird type casts, caused
by offset calculations based on pgoff_t but saved as int (on older
branches we use off_t, which could be 4 or 8 bytes depending on the
environment). These are safe currently because capped by
MAX_PHYSICAL_FILESIZE, but we would run into problems when to make
MAX_PHYSICAL_FILESIZE larger or allow callers of these routines to use a
larger physical max size on demand.
While on it, this improves BufFileDumpBuffer() so as we do not use an
offset for "availbytes". It is not a file offset per-set, but a number
of available bytes.
This change should lead to no functional changes.
Author: Chao Li <li.evan.chao@gmail.com>
Discussion: https://postgr.es/m/aUStrqoOCDRFAq1M@paquier.xyz
Many of the operations done for attribute stats in attribute_stats.c
share the same logic as extended stats, as done by a patch under
discussion to add support for extended stats import and export. All the
pieces necessary for extended statistics are moved to stats_utils.c,
which is the file where common facilities are shared for stats files.
The following renames are done:
* get_attr_stat_type() -> statatt_get_type()
* init_empty_stats_tuple() -> statatt_init_empty_tuple()
* set_stats_slot() -> statatt_set_slot()
* get_elem_stat_type() -> statatt_get_elem_type()
While on it, this commit adds more documentation for all these
functions, describing more their internals and the dependencies that
have been implied for attribute statistics. The same concepts apply to
extended statistics, at some degree.
Author: Corey Huinker <corey.huinker@gmail.com>
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Reviewed-by: Yu Wang <wangyu_runtime@163.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/CADkLM=dpz3KFnqP-dgJ-zvRvtjsa8UZv8wDAQdqho=qN3kX0Zg@mail.gmail.com
If there are any SRFs in a PathTarget, we must separate it into
SRF-computing and SRF-free targets. This is because the executor can
only handle SRFs that appear at the top level of the targetlist of a
ProjectSet plan node.
If we find a subexpression that matches an expression already computed
in the previous plan level, we should treat it like a Var and should
not split it again. setrefs.c will later replace the expression with
a Var referencing the subplan output.
However, when processing the grouping target for grouping sets, the
planner can fail to recognize that an expression is already computed
in the scan/join phase. The root cause is a mismatch in the
nullingrels bits. Expressions in the grouping target carry the
grouping nulling bit in their nullingrels to indicate that they can be
nulled by the grouping step. However, the corresponding expressions
in the scan/join target do not have these bits.
As a result, the exact match check in list_member() fails, leading the
planner to incorrectly believe that the expression needs to be
re-evaluated from its arguments, which are often not available in the
subplan. This can lead to planner errors such as "variable not found
in subplan target list".
To fix, ignore the grouping nulling bit when checking whether an
expression from the grouping target is available in the pre-grouping
input target. This aligns with the matching logic in setrefs.c.
Backpatch to v18, where this issue was introduced.
Bug: #19353
Reported-by: Marian MULLER REBEYROL <marian.muller@serli.com>
Author: Richard Guo <guofenglinux@gmail.com>
Reviewed-by: Tender Wang <tndrwang@gmail.com>
Discussion: https://postgr.es/m/19353-aaa179bba986a19b@postgresql.org
Backpatch-through: 18
Commit 8a3e4011 introduced tab completion for the ONLY option of
VACUUM and ANALYZE, along with some code simplification using
MatchAnyN. However, it caused a regression in tab completion for
VACUUM option values. For example, neither ON nor OFF was suggested
after "VACUUM (VERBOSE". In addition, the ONLY keyword was not
suggested immediately after a completed option list.
Backpatch to v18.
Author: Yugo Nagata <nagata@sraoss.co.jp>
Discussion: https://postgr.es/m/20251223021509.19bba68ecbbc70c9f983c2b4@sraoss.co.jp
Backpatch-through: 18
Commit 67c209 removed the WARNING for insufficient wal_level from the
expected output, but the WARNING may still appear on buildfarm members
that run with wal_level=minimal.
To avoid unstable test output depending on wal_level, this commit the
test to use ALTER PUBLICATION for verifying the same behavior,
ensuring the output remains consistent regardless of the wal_level
setting.
Per buildfarm member thorntail.
Author: Zhijie Hou <houzj.fnst@fujitsu.com>
Discussion: https://postgr.es/m/TY4PR01MB16907680E27BAB146C8EB1A4294B2A@TY4PR01MB16907.jpnprd01.prod.outlook.com
The pg_overexplain documentation previously used the <literal> tag for
some file names, struct names, and commands. Update the markup to
use the more appropriate tags: <filename>, <structname>, and <command>.
Backpatch to v18, where pg_overexplain was introduced.
Author: Fujii Masao <masao.fujii@gmail.com>
Reviewed-by: Shixin Wang <wang-shi-xin@outlook.com>
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Discussion: https://postgr.es/m/CAHGQGwEyYUzz0LjBV_fMcdwU3wgmu0NCoT+JJiozPa8DG6eeog@mail.gmail.com
Backpatch-through: 18
CREATE SUBSCRIPTION with copy_data=true and origin='none' previously
failed when the publisher was running a version earlier than PostgreSQL 19,
even though this combination should be supported.
The failure occurred because the command issued a query calling
pg_get_publication_sequences function on the publisher. That function
does not exist before PG19 and the query is only needed for logical
replication sequence synchronization, which is supported starting in PG19.
This commit fixes this issue by skipping that query when the
publisher runs a version earlier than PG19.
Author: Fujii Masao <masao.fujii@gmail.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com>
Reviewed-by: Shlok Kyal <shlok.kyal.oss@gmail.com>
Discussion: https://postgr.es/m/CAHGQGwEx4twHtJdiPWTyAXJhcBPLaH467SH2ajGSe-41m65giA@mail.gmail.com
The retain_dead_tuples subscription option is supported only when
the publisher runs PostgreSQL 19 or later. However, it could previously
be enabled even when the publisher was running an earlier version.
This was caused by check_pub_dead_tuple_retention() comparing
the publisher server version against 19000 instead of 190000.
Fix this typo so that the version check correctly enforces the PG19+
requirement.
Author: Fujii Masao <masao.fujii@gmail.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com>
Reviewed-by: Shlok Kyal <shlok.kyal.oss@gmail.com>
Discussion: https://postgr.es/m/CAHGQGwEx4twHtJdiPWTyAXJhcBPLaH467SH2ajGSe-41m65giA@mail.gmail.com
Update the logical replication documentation to explicitly outline the
privilege requirements for each publication syntax. This will ensure users
understand the necessary permissions when creating or managing
publications.
Author: Shlok Kyal <shlok.kyal.oss@gmail.com>
Reviewed-by: Peter Smith <smithpb2250@gmail.com>
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Reviewed-by: David G. Johnston <david.g.johnston@gmail.com>
Discussion: https://postgr.es/m/CANhcyEXODen4U0XLk0aAwFTwGxjAfE9eRaynREenLp-JBSaFHw@mail.gmail.com
Currently, the function expr_is_nonnullable() checks only Const and
Var expressions to determine if an expression is non-nullable. This
patch extends the detection logic to handle more expression types.
This can enable several downstream optimizations, such as reducing
NullTest quals to constant truth values (e.g., "COALESCE(var, 1) IS
NULL" becomes FALSE) and converting "COUNT(expr)" to the more
efficient "COUNT(*)" when the expression is proven non-nullable.
This breaks a test case in test_predtest.sql, since we now simplify
"ARRAY[] IS NULL" to constant FALSE, preventing it from weakly
refuting a strict ScalarArrayOpExpr ("x = any(ARRAY[])"). To ensure
the refutation logic is still exercised as intended, wrap the array
argument in opaque_array().
Author: Richard Guo <guofenglinux@gmail.com>
Reviewed-by: Tender Wang <tndrwang@gmail.com>
Reviewed-by: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org>
Reviewed-by: David Rowley <dgrowleyml@gmail.com>
Reviewed-by: Matheus Alcantara <matheusssilv97@gmail.com>
Discussion: https://postgr.es/m/CAMbWs49UhPBjm+NRpxerjaeuFKyUZJ_AjM3NBcSYK2JgZ6VTEQ@mail.gmail.com
We break ROW(...) IS [NOT] NULL into separate tests on its component
fields. During this breakdown, we can improve efficiency by utilizing
expr_is_nonnullable() to detect fields that are provably non-nullable.
If a component field is proven non-nullable, it affects the outcome
based on the test type. For an IS NULL test, a single non-nullable
field refutes the whole NullTest, reducing it to constant FALSE. For
an IS NOT NULL test, the check for that specific field is guaranteed
to succeed, so we can discard it from the list of component tests.
This extends the existing optimization logic, which previously only
handled Const fields, to support any expression that can be proven
non-nullable.
In passing, update the existing constant folding of NullTests to use
expr_is_nonnullable() instead of var_is_nonnullable(), enabling it to
benefit from future improvements to that function.
Author: Richard Guo <guofenglinux@gmail.com>
Reviewed-by: Tender Wang <tndrwang@gmail.com>
Reviewed-by: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org>
Reviewed-by: David Rowley <dgrowleyml@gmail.com>
Reviewed-by: Matheus Alcantara <matheusssilv97@gmail.com>
Discussion: https://postgr.es/m/CAMbWs49UhPBjm+NRpxerjaeuFKyUZJ_AjM3NBcSYK2JgZ6VTEQ@mail.gmail.com
The COALESCE function returns the first of its arguments that is not
null. When an argument is proven non-null, if it is the first
non-null-constant argument, the entire COALESCE expression can be
replaced by that argument. If it is a subsequent argument, all
following arguments can be dropped, since they will never be reached.
Currently, we perform this simplification only for Const arguments.
This patch extends the simplification to support any expression that
can be proven non-nullable.
This can help avoid the overhead of evaluating unreachable arguments.
It can also lead to better plans when the first argument is proven
non-nullable and replaces the expression, as the planner no longer has
to treat the expression as non-strict, and can also leverage index
scans on the resulting expression.
There is an ensuing plan change in generated_virtual.out, and we have
to modify the test to ensure that it continues to test what it is
intended to.
Author: Richard Guo <guofenglinux@gmail.com>
Reviewed-by: Tender Wang <tndrwang@gmail.com>
Reviewed-by: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org>
Reviewed-by: David Rowley <dgrowleyml@gmail.com>
Reviewed-by: Matheus Alcantara <matheusssilv97@gmail.com>
Discussion: https://postgr.es/m/CAMbWs49UhPBjm+NRpxerjaeuFKyUZJ_AjM3NBcSYK2JgZ6VTEQ@mail.gmail.com
The logical replication parallel apply worker could incorrectly advance
the origin progress during an error or failed apply. This behavior risks
transaction loss because such transactions will not be resent by the
server.
Commit 3f28b2fcac addressed a similar issue for both the apply worker and
the table sync worker by registering a before_shmem_exit callback to reset
origin information. This prevents the worker from advancing the origin
during transaction abortion on shutdown. This patch applies the same fix
to the parallel apply worker, ensuring consistent behavior across all
worker types.
As with 3f28b2fcac, we are backpatching through version 16, since parallel
apply mode was introduced there and the issue only occurs when changes are
applied before the transaction end record (COMMIT or ABORT) is received.
Author: Hou Zhijie <houzj.fnst@fujitsu.com>
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Backpatch-through: 16
Discussion: https://postgr.es/m/TY4PR01MB169078771FB31B395AB496A6B94B4A@TY4PR01MB16907.jpnprd01.prod.outlook.com
Discussion: https://postgr.es/m/TYAPR01MB5692FAC23BE40C69DA8ED4AFF5B92@TYAPR01MB5692.jpnprd01.prod.outlook.com
This one was missed in 8f1791c61, because the machines that
detected those issues don't compile this function.
Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/1324889.1764886170@sss.pgh.pa.us
This commit updates pg_visibility_map_summary() to use the
visibilitymap_count() API, replacing its own counting mechanism. This
simplifies the function and improves performance by leveraging the
vectorized implementation introduced in commit 41c51f0c68.
Author: Matthias van de Meent <boekewurm+postgres@gmail.com>
Reviewed-by: wenhui qiu <qiuwenhuifx@gmail.com>
Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Discussion: https://postgr.es/m/CAEze2WgPu-EYYuYQimy=AHQHGa7w8EvLVve5DM5eGMR6zh-7sw@mail.gmail.com
Previously logical decoding required wal_level to be set to 'logical'
at server start. This meant that users had to incur the overhead of
logical-level WAL logging even when no logical replication slots were
in use.
This commit adds functionality to automatically control logical
decoding availability based on logical replication slot presence. The
newly introduced module logicalctl.c allows logical decoding to be
dynamically activated when needed when wal_level is set to
'replica'.
When the first logical replication slot is created, the system
automatically increases the effective WAL level to maintain
logical-level WAL records. Conversely, after the last logical slot is
dropped or invalidated, it decreases back to 'replica' WAL level.
While activation occurs synchronously right after creating the first
logical slot, deactivation happens asynchronously through the
checkpointer process. This design avoids a race condition at the end
of recovery; a concurrent deactivation could happen while the startup
process enables logical decoding at the end of recovery, but WAL
writes are still not permitted until recovery fully completes. The
checkpointer will handle it after recovery is done. Asynchronous
deactivation also avoids excessive toggling of the logical decoding
status in workloads that repeatedly create and drop a single logical
slot. On the other hand, this lazy approach can delay changes to
effective_wal_level and the disabling logical decoding, especially
when the checkpointer is busy with other tasks. We chose this lazy
approach in all deactivation paths to keep the implementation simple,
even though laziness is strictly required only for end-of-recovery
cases. Future work might address this limitation either by using a
dedicated worker instead of the checkpointer, or by implementing
synchronous waiting during slot drops if workloads are significantly
affected by the lazy deactivation of logical decoding.
The effective WAL level, determined internally by XLogLogicalInfo, is
allowed to change within a transaction until an XID is assigned. Once
an XID is assigned, the value becomes fixed for the remainder of the
transaction. This behavior ensures that the logging mode remains
consistent within a writing transaction, similar to the behavior of
GUC parameters.
A new read-only GUC parameter effective_wal_level is introduced to
monitor the actual WAL level in effect. This parameter reflects the
current operational WAL level, which may differ from the configured
wal_level setting.
Bump PG_CONTROL_VERSION as it adds a new field to CheckPoint struct.
Reviewed-by: Shveta Malik <shveta.malik@gmail.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com>
Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Reviewed-by: Peter Smith <smithpb2250@gmail.com>
Reviewed-by: Shlok Kyal <shlok.kyal.oss@gmail.com>
Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>
Discussion: https://postgr.es/m/CAD21AoCVLeLYq09pQPaWs+Jwdni5FuJ8v2jgq-u9_uFbcp6UbA@mail.gmail.com
After waiting for a concurrent updater to finish, heap_lock_tuple()
followed the update chain to lock all tuple versions. However, when
stepping from the initial tuple to the next one, it failed to check
that the next tuple's XMIN matches the initial tuple's XMAX. That's an
important check whenever following an update chain, and the recursive
part that follows the chain did it, but the initial step missed it.
Without the check, if the updating transaction aborts, the updated
tuple is vacuumed away and replaced by an unrelated tuple, the
unrelated tuple might get incorrectly locked.
Author: Jasper Smit <jasper.smit@servicenow.com>
Discussion: https://www.postgresql.org/message-id/CAOG+RQ74x0q=kgBBQ=mezuvOeZBfSxM1qu_o0V28bwDz3dHxLw@mail.gmail.com
Backpatch-through: 14
Since ce0fdbfe97, a replication slot and an origin are created by each
tablesync worker, whose information is stored in both a catalog and
shared memory (once the origin is set up in the latter case). The
transaction where the origin is created is the same as the one that runs
the initial COPY, with the catalog state of the origin becoming visible
for other sessions only once the COPY transaction has committed. The
catalog state is coupled with a state in shared memory, initialized at
the same time as the origin created in the catalogs. Note that the
transaction doing the initial data sync can take a long time, time that
depends on the amount of data to transfer from a publication node to its
subscriber node.
Now, when a DROP SUBSCRIPTION is executed, all its workers are stopped
with the origins removed. The removal of each origin relies on a
catalog lookup. A worker still running the initial COPY would fail its
transaction, with the catalog state of the origin rolled back while the
shared memory state remains around. The session running the DROP
SUBSCRIPTION should be in charge of cleaning up the catalog and the
shared memory state, but as there is no data in the catalogs the shared
memory state is not removed. This issue would leave orphaned origin
data in shared memory, leading to a confusing state as it would still
show up in pg_replication_origin_status. Note that this shared memory
data is sticky, being flushed on disk in replorigin_checkpoint at
checkpoint. This prevents other origins from reusing a slot position
in the shared memory data.
To address this problem, the commit moves the creation of the origin at
the end of the transaction that precedes the one executing the initial
COPY, making the origin immediately visible in the catalogs for other
sessions, giving DROP SUBSCRIPTION a way to know about it. A different
solution would have been to clean up the shared memory state using an
abort callback within the tablesync worker. The solution of this commit
is more consistent with the apply worker that creates an origin in a
short transaction.
A test is added in the subscription test 004_sync.pl, which was able to
display the problem. The test fails when this commit is reverted.
Reported-by: Tenglong Gu <brucegu@amazon.com>
Reported-by: Daisuke Higuchi <higudai@amazon.com>
Analyzed-by: Michael Paquier <michael@paquier.xyz>
Author: Hou Zhijie <houzj.fnst@fujitsu.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Discussion: https://postgr.es/m/aUTekQTg4OYnw-Co@paquier.xyz
Backpatch-through: 14
off_t was previously used for offsets, which is 4 bytes on Windows,
hence limiting the backend code to a hard limit for files longer than
2GB. This leads to some simplification in these files, removing some
casts based on long, also 4 bytes on Windows.
This commit removes one comment introduced in db3c4c3a2d, not relevant
anymore as pgoff_t is a safe 8-byte alternative on Windows.
This change is surprisingly not invasive, as the callers of
BufFileTell(), BufFileSeek() and BufFileTruncateFileSet() (worker.c,
tuplestore.c, etc.) track offsets in local structures that just to
switch from off_t to pgoff_t for the most part.
The file is still relying on a maximum file size of
MAX_PHYSICAL_FILESIZE (1GB). This change allows the code to make this
maximum potentially larger in the future, or larger on a per-demand
basis.
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Discussion: https://postgr.es/m/aUStrqoOCDRFAq1M@paquier.xyz
Previously, only the first option in a parenthesized option list was
suggested by tab completion. This commit enhances tab completion for
both COPY TO and COPY FROM commands to suggest options after each
comma.
Also add completion for HEADER and FREEZE option value candidates.
Author: Yugo Nagata <nagata@sraoss.co.jp>
Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Discussion: https://postgr.es/m/20250605100835.b396f9d656df1018f65a4556@sraoss.co.jp
Correct the referenced location of the RangeTblEntry definition
in the pg_overexplain documentation.
Backpatched to v18, where pg_overexplain was introduced.
Author: Julien Tachoires <julien@tachoires.me>
Reviewed-by: Fujii Masao <masao.fujii@gmail.com>
Discussion: https://postgr.es/m/20251218092319.tht64ffmcvzqdz7u@poseidon.home.virt
Backpatch-through: 18
nbtree index-only scans of an index that uses btree/name_ops as one of
its index column's input opclasses are no longer at any risk of reading
past the end of currTuples. We're no longer reliant on such scans being
able to at least read from the start of markTuples storage (which uses
space from the same allocation as currTuples) to avoid a segfault:
StoreIndexTuple (from nodeIndexonlyscan.c) won't actually read past the
end of a cstring datum from a name_ops index. In other words, we
already have the "special-case treatment for name_ops" that the removed
comment supposed we could avoid by relying on markTuples in this way.
Oversight in commit a63224be49, which added special case handling of
name_ops cstrings to StoreIndexTuple, but missed these comments.
An unused variable caused a compiler warning on BF animal fairywren, an
snprintf() call was redundant, and some buffer sizes were inconsistent.
Per code review from Tom Lane.
The Makefile's test ifeq ($(PORTNAME), win32) never succeeded due to a
circularity, so only Meson builds were actually compiling the new test
code, partially explaining why CI didn't tell us about the warning
sooner (the other problem being that CompilerWarnings only makes
world-bin, a problem for another commit). Simplify.
Backpatch-through: 16, like commit c507ba55
Author: Bryan Green <dbryan.green@gmail.com>
Co-authored-by: Thomas Munro <tmunro@gmail.com>
Reported-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/1086088.1765593851%40sss.pgh.pa.us
The main optimization is for LockBufHdr() to delay initializing
SpinDelayStatus, similar to what LWLockWaitListLock already did. The
initialization is sufficiently expensive & buffer header lock acquisitions are
sufficiently frequent, to make it worthwhile to instead have a fastpath (via a
likely() branch) that does not initialize the SpinDelayStatus.
While LWLockWaitListLock() already the aforementioned optimization, it did not
use likely(), and inspection of the assembly shows that this indeed leads to
worse code generation (also observed in a microbenchmark). Fix that by adding
the likely().
While the LockBufHdr() improvement is a small gain on its own, it mainly is
aimed at preventing a regression after a future commit, which requires
additional locking to set hint bits.
While touching both, also make the comments more similar to each other.
Reviewed-by: Heikki Linnakangas <heikki.linnakangas@iki.fi>
Discussion: https://postgr.es/m/fvfmkr5kk4nyex56ejgxj3uzi63isfxovp2biecb4bspbjrze7@az2pljabhnff
In the wake of commit db6a4a985, remove most use of 'md5' from the
example configuration file. The only remainder is an example exception
for a client that doesn't support SCRAM.
Author: Mikael Gustavsson <mikael.gustavsson@smhi.se>
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Reviewed-by: Andreas Karlsson <andreas@proxel.se>
Reviewed-by: Laurenz Albe <laurenz.albe@cybertec.at>
Discussion: https://postgr.es/m/176595607507.978865.11597773194269211255@wrigleys.postgresql.org
Discussion: https://postgr.es/m/4ed268473fdb4cf9b0eced6c8019d353@smhi.se
Backpatch-through: 18
Previously, if memory context logging was triggered repeatedly and
rapidly while a previous request was still being processed, it could
result in recursive calls to ProcessLogMemoryContextInterrupt().
This could lead to infinite recursion and potentially crash the process.
This commit adds a guard to prevent such recursion.
If ProcessLogMemoryContextInterrupt() is already in progress and
logging memory contexts, subsequent calls will exit immediately,
avoiding unintended recursive calls.
While this scenario is unlikely in practice, it's not impossible.
This change adds a safety check to prevent such failures.
Back-patch to v14, where memory context logging was introduced.
Reported-by: Robert Haas <robertmhaas@gmail.com>
Author: Fujii Masao <masao.fujii@gmail.com>
Reviewed-by: Atsushi Torikoshi <torikoshia@oss.nttdata.com>
Reviewed-by: Robert Haas <robertmhaas@gmail.com>
Reviewed-by: Artem Gavrilov <artem.gavrilov@percona.com>
Discussion: https://postgr.es/m/CA+TgmoZMrv32tbNRrFTvF9iWLnTGqbhYSLVcrHGuwZvCtph0NA@mail.gmail.com
Backpatch-through: 14
All the code paths updated here have been using relation_close() to
close a relation that has already been opened with table_open() or
index_open(), where a relkind check is enforced.
table_close() and index_open() do the same thing as relation_close(), so
there was no harm, but being inconsistent could lead to issues if the
internals of these close() functions begin to introduce some logic
specific to each relkind in the future.
Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Discussion: https://postgr.es/m/aUKamYGiDKO6byp5@ip-10-97-1-34.eu-west-3.compute.internal
Commit 0decd5e89d missed
DO_SUBSCRIPTION_REL, leading to assertion failures. In the unlikely use
case of diffing "pg_dump --binary-upgrade" output, spurious diffs were
possible. As part of fixing that, align the DumpableObject naming and
sort order with DO_PUBLICATION_REL. The overall effect of this commit
is to change sort order from (subname, srsubid) to (rel, subname).
Since DO_SUBSCRIPTION_REL is only for --binary-upgrade, accept that
larger-than-usual dump order change. Back-patch to v17, where commit
9a17be1e24 introduced DO_SUBSCRIPTION_REL.
Reported-by: vignesh C <vignesh21@gmail.com>
Author: vignesh C <vignesh21@gmail.com>
Discussion: https://postgr.es/m/CALDaNm2x3rd7C0_HjUpJFbxpAqXgm=QtoKfkEWDVA8h+JFpa_w@mail.gmail.com
Backpatch-through: 17
Operations on unlogged relations should not be WAL-logged. The
brin_initialize_empty_new_buffer() function didn't get the memo.
The function is only called when a concurrent update to a brin page
uses up space that we're just about to insert to, which makes it
pretty hard to hit. If you do manage to hit it, a full-page WAL record
is erroneously emitted for the unlogged index. If you then crash,
crash recovery will fail on that record with an error like this:
FATAL: could not create file "base/5/32819": File exists
Author: Kirill Reshke <reshkekirill@gmail.com>
Discussion: https://www.postgresql.org/message-id/CALdSSPhpZXVFnWjwEBNcySx_vXtXHwB2g99gE6rK0uRJm-3GgQ@mail.gmail.com
Backpatch-through: 14
Commit 0d2d4a0ec3 introduced a test that verifies replication slot
synchronization to a standby server via SQL API. However, the test did not
configure synchronized_standby_slots. Without this setting, logical
failover slots can advance beyond the physical replication slot, causing
intermittent synchronization failures.
Author: Hou Zhijie <houzj.fnst@fujitsu.com>
Discussion: https://postgr.es/m/TY4PR01MB16907DF70205308BE918E0D4494ABA@TY4PR01MB16907.jpnprd01.prod.outlook.com
This change has been suggested by the two authors listed in this commit,
both of them providing an incomplete solution (David's formula relied on
a "bytea *", while Bertrand's did not use palloc_array()). The solution
provided in this commit uses GBT_VARKEY instead of the inconsistent
bytea for the allocation size, with a palloc_array().
The change related to Vsrt is one I am flipping to a more consistent
style, in passing.
Author: David Geier <geidav.pg@gmail.com>
Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Discussion: https://postgr.es/m/ad0748d4-3080-436e-b0bc-ac8f86a3466a@gmail.com
Discussion: https://postgr.es/m/aTrG3Fi4APtfiCvQ@ip-10-97-1-34.eu-west-3.compute.internal
4ba012a8ed defined the "header" (pointer to the stats data) of
from_serialized_data() as a const, even though it is fine (and
expected!) for the callback to modify the shared memory entry when
loading the stats at startup.
While on it, this commit updates the callback to_serialized_data() in
the test module test_custom_stats to make the data extracted from the
"header" parameter a const since it should never be modified: the stats
are written to disk and no modifications are expected in the shared
memory entry.
This clarifies the API contract of these new callbacks.
Reported-By: Peter Eisentraut <peter@eisentraut.org>
Author: Michael Paquier <michael@paquier.xyz>
Co-authored-by: Sami Imseih <samimseih@gmail.com>
Discussion: https://postgr.es/m/d87a93b0-19c7-4db6-b9c0-d6827e7b2da1@eisentraut.org
Commit e0f373ee4 fixed up races in Cluster::connect_fails when using
log_like. t/002_client.pl didn't get the memo, though, because it
doesn't use Test::Cluster to perform its custom hook tests. (This is
probably not an issue at the moment, since the log check is only done
after authentication success and not failure, but there's no reason to
wait for someone to hit it.)
Introduce the fix, based on debug2 logging, to its use of log_check() as
well, and move the logic into the test() helper so that any additions
don't need to continually duplicate it.
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Discussion: https://postgr.es/m/CAOYmi%2BmrGg%2Bn_X2MOLgeWcj3v_M00gR8uz_D7mM8z%3DdX1JYVbg%40mail.gmail.com
Backpatch-through: 18
The definition of PGoauthBearerRequest uses a temporary SOCKTYPE macro
to hide the difference between Windows and Berkeley socket handles,
since we don't surface pgsocket in our public API. This macro doesn't
need to escape the header, because implementers will choose the correct
socket type based on their platform, so I #undef'd it immediately after
use.
I didn't namespace that helper, though, so if anyone else needs a
SOCKTYPE macro, libpq-fe.h will now unhelpfully get rid of it. This
doesn't seem too far-fetched, given its proximity to existing POSIX
macro names.
Add a PQ_ prefix to avoid collisions, update and improve the surrounding
documentation, and backpatch.
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Discussion: https://postgr.es/m/CAOYmi%2BmrGg%2Bn_X2MOLgeWcj3v_M00gR8uz_D7mM8z%3DdX1JYVbg%40mail.gmail.com
Backpatch-through: 18
I originally used just "regress-VERSION.mo", but that seems too
generic considering that some packagers will put this file into
a system-wide directory. Per suggestion from Christoph Berg.
Reported-by: Christoph Berg <myon@debian.org>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/aULSW7Xqx5MqDW_1@msg.df7cb.de
The test is very sensitive to how backends start and exit, because it
tests dead-end backends which occur when all the connection slots are
in use. The test failed occasionally in the CI, when the backend that
was launched for the raw_connect_works() check lingered for a while,
and exited only later during the test. When it exited, it released a
connection slot, when the test expected all the slots to be in use at
that time.
The 002_connection_limits.pl test had a similar issue: if the backend
launched for safe_psql() in the test initialization lingers around, it
uses up a connection slot during the test, messing up the test's
connection counting. I haven't seen that in the CI, but when I added a
"sleep(1);" to proc_exit(), the test failed.
To make the tests more robust, restart the server to ensure that the
lingering backends doesn't interfere with the later test steps.
In the passing, fix a bogus test name.
Report and analysis by Jelte Fennema-Nio, Andres Freund, Thomas Munro.
Discussion: https://www.postgresql.org/message-id/CAGECzQSU2iGuocuP+fmu89hmBmR3tb-TNyYKjCcL2M_zTCkAFw@mail.gmail.com
Backpatch-through: 18