OpenBSD falls back to "C" when using an incorrect input with setlocale()
and LC_CTYPE, causing this test, introduced by 008cf04, to fail. This
removes the culprit test to avoid the portability issue.
Per report from Robert Haas, via buildfarm member curculio.
Discussion: https://postgr.es/m/CA+TgmoZ6ddh3mHD9gU8DvNYoFmuJaYYn1+4AvZNp25vTdRwCAQ@mail.gmail.com
Backpatch-through: 11
Attempting to use pg_checksums (pg_verify_checksums in 11) on a data
folder which includes tablespace paths used across multiple major
versions would cause pg_checksums to scan all directories present in
pg_tblspc, and not only marked with TABLESPACE_VERSION_DIRECTORY. This
could lead to failures when for example running sanity checks on an
upgraded instance with --check. Even worse, it was possible to rewrite
on-disk pages with --enable for a cluster potentially online.
This commit makes pg_checksums skip any directories not named
TABLESPACE_VERSION_DIRECTORY, similarly to what is done for base
backups.
Reported-by: Michael Banck
Author: Michael Banck, Bernd Helmle
Discussion: https://postgr.es/m/62031974fd8e941dd8351fbc8c7eff60d59c5338.camel@credativ.de
backpatch-through: 11
The original coding failed to properly quote those arguments, leading to
failures when using quotes in the values used. As the quoting can be
encoding-sensitive, the connection to the backend needs to be taken
before applying the correct quoting.
Author: Michael Paquier
Reviewed-by: Daniel Gustafsson
Discussion: https://postgr.es/m/20200214041004.GB1998@paquier.xyz
Backpatch-through: 9.5
Deduplication reduces the storage overhead of duplicates in indexes that
use the standard nbtree index access method. The deduplication process
is applied lazily, after the point where opportunistic deletion of
LP_DEAD-marked index tuples occurs. Deduplication is only applied at
the point where a leaf page split would otherwise be required. New
posting list tuples are formed by merging together existing duplicate
tuples. The physical representation of the items on an nbtree leaf page
is made more space efficient by deduplication, but the logical contents
of the page are not changed. Even unique indexes make use of
deduplication as a way of controlling bloat from duplicates whose TIDs
point to different versions of the same logical table row.
The lazy approach taken by nbtree has significant advantages over a GIN
style eager approach. Most individual inserts of index tuples have
exactly the same overhead as before. The extra overhead of
deduplication is amortized across insertions, just like the overhead of
page splits. The key space of indexes works in the same way as it has
since commit dd299df8 (the commit that made heap TID a tiebreaker
column).
Testing has shown that nbtree deduplication can generally make indexes
with about 10 or 15 tuples for each distinct key value about 2.5X - 4X
smaller, even with single column integer indexes (e.g., an index on a
referencing column that accompanies a foreign key). The final size of
single column nbtree indexes comes close to the final size of a similar
contrib/btree_gin index, at least in cases where GIN's posting list
compression isn't very effective. This can significantly improve
transaction throughput, and significantly reduce the cost of vacuuming
indexes.
A new index storage parameter (deduplicate_items) controls the use of
deduplication. The default setting is 'on', so all new B-Tree indexes
automatically use deduplication where possible. This decision will be
reviewed at the end of the Postgres 13 beta period.
There is a regression of approximately 2% of transaction throughput with
synthetic workloads that consist of append-only inserts into a table
with several non-unique indexes, where all indexes have few or no
repeated values. The underlying issue is that cycles are wasted on
unsuccessful attempts at deduplicating items in non-unique indexes.
There doesn't seem to be a way around it short of disabling
deduplication entirely. Note that deduplication of items in unique
indexes is fairly well targeted in general, which avoids the problem
there (we can use a special heuristic to trigger deduplication passes in
unique indexes, since we're specifically targeting "version bloat").
Bump XLOG_PAGE_MAGIC because xl_btree_vacuum changed.
No bump in BTREE_VERSION, since the representation of posting list
tuples works in a way that's backwards compatible with version 4 indexes
(i.e. indexes built on PostgreSQL 12). However, users must still
REINDEX a pg_upgrade'd index to use deduplication, regardless of the
Postgres version they've upgraded from. This is the only way to set the
new nbtree metapage flag indicating that deduplication is generally
safe.
Author: Anastasia Lubennikova, Peter Geoghegan
Reviewed-By: Peter Geoghegan, Heikki Linnakangas
Discussion:
https://postgr.es/m/55E4051B.7020209@postgrespro.ruhttps://postgr.es/m/4ab6e2db-bcee-f4cf-0916-3a06e6ccbb55@postgrespro.ru
Invent the concept of a B-Tree equalimage ("equality implies image
equality") support function, registered as support function 4. This
indicates whether it is safe (or not safe) to apply optimizations that
assume that any two datums considered equal by an operator class's order
method must be interchangeable without any loss of semantic information.
This is static information about an operator class and a collation.
Register an equalimage routine for almost all of the existing B-Tree
opclasses. We only need two trivial routines for all of the opclasses
that are included with the core distribution. There is one routine for
opclasses that index non-collatable types (which returns 'true'
unconditionally), plus another routine for collatable types (which
returns 'true' when the collation is a deterministic collation).
This patch is infrastructure for an upcoming patch that adds B-Tree
deduplication.
Author: Peter Geoghegan, Anastasia Lubennikova
Discussion: https://postgr.es/m/CAH2-Wzn3Ee49Gmxb7V1VJ3-AC8fWn-Fr8pfWQebHe8rYRxt5OQ@mail.gmail.com
An instance of PostgreSQL crashing with a bad timing could leave behind
temporary pg_internal.init files, potentially causing failures when
verifying checksums. As the same exclusion lists are used between
pg_rewind, pg_checksums and basebackup.c, all those tools are extended
with prefix checks to keep everything in sync, with dedicated checks
added for pg_internal.init.
Backpatch down to 11, where pg_checksums (pg_verify_checksums in 11) and
checksum verification for base backups have been introduced.
Reported-by: Michael Banck
Author: Michael Paquier
Reviewed-by: Kyotaro Horiguchi, David Steele
Discussion: https://postgr.es/m/62031974fd8e941dd8351fbc8c7eff60d59c5338.camel@credativ.de
Backpatch-through: 11
Windows has this, and so do all other live platforms according to the
buildfarm, so remove the configure probe and src/port/ substitution.
Keep the probe that detects whether _LARGEFILE_SOURCE has to be
defined to get that, though ... that seems to be still relevant in
some places.
This is part of a series of commits to get rid of no-longer-relevant
configure checks and dead src/port/ code. I'm committing them separately
to make it easier to back out individual changes if they prove less
portable than I expect.
Discussion: https://postgr.es/m/15379.1582221614@sss.pgh.pa.us
This fixes and updates a couple of comments related to outdated Windows
versions. Particularly, src/common/exec.c had a fallback implementation
to read a file's line from a pipe because stdin/stdout/stderr does not
exist in Windows 2000 that is removed to simplify src/common/ as there
are unlikely versions of Postgres running on such platforms.
Author: Michael Paquier
Reviewed-by: Kyotaro Horiguchi, Juan José Santamaría Flecha
Discussion: https://postgr.es/m/20191219021526.GC4202@paquier.xyz
This was unaccountably omitted in the original RLS patch.
The SQL syntax is basically the same as for comments on triggers,
so crib code from dumpTrigger().
Per report from Marc Munro. Back-patch to all supported branches.
Discussion: https://postgr.es/m/1581889298.18009.15.camel@bloodnok.com
Commit 0da33c762 introduced an unfortunate regression in pg_ctl on
Windows: if the log file specified with -l doesn't exist yet, and
pg_ctl is running with Administrator privileges, then the log file
might get created with permissions that prevent the postmaster from
writing on it. (It seems that whether this happens depends on whether
the log file is inside the user's home directory or not, and perhaps
on other phase-of-the-moon conditions, which may explain why we failed
to notice it sooner.)
To fix, just don't create the log file if it doesn't exist yet. The
case where we need to wait obviously only occurs with a pre-existing
log file.
In passing, switch from using fopen() to plain open(), saving a few
cycles.
Per bug #16259 from Jonathan Katz and Heath Lord. Back-patch to v12,
as the faulty commit was.
Alexander Lakhin
Discussion: https://postgr.es/m/16259-c5ebed32a262a8b1@postgresql.org
%d can be used to track if the current connection is in a transaction
block or not, and adding it by default to the prompt has the advantage
to not need a modification of .psqlrc, something not possible depending
on the environment.
This discussion has happened across various sources, and there was a
strong consensus in favor of this change.
Author: Vik Fearing
Reviewed-by: Fabien Coelho
Discussion: https://postgr.es/m/09502c40-cfe1-bb29-10f9-4b3fa7b2bbb2@2ndquadrant.com
The previous coding forgot to apply shell quoting to the socket
directory and the data folder, leading to failures when running
pg_upgrade. This refactors the code generating the pg_ctl command
starting clusters to use a more correct shell quoting. Failures are
easier to trigger in 12 and newer versions by using a value of
--socketdir that includes quotes, but it is also possible to cause
failures with quotes included in the default socket directory used by
pg_upgrade or the data folders of the clusters involved in the
upgrade.
As 9.4 is going to be EOL'd with the next minor release, nobody is
likely going to upgrade to it now so this branch is not included in the
set of branches fixed.
Author: Michael Paquier
Reviewed-by: Álvaro Herrera, Noah Misch
Backpatch-through: 9.5
This reverts commit 7bae0ad, as this is not ideal with the tar format,
and we may want to explore more options like what is done by tar with
some equivalents of --owner and --group, but for pg_basebackup.
Per complaints from Magnus Hagander and Stephen Frost.
Discussion: https://postgr.es/m/20200205172259.GW3195@tamriel.snowman.net
First, this code did not bother checking for a failure when calling
dup(). Then, per zlib, gzerror() returns NULL for a NULL input, which
can happen if passing down to gzdopen() an invalid file descriptor or if
there was an allocation failure.
No back-patch is done as this would unlikely be a problem in the field.
Per Coverity.
Reported-by: Tom Lane
Those new assertions can be used at file scope, outside of any function
for compilation checks. This commit provides implementations for C and
C++, and fallback implementations.
Author: Peter Smith
Reviewed-by: Andres Freund, Kyotaro Horiguchi, Dagfinn Ilmari Mannsåker,
Michael Paquier
Discussion: https://postgr.es/m/201DD0641B056142AC8C6645EC1B5F62014B8E8030@SYD1217
Similarly to pg_upgrade, pg_ctl and initdb, a root user is able to use
--version and --help, but cannot execute the actual operation to avoid
the creation of files with permissions incompatible with the
postmaster.
This is a behavior change, so not back-patching is done.
Author: Ian Barwick
Discussion: https://postgr.es/m/CABvVfJVqOdD2neLkYdygdOHvbWz_5K_iWiqY+psMfA=FeAa3qQ@mail.gmail.com
If we failed to fork a worker process, or create a communication pipe
for one, WaitForTerminatingWorkers would suffer an assertion failure
if assert-enabled, otherwise crash or go into an infinite loop. This
was a consequence of not accounting for the startup condition where
we've not yet forked all the workers.
The original bug was that ParallelBackupStart would set workerStatus to
WRKR_IDLE before it had successfully forked a worker. I made things
worse in commit b7b8cc0cf by not understanding the undocumented fact
that the WRKR_TERMINATED state was also meant to represent the case
where a worker hadn't been started yet: I changed enum T_WorkerStatus
so that *all* the worker slots were initially in WRKR_IDLE state. But
this wasn't any more broken in practice, since even one slot in the
wrong state would keep WaitForTerminatingWorkers from terminating.
In v10 and later, introduce an explicit T_WorkerStatus value for
worker-not-started, in hopes of preventing future oversights of the
same ilk. Before that, just document that WRKR_TERMINATED is supposed
to cover that case (partly because it wasn't actively broken, and
partly because the enum is exposed outside parallel.c in those branches,
so there's microscopically more risk involved in changing it).
In all branches, introduce a WORKER_IS_RUNNING status test macro
to hide which T_WorkerStatus values mean that, and be more careful
not to access ParallelSlot fields till we're sure they're valid.
Per report from Vignesh C, though this is my patch not his.
Back-patch to all supported branches.
Discussion: https://postgr.es/m/CALDaNm1Luv-E3sarR+-unz-BjchquHHyfP+YC+2FS2pt_J+wxg@mail.gmail.com
We used to strategically place newlines after some function call left
parentheses to make pgindent move the argument list a few chars to the
left, so that the whole line would fit under 80 chars. However,
pgindent no longer does that, so the newlines just made the code
vertically longer for no reason. Remove those newlines, and reflow some
of those lines for some extra naturality.
Reviewed-by: Michael Paquier, Tom Lane
Discussion: https://postgr.es/m/20200129200401.GA6303@alvherre.pgsql
This patch creates a new extension property, "trusted". An extension
that's marked that way in its control file can be installed by a
non-superuser who has the CREATE privilege on the current database,
even if the extension contains objects that normally would have to be
created by a superuser. The objects within the extension will (by
default) be owned by the bootstrap superuser, but the extension itself
will be owned by the calling user. This allows replicating the old
behavior around trusted procedural languages, without all the
special-case logic in CREATE LANGUAGE. We have, however, chosen to
loosen the rules slightly: formerly, only a database owner could take
advantage of the special case that allowed installation of a trusted
language, but now anyone who has CREATE privilege can do so.
Having done that, we can delete the pg_pltemplate catalog, moving the
knowledge it contained into the extension script files for the various
PLs. This ends up being no change at all for the in-core PLs, but it is
a large step forward for external PLs: they can now have the same ease
of installation as core PLs do. The old "trusted PL" behavior was only
available to PLs that had entries in pg_pltemplate, but now any
extension can be marked trusted if appropriate.
This also removes one of the stumbling blocks for our Python 2 -> 3
migration, since the association of "plpythonu" with Python 2 is no
longer hard-wired into pg_pltemplate's initial contents. Exactly where
we go from here on that front remains to be settled, but one problem
is fixed.
Patch by me, reviewed by Peter Eisentraut, Stephen Frost, and others.
Discussion: https://postgr.es/m/5889.1566415762@sss.pgh.pa.us
Commit 40d964ec99 allowed vacuum command to leverage multiple CPUs by
invoking parallel workers to process indexes. This commit provides a
'--parallel' option to specify the parallel degree used by vacuum command.
Author: Masahiko Sawada, with few modifications by me
Reviewed-by: Mahendra Singh and Amit Kapila
Discussion: https://postgr.es/m/CAD21AoDTPMgzSkV4E3SFo1CH_x50bf5PqZFQf4jmqjk-C03BWg@mail.gmail.com
The signature of XLogReadRecord() required the caller to pass the starting
WAL position as argument, or InvalidXLogRecPtr to continue reading at the
end of previous record. That's slightly awkward to the callers, as most
of them don't want to randomly jump around in the WAL stream, but start
reading at one position and then read everything from that point onwards.
Remove the 'RecPtr' argument and add a new function XLogBeginRead() to
specify the starting position instead. That's more convenient for the
callers. Also, xlogreader holds state that is reset when you change the
starting position, so having a separate function for doing that feels like
a more natural fit.
This changes XLogFindNextRecord() function so that it doesn't reset the
xlogreader's state to what it was before the call anymore. Instead, it
positions the xlogreader to the found record, like XLogBeginRead().
Reviewed-by: Kyotaro Horiguchi, Alvaro Herrera
Discussion: https://www.postgresql.org/message-id/5382a7a3-debe-be31-c860-cb810c08f366%40iki.fi
I had supposed that all versions of Readline that have filename
quoting hooks also have the rl_completion_suppress_quote variable.
But it seems OpenBSD managed to find a version someplace that does
not, so we'll have to expend a separate configure probe for that.
(Light testing suggests that this version also lacks the bugs that
make it necessary to frob that variable. Hooray!)
Per buildfarm.
The Readline library contains a fair amount of knowledge about how to
tab-complete filenames, but it turns out that that doesn't work too well
unless we follow its expectation that we use its filename quoting hooks
to quote and de-quote filenames. We were trying to do such quote handling
within complete_from_files(), and that's still what we have to do if we're
using libedit, which lacks those hooks. But for Readline, it works a lot
better if we tell Readline that single-quote is a quoting character and
then provide hooks that know the details of the quoting rules for SQL
and psql meta-commands.
Hence, resurrect the quoting hook functions that existed in the original
version of tab-complete.c (and were disabled by commit f6689a328 because
they "didn't work so well yet"), and whack on them until they do seem to
work well.
Notably, this fixes bug #16059 from Steven Winfield, who pointed out
that the previous coding would strip quote marks from filenames in SQL
COPY commands, even though they're syntactically necessary there.
Now, we not only don't do that, but we'll add a quote mark when you
tab-complete, even if you didn't type one.
Getting this to work across a range of libedit versions (and, to a
lesser extent, libreadline versions) was depressingly difficult.
It will be interesting to see whether the new regression test cases
pass everywhere in the buildfarm.
Some future patch might try to handle quoted SQL identifiers with
similar explicit quoting/dequoting logic, but that's for another day.
Patch by me, reviewed by Peter Eisentraut.
Discussion: https://postgr.es/m/16059-8836946734c02b84@postgresql.org
sigTermHandler() tried to be careful to invoke only operations that
are safe to do in a signal handler. But for some reason we forgot
that exit(3) is not among those, because it calls atexit handlers
that might do various random things. (pg_dump itself installs no
atexit handlers, but e.g. OpenSSL does.) That led to crashes or
lockups when attempting to terminate a parallel dump or restore
via a signal.
Fix by calling _exit() instead.
Per bug #16199 from Raúl Marín. Back-patch to all supported branches.
Discussion: https://postgr.es/m/16199-cb2f121146a96f9b@postgresql.org
This feature allows the vacuum to leverage multiple CPUs in order to
process indexes. This enables us to perform index vacuuming and index
cleanup with background workers. This adds a PARALLEL option to VACUUM
command where the user can specify the number of workers that can be used
to perform the command which is limited by the number of indexes on a
table. Specifying zero as a number of workers will disable parallelism.
This option can't be used with the FULL option.
Each index is processed by at most one vacuum process. Therefore parallel
vacuum can be used when the table has at least two indexes.
The parallel degree is either specified by the user or determined based on
the number of indexes that the table has, and further limited by
max_parallel_maintenance_workers. The index can participate in parallel
vacuum iff it's size is greater than min_parallel_index_scan_size.
Author: Masahiko Sawada and Amit Kapila
Reviewed-by: Dilip Kumar, Amit Kapila, Robert Haas, Tomas Vondra,
Mahendra Singh and Sergei Kornilov
Tested-by: Mahendra Singh and Prabhat Sahu
Discussion:
https://postgr.es/m/CAD21AoDTPMgzSkV4E3SFo1CH_x50bf5PqZFQf4jmqjk-C03BWg@mail.gmail.comhttps://postgr.es/m/CAA4eK1J-VoR9gzS5E75pcD-OH0mEyCdp8RihcwKrcuw7J-Q0+w@mail.gmail.com
This data was only in separate files because it was the most convenient
way to handle it with a shell script. Now that we use a general-purpose
programming language, it's easy to assemble the data into the same format
as the rest of the catalogs and output it into postgres.bki. This allows
removal of some special-purpose code from initdb.c.
Discussion: https://www.postgresql.org/message-id/CACPNZCtVFtjHre6hg9dput0qRPp39pzuyA2A6BT8wdgrRy%2BQdA%40mail.gmail.com
Author: John Naylor
Formerly, various frontend directories symlinked these two sources
and then built them locally. That's an ancient, ugly hack, and
we now have a much better way: put them into libpgcommon.
So do that. (The immediate motivation for this is the prospect
of having to introduce still more symlinking if we don't.)
This commit moves these two files absolutely verbatim, for ease of
reviewing the git history. There's some follow-on work to be done
that will modify them a bit.
Robert Haas, Tom Lane
Discussion: https://postgr.es/m/CA+TgmoYO8oq-iy8E02rD8eX25T-9SmyxKWqqks5OMHxKvGXpXQ@mail.gmail.com
For historical reasons, libpq used a separate libpq.rc file for the
Windows builds while all other components use a common file
win32ver.rc. With a bit of tweaking, the libpq build can also use the
win32ver.rc file. This removes a bit of duplicative code.
Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://www.postgresql.org/message-id/flat/ad505e61-a923-e114-9f38-9867d161073f@2ndquadrant.com
Experience so far suggests that getting these tests to pass on
all libedit versions that are out there may be impossible, or
require dumbing down the tests to the point of uselessness.
So we need to provide a way to skip them when the user knows they'll
fail. An environment variable is probably the most convenient way
to deal with this; it's easy for, e.g., a buildfarm animal's
configuration to set up.
Discussion: https://postgr.es/m/9594.1578586797@sss.pgh.pa.us
The true explanation for Peter Geoghegan's trouble report turns out
to be that he has a ~/.inputrc that affects readline's behavior
enough to break this test. Prevent readline from reading that file.
Also, the best way to prevent TERM from affecting the results seems
to be to unset it altogether, not to set it to "xterm". The latter
choice licenses readline to emit xterm escape sequences, and there's
a lot of variation in exactly what it will emit.
Revert changes that attempted to account exactly for xterm escape
sequences. We shouldn't need that with TERM unset, and it was not
looking like a maintainable solution anyway.
Discussion: https://postgr.es/m/23181.1578167938@sss.pgh.pa.us
Right at the moment, this is making things worse not better in the
buildfarm. I'm not happy with anything about the current state,
but let's at least try to have a green buildfarm report while further
investigation continues.
Discussion: https://postgr.es/m/23181.1578167938@sss.pgh.pa.us
Depending on as-yet-incompletely-explained factors, readline/libedit
might choose to emit screen-control escape sequences as part of
repainting the display. I'd tried to make the test patterns avoid
matching parts of the output that are likely to contain such, but
it seems that there's really no way around matching them explicitly
in some places, unless we want to just give up testing some behaviors
such as display of alternatives.
Per report from Peter Geoghegan.
Discussion: https://postgr.es/m/CAH2-WznPzfWHu8PQwv1Qjpf4wQVPaaWpoO5NunFz9zsYKB4uJA@mail.gmail.com
Escape non-printable characters in failure reports, by using Data::Dumper
in Useqq mode. Also, bump $Test::Builder::Level so the diagnostic
references the calling line, and use diag() instad of note(),
so it shows even in non-verbose mode (per request from Christoph Berg).
Also, give up on trying to test for the specific way that readline
chooses to overwrite existing text in the \DRD -> \drds test.
There are too many variants, it seems, at least on the libedit
side of things.
Dagfinn Ilmari Mannsåker and Tom Lane
Discussion: https://postgr.es/m/20200103110128.GA28967@msg.df7cb.de