The test inserts a row in primary server, waits until the insertion has
been replicated to a cascaded standby, and checks that it's visible there
by querying the cascaded standby. In order for that to work reliably, the
test needs to wait until the insertion WAL record has been fully replayed.
This should fix the occasional buildfarm failures. Diagnosis by Tom Lane.
Discussion: https://www.postgresql.org/message-id/606796.1605424022@sss.pgh.pa.us
On Windows, a filename cannot contain backslashes, because a backslash
is used directory separator. Remove tests I added in commit 9c4f5192f
that tried to do that. We could perhaps use a SKIP block to only skip
them on Windows, but I'm not sure how exactly to formulate that, so just
remove the tests to make the buildfarm green again.
Per buildfarm.
Using a hot standby server as the source has not been possible, because
pg_rewind creates a temporary table in the source system, to hold the
list of file ranges that need to be fetched. Refactor it to queue up the
file fetch requests in pg_rewind's memory, so that the temporary table
is no longer needed.
Also update the logic to compute 'minRecoveryPoint' correctly, when the
source is a standby server.
Reviewed-by: Kyotaro Horiguchi, Soumyadeep Chakraborty
Discussion: https://www.postgresql.org/message-id/0c5b3783-af52-3ee5-f8fa-6e794061f70d%40iki.fi
It's entirely possible to see WAL for a relation that doesn't exist in
the target anymore. That happens when the relation was dropped later.
The refactoring in commit eb00f1d4b broke that case, by sanity-checking
the file type in the target before checking the flag forwhether it
exists there at all.
I noticed this during manual testing. Modify the 001_basic.pl test so
that it covers this case.
If an interactive psql session used \gset when querying a compromised
server, the attacker could execute arbitrary code as the operating
system account running psql. Using a prefix not found among specially
treated variables, e.g. every lowercase string, precluded the attack.
Fix by issuing a warning and setting no variable for the column in
question. Users wanting the old behavior can use a prefix and then a
meta-command like "\set HISTSIZE :prefix_HISTSIZE". Back-patch to 9.5
(all supported versions).
Reviewed by Robert Haas. Reported by Nick Cleaton.
Security: CVE-2020-25696
Coverity complained about a defect in commit 257836a7:
Calling "parsePGArray" without checking return value (as is
done elsewhere 11 out of 13 times).
Fix, and also check for empty strings explicitly (NULL as represented by
PQgetvalue()). That worked correctly before only because parsePGArray()
happens to set *nitems = 0 when it fails on an empty string. Also
convert a sanity check assertion to an error to be more paranoid, and
pgindent a nearby line.
Reported-by: Michael Paquier <michael@paquier.xyz>
This should have been done in the initial commit that made
unix_socket_directories a list as of c9b0cbe. This change allows to
support correctly the case of ALTER SYSTEM, where it is possible to
specify multiple paths as a list, like the following pattern where
flattening is applied to each item:
ALTER SYSTEM SET unix_socket_directories = '/path1', '/path2';
Any parameters specified in postgresql.conf are parsed the same way, so
there is no compatibility change. pg_dump has a hardcoded list of
parameters marked with GUC_LIST_QUOTE, that gets its routine update.
These are reordered alphabetically for clarity.
Author: Ian Lawrence Barwick
Reviewed-by: Peter Eisentraunt, Tom Lane, Michael Paquier
Discussion: https://postgr.es/m/CAB8KJ=iMOtNY6_sUwV=LQVCJ2zgYHBDyNzVfvE5GN3WQ3v9kQg@mail.gmail.com
Revert 403a3d91c, as well as the followup fix 7f4235032, in all
branches. We need to think a bit harder about what the behavior
of LOCK TABLE on views should be, and there's no time for that
before next week's releases. We'll take another crack at this
later.
Discussion: https://postgr.es/m/16703-e348f58aab3cf6cc@postgresql.org
This makes the abstraction of a "source" server more clear, by introducing
a common abstract class, borrowing the object-oriented programming term,
that represents all the operations that can be done on the source server.
There are two implementations of it, one for fetching via libpq, and
another to fetch from a local directory. This adds some code, but makes it
easier to understand what's going on.
The copy_executeFileMap() and libpq_executeFileMap() functions contained
basically the same logic, just calling different functions to fetch the
source files. Refactor so that the common logic is in one place, in a new
function called perform_rewind().
Reviewed-by: Kyotaro Horiguchi, Soumyadeep Chakraborty
Discussion: https://www.postgresql.org/message-id/0c5b3783-af52-3ee5-f8fa-6e794061f70d%40iki.fi
Deciding what to do with each file is now a separate step after all the
necessary information has been gathered. It is more clear that way.
Previously, the decision-making was divided between process_source_file()
and process_target_file(), and it was a bit hard to piece together what
the overall rules were.
Reviewed-by: Kyotaro Horiguchi, Soumyadeep Chakraborty
Discussion: https://www.postgresql.org/message-id/0c5b3783-af52-3ee5-f8fa-6e794061f70d%40iki.fi
Record the current version of dependent collations in pg_depend when
creating or rebuilding an index. When accessing the index later, warn
that the index may be corrupted if the current version doesn't match.
Thanks to Douglas Doole, Peter Eisentraut, Christoph Berg, Laurenz Albe,
Michael Paquier, Robert Haas, Tom Lane and others for very helpful
discussion.
Author: Thomas Munro <thomas.munro@gmail.com>
Author: Julien Rouhaud <rjuju123@gmail.com>
Reviewed-by: Peter Eisentraut <peter.eisentraut@2ndquadrant.com> (earlier versions)
Discussion: https://postgr.es/m/CAEepm%3D0uEQCpfq_%2BLYFBdArCe4Ot98t1aR4eYiYTe%3DyavQygiQ%40mail.gmail.com
This model couldn't be extended to cover the default collation, and
didn't have any information about the affected database objects when the
version changed. Remove, in preparation for a follow-up commit that
will add a new mechanism.
Author: Thomas Munro <thomas.munro@gmail.com>
Reviewed-by: Julien Rouhaud <rjuju123@gmail.com>
Reviewed-by: Peter Eisentraut <peter.eisentraut@2ndquadrant.com>
Discussion: https://postgr.es/m/CAEepm%3D0uEQCpfq_%2BLYFBdArCe4Ot98t1aR4eYiYTe%3DyavQygiQ%40mail.gmail.com
Although error results received from the backend should always have
a SQLSTATE field, ones generated by libpq won't, making this code
vulnerable to a crash after, say, untimely loss of connection.
Noted by Coverity.
Oversight in commit 403a3d91c. Back-patch to 9.5, as that was.
In almost all other places, we use plain "r" or "w" mode in popen()
calls (the exceptions being for COPY data). This one has been
overlooked (possibly because it's buried in a ".l" flex file?),
but it's using PG_BINARY_R.
Kensuke Okamura complained in bug #16688 that we fail to strip \r
when stripping the trailing newline from a backtick result string.
That's true enough, but we'd also fail to convert embedded \r\n
cleanly, which also seems undesirable. Fixing the popen() mode
seems like the best way to deal with this.
It's been like this for a long time, so back-patch to all supported
branches.
Discussion: https://postgr.es/m/16688-c649c7b69cd7e6f8@postgresql.org
Now that LOCK TABLE can take any relation type, acquire lock on all
relations that are to be dumped. This prevents schema changes or
deadlock errors that could cause a dump to fail after expending much
effort. The server is tested to have the capability and the feature
disabled if it doesn't, so that a patched pg_dump doesn't fail when
connecting to an unpatched server.
Backpatch to 9.5.
Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reported-by: Wells Oliver <wells.oliver@gmail.com>
Discussion: https://postgr.es/m/20201021200659.GA32358@alvherre.pgsql
Instead of immediately PQfinish'ing a dead connection, save it aside
so that we can still extract its parameters for \connect attempts.
(This works because PQconninfo doesn't care whether the PGconn is in
CONNECTION_BAD state.) This allows developers to reconnect with
just \c after a database crash and restart.
It's tempting to use the same approach instead of closing the old
connection after a failed non-interactive \connect command. However,
that would not be very safe: consider a script containing
\c db1 user1 live_server
\c db2 user2 dead_server
\c db3
The script would be expecting to connect to db3 at dead_server, but
if we re-use parameters from the first connection then it might
successfully connect to db3 at live_server. This'd defeat the goal
of not letting a script accidentally execute commands against the
wrong database.
Discussion: https://postgr.es/m/38464.1603394584@sss.pgh.pa.us
The check for whether to complain about not having an old connection
to get parameters from was seriously out of date: it had not been
rethought when we invented connstrings, nor when we invented the
-reuse-previous option. Replace it with a check that throws an
error if reuse-previous is active and we lack an old connection to
reuse. While that doesn't move the goalposts very far in terms of
easing reconnection after a server crash, at least it's consistent.
If the user specifies a connstring plus additional parameters
(which is invalid per the documentation), the extra parameters were
silently ignored. That seems like it could be really confusing,
so let's throw a syntax error instead.
Teach the connstring code path to re-use the old connection's password
in the same cases as the old-style-syntax code path would, ie if we
are reusing parameters and the values of username, host/hostaddr, and
port are not being changed. Document this behavior, too, since it was
unmentioned before. Also simplify the implementation a bit, giving
rise to two new and useful properties: if there's a "password=xxx" in
the connstring, we'll use it not ignore it, and by default (i.e.,
except with --no-password) we will prompt for a password if the
re-used password or connstring password doesn't work. The previous
code just failed if the re-used password didn't work.
Given the paucity of field complaints about these issues, I don't
think that they rise to the level of back-patchable bug fixes,
and in any case they might represent undesirable behavior changes
in minor releases. So no back-patch.
Discussion: https://postgr.es/m/235210.1603321144@sss.pgh.pa.us
psql's \connect claims to be able to re-use previous connection
parameters, but in fact it only re-uses the database name, user name,
host name (and possibly hostaddr, depending on version), and port.
This is problematic for assorted use cases. Notably, pg_dump[all]
emits "\connect databasename" commands which we would like to have
re-use all other parameters. If such a script is loaded in a psql run
that initially had "-d connstring" with some non-default parameters,
those other parameters would be lost, potentially causing connection
failure. (Thus, this is the same kind of bug addressed in commits
a45bc8a4f and 8e5793ab6, although the details are much different.)
To fix, redesign do_connect() so that it pulls out all properties
of the old PGconn using PQconninfo(), and then replaces individual
properties in that array. In the case where we don't wish to re-use
anything, get libpq's default settings using PQconndefaults() and
replace entries in that, so that we don't need different code paths
for the two cases.
This does result in an additional behavioral change for cases where
the original connection parameters allowed multiple hosts, say
"psql -h host1,host2", and the \connect request allows re-use of the
host setting. Because the previous coding relied on PQhost(), it
would only permit reconnection to the same host originally selected.
Although one can think of scenarios where that's a good thing, there
are others where it is not. Moreover, that behavior doesn't seem to
meet the principle of least surprise, nor was it documented; nor is
it even clear it was intended, since that coding long pre-dates the
addition of multi-host support to libpq. Hence, this patch is content
to drop it and re-use the host list as given.
Per Peter Eisentraut's comments on bug #16604. Back-patch to all
supported branches.
Discussion: https://postgr.es/m/16604-933f4b8791227b15@postgresql.org
After de8feb1f3a, some warnings remained
that were only visible when using GCC on Windows. Fix those as well.
Note that the ecpg test source files don't use the full pg_config.h,
so we can't use pg_funcptr_t there but have to do it the long way.
When told to process all databases, clusterdb, reindexdb, and vacuumdb
would reconnect by replacing their --maintenance-db parameter with the
name of the target database. If that parameter is a connstring (which
has been allowed for a long time, though we failed to document that
before this patch), we'd lose any other options it might specify, for
example SSL or GSS parameters, possibly resulting in failure to connect.
Thus, this is the same bug as commit a45bc8a4f fixed in pg_dump and
pg_restore. We can fix it in the same way, by using libpq's rules for
handling multiple "dbname" parameters to add the target database name
separately. I chose to apply the same refactoring approach as in that
patch, with a struct to handle the command line parameters that need to
be passed through to connectDatabase. (Maybe someday we can unify the
very similar functions here and in pg_dump/pg_restore.)
Per Peter Eisentraut's comments on bug #16604. Back-patch to all
supported branches.
Discussion: https://postgr.es/m/16604-933f4b8791227b15@postgresql.org
Failure to write data to a large object during restore led to an ugly
and uninformative error message. To add insult to injury, it then
fatal'd out, where other SQL-level errors usually result in pressing on.
Report the underlying error condition, rather than just giving not-very-
useful byte counts, and use warn_or_exit_horribly() so as to adhere to
pg_restore's general policy about whether to continue or not.
Also recognize that lo_write() returns int not size_t.
Per report from Justin Pryzby, though I didn't use his patch.
Given the lack of comparable complaints, I'm not sure this is
worth back-patching.
Discussion: https://postgr.es/m/20201018010232.GF9241@telsasoft.com
The Windows documentation insists that every WSAStartup call should
have a matching WSACleanup call. However, if that ever had actual
relevance, it wasn't in this century. Every remotely-modern Windows
kernel is capable of cleaning up when a process exits without doing
that, and must be so to avoid resource leaks in case of a process
crash. Moreover, Postgres backends have done WSAStartup without
WSACleanup since commit 4cdf51e64 in 2004, and we've never seen any
indication of a problem with that.
libpq's habit of doing WSAStartup during connection start and
WSACleanup during shutdown is also rather inefficient, since a
series of non-overlapping connection requests leads to repeated,
quite expensive DLL unload/reload cycles. We document a workaround
for that (having the application call WSAStartup for itself), but
that's just a kluge. It's also worth noting that it's far from
uncommon for applications to exit without doing PQfinish, and
we've not heard reports of trouble from that either.
However, the real reason for acting on this is that recent
experiments by Alexander Lakhin suggest that calling WSACleanup
during PQfinish might be triggering the symptom we occasionally see
that a process using libpq fails to emit expected stdio output.
Therefore, let's change libpq so that it calls WSAStartup only
once per process, during the first connection attempt, and never
calls WSACleanup at all.
While at it, get rid of the only other WSACleanup call in our code
tree, in pg_dump/parallel.c; that presumably is equally useless.
If this proves to suppress the fairly-common ecpg test failures
we see on Windows, I'll back-patch, but for now let's just do it
in HEAD and see what happens.
Discussion: https://postgr.es/m/ac976d8c-03df-d6b8-025c-15a2de8d9af1@postgrespro.ru
This commit required support for inline variable definition, which is
not a requirement.
RELEASE NOTE AUTHOR: the author of commit 3c0471b5fd
(pg_upgrade/tablespaces) was Justin Pryzby, not me.
Reported-by: Andres Freund
Discussion: https://postgr.es/m/20201016001959.h24fkywfubkv2pc5@alap3.anarazel.de
Backpatch-through: 9.5
Previously, if pg_upgrade failed, and the user recreated the cluster but
did not remove the new cluster tablespace directory, a later pg_upgrade
would fail since the new tablespace directory would already exists.
This adds error reporting for this during check.
Reported-by: Justin Pryzby
Discussion: https://postgr.es/m/20200925005531.GJ23631@telsasoft.com
Backpatch-through: 9.5
A number of places were using appendStringInfo() when they could have been
using appendStringInfoString() instead. While there's no functionality
change there, it's just more efficient to use appendStringInfoString()
when no formatting is required. Likewise for some
appendStringInfoString() calls which were just appending a single char.
We can just use appendStringInfoChar() for that.
Additionally, many places were using appendPQExpBuffer() when they could
have used appendPQExpBufferStr(). Change those too.
Patch by Zhijie Hou, but further searching by me found significantly more
places that deserved the same treatment.
Author: Zhijie Hou, David Rowley
Discussion: https://postgr.es/m/cb172cf4361e4c7ba7167429070979d4@G08CNEXMBPEKD05.g08.fujitsu.local
Commit fe27009cb tried to make parallel.c's Windows implementation of
piperead() translate Windows socket errors to Unix, but that didn't
actually work because TranslateSocketError() is backend-internal code
(and not even public there). But on closer inspection, the sole
caller of this function doesn't actually care whether the result is
zero or negative, much less inspect the errno. So the whole exercise
is totally useless, and has been since this code was introduced.
Rip it out and just call recv() directly.
Per buildfarm.
Discussion: https://postgr.es/m/2621622.1602184554@sss.pgh.pa.us
Up to now, only ECONNRESET (and EPIPE, in most but not quite all places)
received special treatment in our error handling logic. This patch
changes things so that related error codes such as ECONNABORTED are
also recognized as indicating that the connection's dead and unlikely
to come back.
We continue to think, however, that only ECONNRESET and EPIPE should be
reported as probable server crashes; the other cases indicate network
connectivity problems but prove little about the server's state. Thus,
there's no change in the error message texts that are output for such
cases. The key practical effect is that errcode_for_socket_access()
will report ERRCODE_CONNECTION_FAILURE rather than
ERRCODE_INTERNAL_ERROR for a network failure. It's expected that this
will fix buildfarm member lorikeet's failures since commit 32a9c0bdf,
as that seems to be due to not treating ECONNABORTED equivalently to
ECONNRESET.
The set of errnos treated this way now includes ECONNABORTED, EHOSTDOWN,
EHOSTUNREACH, ENETDOWN, ENETRESET, and ENETUNREACH. Several of these
were second-class citizens in terms of their handling in places like
get_errno_symbol(), so upgrade the infrastructure where necessary.
As committed, this patch assumes that all these symbols are defined
everywhere. POSIX specifies all of them except EHOSTDOWN, but that
seems to exist on all platforms of interest; we'll see what the
buildfarm says about that.
Probably this should be back-patched, but let's see what the buildfarm
thinks of it first.
Fujii Masao and Tom Lane
Discussion: https://postgr.es/m/2621622.1602184554@sss.pgh.pa.us
Commit 3eb3d3e78 was a few bricks shy of a load: while it correctly
set the table's "interesting" flag when deciding to dump the data of
an extension config table, it was not correct to clear that flag
if we concluded we shouldn't dump the data. This led to the crash
reported in bug #16655, because in fact we'll traverse dumpTableSchema
anyway for all extension tables (to see if they have user-added
seclabels or RLS policies).
The right thing to do is to force "interesting" true in makeTableDataInfo,
and otherwise leave the flag alone. (Doing it there is more future-proof
in case additional calls are added, and it also avoids setting the flag
unnecessarily if that function decides the table is non-dumpable.)
This investigation also showed that while only the --inserts code path
had an obvious failure in the case considered by 3eb3d3e78, the COPY
code path also has a problem with not having loaded table subsidiary
data. That causes fmtCopyColumnList to silently return an empty string
instead of the correct column list. That accidentally mostly works,
which perhaps is why we didn't notice this before. It would only fail
if the restore column order is different from the dump column order,
which only happens in weird inheritance cases, so it's not surprising
nobody had hit the case with an extension config table. Nonetheless,
it's a bug, and it goes a long way back, not just to v12 where the
--inserts code path started to have a problem with this.
In hopes of catching such cases a bit sooner in future, add some
Asserts that "interesting" has been set in both dumpTableData and
dumpTableSchema. Adjust the test case added by 3eb3d3e78 so that it
checks the COPY rather than INSERT form of that bug, allowing it to
detect the longer-standing symptom.
Per bug #16655 from Cameron Daniel. Back-patch to all supported
branches.
Discussion: https://postgr.es/m/16655-5c92d6b3a9438137@postgresql.org
Discussion: https://postgr.es/m/18048b44-3414-b983-8c7c-9165b177900d@2ndQuadrant.com
The BKI file's string quoting conventions were previously quite weird,
perhaps as a result of repurposing a function built to scan
single-quoted strings to scan double-quoted ones. Change to use the
same rules as we use in GUC files, allowing some simplifications in
genbki.pl and initdb.c.
While at it, completely remove the backend's scanstr() function, which
was essentially a duplicate of the string dequoting code in guc-file.l.
Instead export that one (under a less generic name than it had) and let
bootscanner.l use it. Now we can clarify that scansup.c exists only to
support the main lexer. We could alternatively have removed GUC_scanstr,
but this way seems better since the previous arrangement could mislead
a reader into thinking that scanstr() had something to do with the main
lexer's handling of string literals. Maybe it did once, but if so it
was a long time ago.
This patch does not bump catversion, since the initially-installed
catalog contents don't change. Note however that successful initdb
after applying this patch will require up-to-date postgres.bki as well
as postgres and initdb executables.
In passing, remove a bunch of very-long-obsolete #include's in
bootparse.y and bootscanner.l.
John Naylor
Discussion: https://postgr.es/m/CACPNZCtDpd18T0KATTmCggO2GdVC4ow86ypiq5ENff1VnauL8g@mail.gmail.com
Both tools never had safeguard checks for the options provided, and it
was possible to make pg_test_fsync run an infinite amount of time or
pass down buggy values to pg_test_timing.
These behaviors have existed for a long time, with no actual complaints,
so no backpatch is done. Basic TAP tests are introduced for both tools.
Author: Michael Paquier
Reviewed-by: Peter Eisentraut
Discussion: https://postgr.es/m/20200806062759.GE16470@paquier.xyz
Parallel pg_dump failed if its -d parameter was a connection string
containing any essential information other than host, port, or username.
The same was true for pg_restore with --create.
The reason is that these scenarios failed to preserve the connection
string from the command line; the code felt free to replace that with
just the database name when reconnecting from a pg_dump parallel worker
or after creating the target database. By chance, parallel pg_restore
did not suffer this defect, as long as you didn't say --create.
In practice it seems that the error would be obvious only if the
connstring included essential, non-default SSL or GSS parameters.
This may explain why it took us so long to notice. (It also makes
it very difficult to craft a regression test case illustrating the
problem, since the test would fail in builds without those options.)
Fix by refactoring so that ConnectDatabase always receives all the
relevant options directly from the command line, rather than
reconstructed values. Inject a different database name, when necessary,
by relying on libpq's rules for handling multiple "dbname" parameters.
While here, let's get rid of the essentially duplicate _connectDB
function, as well as some obsolete nearby cruft.
Per bug #16604 from Zsolt Ero. Back-patch to all supported branches.
Discussion: https://postgr.es/m/16604-933f4b8791227b15@postgresql.org
Existing code used various inconsistent ways to printf struct stat's
st_size member. The type of that is off_t, which is in most cases a
signed 64-bit integer, so use the long long int format for it.
pg_restore previously coped with overlength TOC-file lines using some
complicated logic to ignore additional bufferloads. While this isn't
wrong, since we don't expect that the interesting part of a line would
run to more than a dozen or so bytes, it's more complex than it needs
to be. Use a StringInfo instead of a fixed-size buffer so that we can
process long lines as single entities and thus not need the extra
logic.
Daniel Gustafsson
Discussion: https://postgr.es/m/48A4FA71-524E-41B9-953A-FD04EF36E2E7@yesql.se
Further experience says that the appending behavior offered by
pg_get_line_append is useful to only a very small minority of callers.
For most, the requirement to reset the buffer after each line is just
an error-prone nuisance. Hence, invent another alternative call
pg_get_line_buf, which takes care of that detail.
Noted while reviewing a patch from Daniel Gustafsson.
Discussion: https://postgr.es/m/48A4FA71-524E-41B9-953A-FD04EF36E2E7@yesql.se
This feature has been a thorn in our sides for a long time, causing
many grammatical ambiguity problems. It doesn't seem worth the
pain to continue to support it, so remove it.
There are some follow-on improvements we can make in the grammar,
but this commit only removes the bare minimum number of productions,
plus assorted backend support code.
Note that pg_dump and psql continue to have full support, since
they may be used against older servers. However, pg_dump warns
about postfix operators. There is also a check in pg_upgrade.
Documentation-wise, I (tgl) largely removed the "left unary"
terminology in favor of saying "prefix operator", which is
a more standard and IMO less confusing term.
I included a catversion bump, although no initial catalog data
changes here, to mark the boundary at which oprkind = 'r'
stopped being valid in pg_operator.
Mark Dilger, based on work by myself and Robert Haas;
review by John Naylor
Discussion: https://postgr.es/m/38ca86db-42ab-9b48-2902-337a0d6b8311@2ndquadrant.com