Oleg Bartunov and Teodor Sigaev, but I did a lot of editorializing,
so anything that's broken is probably my fault.
Documentation is nonexistent as yet, but let's land the patch so we can
get some portability testing done.
are not one of the query's defined result relations, but nonetheless have
triggers fired against them while the query is active. This was formerly
impossible but can now occur because of my recent patch to fix the firing
order for RI triggers. Caching a ResultRelInfo avoids duplicating work by
repeatedly opening and closing the same relation, and also allows EXPLAIN
ANALYZE to "see" and report on these extra triggers. Use the same mechanism
to cache open relations when firing deferred triggers at transaction shutdown;
this replaces the former one-element-cache strategy used in that case, and
should improve performance a bit when there are deferred triggers on a number
of relations.
row within one query: we were firing check triggers before all the updates
were done, leading to bogus failures. Fix by making the triggers queued by
an RI update go at the end of the outer query's trigger event list, thereby
effectively making the processing "breadth-first". This was indeed how it
worked pre-8.0, so the bug does not occur in the 7.x branches.
Per report from Pavel Stehule.
that still thought they could set HEAP_XMAX_COMMITTED immediately after
seeing the other transaction commit. Make them use the same logic as
tqual.c does to determine if the hint bit can be set yet.
child memory contexts is indented two spaces to the right of its
parent context. This should make it easier to deduce the memory
context hierarchy from the output of MemoryContextStats().
between the setting of log_line_prefix and the setting of log_timezone. We
can't realistically set log_timezone any earlier than we do now, so the best
behavior seems to be to use GMT zone if any timestamps are to be logged during
early startup. Create a dummy zone variable with a minimal definition of GMT
(in particular it will never know about leap seconds), so that we can set it
up without reference to any external files.
displayed in the postmaster log. This avoids Windows-specific problems with
localized time zone names that are in the wrong encoding, and generally seems
like a good idea to forestall other potential platform-dependent issues.
To preserve the existing behavior that all backends will log in the same time
zone, create a new GUC variable log_timezone that can only be changed on a
system-wide basis, and reference log-related calculations to that zone instead
of the TimeZone variable.
This fixes the issue reported by Hiroshi Saito that timestamps printed by
xlog.c startup could be improperly localized on Windows. We still need a
simpler patch for that problem in the back branches, however.
so that we will be able to create a cookie for all processes for CSVlogs.
It is set wherever MyProcPid is set. Take the opportunity to remove the now
unnecessary session-only restriction on the %s and %c escapes in log_line_prefix.
before reporting a transaction committed. Data consistency is still
guaranteed (unlike setting fsync = off), but a crash may lose the effects
of the last few transactions. Patch by Simon, some editorialization by Tom.
with the recent patch to log temp file sizes at removal time. Doesn't seem
worth fixing since it's unused.
In passing, make a few elog messages conform to the message style guide.
named pg_toast_temp_nnn, alongside the pg_temp_nnn schemas used for the temp
tables themselves. This allows low-level code such as the relcache to
recognize that these tables are indeed temporary, which enables various
optimizations such as not WAL-logging changes and using local rather than
shared buffers for access. Aside from obvious performance benefits, this
provides a solution to bug #3483, in which other backends unexpectedly held
open file references to temporary tables. The scheme preserves the property
that TOAST tables are not in any schema that's normally in the search path,
so they don't conflict with user table names.
initdb forced because of changes in system view definitions.
and fsync WAL at convenient intervals. For the moment it just tries to
offload this work from backends, but soon it will be responsible for
guaranteeing a maximum delay before asynchronously-committed transactions
will be flushed to disk.
This is a portion of Simon Riggs' async-commit patch, committed to CVS
separately because a background WAL writer seems like it might be a good idea
independently of the async-commit feature. I rebased walwriter.c on
bgwriter.c because it seemed like a more appropriate way of handling signals;
while the startup/shutdown logic in postmaster.c is more like autovac because
we want walwriter to quit before we start the shutdown checkpoint.
against a Unix server, and Windows-specific server-side authentication
using SSPI "negotiate" method (Kerberos or NTLM).
Only builds properly with MSVC for now.
we don't know at that point which relation OID to tell pgstat to forget.
The code was passing the relfilenode, which is incorrect, and could possibly
cause some other relation's stats to be zeroed out. While we could try to
clean this up, it seems much simpler and more reliable to let the next
invocation of pgstat_vacuum_tabstat() fix things; which indeed is how it
worked before I introduced the buggy code into 8.1.3 and later :-(.
Problem noticed by Itagaki Takahiro, fix is per subsequent discussion.
PGconn. Invent a new libpq connection-status function,
PQconnectionUsedPassword() that returns true if the server
demanded a password during authentication, false otherwise.
This may be useful to clients in general, but is immediately
useful to help plug a privilege escalation path in dblink.
Per list discussion and design proposed by Tom Lane.
unwarranted liberties with int8 vs float8 values for these types.
Specifically, be sure to apply either hashint8 or hashfloat8 depending
on HAVE_INT64_TIMESTAMP. Per my gripe of even date.
Sequences and views could previously be renamed using ALTER TABLE, but
this was a repeated source of confusion for users. Update the docs,
and psql tab completion. Patch from David Fetter; various minor fixes
by myself.
This is a Linux kernel bug that apparently exists in every extant kernel
version: sometimes shmctl() will fail with EIDRM when EINVAL is correct.
We were assuming that EIDRM indicates a possible conflict with pre-existing
backends, and refusing to start the postmaster when this happens. Fortunately,
there does not seem to be any case where Linux can legitimately return EIDRM
(it doesn't track shmem segments in a way that would allow that), so we can
get away with just assuming that EIDRM means EINVAL on this platform.
Per reports from Michael Fuhr and Jon Lapham --- it's a bit surprising
we have not seen more reports, actually.
over a fairly long period of time, rather than being spat out in a burst.
This happens only for background checkpoints carried out by the bgwriter;
other cases, such as a shutdown checkpoint, are still done at full speed.
Remove the "all buffers" scan in the bgwriter, and associated stats
infrastructure, since this seems no longer very useful when the checkpoint
itself is properly throttled.
Original patch by Itagaki Takahiro, reworked by Heikki Linnakangas,
and some minor API editorialization by me.
by having the postmaster signal it when certain failures occur. This requires
the postmaster setting a flag in shared memory, but should be as safe as the
pmsignal.c code is.
Also make sure the launcher honor's a postgresql.conf change turning it off
on SIGHUP.
(which now deals only in optimizable statements), and put that code
into a new file parser/parse_utilcmd.c. This helps clarify and enforce
the design rule that utility statements shouldn't be processed during
the regular parse analysis phase; all interpretation of their meaning
should happen after they are given to ProcessUtility to execute.
(We need this because we don't retain any locks for a utility statement
that's in a plan cache, nor have any way to detect that it's stale.)
We are also able to simplify the API for parse_analyze() and related
routines, because they will now always return exactly one Query structure.
In passing, fix bug #3403 concerning trying to add a serial column to
an existing temp table (this is largely Heikki's work, but we needed
all that restructuring to make it safe).
within a signal handler (this might be safe given the relatively narrow code
range in which the interrupt is enabled, but it seems awfully risky); do issue
more informative log messages that tell what is being waited for and the exact
length of the wait; minor other code cleanup. Greg Stark and Tom Lane
unreserved according to the grammar. The list of unreserved words has gotten
extensive enough that the unnecessary quoting is becoming a bit of an eyesore.
To do this, add knowledge of the keyword category to keywords.c's table.
(Someday we might be able to generate keywords.c's table and the keyword lists
in gram.y from a common source.) For the moment, lie about WITH's status in
the table so it will still get quoted --- this is because of the expectation
that WITH will become reserved when the SQL recursive-queries patch gets done.
I didn't force initdb because this affects nothing on-disk; but note that a
few regression tests have changed expected output.
an array of strings rather than an array of integers, and allow any simple
constant or identifier to be used in typmods; for example
create table foo (f1 widget(42,'23skidoo',point));
Of course the typmodin function has still got to pack this info into a
non-negative int32 for storage, but it's still a useful improvement in
flexibility, especially considering that you can do nearly anything if you
are willing to keep the info in a side table. We can get away with this
change since we have not yet released a version providing user-definable
typmods. Per discussion.
reassembled in the syslogger before writing to the log file. This prevents
partial messages from being written, which mucks up log rotation, and
messages from different backends being interleaved, which causes garbled
logs. Backport as far as 8.0, where the syslogger was introduced.
Tom Lane and Andrew Dunstan
with a plpgsql-defined cursor. The underlying mechanism for this is that the
main SQL engine will now take "WHERE CURRENT OF $n" where $n is a refcursor
parameter. Not sure if we should document that fact or consider it an
implementation detail. Per discussion with Pavel Stehule.
Along the way, allow FOR UPDATE in non-WITH-HOLD cursors; there may once
have been a reason to disallow that, but it seems to work now, and it's
really rather necessary if you want to select a row via a cursor and then
update it in a concurrent-safe fashion.
Original patch by Arul Shaji, rather heavily editorialized by Tom Lane.
pseudo HeapScanDesc created for a bitmap heap scan. This avoids some useless
overhead during a bitmap scan startup, in particular invoking the syncscan
code. (We might someday want to do that, but right now it's merely useless
contention for shared memory, to say nothing of possibly pushing useful
entries out of syncscan's small LRU list.) This also allows elimination of
ugly pgstat_discount_heap_scan() kluge.
for each temp file, rather than once per sort or hashjoin; this allows
spreading the data of a large sort or join across multiple tablespaces.
(I remain dubious that this will make any difference in practice, but certain
people insisted.) Arrange to cache the results of parsing the GUC variable
instead of recomputing from scratch on every demand, and push usage of the
cache down to the bottommost fd.c level.