not being consulted anywhere, so remove it and remove the _mdnblocks()
calls that were used to set it. Change smgrextend interface to pass in
the target block number (ie, current file length) --- the caller always
knows this already, having already done smgrnblocks(), so it's silly to
do it over again inside mdextend. Net result: extension of a file now
takes one lseek(SEEK_END) and a write(), not three lseeks and a write.
collected by ANALYZE. Also, add some modest amount of intelligence to
guesses that are used for varlena columns in the absence of any ANALYZE
statistics. The 'width' reported by EXPLAIN is finally something less
than totally bogus for varlena columns ... and, in consequence, hashjoin
estimating should be a little better ...
a separate statement (though it can still be invoked as part of VACUUM, too).
pg_statistic redesigned to be more flexible about what statistics are
stored. ANALYZE now collects a list of several of the most common values,
not just one, plus a histogram (not just the min and max values). Random
sampling is used to make the process reasonably fast even on very large
tables. The number of values and histogram bins collected is now
user-settable via an ALTER TABLE command.
There is more still to do; the new stats are not being used everywhere
they could be in the planner. But the remaining changes for this project
should be localized, and the behavior is already better than before.
A not-very-related change is that sorting now makes use of btree comparison
routines if it can find one, rather than invoking '<' twice.
routine DetermineLocalTimeZone(). In that routine, be more wary of
broken mktime() implementations than the original code was: don't allow
mktime to change the already-set y/m/d/h/m/s information, and don't
use tm_gmtoff if mktime failed. Possibly this will resolve some of
the complaints we've been hearing from users of Middle Eastern timezones
on RedHat.
give consistent results for all datatypes. Types float4, float8, and
numeric were broken for NaN values; abstime, timestamp, and interval
were broken for INVALID values; timetz was just plain broken (some
possible pairs of values were neither < nor = nor >). Also clean up
text, bpchar, varchar, and bit/varbit to eliminate duplicate code and
thereby reduce the probability of similar inconsistencies arising in
the future.
as six bytes not eight. This fixes a regression test failure but more
importantly avoids wasting four bytes of pad space in every tuple header.
Also add some commentary about what's going on.
accepts nnnLL syntax for long long constants. If so, decorate the CRC64
constants with LL to avoid warnings and/or erroneous results from certain
non-standards-compliant compilers.
O_SYNC, or O_DSYNC (as available on a given platform). Add GUC parameter
to control sync method.
Also, add defense to XLogWrite to prevent it from going nuts if passed
a target write position that's past the end of the buffers so far filled
by XLogInsert.
detect case that next page in log came from an older run than the prior
page. This avoids the necessity to re-zero the log after recovery from
a crash, which is good because we need not risk destroying valuable log
information.
This forces another initdb since yesterday :-(. Need to get that log
reset utility done...
* Store two past checkpoint locations, not just one, in pg_control.
On startup, we fall back to the older checkpoint if the newer one
is unreadable. Also, a physical copy of the newest checkpoint record
is kept in pg_control for possible use in disaster recovery (ie,
complete loss of pg_xlog). Also add a version number for pg_control
itself. Remove archdir from pg_control; it ought to be a GUC
parameter, not a special case (not that it's implemented yet anyway).
* Suppress successive checkpoint records when nothing has been entered
in the WAL log since the last one. This is not so much to avoid I/O
as to make it actually useful to keep track of the last two
checkpoints. If the things are right next to each other then there's
not a lot of redundancy gained...
* Change CRC scheme to a true 64-bit CRC, not a pair of 32-bit CRCs
on alternate bytes. Polynomial borrowed from ECMA DLT1 standard.
* Fix XLOG record length handling so that it will work at BLCKSZ = 32k.
* Change XID allocation to work more like OID allocation. (This is of
dubious necessity, but I think it's a good idea anyway.)
* Fix a number of minor bugs, such as off-by-one logic for XLOG file
wraparound at the 4 gig mark.
* Add documentation and clean up some coding infelicities; move file
format declarations out to include files where planned contrib
utilities can get at them.
* Checkpoint will now occur every CHECKPOINT_SEGMENTS log segments or
every CHECKPOINT_TIMEOUT seconds, whichever comes first. It is also
possible to force a checkpoint by sending SIGUSR1 to the postmaster
(undocumented feature...)
* Defend against kill -9 postmaster by storing shmem block's key and ID
in postmaster.pid lockfile, and checking at startup to ensure that no
processes are still connected to old shmem block (if it still exists).
* Switch backends to accept SIGQUIT rather than SIGUSR1 for emergency
stop, for symmetry with postmaster and xlog utilities. Clean up signal
handling in bootstrap.c so that xlog utilities launched by postmaster
will react to signals better.
* Standalone bootstrap now grabs lockfile in target directory, as added
insurance against running it in parallel with live postmaster.
when user does another FETCH after reaching end of data, or another
FETCH backwards after reaching start. This is needed because some plan
nodes are not very robust about being called again after they've already
returned NULL; for example, MergeJoin will crash in some states but not
others. While the ideal approach would be for them all to handle this
correctly, it seems foolish to assume that no such bugs would creep in
again once cleaned up. Therefore, the most robust answer is to prevent
the situation from arising at all.
noncachable, so that CURRENT_DATE and CURRENT_TIME work as functions
again, rather than being collapsed to constants immediately. Marking the
reverse conversions noncachable might be overkill, but I'm not sure;
do these datatypes have the notion of a CURRENT value? Better safe than
sorry, for now.
only if at least N other backends currently have open transactions. This
is not a great deal of intelligence about whether a delay might be
profitable ... but it beats no intelligence at all. Note that the default
COMMIT_DELAY is still zero --- this new code does nothing unless that
setting is changed.
Also, mark ENABLEFSYNC as a system-wide setting. It's no longer safe to
allow that to be set per-backend, since we may be relying on some other
backend's fsync to have synced the WAL log.
> Is there one LOCKMETHODCTL for every backend? I thought there was only
> one of them.
>>
>> You're right, that line is erroneous; it should read
>>
>> size += MAX_LOCK_METHODS * MAXALIGN(sizeof(LOCKMETHODCTL));
>>
>> Not a significant error but it should be changed for clarity ...
be lurking in the install target directory. But don't zap up-to-date
headers (so install-all-headers before regular install will work).
Per suggestion from Larry Rosenman.