mirror of https://github.com/postgres/postgres
parent
e873207fdd
commit
b9f5c93b75
@ -1,243 +1,280 @@ |
||||
Regression Tests |
||||
|
||||
Introduction |
||||
Regression Tests |
||||
|
||||
The regression tests are a comprehensive set of tests for the SQL |
||||
implementation in PostgreSQL. They test standard SQL operations as well as |
||||
the extended capabilities of PostgreSQL. The test suite was originally |
||||
developed by Jolly Chen and Andrew Yu, and was extensively revised and |
||||
repackaged by Marc Fournier and Thomas Lockhart. From PostgreSQL 6.1 onward |
||||
the regression tests are current for every official release. |
||||
implementation in PostgreSQL. They test standard SQL operations as well as the |
||||
extended capabilities of PostgreSQL. From PostgreSQL 6.1 onward, the regression |
||||
tests are current for every official release. |
||||
|
||||
------------------------------------------------------------------------ |
||||
------------------------------------------------------------------------------- |
||||
|
||||
Running the Tests |
||||
|
||||
The regression test can be run against an already installed and running |
||||
server, or using a temporary installation within the build tree. |
||||
Furthermore, there is a "parallel" and a "sequential" mode for running the |
||||
tests. The sequential method runs each test script in turn, whereas the |
||||
parallel method starts up multiple server processes to run groups of tests |
||||
in parallel. Parallel testing gives confidence that interprocess |
||||
communication and locking are working correctly. For historical reasons, the |
||||
sequential test is usually run against an existing installation and the |
||||
parallel method against a temporary installation, but there are no technical |
||||
reasons for this. |
||||
The regression test can be run against an already installed and running server, |
||||
or using a temporary installation within the build tree. Furthermore, there is |
||||
a "parallel" and a "sequential" mode for running the tests. The sequential |
||||
method runs each test script in turn, whereas the parallel method starts up |
||||
multiple server processes to run groups of tests in parallel. Parallel testing |
||||
gives confidence that interprocess communication and locking are working |
||||
correctly. For historical reasons, the sequential test is usually run against |
||||
an existing installation and the parallel method against a temporary |
||||
installation, but there are no technical reasons for this. |
||||
|
||||
To run the regression tests after building but before installation, type |
||||
|
||||
$ gmake check |
||||
gmake check |
||||
|
||||
in the top-level directory. (Or you can change to src/test/regress and run |
||||
the command there.) This will first build several auxiliary files, such as |
||||
platform-dependent "expected" files and some sample user-defined trigger |
||||
functions, and then run the test driver script. At the end you should see |
||||
something like |
||||
in the top-level directory. (Or you can change to "src/test/regress" and run |
||||
the command there.) This will first build several auxiliary files, such as some |
||||
sample user-defined trigger functions, and then run the test driver script. At |
||||
the end you should see something like |
||||
|
||||
====================== |
||||
All 77 tests passed. |
||||
====================== |
||||
====================== |
||||
All 93 tests passed. |
||||
====================== |
||||
|
||||
or otherwise a note about what tests failed. See the Section called Test |
||||
or otherwise a note about which tests failed. See the Section called Test |
||||
Evaluation below for more. |
||||
|
||||
Note: Because this test method runs a temporary server, it will |
||||
not work when you are the root user (the server will not start as |
||||
root). If you already did the build as root, you do not have to |
||||
start all over. Instead, make the regression test directory |
||||
writable by some other user, log in as that user, and restart the |
||||
tests. For example, |
||||
Because this test method runs a temporary server, it will not work when you are |
||||
the root user (since the server will not start as root). If you already did the |
||||
build as root, you do not have to start all over. Instead, make the regression |
||||
test directory writable by some other user, log in as that user, and restart |
||||
the tests. For example |
||||
|
||||
root# chmod -R a+w src/test/regress |
||||
root# chmod -R a+w contrib/spi |
||||
root# su - joeuser |
||||
joeuser$ cd top-level build directory |
||||
joeuser$ gmake check |
||||
|
||||
(The only possible "security risk" here is that other users might be able to |
||||
alter the regression test results behind your back. Use common sense when |
||||
managing user permissions.) |
||||
|
||||
Alternatively, run the tests after installation. |
||||
|
||||
The parallel regression test starts quite a few processes under your user ID. |
||||
Presently, the maximum concurrency is twenty parallel test scripts, which means |
||||
sixty processes: there's a server process, a psql, and usually a shell parent |
||||
process for the psql for each test script. So if your system enforces a per- |
||||
user limit on the number of processes, make sure this limit is at least |
||||
seventy-five or so, else you may get random-seeming failures in the parallel |
||||
test. If you are not in a position to raise the limit, you can cut down the |
||||
degree of parallelism by setting the MAX_CONNECTIONS parameter. For example, |
||||
|
||||
root# chmod -R a+w src/test/regress |
||||
root# chmod -R a+w contrib/spi |
||||
root# su - joeuser |
||||
joeuser$ cd <build top-level directory> |
||||
joeuser$ gmake check |
||||
gmake MAX_CONNECTIONS=10 check |
||||
|
||||
(The only possible "security risk" here is that other users might |
||||
be able to alter the regression test results behind your back. Use |
||||
common sense when managing user permissions.) |
||||
runs no more than ten tests concurrently. |
||||
|
||||
Alternatively, run the tests after installation. |
||||
On some systems, the default Bourne-compatible shell ("/bin/sh") gets confused |
||||
when it has to manage too many child processes in parallel. This may cause the |
||||
parallel test run to lock up or fail. In such cases, specify a different |
||||
Bourne-compatible shell on the command line, for example: |
||||
|
||||
Tip: On some systems, the default Bourne-compatible shell |
||||
(/bin/sh) gets confused when it has to manage too many child |
||||
processes in parallel. This may cause the parallel test run to |
||||
lock up or fail. In such cases, specify a different |
||||
Bourne-compatible shell on the command line, for example: |
||||
gmake SHELL=/bin/ksh check |
||||
|
||||
$ gmake SHELL=/bin/ksh check |
||||
If no non-broken shell is available, you may be able to work around the problem |
||||
by limiting the number of connections, as shown above. |
||||
|
||||
To run the tests after installation, initialize a data area and start the |
||||
server, then type |
||||
|
||||
$ gmake installcheck |
||||
gmake installcheck |
||||
|
||||
The tests will expect to contact the server at the local host and the |
||||
default port number, unless directed otherwise by PGHOST and PGPORT |
||||
environment variables. |
||||
The tests will expect to contact the server at the local host and the default |
||||
port number, unless directed otherwise by PGHOST and PGPORT environment |
||||
variables. |
||||
|
||||
------------------------------------------------------------------------ |
||||
------------------------------------------------------------------------------- |
||||
|
||||
Test Evaluation |
||||
|
||||
Some properly installed and fully functional PostgreSQL installations can |
||||
"fail" some of these regression tests due to platform-specific artifacts |
||||
such as varying floating point representation and time zone support. The |
||||
tests are currently evaluated using a simple diff comparison against the |
||||
outputs generated on a reference system, so the results are sensitive to |
||||
small system differences. When a test is reported as "failed", always |
||||
examine the differences between expected and actual results; you may well |
||||
find that the differences are not significant. Nonetheless, we still strive |
||||
to maintain accurate reference files across all supported platforms, so it |
||||
can be expected that all tests pass. |
||||
|
||||
The actual outputs of the regression tests are in files in the |
||||
src/test/regress/results directory. The test script uses diff to compare |
||||
each output file against the reference outputs stored in the |
||||
src/test/regress/expected directory. Any differences are saved for your |
||||
inspection in src/test/regress/regression.diffs. (Or you can run diff |
||||
yourself, if you prefer.) |
||||
|
||||
------------------------------------------------------------------------ |
||||
"fail" some of these regression tests due to platform-specific artifacts such |
||||
as varying floating-point representation and time zone support. The tests are |
||||
currently evaluated using a simple "diff" comparison against the outputs |
||||
generated on a reference system, so the results are sensitive to small system |
||||
differences. When a test is reported as "failed", always examine the |
||||
differences between expected and actual results; you may well find that the |
||||
differences are not significant. Nonetheless, we still strive to maintain |
||||
accurate reference files across all supported platforms, so it can be expected |
||||
that all tests pass. |
||||
|
||||
The actual outputs of the regression tests are in files in the "src/test/ |
||||
regress/results" directory. The test script uses "diff" to compare each output |
||||
file against the reference outputs stored in the "src/test/regress/expected" |
||||
directory. Any differences are saved for your inspection in "src/test/regress/ |
||||
regression.diffs". (Or you can run "diff" yourself, if you prefer.) |
||||
|
||||
------------------------------------------------------------------------------- |
||||
|
||||
Error message differences |
||||
|
||||
Some of the regression tests involve intentional invalid input values. Error |
||||
messages can come from either the PostgreSQL code or from the host platform |
||||
system routines. In the latter case, the messages may vary between |
||||
platforms, but should reflect similar information. These differences in |
||||
messages will result in a "failed" regression test that can be validated by |
||||
inspection. |
||||
system routines. In the latter case, the messages may vary between platforms, |
||||
but should reflect similar information. These differences in messages will |
||||
result in a "failed" regression test that can be validated by inspection. |
||||
|
||||
------------------------------------------------------------------------ |
||||
------------------------------------------------------------------------------- |
||||
|
||||
Locale differences |
||||
|
||||
The tests expect to run in plain "C" locale. This should not cause any |
||||
problems when you run the tests against a temporary installation, since the |
||||
regression test driver takes care to start the server in C locale. However, |
||||
if you run the tests against an already-installed server that is using non-C |
||||
locale settings, you may see differences caused by varying rules for string |
||||
sort order, formatting of numeric and monetary values, and so forth. |
||||
|
||||
In some locales the resulting differences are small and easily checked by |
||||
inspection. However, in a locale that changes the rules for formatting of |
||||
numeric values (typically by swapping the usage of commas and decimal |
||||
points), entry of some data values will fail, resulting in extensive |
||||
differences later in the tests where the missing data values are supposed to |
||||
be used. |
||||
|
||||
------------------------------------------------------------------------ |
||||
If you run the tests against an already-installed server that was initialized |
||||
with a collation-order locale other than C, then there may be differences due |
||||
to sort order and follow-up failures. The regression test suite is set up to |
||||
handle this problem by providing alternative result files that together are |
||||
known to handle a large number of locales. For example, for the char test, the |
||||
expected file "char.out" handles the C and POSIX locales, and the file |
||||
"char_1.out" handles many other locales. The regression test driver will |
||||
automatically pick the best file to match against when checking for success and |
||||
for computing failure differences. (This means that the regression tests cannot |
||||
detect whether the results are appropriate for the configured locale. The tests |
||||
will simply pick the one result file that works best.) |
||||
|
||||
If for some reason the existing expected files do not cover some locale, you |
||||
can add a new file. The naming scheme is testname_digit.out. The actual digit |
||||
is not significant. Remember that the regression test driver will consider all |
||||
such files to be equally valid test results. If the test results are platform- |
||||
specific, the technique described in the Section called Platform-specific |
||||
comparison files should be used instead. |
||||
|
||||
------------------------------------------------------------------------------- |
||||
|
||||
Date and time differences |
||||
|
||||
Some of the queries in the timestamp test will fail if you run the test on |
||||
the day of a daylight-savings time changeover, or the day before or after |
||||
one. These queries assume that the intervals between midnight yesterday, |
||||
midnight today and midnight tomorrow are exactly twenty-four hours -- which |
||||
is wrong if daylight-savings time went into or out of effect meanwhile. |
||||
|
||||
Most of the date and time results are dependent on the time zone |
||||
environment. The reference files are generated for time zone PST8PDT |
||||
(Berkeley, California) and there will be apparent failures if the tests are |
||||
not run with that time zone setting. The regression test driver sets |
||||
environment variable PGTZ to PST8PDT, which normally ensures proper results. |
||||
However, your system must provide library support for the PST8PDT time zone, |
||||
or the time zone-dependent tests will fail. To verify that your machine does |
||||
have this support, type the following: |
||||
|
||||
$ env TZ=PST8PDT date |
||||
|
||||
The command above should have returned the current system time in the |
||||
PST8PDT time zone. If the PST8PDT database is not available, then your |
||||
system may have returned the time in GMT. If the PST8PDT time zone is not |
||||
available, you can set the time zone rules explicitly: |
||||
A few of the queries in the "horology" test will fail if you run the test on |
||||
the day of a daylight-saving time changeover, or the day after one. These |
||||
queries expect that the intervals between midnight yesterday, midnight today |
||||
and midnight tomorrow are exactly twenty-four hours --- which is wrong if |
||||
daylight-saving time went into or out of effect meanwhile. |
||||
|
||||
PGTZ='PST8PDT7,M04.01.0,M10.05.03'; export PGTZ |
||||
Note: Because USA daylight-saving time rules are used, this problem |
||||
always occurs on the first Sunday of April, the last Sunday of |
||||
October, and their following Mondays, regardless of when daylight- |
||||
saving time is in effect where you live. Also note that the problem |
||||
appears or disappears at midnight Pacific time (UTC-7 or UTC-8), not |
||||
midnight your local time. Thus the failure may appear late on |
||||
Saturday or persist through much of Tuesday, depending on where you |
||||
live. |
||||
|
||||
There appear to be some systems that do not accept the recommended syntax |
||||
for explicitly setting the local time zone rules; you may need to use a |
||||
different PGTZ setting on such machines. |
||||
Most of the date and time results are dependent on the time zone environment. |
||||
The reference files are generated for time zone PST8PDT (Berkeley, California), |
||||
and there will be apparent failures if the tests are not run with that time |
||||
zone setting. The regression test driver sets environment variable PGTZ to |
||||
PST8PDT, which normally ensures proper results. However, your operating system |
||||
must provide support for the PST8PDT time zone, or the time zone-dependent |
||||
tests will fail. To verify that your machine does have this support, type the |
||||
following: |
||||
|
||||
Some systems using older time zone libraries fail to apply daylight-savings |
||||
corrections to dates before 1970, causing pre-1970 PDT times to be displayed |
||||
in PST instead. This will result in localized differences in the test |
||||
results. |
||||
env TZ=PST8PDT date |
||||
|
||||
------------------------------------------------------------------------ |
||||
The command above should have returned the current system time in the PST8PDT |
||||
time zone. If the PST8PDT time zone is not available, then your system may have |
||||
returned the time in UTC. If the PST8PDT time zone is missing, you can set the |
||||
time zone rules explicitly: |
||||
|
||||
Floating point differences |
||||
PGTZ='PST8PDT7,M04.01.0,M10.05.03'; export PGTZ |
||||
|
||||
Some of the tests involve computing 64-bit (double precision) numbers from |
||||
table columns. Differences in results involving mathematical functions of |
||||
double precision columns have been observed. The float8 and geometry tests |
||||
are particularly prone to small differences across platforms, or even with |
||||
different compiler optimization options. Human eyeball comparison is needed |
||||
to determine the real significance of these differences which are usually 10 |
||||
places to the right of the decimal point. |
||||
There appear to be some systems that do not accept the recommended syntax for |
||||
explicitly setting the local time zone rules; you may need to use a different |
||||
PGTZ setting on such machines. |
||||
|
||||
Some systems signal errors from pow() and exp() differently from the |
||||
mechanism expected by the current PostgreSQL code. |
||||
Some systems using older time-zone libraries fail to apply daylight-saving |
||||
corrections to dates before 1970, causing pre-1970 PDT times to be displayed in |
||||
PST instead. This will result in localized differences in the test results. |
||||
|
||||
------------------------------------------------------------------------ |
||||
------------------------------------------------------------------------------- |
||||
|
||||
Polygon differences |
||||
Floating-point differences |
||||
|
||||
Several of the tests involve operations on geographic data about the |
||||
Oakland/Berkeley, California street map. The map data is expressed as |
||||
polygons whose vertices are represented as pairs of double precision numbers |
||||
(decimal latitude and longitude). Initially, some tables are created and |
||||
loaded with geographic data, then some views are created that join two |
||||
tables using the polygon intersection operator (##), then a select is done |
||||
on the view. |
||||
Some of the tests involve computing 64-bit floating-point numbers (double |
||||
precision) from table columns. Differences in results involving mathematical |
||||
functions of double precision columns have been observed. The float8 and |
||||
geometry tests are particularly prone to small differences across platforms, or |
||||
even with different compiler optimization options. Human eyeball comparison is |
||||
needed to determine the real significance of these differences which are |
||||
usually 10 places to the right of the decimal point. |
||||
|
||||
When comparing the results from different platforms, differences occur in |
||||
the 2nd or 3rd place to the right of the decimal point. The SQL statements |
||||
where these problems occur are the following: |
||||
Some systems display minus zero as -0, while others just show 0. |
||||
|
||||
SELECT * from street; |
||||
SELECT * from iexit; |
||||
Some systems signal errors from pow() and exp() differently from the mechanism |
||||
expected by the current PostgreSQL code. |
||||
|
||||
------------------------------------------------------------------------ |
||||
------------------------------------------------------------------------------- |
||||
|
||||
Row ordering differences |
||||
|
||||
You might see differences in which the same rows are output in a different |
||||
order than what appears in the expected file. In most cases this is not, |
||||
strictly speaking, a bug. Most of the regression test scripts are not so |
||||
pedantic as to use an ORDER BY for every single SELECT, and so their result |
||||
row orderings are not well-defined according to the letter of the SQL |
||||
pedantic as to use an ORDER BY for every single SELECT, and so their result row |
||||
orderings are not well-defined according to the letter of the SQL |
||||
specification. In practice, since we are looking at the same queries being |
||||
executed on the same data by the same software, we usually get the same |
||||
result ordering on all platforms, and so the lack of ORDER BY isn't a |
||||
problem. Some queries do exhibit cross-platform ordering differences, |
||||
however. (Ordering differences can also be triggered by non-C locale |
||||
settings.) |
||||
executed on the same data by the same software, we usually get the same result |
||||
ordering on all platforms, and so the lack of ORDER BY isn't a problem. Some |
||||
queries do exhibit cross-platform ordering differences, however. (Ordering |
||||
differences can also be triggered by non-C locale settings.) |
||||
|
||||
Therefore, if you see an ordering difference, it's not something to worry |
||||
about, unless the query does have an ORDER BY that your result is violating. |
||||
But please report it anyway, so that we can add an ORDER BY to that |
||||
particular query and thereby eliminate the bogus "failure" in future |
||||
releases. |
||||
But please report it anyway, so that we can add an ORDER BY to that particular |
||||
query and thereby eliminate the bogus "failure" in future releases. |
||||
|
||||
You might wonder why we don't order all the regress test queries explicitly |
||||
to get rid of this issue once and for all. The reason is that that would |
||||
make the regression tests less useful, not more, since they'd tend to |
||||
exercise query plan types that produce ordered results to the exclusion of |
||||
those that don't. |
||||
You might wonder why we don't order all the regression test queries explicitly |
||||
to get rid of this issue once and for all. The reason is that that would make |
||||
the regression tests less useful, not more, since they'd tend to exercise query |
||||
plan types that produce ordered results to the exclusion of those that don't. |
||||
|
||||
------------------------------------------------------------------------ |
||||
------------------------------------------------------------------------------- |
||||
|
||||
The "random" test |
||||
|
||||
There is at least one case in the "random" test script that is intended to |
||||
produce random results. This causes random to fail the regression test once |
||||
in a while (perhaps once in every five to ten trials). Typing |
||||
There is at least one case in the random test script that is intended to |
||||
produce random results. This causes random to fail the regression test once in |
||||
a while (perhaps once in every five to ten trials). Typing |
||||
|
||||
diff results/random.out expected/random.out |
||||
diff results/random.out expected/random.out |
||||
|
||||
should produce only one or a few lines of differences. You need not worry |
||||
unless the random test always fails in repeated attempts. (On the other |
||||
hand, if the random test is never reported to fail even in many trials of |
||||
the regression tests, you probably should worry.) |
||||
unless the random test always fails in repeated attempts. (On the other hand, |
||||
if the random test is *never* reported to fail even in many trials of the |
||||
regression tests, you probably *should* worry.) |
||||
|
||||
------------------------------------------------------------------------------- |
||||
|
||||
Platform-specific comparison files |
||||
|
||||
Since some of the tests inherently produce platform-specific results, we have |
||||
provided a way to supply platform-specific result comparison files. Frequently, |
||||
the same variation applies to multiple platforms; rather than supplying a |
||||
separate comparison file for every platform, there is a mapping file that |
||||
defines which comparison file to use. So, to eliminate bogus test "failures" |
||||
for a particular platform, you must choose or make a variant result file, and |
||||
then add a line to the mapping file, which is "src/test/regress/resultmap". |
||||
|
||||
Each line in the mapping file is of the form |
||||
|
||||
testname/platformpattern=comparisonfilename |
||||
|
||||
The test name is just the name of the particular regression test module. The |
||||
platform pattern is a pattern in the style of the Unix tool "expr" (that is, a |
||||
regular expression with an implicit ^ anchor at the start). It is matched |
||||
against the platform name as printed by "config.guess" followed by :gcc or :cc, |
||||
depending on whether you use the GNU compiler or the system's native compiler |
||||
(on systems where there is a difference). The comparison file name is the name |
||||
of the substitute result comparison file. |
||||
|
||||
For example: some systems using older time zone libraries fail to apply |
||||
daylight-saving corrections to dates before 1970, causing pre-1970 PDT times to |
||||
be displayed in PST instead. This causes a few differences in the "horology" |
||||
regression test. Therefore, we provide a variant comparison file, "horology-no- |
||||
DST-before-1970.out", which includes the results to be expected on these |
||||
systems. To silence the bogus "failure" message on HPUX platforms, "resultmap" |
||||
includes |
||||
|
||||
horology/.*-hpux=horology-no-DST-before-1970 |
||||
|
||||
which will trigger on any machine for which the output of "config.guess" |
||||
includes -hpux. Other lines in "resultmap" select the variant comparison file |
||||
for other platforms where it's appropriate. |
||||
|
Loading…
Reference in new issue