mirror of https://github.com/postgres/postgres
Tag:
Branch:
Tree:
98edd617f3
REL2_0B
REL6_4
REL6_5_PATCHES
REL7_0_PATCHES
REL7_1_STABLE
REL7_2_STABLE
REL7_3_STABLE
REL7_4_STABLE
REL8_0_STABLE
REL8_1_STABLE
REL8_2_STABLE
REL8_3_STABLE
REL8_4_STABLE
REL8_5_ALPHA1_BRANCH
REL8_5_ALPHA2_BRANCH
REL8_5_ALPHA3_BRANCH
REL9_0_ALPHA4_BRANCH
REL9_0_ALPHA5_BRANCH
REL9_0_STABLE
REL9_1_STABLE
REL9_2_STABLE
REL9_3_STABLE
REL9_4_STABLE
REL9_5_STABLE
REL9_6_STABLE
REL_10_STABLE
REL_11_STABLE
REL_12_STABLE
REL_13_STABLE
REL_14_STABLE
REL_15_STABLE
REL_16_STABLE
REL_17_STABLE
REL_18_STABLE
Release_1_0_3
WIN32_DEV
ecpg_big_bison
master
PG95-1_01
PG95-1_08
PG95-1_09
REL2_0
REL6_1
REL6_1_1
REL6_2
REL6_2_1
REL6_3
REL6_3_2
REL6_4_2
REL6_5
REL6_5_1
REL6_5_2
REL6_5_3
REL7_0
REL7_0_2
REL7_0_3
REL7_1
REL7_1_1
REL7_1_2
REL7_1_3
REL7_1_BETA
REL7_1_BETA2
REL7_1_BETA3
REL7_2
REL7_2_1
REL7_2_2
REL7_2_3
REL7_2_4
REL7_2_5
REL7_2_6
REL7_2_7
REL7_2_8
REL7_2_BETA1
REL7_2_BETA2
REL7_2_BETA3
REL7_2_BETA4
REL7_2_BETA5
REL7_2_RC1
REL7_2_RC2
REL7_3
REL7_3_1
REL7_3_10
REL7_3_11
REL7_3_12
REL7_3_13
REL7_3_14
REL7_3_15
REL7_3_16
REL7_3_17
REL7_3_18
REL7_3_19
REL7_3_2
REL7_3_20
REL7_3_21
REL7_3_3
REL7_3_4
REL7_3_5
REL7_3_6
REL7_3_7
REL7_3_8
REL7_3_9
REL7_4
REL7_4_1
REL7_4_10
REL7_4_11
REL7_4_12
REL7_4_13
REL7_4_14
REL7_4_15
REL7_4_16
REL7_4_17
REL7_4_18
REL7_4_19
REL7_4_2
REL7_4_20
REL7_4_21
REL7_4_22
REL7_4_23
REL7_4_24
REL7_4_25
REL7_4_26
REL7_4_27
REL7_4_28
REL7_4_29
REL7_4_3
REL7_4_30
REL7_4_4
REL7_4_5
REL7_4_6
REL7_4_7
REL7_4_8
REL7_4_9
REL7_4_BETA1
REL7_4_BETA2
REL7_4_BETA3
REL7_4_BETA4
REL7_4_BETA5
REL7_4_RC1
REL7_4_RC2
REL8_0_0
REL8_0_0BETA1
REL8_0_0BETA2
REL8_0_0BETA3
REL8_0_0BETA4
REL8_0_0BETA5
REL8_0_0RC1
REL8_0_0RC2
REL8_0_0RC3
REL8_0_0RC4
REL8_0_0RC5
REL8_0_1
REL8_0_10
REL8_0_11
REL8_0_12
REL8_0_13
REL8_0_14
REL8_0_15
REL8_0_16
REL8_0_17
REL8_0_18
REL8_0_19
REL8_0_2
REL8_0_20
REL8_0_21
REL8_0_22
REL8_0_23
REL8_0_24
REL8_0_25
REL8_0_26
REL8_0_3
REL8_0_4
REL8_0_5
REL8_0_6
REL8_0_7
REL8_0_8
REL8_0_9
REL8_1_0
REL8_1_0BETA1
REL8_1_0BETA2
REL8_1_0BETA3
REL8_1_0BETA4
REL8_1_0RC1
REL8_1_1
REL8_1_10
REL8_1_11
REL8_1_12
REL8_1_13
REL8_1_14
REL8_1_15
REL8_1_16
REL8_1_17
REL8_1_18
REL8_1_19
REL8_1_2
REL8_1_20
REL8_1_21
REL8_1_22
REL8_1_23
REL8_1_3
REL8_1_4
REL8_1_5
REL8_1_6
REL8_1_7
REL8_1_8
REL8_1_9
REL8_2_0
REL8_2_1
REL8_2_10
REL8_2_11
REL8_2_12
REL8_2_13
REL8_2_14
REL8_2_15
REL8_2_16
REL8_2_17
REL8_2_18
REL8_2_19
REL8_2_2
REL8_2_20
REL8_2_21
REL8_2_22
REL8_2_23
REL8_2_3
REL8_2_4
REL8_2_5
REL8_2_6
REL8_2_7
REL8_2_8
REL8_2_9
REL8_2_BETA1
REL8_2_BETA2
REL8_2_BETA3
REL8_2_RC1
REL8_3_0
REL8_3_1
REL8_3_10
REL8_3_11
REL8_3_12
REL8_3_13
REL8_3_14
REL8_3_15
REL8_3_16
REL8_3_17
REL8_3_18
REL8_3_19
REL8_3_2
REL8_3_20
REL8_3_21
REL8_3_22
REL8_3_23
REL8_3_3
REL8_3_4
REL8_3_5
REL8_3_6
REL8_3_7
REL8_3_8
REL8_3_9
REL8_3_BETA1
REL8_3_BETA2
REL8_3_BETA3
REL8_3_BETA4
REL8_3_RC1
REL8_3_RC2
REL8_4_0
REL8_4_1
REL8_4_10
REL8_4_11
REL8_4_12
REL8_4_13
REL8_4_14
REL8_4_15
REL8_4_16
REL8_4_17
REL8_4_18
REL8_4_19
REL8_4_2
REL8_4_20
REL8_4_21
REL8_4_22
REL8_4_3
REL8_4_4
REL8_4_5
REL8_4_6
REL8_4_7
REL8_4_8
REL8_4_9
REL8_4_BETA1
REL8_4_BETA2
REL8_4_RC1
REL8_4_RC2
REL8_5_ALPHA1
REL8_5_ALPHA2
REL8_5_ALPHA3
REL9_0_0
REL9_0_1
REL9_0_10
REL9_0_11
REL9_0_12
REL9_0_13
REL9_0_14
REL9_0_15
REL9_0_16
REL9_0_17
REL9_0_18
REL9_0_19
REL9_0_2
REL9_0_20
REL9_0_21
REL9_0_22
REL9_0_23
REL9_0_3
REL9_0_4
REL9_0_5
REL9_0_6
REL9_0_7
REL9_0_8
REL9_0_9
REL9_0_ALPHA4
REL9_0_ALPHA5
REL9_0_BETA1
REL9_0_BETA2
REL9_0_BETA3
REL9_0_BETA4
REL9_0_RC1
REL9_1_0
REL9_1_1
REL9_1_10
REL9_1_11
REL9_1_12
REL9_1_13
REL9_1_14
REL9_1_15
REL9_1_16
REL9_1_17
REL9_1_18
REL9_1_19
REL9_1_2
REL9_1_20
REL9_1_21
REL9_1_22
REL9_1_23
REL9_1_24
REL9_1_3
REL9_1_4
REL9_1_5
REL9_1_6
REL9_1_7
REL9_1_8
REL9_1_9
REL9_1_ALPHA1
REL9_1_ALPHA2
REL9_1_ALPHA3
REL9_1_ALPHA4
REL9_1_ALPHA5
REL9_1_BETA1
REL9_1_BETA2
REL9_1_BETA3
REL9_1_RC1
REL9_2_0
REL9_2_1
REL9_2_10
REL9_2_11
REL9_2_12
REL9_2_13
REL9_2_14
REL9_2_15
REL9_2_16
REL9_2_17
REL9_2_18
REL9_2_19
REL9_2_2
REL9_2_20
REL9_2_21
REL9_2_22
REL9_2_23
REL9_2_24
REL9_2_3
REL9_2_4
REL9_2_5
REL9_2_6
REL9_2_7
REL9_2_8
REL9_2_9
REL9_2_BETA1
REL9_2_BETA2
REL9_2_BETA3
REL9_2_BETA4
REL9_2_RC1
REL9_3_0
REL9_3_1
REL9_3_10
REL9_3_11
REL9_3_12
REL9_3_13
REL9_3_14
REL9_3_15
REL9_3_16
REL9_3_17
REL9_3_18
REL9_3_19
REL9_3_2
REL9_3_20
REL9_3_21
REL9_3_22
REL9_3_23
REL9_3_24
REL9_3_25
REL9_3_3
REL9_3_4
REL9_3_5
REL9_3_6
REL9_3_7
REL9_3_8
REL9_3_9
REL9_3_BETA1
REL9_3_BETA2
REL9_3_RC1
REL9_4_0
REL9_4_1
REL9_4_10
REL9_4_11
REL9_4_12
REL9_4_13
REL9_4_14
REL9_4_15
REL9_4_16
REL9_4_17
REL9_4_18
REL9_4_19
REL9_4_2
REL9_4_20
REL9_4_21
REL9_4_22
REL9_4_23
REL9_4_24
REL9_4_25
REL9_4_26
REL9_4_3
REL9_4_4
REL9_4_5
REL9_4_6
REL9_4_7
REL9_4_8
REL9_4_9
REL9_4_BETA1
REL9_4_BETA2
REL9_4_BETA3
REL9_4_RC1
REL9_5_0
REL9_5_1
REL9_5_10
REL9_5_11
REL9_5_12
REL9_5_13
REL9_5_14
REL9_5_15
REL9_5_16
REL9_5_17
REL9_5_18
REL9_5_19
REL9_5_2
REL9_5_20
REL9_5_21
REL9_5_22
REL9_5_23
REL9_5_24
REL9_5_25
REL9_5_3
REL9_5_4
REL9_5_5
REL9_5_6
REL9_5_7
REL9_5_8
REL9_5_9
REL9_5_ALPHA1
REL9_5_ALPHA2
REL9_5_BETA1
REL9_5_BETA2
REL9_5_RC1
REL9_6_0
REL9_6_1
REL9_6_10
REL9_6_11
REL9_6_12
REL9_6_13
REL9_6_14
REL9_6_15
REL9_6_16
REL9_6_17
REL9_6_18
REL9_6_19
REL9_6_2
REL9_6_20
REL9_6_21
REL9_6_22
REL9_6_23
REL9_6_24
REL9_6_3
REL9_6_4
REL9_6_5
REL9_6_6
REL9_6_7
REL9_6_8
REL9_6_9
REL9_6_BETA1
REL9_6_BETA2
REL9_6_BETA3
REL9_6_BETA4
REL9_6_RC1
REL_10_0
REL_10_1
REL_10_10
REL_10_11
REL_10_12
REL_10_13
REL_10_14
REL_10_15
REL_10_16
REL_10_17
REL_10_18
REL_10_19
REL_10_2
REL_10_20
REL_10_21
REL_10_22
REL_10_23
REL_10_3
REL_10_4
REL_10_5
REL_10_6
REL_10_7
REL_10_8
REL_10_9
REL_10_BETA1
REL_10_BETA2
REL_10_BETA3
REL_10_BETA4
REL_10_RC1
REL_11_0
REL_11_1
REL_11_10
REL_11_11
REL_11_12
REL_11_13
REL_11_14
REL_11_15
REL_11_16
REL_11_17
REL_11_18
REL_11_19
REL_11_2
REL_11_20
REL_11_21
REL_11_22
REL_11_3
REL_11_4
REL_11_5
REL_11_6
REL_11_7
REL_11_8
REL_11_9
REL_11_BETA1
REL_11_BETA2
REL_11_BETA3
REL_11_BETA4
REL_11_RC1
REL_12_0
REL_12_1
REL_12_10
REL_12_11
REL_12_12
REL_12_13
REL_12_14
REL_12_15
REL_12_16
REL_12_17
REL_12_18
REL_12_19
REL_12_2
REL_12_20
REL_12_21
REL_12_22
REL_12_3
REL_12_4
REL_12_5
REL_12_6
REL_12_7
REL_12_8
REL_12_9
REL_12_BETA1
REL_12_BETA2
REL_12_BETA3
REL_12_BETA4
REL_12_RC1
REL_13_0
REL_13_1
REL_13_10
REL_13_11
REL_13_12
REL_13_13
REL_13_14
REL_13_15
REL_13_16
REL_13_17
REL_13_18
REL_13_19
REL_13_2
REL_13_20
REL_13_21
REL_13_22
REL_13_23
REL_13_3
REL_13_4
REL_13_5
REL_13_6
REL_13_7
REL_13_8
REL_13_9
REL_13_BETA1
REL_13_BETA2
REL_13_BETA3
REL_13_RC1
REL_14_0
REL_14_1
REL_14_10
REL_14_11
REL_14_12
REL_14_13
REL_14_14
REL_14_15
REL_14_16
REL_14_17
REL_14_18
REL_14_19
REL_14_2
REL_14_20
REL_14_3
REL_14_4
REL_14_5
REL_14_6
REL_14_7
REL_14_8
REL_14_9
REL_14_BETA1
REL_14_BETA2
REL_14_BETA3
REL_14_RC1
REL_15_0
REL_15_1
REL_15_10
REL_15_11
REL_15_12
REL_15_13
REL_15_14
REL_15_15
REL_15_2
REL_15_3
REL_15_4
REL_15_5
REL_15_6
REL_15_7
REL_15_8
REL_15_9
REL_15_BETA1
REL_15_BETA2
REL_15_BETA3
REL_15_BETA4
REL_15_RC1
REL_15_RC2
REL_16_0
REL_16_1
REL_16_10
REL_16_11
REL_16_2
REL_16_3
REL_16_4
REL_16_5
REL_16_6
REL_16_7
REL_16_8
REL_16_9
REL_16_BETA1
REL_16_BETA2
REL_16_BETA3
REL_16_RC1
REL_17_0
REL_17_1
REL_17_2
REL_17_3
REL_17_4
REL_17_5
REL_17_6
REL_17_7
REL_17_BETA1
REL_17_BETA2
REL_17_BETA3
REL_17_RC1
REL_18_0
REL_18_1
REL_18_BETA1
REL_18_BETA2
REL_18_BETA3
REL_18_RC1
Release_1_0_2
Release_2_0
Release_2_0_0
release-6-3
${ noResults }
1350 Commits (98edd617f3b62a02cb2df9b418fcc4ece45c7ec0)
| Author | SHA1 | Message | Date |
|---|---|---|---|
|
|
98edd617f3 |
Fix datatype confusion with the new lossy GiST distance functions.
We can only support a lossy distance function when the distance function's datatype is comparable with the original ordering operator's datatype. The distance function always returns a float8, so we are limited to float8, and float4 (by a hard-coded cast of the float8 to float4). In light of this limitation, it seems like a good idea to have a separate 'recheck' flag for the ORDER BY expressions, so that if you have a non-lossy distance function, it still works with lossy quals. There are cases like that with the build-in or contrib opclasses, but it's plausible. There was a hidden assumption that the ORDER BY values returned by GiST match the original ordering operator's return type, but there are plenty of examples where that's not true, e.g. in btree_gist and pg_trgm. As long as the distance function is not lossy, we can tolerate that and just not return the distance to the executor (or rather, always return NULL). The executor doesn't need the distances if there are no lossy results. There was another little bug: the recheck variable was not initialized before calling the distance function. That revealed the bigger issue, as the executor tried to reorder tuples that didn't need reordering, and that failed because of the datatype mismatch. |
11 years ago |
|
|
35fcb1b3d0 |
Allow GiST distance function to return merely a lower-bound.
The distance function can now set *recheck = false, like index quals. The executor will then re-check the ORDER BY expressions, and use a queue to reorder the results on the fly. This makes it possible to do kNN-searches on polygons and circles, which don't store the exact value in the index, but just a bounding box. Alexander Korotkov and me |
11 years ago |
|
|
1dc5ebc907 |
Support "expanded" objects, particularly arrays, for better performance.
This patch introduces the ability for complex datatypes to have an in-memory representation that is different from their on-disk format. On-disk formats are typically optimized for minimal size, and in any case they can't contain pointers, so they are often not well-suited for computation. Now a datatype can invent an "expanded" in-memory format that is better suited for its operations, and then pass that around among the C functions that operate on the datatype. There are also provisions (rudimentary as yet) to allow an expanded object to be modified in-place under suitable conditions, so that operations like assignment to an element of an array need not involve copying the entire array. The initial application for this feature is arrays, but it is not hard to foresee using it for other container types like JSON, XML and hstore. I have hopes that it will be useful to PostGIS as well. In this initial implementation, a few heuristics have been hard-wired into plpgsql to improve performance for arrays that are stored in plpgsql variables. We would like to generalize those hacks so that other datatypes can obtain similar improvements, but figuring out some appropriate APIs is left as a task for future work. (The heuristics themselves are probably not optimal yet, either, as they sometimes force expansion of arrays that would be better left alone.) Preliminary performance testing shows impressive speed gains for plpgsql functions that do element-by-element access or update of large arrays. There are other cases that get a little slower, as a result of added array format conversions; but we can hope to improve anything that's annoyingly bad. In any case most applications should see a net win. Tom Lane, reviewed by Andres Freund |
11 years ago |
|
|
78efd5c1ed |
Extend abbreviated key infrastructure to datum tuplesorts.
Andrew Gierth, reviewed by Peter Geoghegan and by me. |
11 years ago |
|
|
0bb8528b5c |
Fix postgres_fdw to return the right ctid value in EvalPlanQual cases.
If a postgres_fdw foreign table is a non-locked source relation in an UPDATE, DELETE, or SELECT FOR UPDATE/SHARE, and the query selects its ctid column, the wrong value would be returned if an EvalPlanQual recheck occurred. This happened because the foreign table's result row was copied via the ROW_MARK_COPY code path, and EvalPlanQualFetchRowMarks just unconditionally set the reconstructed tuple's t_self to "invalid". To fix that, we can have EvalPlanQualFetchRowMarks copy the composite datum's t_ctid field, and be sure to initialize that along with t_self when postgres_fdw constructs a tuple to return. If we just did that much then EvalPlanQualFetchRowMarks would start returning "(0,0)" as ctid for all other ROW_MARK_COPY cases, which perhaps does not matter much, but then again maybe it might. The cause of that is that heap_form_tuple, which is the ultimate source of all composite datums, simply leaves t_ctid as zeroes in newly constructed tuples. That seems like a bad idea on general principles: a field that's really not been initialized shouldn't appear to have a valid value. So let's eat the trivial additional overhead of doing "ItemPointerSetInvalid(&(td->t_ctid))" in heap_form_tuple. This closes out our handling of Etsuro Fujita's report that tableoid and ctid weren't correctly set in postgres_fdw EvalPlanQual cases. Along the way we did a great deal of work to improve FDWs' ability to control row locking behavior; which was not wasted effort by any means, but it didn't end up being a fix for this problem because that feature would be too expensive for postgres_fdw to use all the time. Although the fix for the tableoid misbehavior was back-patched, I'm hesitant to do so here; it seems far less likely that people would care about remote ctid than tableoid, and even such a minor behavioral change as this in heap_form_tuple is perhaps best not back-patched. So commit to HEAD only, at least for the moment. Etsuro Fujita, with some adjustments by me |
11 years ago |
|
|
4af6e61a36 |
Fix ON CONFLICT bugs that manifest when used in rules.
Specifically the tlist and rti of the pseudo "excluded" relation weren't properly treated by expression_tree_walker, which lead to errors when excluded was referenced inside a rule because the varnos where not properly adjusted. Similar omissions in OffsetVarNodes and expression_tree_mutator had less impact, but should obviously be fixed nonetheless. A couple tests of for ON CONFLICT UPDATE into INSERT rule bearing relations have been added. In passing I updated a couple comments. |
11 years ago |
|
|
afb9249d06 |
Add support for doing late row locking in FDWs.
Previously, FDWs could only do "early row locking", that is lock a row as soon as it's fetched, even though local restriction/join conditions might discard the row later. This patch adds callbacks that allow FDWs to do late locking in the same way that it's done for regular tables. To make use of this feature, an FDW must support the "ctid" column as a unique row identifier. Currently, since ctid has to be of type TID, the feature is of limited use, though in principle it could be used by postgres_fdw. We may eventually allow FDWs to specify another data type for ctid, which would make it possible for more FDWs to use this feature. This commit does not modify postgres_fdw to use late locking. We've tested some prototype code for that, but it's not in committable shape, and besides it's quite unclear whether it actually makes sense to do late locking against a remote server. The extra round trips required are likely to outweigh any benefit from improved concurrency. Etsuro Fujita, reviewed by Ashutosh Bapat, and hacked up a lot by me |
11 years ago |
|
|
1a8a4e5cde |
Code review for foreign/custom join pushdown patch.
Commit
|
11 years ago |
|
|
168d5805e4 |
Add support for INSERT ... ON CONFLICT DO NOTHING/UPDATE.
The newly added ON CONFLICT clause allows to specify an alternative to
raising a unique or exclusion constraint violation error when inserting.
ON CONFLICT refers to constraints that can either be specified using a
inference clause (by specifying the columns of a unique constraint) or
by naming a unique or exclusion constraint. DO NOTHING avoids the
constraint violation, without touching the pre-existing row. DO UPDATE
SET ... [WHERE ...] updates the pre-existing tuple, and has access to
both the tuple proposed for insertion and the existing tuple; the
optional WHERE clause can be used to prevent an update from being
executed. The UPDATE SET and WHERE clauses have access to the tuple
proposed for insertion using the "magic" EXCLUDED alias, and to the
pre-existing tuple using the table name or its alias.
This feature is often referred to as upsert.
This is implemented using a new infrastructure called "speculative
insertion". It is an optimistic variant of regular insertion that first
does a pre-check for existing tuples and then attempts an insert. If a
violating tuple was inserted concurrently, the speculatively inserted
tuple is deleted and a new attempt is made. If the pre-check finds a
matching tuple the alternative DO NOTHING or DO UPDATE action is taken.
If the insertion succeeds without detecting a conflict, the tuple is
deemed inserted.
To handle the possible ambiguity between the excluded alias and a table
named excluded, and for convenience with long relation names, INSERT
INTO now can alias its target table.
Bumps catversion as stored rules change.
Author: Peter Geoghegan, with significant contributions from Heikki
Linnakangas and Andres Freund. Testing infrastructure by Jeff Janes.
Reviewed-By: Heikki Linnakangas, Andres Freund, Robert Haas, Simon Riggs,
Dean Rasheed, Stephen Frost and many others.
|
11 years ago |
|
|
2c8f4836db |
Represent columns requiring insert and update privileges indentently.
Previously, relation range table entries used a single Bitmapset field representing which columns required either UPDATE or INSERT privileges, despite the fact that INSERT and UPDATE privileges are separately cataloged, and may be independently held. As statements so far required either insert or update privileges but never both, that was sufficient. The required permission could be inferred from the top level statement run. The upcoming INSERT ... ON CONFLICT UPDATE feature needs to independently check for both privileges in one statement though, so that is not sufficient anymore. Bumps catversion as stored rules change. Author: Peter Geoghegan Reviewed-By: Andres Freund |
11 years ago |
|
|
40f42d2a34 |
Use outerPlanState macro instead of referring to leffttree.
This makes the executor code more consistent. It also removes an apparently superfluous NULL test in nodeGroup.c. Qingqing Zhou, reviewed by Tom Lane, and further revised by me. |
11 years ago |
|
|
e7cb7ee145 |
Allow FDWs and custom scan providers to replace joins with scans.
Foreign data wrappers can use this capability for so-called "join pushdown"; that is, instead of executing two separate foreign scans and then joining the results locally, they can generate a path which performs the join on the remote server and then is scanned locally. This commit does not extend postgres_fdw to take advantage of this capability; it just provides the infrastructure. Custom scan providers can use this in a similar way. Previously, it was only possible for a custom scan provider to scan a single relation. Now, it can scan an entire join tree, provided of course that it knows how to produce the same results that the join would have produced if executed normally. KaiGai Kohei, reviewed by Shigeru Hanada, Ashutosh Bapat, and me. |
11 years ago |
|
|
924bcf4f16 |
Create an infrastructure for parallel computation in PostgreSQL.
This does four basic things. First, it provides convenience routines to coordinate the startup and shutdown of parallel workers. Second, it synchronizes various pieces of state (e.g. GUCs, combo CID mappings, transaction snapshot) from the parallel group leader to the worker processes. Third, it prohibits various operations that would result in unsafe changes to that state while parallelism is active. Finally, it propagates events that would result in an ErrorResponse, NoticeResponse, or NotifyResponse message being sent to the client from the parallel workers back to the master, from which they can then be sent on to the client. Robert Haas, Amit Kapila, Noah Misch, Rushabh Lathia, Jeevan Chalke. Suggestions and review from Andres Freund, Heikki Linnakangas, Noah Misch, Simon Riggs, Euler Taveira, and Jim Nasby. |
11 years ago |
|
|
6aab1f45ac |
Fix various typos and grammar errors in comments.
Author: Dmitriy Olshevskiy Discussion: 553D00A6.4090205@bk.ru |
11 years ago |
|
|
e89bd02f58 |
Perform RLS WITH CHECK before constraints, etc
The RLS capability is built on top of the WITH CHECK OPTION system which was added for auto-updatable views, however, unlike WCOs on views (which are mandated by the SQL spec to not fire until after all other constraints and checks are done), it makes much more sense for RLS checks to happen earlier than constraint and uniqueness checks. This patch reworks the structure which holds the WCOs a bit to be explicitly either VIEW or RLS checks and the RLS-related checks are done prior to the constraint and uniqueness checks. This also allows better error reporting as we are now reporting when a violation is due to a WITH CHECK OPTION and when it's due to an RLS policy violation, which was independently noted by Craig Ringer as being confusing. The documentation is also updated to include a paragraph about when RLS WITH CHECK handling is performed, as there have been a number of questions regarding that and the documentation was previously silent on the matter. Author: Dean Rasheed, with some kabitzing and comment changes by me. |
11 years ago |
|
|
61a553a091 |
Add comments explaining how unique and exclusion constraints are enforced.
|
11 years ago |
|
|
62420ae7d6 |
Move functions related to index maintenance to separate source file.
There is enough code here to deserve a file of their own, not be buried in the middle of execUtils.c. |
11 years ago |
|
|
feeb526cfe |
Fix ExecOpenScanRelation to take a lock on a ROW_MARK_COPY relation.
ExecOpenScanRelation assumed that any relation listed in the ExecRowMark list has been locked by InitPlan; but this is not true if the rel's markType is ROW_MARK_COPY, which is possible if it's a foreign table. In most (possibly all) cases, failure to acquire a lock here isn't really problematic because the parser, planner, or plancache would have taken the appropriate lock already. In principle though it might leave us vulnerable to working with a relation that we hold no lock on, and in any case if the executor isn't depending on previously-taken locks otherwise then it should not do so for ROW_MARK_COPY relations. Noted by Etsuro Fujita. Back-patch to all active versions, since the inconsistency has been there a long time. (It's almost certainly irrelevant in 9.0, since that predates foreign tables, but the code's still wrong on its own terms.) |
11 years ago |
|
|
cb1ca4d800 |
Allow foreign tables to participate in inheritance.
Foreign tables can now be inheritance children, or parents. Much of the system was already ready for this, but we had to fix a few things of course, mostly in the area of planner and executor handling of row locks. As side effects of this, allow foreign tables to have NOT VALID CHECK constraints (and hence to accept ALTER ... VALIDATE CONSTRAINT), and to accept ALTER SET STORAGE and ALTER SET WITH/WITHOUT OIDS. Continuing to disallow these things would've required bizarre and inconsistent special cases in inheritance behavior. Since foreign tables don't enforce CHECK constraints anyway, a NOT VALID one is a complete no-op, but that doesn't mean we shouldn't allow it. And it's possible that some FDWs might have use for SET STORAGE or SET WITH OIDS, though doubtless they will be no-ops for most. An additional change in support of this is that when a ModifyTable node has multiple target tables, they will all now be explicitly identified in EXPLAIN output, for example: Update on pt1 (cost=0.00..321.05 rows=3541 width=46) Update on pt1 Foreign Update on ft1 Foreign Update on ft2 Update on child3 -> Seq Scan on pt1 (cost=0.00..0.00 rows=1 width=46) -> Foreign Scan on ft1 (cost=100.00..148.03 rows=1170 width=46) -> Foreign Scan on ft2 (cost=100.00..148.03 rows=1170 width=46) -> Seq Scan on child3 (cost=0.00..25.00 rows=1200 width=46) This was done mainly to provide an unambiguous place to attach "Remote SQL" fields, but it is useful for inherited updates even when no foreign tables are involved. Shigeru Hanada and Etsuro Fujita, reviewed by Ashutosh Bapat and Kyotaro Horiguchi, some additional hacking by me |
11 years ago |
|
|
443fd0540e |
Ensure tableoid reads correctly in EvalPlanQual-manufactured tuples.
The ROW_MARK_COPY path in EvalPlanQualFetchRowMarks() was just setting tableoid to InvalidOid, I think on the assumption that the referenced RTE must be a subquery or other case without a meaningful OID. However, foreign tables also use this code path, and they do have meaningful table OIDs; so failure to set the tuple field can lead to user-visible misbehavior. Fix that by fetching the appropriate OID from the range table. There's still an issue about whether CTID can ever have a meaningful value in this case; at least with postgres_fdw foreign tables, it does. But that is a different problem that seems to require a significantly different patch --- it's debatable whether postgres_fdw really wants to use this code path at all. Simplified version of a patch by Etsuro Fujita, who also noted the problem to begin with. The issue can be demonstrated in all versions having FDWs, so back-patch to 9.1. |
11 years ago |
|
|
8abb3cda0d |
Use the typcache to cache constraints for domain types.
Previously, we cached domain constraints for the life of a query, or really for the life of the FmgrInfo struct that was used to invoke domain_in() or domain_check(). But plpgsql (and probably other places) are set up to cache such FmgrInfos for the whole lifespan of a session, which meant they could be enforcing really stale sets of constraints. On the other hand, searching pg_constraint once per query gets kind of expensive too: testing says that as much as half the runtime of a trivial query such as "SELECT 0::domaintype" went into that. To fix this, delegate the responsibility for tracking a domain's constraints to the typcache, which has the infrastructure needed to detect syscache invalidation events that signal possible changes. This not only removes unnecessary repeat reads of pg_constraint, but ensures that we never apply stale constraint data: whatever we use is the current data according to syscache rules. Unfortunately, the current configuration of the system catalogs means we have to flush cached domain-constraint data whenever either pg_type or pg_constraint changes, which happens rather a lot (eg, creation or deletion of a temp table will do it). It might be worth rearranging things to split pg_constraint into two catalogs, of which the domain constraint one would probably be very low-traffic. That's a job for another patch though, and in any case this patch should improve matters materially even with that handicap. This patch makes use of the recently-added memory context reset callback feature to manage the lifespan of domain constraint caches, so that we don't risk deleting a cache that might be in the midst of evaluation. Although this is a bug fix as well as a performance improvement, no back-patch. There haven't been many if any field complaints about stale domain constraint checks, so it doesn't seem worth taking the risk of modifying data structures as basic as MemoryContexts in back branches. |
11 years ago |
|
|
b419865a81 |
In array_agg(), don't create a new context for every group.
Previously, each new array created a new memory context that started out at 8kB. This is incredibly wasteful when there are lots of small groups of just a few elements each. Change initArrayResult() and friends to accept a "subcontext" argument to indicate whether the caller wants the ArrayBuildState allocated in a new subcontext or not. If not, it can no longer be released separately from the rest of the memory context. Fixes bug report by Frank van Vugt on 2013-10-19. Tomas Vondra. Reviewed by Ali Akbar, Tom Lane, and me. |
11 years ago |
|
|
e1a11d9311 |
Use FLEXIBLE_ARRAY_MEMBER for HeapTupleHeaderData.t_bits[].
This requires changing quite a few places that were depending on sizeof(HeapTupleHeaderData), but it seems for the best. Michael Paquier, some adjustments by me |
11 years ago |
|
|
33a3b03d63 |
Use FLEXIBLE_ARRAY_MEMBER in some more places.
Fix a batch of structs that are only visible within individual .c files. Michael Paquier |
11 years ago |
|
|
09d8d110a6 |
Use FLEXIBLE_ARRAY_MEMBER in a bunch more places.
Replace some bogus "x[1]" declarations with "x[FLEXIBLE_ARRAY_MEMBER]". Aside from being more self-documenting, this should help prevent bogus warnings from static code analyzers and perhaps compiler misoptimizations. This patch is just a down payment on eliminating the whole problem, but it gets rid of a lot of easy-to-fix cases. Note that the main problem with doing this is that one must no longer rely on computing sizeof(the containing struct), since the result would be compiler-dependent. Instead use offsetof(struct, lastfield). Autoconf also warns against spelling that offsetof(struct, lastfield[0]). Michael Paquier, review and additional fixes by me. |
11 years ago |
|
|
e983c4d1aa |
Rationalize the APIs of array element/slice access functions.
The four functions array_ref, array_set, array_get_slice, array_set_slice have traditionally declared their array inputs and results as being of type "ArrayType *". This is a lie, and has been since Berkeley days, because they actually also support "fixed-length array" types such as "name" and "point"; not to mention that the inputs could be toasted. These values should be declared Datum instead to avoid confusion. The current coding already risks possible misoptimization by compilers, and it'll get worse when "expanded" array representations become a valid alternative. However, there's a fair amount of code using array_ref and array_set with arrays that *are* known to be ArrayType structures, and there might be more such places in third-party code. Rather than cluttering those call sites with PointerGetDatum/DatumGetArrayTypeP cruft, what I did was to rename the existing functions to array_get_element/array_set_element, fix their signatures, then reincarnate array_ref/array_set as backwards compatibility wrappers. array_get_slice/array_set_slice have no such constituency in the core code, and probably not in third-party code either, so I just changed their APIs. |
11 years ago |
|
|
08361cea2b |
Fix null-pointer-deref crash while doing COPY IN with check constraints.
In commit
|
11 years ago |
|
|
57fe246890 |
Fix reference-after-free when waiting for another xact due to constraint.
If an insertion or update had to wait for another transaction to finish,
because there was another insertion with conflicting key in progress,
we would pass a just-free'd item pointer to XactLockTableWait().
All calls to XactLockTableWait() and MultiXactIdWait() had similar issues.
Some passed a pointer to a buffer in the buffer cache, after already
releasing the lock. The call in EvalPlanQualFetch had already released the
pin too. All but the call in execUtils.c would merely lead to reporting a
bogus ctid, however (or an assertion failure, if enabled).
All the callers that passed HeapTuple->t_data->t_ctid were slightly bogus
anyway: if the tuple was updated (again) in the same transaction, its ctid
field would point to the next tuple in the chain, not the tuple itself.
Backpatch to 9.4, where the 'ctid' argument to XactLockTableWait was added
(in commit
|
11 years ago |
|
|
804b6b6db4 |
Fix column-privilege leak in error-message paths
While building error messages to return to the user, BuildIndexValueDescription, ExecBuildSlotValueDescription and ri_ReportViolation would happily include the entire key or entire row in the result returned to the user, even if the user didn't have access to view all of the columns being included. Instead, include only those columns which the user is providing or which the user has select rights on. If the user does not have any rights to view the table or any of the columns involved then no detail is provided and a NULL value is returned from BuildIndexValueDescription and ExecBuildSlotValueDescription. Note that, for key cases, the user must have access to all of the columns for the key to be shown; a partial key will not be returned. Further, in master only, do not return any data for cases where row security is enabled on the relation and row security should be applied for the user. This required a bit of refactoring and moving of things around related to RLS- note the addition of utils/misc/rls.c. Back-patch all the way, as column-level privileges are now in all supported versions. This has been assigned CVE-2014-8161, but since the issue and the patch have already been publicized on pgsql-hackers, there's no point in trying to hide this commit. |
11 years ago |
|
|
4ea51cdfe8 |
Use abbreviated keys for faster sorting of text datums.
This commit extends the SortSupport infrastructure to allow operator classes the option to provide abbreviated representations of Datums; in the case of text, we abbreviate by taking the first few characters of the strxfrm() blob. If the abbreviated comparison is insufficent to resolve the comparison, we fall back on the normal comparator. This can be much faster than the old way of doing sorting if the first few bytes of the string are usually sufficient to resolve the comparison. There is the potential for a performance regression if all of the strings to be sorted are identical for the first 8+ characters and differ only in later positions; therefore, the SortSupport machinery now provides an infrastructure to abort the use of abbreviation if it appears that abbreviation is producing comparatively few distinct keys. HyperLogLog, a streaming cardinality estimator, is included in this commit and used to make that determination for text. Peter Geoghegan, reviewed by me. |
11 years ago |
|
|
1605291b6c |
Typo fix.
Etsuro Fujita |
11 years ago |
|
|
c480cb9d24 |
Fix use-of-already-freed-memory problem in EvalPlanQual processing.
Up to now, the "child" executor state trees generated for EvalPlanQual rechecks have simply shared the ResultRelInfo arrays used for the original execution tree. However, this leads to dangling-pointer problems, because ExecInitModifyTable() is all too willing to scribble on some fields of the ResultRelInfo(s) even when it's being run in one of those child trees. This trashes those fields from the perspective of the parent tree, because even if the generated subtree is logically identical to what was in use in the parent, it's in a memory context that will go away when we're done with the child state tree. We do however want to share information in the direction from the parent down to the children; in particular, fields such as es_instrument *must* be shared or we'll lose the stats arising from execution of the children. So the simplest fix is to make a copy of the parent's ResultRelInfo array, but not copy any fields back at end of child execution. Per report from Manuel Kniep. The added isolation test is based on his example. In an unpatched memory-clobber-enabled build it will reliably fail with "ctid is NULL" errors in all branches back to 9.1, as a consequence of junkfilter->jf_junkAttNo being overwritten with $7f7f. This test cannot be run as-is before that for lack of WITH syntax; but I have no doubt that some variant of this problem can arise in older branches, so apply the code change all the way back. |
11 years ago |
|
|
c4fda14845 |
Fix typo in execMain.c
Wee -> We. Pointed out by Etsuro Fujita. |
11 years ago |
|
|
4baaf863ec |
Update copyright for 2015
Backpatch certain files through 9.0 |
11 years ago |
|
|
2db576ba8c |
Fix corner case where SELECT FOR UPDATE could return a row twice.
In READ COMMITTED mode, if a SELECT FOR UPDATE discovers it has to redo WHERE-clause checking on rows that have been updated since the SELECT's snapshot, it invokes EvalPlanQual processing to do that. If this first occurs within a non-first child table of an inheritance tree, the previous coding could accidentally re-return a matching row from an earlier, already-scanned child table. (And, to add insult to injury, I think this could make it miss returning a row that should have been returned, if the updated row that this happens on should still have passed the WHERE qual.) Per report from Kyotaro Horiguchi; the added isolation test is based on his test case. This has been broken for quite awhile, so back-patch to all supported branches. |
11 years ago |
|
|
f4e031c662 |
Add bms_next_member(), and use it where appropriate.
This patch adds a way of iterating through the members of a bitmapset nondestructively, unlike the old way with bms_first_member(). While bms_next_member() is very slightly slower than bms_first_member() (at least for typical-size bitmapsets), eliminating the need to palloc and pfree a temporary copy of the target bitmapset is a significant win. So this method should be preferred in all cases where a temporary copy would be necessary. Tom Lane, with suggestions from Dean Rasheed and David Rowley |
11 years ago |
|
|
143b39c185 |
Rename pg_rowsecurity -> pg_policy and other fixes
As pointed out by Robert, we should really have named pg_rowsecurity pg_policy, as the objects stored in that catalog are policies. This patch fixes that and updates the column names to start with 'pol' to match the new catalog name. The security consideration for COPY with row level security, also pointed out by Robert, has also been addressed by remembering and re-checking the OID of the relation initially referenced during COPY processing, to make sure it hasn't changed under us by the time we finish planning out the query which has been built. Robert and Alvaro also commented on missing OCLASS and OBJECT entries for POLICY (formerly ROWSECURITY or POLICY, depending) in various places. This patch fixes that too, which also happens to add the ability to COMMENT on policies. In passing, attempt to improve the consistency of messages, comments, and documentation as well. This removes various incarnations of 'row-security', 'row-level security', 'Row-security', etc, in favor of 'policy', 'row level security' or 'row_security' as appropriate. Happy Thanksgiving! |
11 years ago |
|
|
bac27394a1 |
Support arrays as input to array_agg() and ARRAY(SELECT ...).
These cases formerly failed with errors about "could not find array type for data type". Now they yield arrays of the same element type and one higher dimension. The implementation involves creating functions with API similar to the existing accumArrayResult() family. I (tgl) also extended the base family by adding an initArrayResult() function, which allows callers to avoid special-casing the zero-inputs case if they just want an empty array as result. (Not all do, so the previous calling convention remains valid.) This allowed simplifying some existing code in xml.c and plperl.c. Ali Akbar, reviewed by Pavel Stehule, significantly modified by me |
11 years ago |
|
|
adbfab119b |
Remove dead code supporting mark/restore in SeqScan, TidScan, ValuesScan.
There seems no prospect that any of this will ever be useful, and indeed it's questionable whether some of it would work if it ever got called; it's certainly not been exercised in a very long time, if ever. So let's get rid of it, and make the comments about mark/restore in execAmi.c less wishy-washy. The mark/restore support for Result nodes is also currently dead code, but that's due to planner limitations not because it's impossible that it could be useful. So I left it in. |
11 years ago |
|
|
a34fa8ee7c |
Initial code review for CustomScan patch.
Get rid of the pernicious entanglement between planner and executor headers
introduced by commit
|
11 years ago |
|
|
081a6048cf |
Fix another oversight in CustomScan patch.
execCurrent.c's search_plan_tree() must recognize a CustomScan on the target relation. This would only be helpful for custom providers that support CurrentOfExpr quals, which is probably a bit far-fetched, but it's not impossible I think. But even without assuming that, we need to recognize a scanned-relation match so that we will properly throw error if the desired relation is being scanned with both a CustomScan and a regular scan (ie, self-join). Also recognize ForeignScanState for similar reasons. Supporting WHERE CURRENT OF on a foreign table is probably even more far-fetched than it is for custom scans, but I think in principle you could do it with postgres_fdw (or another FDW that supports the ctid column). This would be a back-patchable bug fix if existing FDWs handled CurrentOfExpr, but I doubt any do so I won't bother back-patching. |
11 years ago |
|
|
677708032c |
Explicitly support the case that a plancache's raw_parse_tree is NULL.
This only happens if a client issues a Parse message with an empty query string, which is a bit odd; but since it is explicitly called out as legal by our FE/BE protocol spec, we'd probably better continue to allow it. Fix by adding tests everywhere that the raw_parse_tree field is passed to functions that don't or shouldn't accept NULL. Also make it clear in the relevant comments that NULL is an expected case. This reverts commits |
11 years ago |
|
|
bf7ca15875 |
Ensure that RowExprs and whole-row Vars produce the expected column names.
At one time it wasn't terribly important what column names were associated with the fields of a composite Datum, but since the introduction of operations like row_to_json(), it's important that looking up the rowtype ID embedded in the Datum returns the column names that users would expect. That did not work terribly well before this patch: you could get the column names of the underlying table, or column aliases from any level of the query, depending on minor details of the plan tree. You could even get totally empty field names, which is disastrous for cases like row_to_json(). To fix this for whole-row Vars, look to the RTE referenced by the Var, and make sure its column aliases are applied to the rowtype associated with the result Datums. This is a tad scary because we might have to return a transient RECORD type even though the Var is declared as having some named rowtype. In principle it should be all right because the record type will still be physically compatible with the named rowtype; but I had to weaken one Assert in ExecEvalConvertRowtype, and there might be third-party code containing similar assumptions. Similarly, RowExprs have to be willing to override the column names coming from a named composite result type and produce a RECORD when the column aliases visible at the site of the RowExpr differ from the underlying table's column names. In passing, revert the decision made in commit |
11 years ago |
|
|
0b03e5951b |
Introduce custom path and scan providers.
This allows extension modules to define their own methods for scanning a relation, and get the core code to use them. It's unclear as yet how much use this capability will find, but we won't find out if we never commit it. KaiGai Kohei, reviewed at various times and in various levels of detail by Shigeru Hanada, Tom Lane, Andres Freund, Álvaro Herrera, and myself. |
11 years ago |
|
|
fe65280fea |
C comments: adjust execTuples.c for new structure
Report by Peter Geoghegan |
11 years ago |
|
|
30d7ae3c76 |
Increase number of hash join buckets for underestimate.
If we expect batching at the very beginning, we size nbuckets for "full work_mem" (see how many tuples we can get into work_mem, while not breaking NTUP_PER_BUCKET threshold). If we expect to be fine without batching, we start with the 'right' nbuckets and track the optimal nbuckets as we go (without actually resizing the hash table). Once we hit work_mem (considering the optimal nbuckets value), we keep the value. At the end of the first batch, we check whether (nbuckets != nbuckets_optimal) and resize the hash table if needed. Also, we keep this value for all batches (it's OK because it assumes full work_mem, and it makes the batchno evaluation trivial). So the resize happens only once. There could be cases where it would improve performance to allow the NTUP_PER_BUCKET threshold to be exceeded to keep everything in one batch rather than spilling to a second batch, but attempts to generate such a case have so far been unsuccessful; that issue may be addressed with a follow-on patch after further investigation. Tomas Vondra with minor format and comment cleanup by me Reviewed by Robert Haas, Heikki Linnakangas, and Kevin Grittner |
11 years ago |
|
|
df630b0dd5 |
Implement SKIP LOCKED for row-level locks
This clause changes the behavior of SELECT locking clauses in the presence of locked rows: instead of causing a process to block waiting for the locks held by other processes (or raise an error, with NOWAIT), SKIP LOCKED makes the new reader skip over such rows. While this is not appropriate behavior for general purposes, there are some cases in which it is useful, such as queue-like tables. Catalog version bumped because this patch changes the representation of stored rules. Reviewed by Craig Ringer (based on a previous attempt at an implementation by Simon Riggs, who also provided input on the syntax used in the current patch), David Rowley, and Álvaro Herrera. Author: Thomas Munro |
11 years ago |
|
|
491c029dbc |
Row-Level Security Policies (RLS)
Building on the updatable security-barrier views work, add the ability to define policies on tables to limit the set of rows which are returned from a query and which are allowed to be added to a table. Expressions defined by the policy for filtering are added to the security barrier quals of the query, while expressions defined to check records being added to a table are added to the with-check options of the query. New top-level commands are CREATE/ALTER/DROP POLICY and are controlled by the table owner. Row Security is able to be enabled and disabled by the owner on a per-table basis using ALTER TABLE .. ENABLE/DISABLE ROW SECURITY. Per discussion, ROW SECURITY is disabled on tables by default and must be enabled for policies on the table to be used. If no policies exist on a table with ROW SECURITY enabled, a default-deny policy is used and no records will be visible. By default, row security is applied at all times except for the table owner and the superuser. A new GUC, row_security, is added which can be set to ON, OFF, or FORCE. When set to FORCE, row security will be applied even for the table owner and superusers. When set to OFF, row security will be disabled when allowed and an error will be thrown if the user does not have rights to bypass row security. Per discussion, pg_dump sets row_security = OFF by default to ensure that exports and backups will have all data in the table or will error if there are insufficient privileges to bypass row security. A new option has been added to pg_dump, --enable-row-security, to ask pg_dump to export with row security enabled. A new role capability, BYPASSRLS, which can only be set by the superuser, is added to allow other users to be able to bypass row security using row_security = OFF. Many thanks to the various individuals who have helped with the design, particularly Robert Haas for his feedback. Authors include Craig Ringer, KaiGai Kohei, Adam Brightwell, Dean Rasheed, with additional changes and rework by me. Reviewers have included all of the above, Greg Smith, Jeff McCormick, and Robert Haas. |
11 years ago |
|
|
2df465e696 |
Fix pointer type in size passed to memset.
Pointers are all the same size, so it makes no practical difference, but let's be tidy. Found by Coverity, noted off-list by Tom Lane. |
11 years ago |
|
|
8cce08f168 |
Change NTUP_PER_BUCKET to 1 to improve hash join lookup speed.
Since this makes the bucket headers use ~10x as much memory, properly
account for that memory when we figure out whether everything fits
in work_mem. This might result in some cases that previously used
only a single batch getting split into multiple batches, but it's
unclear as yet whether we need defenses against that case, and if so,
what the shape of those defenses should be.
It's worth noting that even in these edge cases, users should still be
no worse off than they would have been last week, because commit
|
11 years ago |