mirror of https://github.com/postgres/postgres
Tag:
Branch:
Tree:
8a300fc3af
REL2_0B
REL6_4
REL6_5_PATCHES
REL7_0_PATCHES
REL7_1_STABLE
REL7_2_STABLE
REL7_3_STABLE
REL7_4_STABLE
REL8_0_STABLE
REL8_1_STABLE
REL8_2_STABLE
REL8_3_STABLE
REL8_4_STABLE
REL8_5_ALPHA1_BRANCH
REL8_5_ALPHA2_BRANCH
REL8_5_ALPHA3_BRANCH
REL9_0_ALPHA4_BRANCH
REL9_0_ALPHA5_BRANCH
REL9_0_STABLE
REL9_1_STABLE
REL9_2_STABLE
REL9_3_STABLE
REL9_4_STABLE
REL9_5_STABLE
REL9_6_STABLE
REL_10_STABLE
REL_11_STABLE
REL_12_STABLE
REL_13_STABLE
REL_14_STABLE
REL_15_STABLE
REL_16_STABLE
REL_17_STABLE
REL_18_STABLE
Release_1_0_3
WIN32_DEV
ecpg_big_bison
master
PG95-1_01
PG95-1_08
PG95-1_09
REL2_0
REL6_1
REL6_1_1
REL6_2
REL6_2_1
REL6_3
REL6_3_2
REL6_4_2
REL6_5
REL6_5_1
REL6_5_2
REL6_5_3
REL7_0
REL7_0_2
REL7_0_3
REL7_1
REL7_1_1
REL7_1_2
REL7_1_3
REL7_1_BETA
REL7_1_BETA2
REL7_1_BETA3
REL7_2
REL7_2_1
REL7_2_2
REL7_2_3
REL7_2_4
REL7_2_5
REL7_2_6
REL7_2_7
REL7_2_8
REL7_2_BETA1
REL7_2_BETA2
REL7_2_BETA3
REL7_2_BETA4
REL7_2_BETA5
REL7_2_RC1
REL7_2_RC2
REL7_3
REL7_3_1
REL7_3_10
REL7_3_11
REL7_3_12
REL7_3_13
REL7_3_14
REL7_3_15
REL7_3_16
REL7_3_17
REL7_3_18
REL7_3_19
REL7_3_2
REL7_3_20
REL7_3_21
REL7_3_3
REL7_3_4
REL7_3_5
REL7_3_6
REL7_3_7
REL7_3_8
REL7_3_9
REL7_4
REL7_4_1
REL7_4_10
REL7_4_11
REL7_4_12
REL7_4_13
REL7_4_14
REL7_4_15
REL7_4_16
REL7_4_17
REL7_4_18
REL7_4_19
REL7_4_2
REL7_4_20
REL7_4_21
REL7_4_22
REL7_4_23
REL7_4_24
REL7_4_25
REL7_4_26
REL7_4_27
REL7_4_28
REL7_4_29
REL7_4_3
REL7_4_30
REL7_4_4
REL7_4_5
REL7_4_6
REL7_4_7
REL7_4_8
REL7_4_9
REL7_4_BETA1
REL7_4_BETA2
REL7_4_BETA3
REL7_4_BETA4
REL7_4_BETA5
REL7_4_RC1
REL7_4_RC2
REL8_0_0
REL8_0_0BETA1
REL8_0_0BETA2
REL8_0_0BETA3
REL8_0_0BETA4
REL8_0_0BETA5
REL8_0_0RC1
REL8_0_0RC2
REL8_0_0RC3
REL8_0_0RC4
REL8_0_0RC5
REL8_0_1
REL8_0_10
REL8_0_11
REL8_0_12
REL8_0_13
REL8_0_14
REL8_0_15
REL8_0_16
REL8_0_17
REL8_0_18
REL8_0_19
REL8_0_2
REL8_0_20
REL8_0_21
REL8_0_22
REL8_0_23
REL8_0_24
REL8_0_25
REL8_0_26
REL8_0_3
REL8_0_4
REL8_0_5
REL8_0_6
REL8_0_7
REL8_0_8
REL8_0_9
REL8_1_0
REL8_1_0BETA1
REL8_1_0BETA2
REL8_1_0BETA3
REL8_1_0BETA4
REL8_1_0RC1
REL8_1_1
REL8_1_10
REL8_1_11
REL8_1_12
REL8_1_13
REL8_1_14
REL8_1_15
REL8_1_16
REL8_1_17
REL8_1_18
REL8_1_19
REL8_1_2
REL8_1_20
REL8_1_21
REL8_1_22
REL8_1_23
REL8_1_3
REL8_1_4
REL8_1_5
REL8_1_6
REL8_1_7
REL8_1_8
REL8_1_9
REL8_2_0
REL8_2_1
REL8_2_10
REL8_2_11
REL8_2_12
REL8_2_13
REL8_2_14
REL8_2_15
REL8_2_16
REL8_2_17
REL8_2_18
REL8_2_19
REL8_2_2
REL8_2_20
REL8_2_21
REL8_2_22
REL8_2_23
REL8_2_3
REL8_2_4
REL8_2_5
REL8_2_6
REL8_2_7
REL8_2_8
REL8_2_9
REL8_2_BETA1
REL8_2_BETA2
REL8_2_BETA3
REL8_2_RC1
REL8_3_0
REL8_3_1
REL8_3_10
REL8_3_11
REL8_3_12
REL8_3_13
REL8_3_14
REL8_3_15
REL8_3_16
REL8_3_17
REL8_3_18
REL8_3_19
REL8_3_2
REL8_3_20
REL8_3_21
REL8_3_22
REL8_3_23
REL8_3_3
REL8_3_4
REL8_3_5
REL8_3_6
REL8_3_7
REL8_3_8
REL8_3_9
REL8_3_BETA1
REL8_3_BETA2
REL8_3_BETA3
REL8_3_BETA4
REL8_3_RC1
REL8_3_RC2
REL8_4_0
REL8_4_1
REL8_4_10
REL8_4_11
REL8_4_12
REL8_4_13
REL8_4_14
REL8_4_15
REL8_4_16
REL8_4_17
REL8_4_18
REL8_4_19
REL8_4_2
REL8_4_20
REL8_4_21
REL8_4_22
REL8_4_3
REL8_4_4
REL8_4_5
REL8_4_6
REL8_4_7
REL8_4_8
REL8_4_9
REL8_4_BETA1
REL8_4_BETA2
REL8_4_RC1
REL8_4_RC2
REL8_5_ALPHA1
REL8_5_ALPHA2
REL8_5_ALPHA3
REL9_0_0
REL9_0_1
REL9_0_10
REL9_0_11
REL9_0_12
REL9_0_13
REL9_0_14
REL9_0_15
REL9_0_16
REL9_0_17
REL9_0_18
REL9_0_19
REL9_0_2
REL9_0_20
REL9_0_21
REL9_0_22
REL9_0_23
REL9_0_3
REL9_0_4
REL9_0_5
REL9_0_6
REL9_0_7
REL9_0_8
REL9_0_9
REL9_0_ALPHA4
REL9_0_ALPHA5
REL9_0_BETA1
REL9_0_BETA2
REL9_0_BETA3
REL9_0_BETA4
REL9_0_RC1
REL9_1_0
REL9_1_1
REL9_1_10
REL9_1_11
REL9_1_12
REL9_1_13
REL9_1_14
REL9_1_15
REL9_1_16
REL9_1_17
REL9_1_18
REL9_1_19
REL9_1_2
REL9_1_20
REL9_1_21
REL9_1_22
REL9_1_23
REL9_1_24
REL9_1_3
REL9_1_4
REL9_1_5
REL9_1_6
REL9_1_7
REL9_1_8
REL9_1_9
REL9_1_ALPHA1
REL9_1_ALPHA2
REL9_1_ALPHA3
REL9_1_ALPHA4
REL9_1_ALPHA5
REL9_1_BETA1
REL9_1_BETA2
REL9_1_BETA3
REL9_1_RC1
REL9_2_0
REL9_2_1
REL9_2_10
REL9_2_11
REL9_2_12
REL9_2_13
REL9_2_14
REL9_2_15
REL9_2_16
REL9_2_17
REL9_2_18
REL9_2_19
REL9_2_2
REL9_2_20
REL9_2_21
REL9_2_22
REL9_2_23
REL9_2_24
REL9_2_3
REL9_2_4
REL9_2_5
REL9_2_6
REL9_2_7
REL9_2_8
REL9_2_9
REL9_2_BETA1
REL9_2_BETA2
REL9_2_BETA3
REL9_2_BETA4
REL9_2_RC1
REL9_3_0
REL9_3_1
REL9_3_10
REL9_3_11
REL9_3_12
REL9_3_13
REL9_3_14
REL9_3_15
REL9_3_16
REL9_3_17
REL9_3_18
REL9_3_19
REL9_3_2
REL9_3_20
REL9_3_21
REL9_3_22
REL9_3_23
REL9_3_24
REL9_3_25
REL9_3_3
REL9_3_4
REL9_3_5
REL9_3_6
REL9_3_7
REL9_3_8
REL9_3_9
REL9_3_BETA1
REL9_3_BETA2
REL9_3_RC1
REL9_4_0
REL9_4_1
REL9_4_10
REL9_4_11
REL9_4_12
REL9_4_13
REL9_4_14
REL9_4_15
REL9_4_16
REL9_4_17
REL9_4_18
REL9_4_19
REL9_4_2
REL9_4_20
REL9_4_21
REL9_4_22
REL9_4_23
REL9_4_24
REL9_4_25
REL9_4_26
REL9_4_3
REL9_4_4
REL9_4_5
REL9_4_6
REL9_4_7
REL9_4_8
REL9_4_9
REL9_4_BETA1
REL9_4_BETA2
REL9_4_BETA3
REL9_4_RC1
REL9_5_0
REL9_5_1
REL9_5_10
REL9_5_11
REL9_5_12
REL9_5_13
REL9_5_14
REL9_5_15
REL9_5_16
REL9_5_17
REL9_5_18
REL9_5_19
REL9_5_2
REL9_5_20
REL9_5_21
REL9_5_22
REL9_5_23
REL9_5_24
REL9_5_25
REL9_5_3
REL9_5_4
REL9_5_5
REL9_5_6
REL9_5_7
REL9_5_8
REL9_5_9
REL9_5_ALPHA1
REL9_5_ALPHA2
REL9_5_BETA1
REL9_5_BETA2
REL9_5_RC1
REL9_6_0
REL9_6_1
REL9_6_10
REL9_6_11
REL9_6_12
REL9_6_13
REL9_6_14
REL9_6_15
REL9_6_16
REL9_6_17
REL9_6_18
REL9_6_19
REL9_6_2
REL9_6_20
REL9_6_21
REL9_6_22
REL9_6_23
REL9_6_24
REL9_6_3
REL9_6_4
REL9_6_5
REL9_6_6
REL9_6_7
REL9_6_8
REL9_6_9
REL9_6_BETA1
REL9_6_BETA2
REL9_6_BETA3
REL9_6_BETA4
REL9_6_RC1
REL_10_0
REL_10_1
REL_10_10
REL_10_11
REL_10_12
REL_10_13
REL_10_14
REL_10_15
REL_10_16
REL_10_17
REL_10_18
REL_10_19
REL_10_2
REL_10_20
REL_10_21
REL_10_22
REL_10_23
REL_10_3
REL_10_4
REL_10_5
REL_10_6
REL_10_7
REL_10_8
REL_10_9
REL_10_BETA1
REL_10_BETA2
REL_10_BETA3
REL_10_BETA4
REL_10_RC1
REL_11_0
REL_11_1
REL_11_10
REL_11_11
REL_11_12
REL_11_13
REL_11_14
REL_11_15
REL_11_16
REL_11_17
REL_11_18
REL_11_19
REL_11_2
REL_11_20
REL_11_21
REL_11_22
REL_11_3
REL_11_4
REL_11_5
REL_11_6
REL_11_7
REL_11_8
REL_11_9
REL_11_BETA1
REL_11_BETA2
REL_11_BETA3
REL_11_BETA4
REL_11_RC1
REL_12_0
REL_12_1
REL_12_10
REL_12_11
REL_12_12
REL_12_13
REL_12_14
REL_12_15
REL_12_16
REL_12_17
REL_12_18
REL_12_19
REL_12_2
REL_12_20
REL_12_21
REL_12_22
REL_12_3
REL_12_4
REL_12_5
REL_12_6
REL_12_7
REL_12_8
REL_12_9
REL_12_BETA1
REL_12_BETA2
REL_12_BETA3
REL_12_BETA4
REL_12_RC1
REL_13_0
REL_13_1
REL_13_10
REL_13_11
REL_13_12
REL_13_13
REL_13_14
REL_13_15
REL_13_16
REL_13_17
REL_13_18
REL_13_19
REL_13_2
REL_13_20
REL_13_21
REL_13_22
REL_13_23
REL_13_3
REL_13_4
REL_13_5
REL_13_6
REL_13_7
REL_13_8
REL_13_9
REL_13_BETA1
REL_13_BETA2
REL_13_BETA3
REL_13_RC1
REL_14_0
REL_14_1
REL_14_10
REL_14_11
REL_14_12
REL_14_13
REL_14_14
REL_14_15
REL_14_16
REL_14_17
REL_14_18
REL_14_19
REL_14_2
REL_14_20
REL_14_3
REL_14_4
REL_14_5
REL_14_6
REL_14_7
REL_14_8
REL_14_9
REL_14_BETA1
REL_14_BETA2
REL_14_BETA3
REL_14_RC1
REL_15_0
REL_15_1
REL_15_10
REL_15_11
REL_15_12
REL_15_13
REL_15_14
REL_15_15
REL_15_2
REL_15_3
REL_15_4
REL_15_5
REL_15_6
REL_15_7
REL_15_8
REL_15_9
REL_15_BETA1
REL_15_BETA2
REL_15_BETA3
REL_15_BETA4
REL_15_RC1
REL_15_RC2
REL_16_0
REL_16_1
REL_16_10
REL_16_11
REL_16_2
REL_16_3
REL_16_4
REL_16_5
REL_16_6
REL_16_7
REL_16_8
REL_16_9
REL_16_BETA1
REL_16_BETA2
REL_16_BETA3
REL_16_RC1
REL_17_0
REL_17_1
REL_17_2
REL_17_3
REL_17_4
REL_17_5
REL_17_6
REL_17_7
REL_17_BETA1
REL_17_BETA2
REL_17_BETA3
REL_17_RC1
REL_18_0
REL_18_1
REL_18_BETA1
REL_18_BETA2
REL_18_BETA3
REL_18_RC1
Release_1_0_2
Release_2_0
Release_2_0_0
release-6-3
${ noResults }
179 Commits (8a300fc3afce4a0fe8a4c55bc22b2aeec092cf23)
| Author | SHA1 | Message | Date |
|---|---|---|---|
|
|
3af87736bf |
Fix another cause of "wrong varnullingrels" planner failures.
I removed the delay_upper_joins mechanism in commit
|
3 years ago |
|
|
efeb12ef0b |
Don't include outer join relids in lateral_relids bitmapsets.
This avoids an assertion failure when outer joins are rearranged per identity 3. Listing only the baserels from a PlaceHolderVar's ph_lateral set should be enough to ensure that the required values are available when we need to compute the PHV --- it's what we did before inventing nullingrel sets, after all. It's a bit unsatisfying; but with beta2 hard upon us, there's not time to look for an aesthetically cleaner fix. Richard Guo and Tom Lane Discussion: https://postgr.es/m/CAMbWs48Jcw-NvnxT23WiHP324wG44DvzcH1j4hc0Zn+3sR9cfg@mail.gmail.com |
3 years ago |
|
|
0655c03ef9 |
Centralize fixups for mismatched nullingrels in nestloop params.
It turns out that the fixes we applied in commits |
3 years ago |
|
|
7fcd7ef2a9 |
Don't use partial unique indexes for unique proofs in the planner
Here we adjust relation_has_unique_index_for() so that it no longer makes use of partial unique indexes as uniqueness proofs. It is incorrect to use these as the predicates used by check_index_predicates() to set predOK makes use of not only baserestrictinfo quals as proofs, but also qual from join conditions. For relation_has_unique_index_for()'s case, we need to know the relation is unique for a given set of columns before any joins are evaluated, so if predOK was only set to true due to some join qual, then it's unsafe to use such indexes in relation_has_unique_index_for(). The final plan may not even make use of that index, which could result in reading tuples that are not as unique as the planner previously expected them to be. Bug: #17975 Reported-by: Tor Erik Linnerud Backpatch-through: 11, all supported versions Discussion: https://postgr.es/m/17975-98a90c156f25c952%40postgresql.org |
3 years ago |
|
|
f4c00d138f |
When removing a left join, clean out references in EquivalenceClasses.
Since commit
|
3 years ago |
|
|
63e4f13d2a |
Fix "wrong varnullingrels" for Memoize's lateral references, too.
The issue fixed in commit
|
3 years ago |
|
|
bfd332b3fd |
Fix "wrong varnullingrels" for subquery nestloop parameters.
If we apply outer join identity 3 when relation C is a subquery having lateral references to relation B, then the lateral references within C continue to bear the original syntactically-correct varnullingrels marks, but that won't match what is available from the outer side of the nestloop. Compensate for that in process_subquery_nestloop_params(). This is a slightly hacky fix, but we certainly don't want to re-plan C in toto for each possible outer join order, so there's not a lot of better alternatives. Richard Guo and Tom Lane, per report from Markus Winand Discussion: https://postgr.es/m/DFBB2D25-DE97-49CA-A60E-07C881EA59A7@winand.at |
3 years ago |
|
|
9a2dbc614e |
Fix oversight in outer join removal.
A placeholder that references the outer join's relid in ph_eval_at is logically "above" the join, and therefore we can't remove its PlaceHolderInfo: it might still be used somewhere in the query. This was not an issue pre-v16 because we failed to remove the join at all in such cases. The new outer-join-aware-Var infrastructure permits deducing that it's okay to remove the join, but then we have to clean up correctly afterwards. Report and fix by Richard Guo Discussion: https://postgr.es/m/CAMbWs4_tuVn9EwwMcggGiZJWWstdXX_ci8FeEU17vs+4nLgw3w@mail.gmail.com |
3 years ago |
|
|
7a844c77ec |
Fix joinclause removal logic to cope with cloned clauses.
When we're deleting a no-op LEFT JOIN from the query, we must remove
the join's joinclauses from surviving relations' joininfo lists.
The invention of "cloned" clauses in
|
3 years ago |
|
|
991a3df227 |
Fix filtering of "cloned" outer-join quals some more.
We've had multiple issues with the clause_is_computable_at logic that I introduced in 2489d76c4: it's been known to accept more than one clone of the same qual at the same plan node, and also to accept no clones at all. It's looking impractical to get it 100% right on the basis of the currently-stored information, so fix it by introducing a new RestrictInfo field "incompatible_relids" that explicitly shows which outer joins a given clone mustn't be pushed above. In principle we could populate this field in every RestrictInfo, but that would cost space and there doesn't presently seem to be a need for it in general. Also, while deconstruct_distribute_oj_quals can easily fill the field with the remaining members of the commutative join set that it's considering, computing it in the general case seems again pretty complicated. So for now, just fill it for clone quals. Along the way, fix a bug that may or may not be only latent: equivclass.c was generating replacement clauses with is_pushed_down and has_clone/is_clone markings that didn't match their required_relids. This led me to conclude that leaving the clone flags out of make_restrictinfo's purview wasn't such a great idea after all, so add them. Per report from Richard Guo. Discussion: https://postgr.es/m/CAMbWs48EYi_9-pSd0ORes1kTmTeAjT4Q3gu49hJtYCbSn2JyeA@mail.gmail.com |
3 years ago |
|
|
b9c755a2f6 |
In clause_is_computable_at(), test required_relids for clone clauses.
Use the clause's required_relids not clause_relids for testing whether it is computable at the current join level, if it is a clone clause generated by deconstruct_distribute_oj_quals(). Arguably, this is more correct and we should do it for all clauses; that would at least remove the handwavy claim that we are doing it to save cycles compared to inspecting Vars individually. However, attempting to do that exposes that we are not being careful to compute an accurate value for required_relids in all cases. I'm unsure whether it's a good idea to attempt to do that for v16, or leave it as future clean-up. In the meantime, this quick hack demonstrably fixes some cases, so let's squeeze it in for beta1. Patch by me, but great thanks to Richard Guo for investigation and testing. The new test cases are all modeled on his examples. Discussion: https://postgr.es/m/CAMbWs4-_vwkBij4XOQ5ukxUvLgwTm0kS5_DO9CicUeKbEfKjUw@mail.gmail.com |
3 years ago |
|
|
d0f952691f |
Fix thinko in join removal.
In commit
|
3 years ago |
|
|
9df8f903eb |
Fix some issues with improper placement of outer join clauses.
After applying outer-join identity 3 in the forward direction,
it was possible for the planner to mistakenly apply a qual clause
from above the two outer joins at the now-lower join level.
This can give the wrong answer, since a value that would get nulled
by the now-upper join might not yet be null.
To fix, when we perform such a transformation, consider that the
now-lower join hasn't really completed the outer join it's nominally
responsible for and thus its relid set should not include that OJ's
relid (nor should its output Vars have that nullingrel bit set).
Instead we add those bits when the now-upper join is performed.
The existing rules for qual placement then suffice to prevent
higher qual clauses from dropping below the now-upper join.
There are a few complications from needing to consider transitive
closures in case multiple pushdowns have happened, but all in all
it's not a very complex patch.
This is all new logic (from
|
3 years ago |
|
|
c8b881d21f |
Undo faulty attempt at not relying on RINFO_IS_PUSHED_DOWN.
I've had a bee in my bonnet for some time about getting rid of RestrictInfo.is_pushed_down, because it's squishily defined and requires not-inexpensive extra tests to use (cf RINFO_IS_PUSHED_DOWN). In commit |
3 years ago |
|
|
767c598954 |
Work around implementation restriction in adjust_appendrel_attrs.
adjust_appendrel_attrs can't transfer nullingrel labeling to a non-Var
translation expression (mainly because it's too late to wrap such an
expression in a PlaceHolderVar). I'd supposed in commit
|
3 years ago |
|
|
739f1d6218 |
Fix mis-handling of outer join quals generated by EquivalenceClasses.
It's possible, in admittedly-rather-contrived cases, for an eclass to generate a derived "join" qual that constrains the post-outer-join value(s) of some RHS variable(s) without mentioning the LHS at all. While the mechanisms were set up to work for this, we fell foul of the "get_common_eclass_indexes" filter installed by commit 3373c7155: it could decide that such an eclass wasn't relevant to the join, so that the required qual clause wouldn't get emitted there or anywhere else. To fix, apply get_common_eclass_indexes only at inner joins, where its rule is still valid. At an outer join, fall back to examining all eclasses that mention either input (or the OJ relid, though it should be impossible for an eclass to mention that without mentioning either input). Perhaps we can improve on that later, but the cost/benefit of adding more complexity to skip some irrelevant eclasses is dubious. To allow cheaply distinguishing outer from inner joins, pass the ojrelid to generate_join_implied_equalities as a separate argument. This also allows cleaning up some sloppiness that had crept into the definition of its join_relids argument, and it allows accurate calculation of nominal_join_relids for a child outer join. (The latter oversight seems not to have been a live bug, but it certainly could have caused problems in future.) Also fix what might be a live bug in check_index_predicates: it was being sloppy about what it passed to generate_join_implied_equalities. Per report from Richard Guo. Discussion: https://postgr.es/m/CAMbWs4-DsTBfOvXuw64GdFss2=M5cwtEhY=0DCS7t2gT7P6hSA@mail.gmail.com |
3 years ago |
|
|
a75ff55c83 |
Fix some issues with wrong placement of pseudo-constant quals.
initsplan.c figured that it could push Var-free qual clauses to the top of the current JoinDomain, which is okay in the abstract. But if the current domain is inside some outer join, and we later commute an inside-the-domain outer join with one outside it, we end up placing the pushed-up qual clause incorrectly. In distribute_qual_to_rels, avoid this by using the syntactic scope of the qual clause; with the exception that if we're in the top-level join domain we can still use the full query relid set, ensuring the resulting gating Result node goes to the top of the plan. (This is approximately as smart as the pre-v16 code was. Perhaps we can do better later, but it's not clear that such cases are worth a lot of sweat.) In process_implied_equality, we don't have a clear notion of syntactic scope, but we do have the results of SpecialJoinInfo construction. Thumb through those and remove any lower outer joins that might get commuted to above the join domain. Again, we can make an exception for the top-level join domain. It'd be possible to work harder here (for example, by keeping outer joins that aren't shown as potentially commutable), but I'm going to stop here for the moment. This issue has convinced me that the current representation of join domains probably needs further refinement, so I'm disinclined to write inessential dependent logic just yet. In passing, tighten the qualscope passed to process_implied_equality by generate_base_implied_equalities_no_const; there's no need for it to be larger than the rel we are currently considering. Tom Lane and Richard Guo, per report from Tender Wang. Discussion: https://postgr.es/m/CAHewXNk9eJ35ru5xATWioTV4+xZPHptjy9etdcNPjUfY9RQ+uQ@mail.gmail.com |
3 years ago |
|
|
c7468c73f7 |
Fix buggy recursion in flatten_rtes_walker().
Must save-and-restore the context we are modifying.
Oversight in commit
|
3 years ago |
|
|
f50f029c49 |
Fix thinkos in have_unsafe_outer_join_ref; reduce to Assert check.
Late in the development of commit
|
3 years ago |
|
|
44e56baa80 |
Fix join removal logic to clean up sub-RestrictInfos of OR clauses.
analyzejoins.c took care to clean out removed relids from the clause_relids and required_relids of RestrictInfos associated with the doomed rel ... but it paid no attention to the fact that if such a RestrictInfo contains an OR clause, there will be sub-RestrictInfos containing similar fields. I'm more than a bit surprised that this oversight hasn't caused visible problems before. In any case, it's certainly broken now, so add logic to clean out the sub-RestrictInfos recursively. We might need to back-patch this someday. Per bug #17786 from Robins Tharakan. Discussion: https://postgr.es/m/17786-f1ea7fbdab97daec@postgresql.org |
3 years ago |
|
|
acc5821e4d |
Further fixes in qual nullingrel adjustment for outer join commutation.
One of the add_nulling_relids calls in deconstruct_distribute_oj_quals added an OJ relid to too few Vars, while the other added it to too many. We should consider the syntactic structure not min_left/righthand while deciding which Vars to decorate, and when considering pushing up a lower outer join pursuant to transforming the second form of OJ identity 3 to the first form, we only want to decorate Vars coming from its LHS. In a related bug, I realized that make_outerjoininfo was failing to check a very basic property that's needed to apply OJ identity 3: the syntactically-upper outer join clause can't refer to the lower join's LHS. This didn't break the join order restriction logic, but it led to setting bogus commute_xxx bits, possibly resulting in bogus nullingrel markings in modified quals. Richard Guo and Tom Lane Discussion: https://postgr.es/m/CAMbWs497CmBruMx1SOjepWEz+T5NWa4scqbdE9v7ZzSXqH_gQw@mail.gmail.com Discussion: https://postgr.es/m/CAEP4nAx9C5gXNBfEA0JBfz7B+5f1Bawt-RWQWyhev-wdps8BZA@mail.gmail.com |
3 years ago |
|
|
798c017634 |
remove_rel_from_query() must clean up PlaceHolderVar.phrels fields.
While we got away with this sloppiness before, it's not okay now
that
|
3 years ago |
|
|
fee7b77b90 |
Rethink nullingrel marking rules in build_joinrel_tlist().
The logic for when to add the current outer join's own relid to the nullingrels sets of output Vars and PHVs was overly complicated and underly correct. Not sure why I didn't think of this before, but since what we want is marking per the syntactic structure, we can just consult our records about the syntactic structure, ie syn_righthand/syn_lefthand. Also, tighten the rule about when to add the commute_above_r bits, in hopes of eliminating some squishy reasoning. I do not know of a reason to think that that's broken as-is, but this way seems better. Per bug #17781 from Robins Tharakan. Discussion: https://postgr.es/m/17781-c0405c8b3cd5e072@postgresql.org |
3 years ago |
|
|
2cbbffff05 |
Remove leftover code in deconstruct_distribute_oj_quals().
The initial "put back OJ relids" adjustment of ojscope was incorrect and unnecessary; it seems to be a leftover from when I (tgl) was trying to get this function to work at all. Richard Guo Discussion: https://postgr.es/m/CAMbWs4-L2C47ZGZPabBAi5oDZsKmsbvhYcGCy5o=gCjsaG_ZQA@mail.gmail.com |
3 years ago |
|
|
cad5692051 |
Fix up join removal's interaction with PlaceHolderVars.
The portion of join_is_removable() that checks PlaceHolderVars
can be made a little more accurate and intelligible than it was.
The key point is that we can allow join removal even if a PHV
mentions the target rel in ph_eval_at, if that mention was only
added as a consequence of forcing the PHV up to a join level
that's at/above the outer join we're trying to get rid of.
We can check that by testing for the OJ's relid appearing in
ph_eval_at, indicating that it's supposed to be evaluated after
the outer join, plus the existing test that the contained
expression doesn't actually mention the target rel.
While here, add an explicit check that there'll be something left
in ph_eval_at after we remove the target rel and OJ relid. There
is an Assert later on about that, and I'm not too sure that the
case could happen for a PHV satisfying the other constraints,
but let's just check. (There was previously a bms_is_subset test
that meant to cover this risk, but it's broken now because it
doesn't account for the fact that we'll also remove the OJ relid.)
The real reason for revisiting this code though is that the
Assert I left behind in
|
3 years ago |
|
|
b2d0e13a0a |
Fix over-optimistic updating of info about commutable outer joins.
make_outerjoininfo was set up to update SpecialJoinInfo's commute_below, commute_above_l, commute_above_r fields as soon as it found a pair of outer joins that look like they can commute. However, this decision could be negated later in the same loop due to finding an intermediate outer join that prevents commutation. That left us with commute_xxx fields that were contradictory to the join order restrictions expressed in min_lefthand/min_righthand. The latter fields would keep us from actually choosing a bad join order; but the inconsistent commute_xxx fields could bollix details such as the varnullingrels values created for intermediate join relation targetlists, ending in an assertion failure in setrefs.c. To fix, wait till the end of make_outerjoininfo where we have accurate values for min_lefthand/min_righthand, and then insert only relids not present in those sets into the commute_xxx fields. Per SQLSmith testing by Robins Tharakan. Note that while Robins bisected the failure to commit |
3 years ago |
|
|
9f452feeeb |
Fix thinko in qual distribution.
deconstruct_distribute tweaks the outer join scope (ojscope) it passes to distribute_qual_to_rels when considering an outer join qual that's above potentially-commutable outer joins. However, if the current join is *not* potentially commutable, we shouldn't do that. The argument that distribute_qual_to_rels will not do something wrong with the bogus ojscope falls flat if we don't pass it non-null postponed_oj_qual_list. Moreover, there's no need to play games in this case since we aren't going to commute anything. Per SQLSmith testing by Robins Tharakan. Discussion: https://postgr.es/m/CAEP4nAw74k4b-=93gmfCNX3MOY3y4uPxqbk_MnCVEpdsqHJVsg@mail.gmail.com |
3 years ago |
|
|
8538519db1 |
Fix thinko in outer-join removal.
If we have a RestrictInfo that mentions both the removal-candidate relation and the outer join's relid, then that is a pushed-down condition not a join condition, so it should be grounds for deciding that we can't remove the outer join. In commit |
3 years ago |
|
|
5840c20272 |
Rethink treatment of "postponed" quals in deconstruct_jointree().
After pulling up LATERAL subqueries, we may have qual clauses that
refer to relations outside their syntactic scope. Before doing any
such pullup, prepjointree.c checks to make sure that it wouldn't
create a semantically-invalid situation; but we leave it to
deconstruct_jointree() to actually move these quals up the join
tree to a place where they can be evaluated. In commit
|
3 years ago |
|
|
eae0e20def |
Remove over-optimistic Assert.
In commit
|
3 years ago |
|
|
b448f1c8d8 |
Do assorted mop-up in the planner.
Remove RestrictInfo.nullable_relids, along with a good deal of infrastructure that calculated it. One use-case for it was in join_clause_is_movable_to, but we can now replace that usage with a check to see if the clause's relids include any outer join that can null the target relation. The other use-case was in join_clause_is_movable_into, but that test can just be dropped entirely now that the clause's relids include outer joins. Furthermore, join_clause_is_movable_into should now be accurate enough that it will accept anything returned by generate_join_implied_equalities, so we can restore the Assert that was diked out in commit |
3 years ago |
|
|
2489d76c49 |
Make Vars be outer-join-aware.
Traditionally we used the same Var struct to represent the value of a table column everywhere in parse and plan trees. This choice predates our support for SQL outer joins, and it's really a pretty bad idea with outer joins, because the Var's value can depend on where it is in the tree: it might go to NULL above an outer join. So expression nodes that are equal() per equalfuncs.c might not represent the same value, which is a huge correctness hazard for the planner. To improve this, decorate Var nodes with a bitmapset showing which outer joins (identified by RTE indexes) may have nulled them at the point in the parse tree where the Var appears. This allows us to trust that equal() Vars represent the same value. A certain amount of klugery is still needed to cope with cases where we re-order two outer joins, but it's possible to make it work without sacrificing that core principle. PlaceHolderVars receive similar decoration for the same reason. In the planner, we include these outer join bitmapsets into the relids that an expression is considered to depend on, and in consequence also add outer-join relids to the relids of join RelOptInfos. This allows us to correctly perceive whether an expression can be calculated above or below a particular outer join. This change affects FDWs that want to plan foreign joins. They *must* follow suit when labeling foreign joins in order to match with the core planner, but for many purposes (if postgres_fdw is any guide) they'd prefer to consider only base relations within the join. To support both requirements, redefine ForeignScan.fs_relids as base+OJ relids, and add a new field fs_base_relids that's set up by the core planner. Large though it is, this commit just does the minimum necessary to install the new mechanisms and get check-world passing again. Follow-up patches will perform some cleanup. (The README additions and comments mention some stuff that will appear in the follow-up.) Patch by me; thanks to Richard Guo for review. Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us |
3 years ago |
|
|
3c569049b7 |
Allow left join removals and unique joins on partitioned tables
This allows left join removals and unique joins to work with partitioned tables. The planner just lacked sufficient proofs that a given join would not cause any row duplication. Unique indexes currently serve as that proof, so have get_relation_info() populate the indexlist for partitioned tables too. Author: Arne Roland Reviewed-by: Alvaro Herrera, Zhihong Yu, Amit Langote, David Rowley Discussion: https://postgr.es/m/c3b2408b7a39433b8230bbcd02e9f302@index.de |
3 years ago |
|
|
7122f9d543 |
Fix bit-rotted planner test case.
While fooling with my pet outer-join-variables patch, I discovered
that the test case I added in commit
|
3 years ago |
|
|
51dfaa0b01 |
Remove bogus Assert and dead code in remove_useless_results_recurse().
The JOIN_SEMI case Assert'ed that there are no PlaceHolderVars that need to be evaluated at the semijoin's RHS, which is wrong because there could be some in the semijoin's qual condition. However, there could not be any references further up than that, and within the qual there is not any way that such a PHV could have gone to null yet, so we don't really need the PHV and there is no need to avoid making the RHS-removal optimization. The upshot is that there's no actual bug in production code, and we ought to just remove this misguided Assert. While we're here, also drop the JOIN_RIGHT case, which is dead code because reduce_outer_joins() already got rid of JOIN_RIGHT. Per bug #17700 from Xin Wen. Uselessness of the JOIN_RIGHT case pointed out by Richard Guo. Back-patch to v12 where this code was added. Discussion: https://postgr.es/m/17700-2b5c10d917c30687@postgresql.org |
3 years ago |
|
|
56d0ed3b75 |
Give better hints for ambiguous or unreferenceable columns.
Examine ParseNamespaceItem flags to detect whether a column name is unreferenceable for lack of LATERAL, or could be referenced if a qualified name were used, and give better hints for such cases. Also, don't phrase the message to imply that there's only one matching column when there is really more than one. Many of the regression test output changes are not very interesting, but just reflect reclassifying the "There is a column ... but it cannot be referenced from this part of the query" messages as DETAIL rather than HINT. They are details per our style guide, in the sense of being factual rather than offering advice; and this change provides room to offer actual HINTs about what to do. While here, adjust the fuzzy-name-matching code to be a shade less impenetrable. It was overloading the meanings of FuzzyAttrMatchState fields way too much IMO, so splitting them into multiple fields seems to make it clearer. It's not like we need to shave bytes in that struct. Per discussion of bug #17233 from Alexander Korolev. Discussion: https://postgr.es/m/17233-afb9d806aaa64b17@postgresql.org |
3 years ago |
|
|
ff8fa0bf7e |
Handle SubPlan cases in find_nonnullable_rels/vars.
We can use some variants of SubPlan to deduce that Vars appearing in the testexpr must be non-null. Richard Guo Discussion: https://postgr.es/m/CAMbWs4-jV=199A2Y_6==99dYnpnmaO_Wz_RGkRTTaCB=Pihw2w@mail.gmail.com |
3 years ago |
|
|
0043aa6b85 |
Add basic regression tests for semi/antijoin recognition.
Add some simple tests that the planner recognizes all the standard idioms for SEMI and ANTI joins. Failure to optimize in this way won't necessarily cause any visible change in query results, so check the plans. We had no similar coverage before, at least for some variants of antijoin, as noted by Richard Guo. Discussion: https://postgr.es/m/CAMbWs4-mvPPCJ1W6iK6dD5HiNwoJdi6mZp=-7mE8N9Sh+cd0tQ@mail.gmail.com |
3 years ago |
|
|
b592422095 |
Relax overly strict rules in select_outer_pathkeys_for_merge()
The select_outer_pathkeys_for_merge function made an attempt to build the merge join pathkeys in the same order as query_pathkeys. This was done as it may have led to no sort being required for an ORDER BY or GROUP BY clause in the upper planner. However, this restriction seems overly strict as it required that we match the query_pathkeys entirely or we don't bother putting the merge join pathkeys in that order. Here we relax this rule so that we use a prefix of the query_pathkeys providing that prefix matches all of the join quals. This may provide the upper planner with partially sorted input which will allow the use of incremental sorts instead of full sorts. Author: David Rowley Reviewed-by: Richard Guo Discussion: https://postgr.es/m/CAApHDvrtZu0PHVfDPFM4Yx3jNR2Wuwosv+T2zqa7LrhhBr2rRg@mail.gmail.com |
4 years ago |
|
|
8d367a44d3 |
Fix alias matching in transformLockingClause().
When locking a specific named relation for a FOR [KEY] UPDATE/SHARE clause, transformLockingClause() finds the relation to lock by scanning the rangetable for an RTE with a matching eref->aliasname. However, it failed to account for the visibility rules of a join RTE. If a join RTE doesn't have a user-supplied alias, it will have a generated eref->aliasname of "unnamed_join" that is not visible as a relation name in the parse namespace. Such an RTE needs to be skipped, otherwise it might be found in preference to a regular base relation with a user-supplied alias of "unnamed_join", preventing it from being locked. In addition, if a join RTE doesn't have a user-supplied alias, but does have a join_using_alias, then the RTE needs to be matched using that alias rather than the generated eref->aliasname, otherwise a misleading "relation not found" error will be reported rather than a "join cannot be locked" error. Backpatch all the way, except for the second part which only goes back to 14, where JOIN USING aliases were added. Dean Rasheed, reviewed by Tom Lane. Discussion: https://postgr.es/m/CAEZATCUY_KOBnqxbTSPf=7fz9HWPnZ5Xgb9SwYzZ8rFXe7nb=w@mail.gmail.com |
4 years ago |
|
|
0fbf011200
|
Check column list length in XMLTABLE/JSON_TABLE alias
We weren't checking the length of the column list in the alias clause of an XMLTABLE or JSON_TABLE function (a "tablefunc" RTE), and it was possible to make the server crash by passing an overly long one. Fix it by throwing an error in that case, like the other places that deal with alias lists. In passing, modify the equivalent test used for join RTEs to look like the other ones, which was different for no apparent reason. This bug came in when XMLTABLE was born in version 10; backpatch to all stable versions. Reported-by: Wang Ke <krking@zju.edu.cn> Discussion: https://postgr.es/m/17480-1c9d73565bb28e90@postgresql.org |
4 years ago |
|
|
cc50080a82 |
Rearrange core regression tests to reduce cross-script dependencies.
The idea behind this patch is to make it possible to run individual test scripts without running the entire core test suite. Making all the scripts completely independent would involve a massive rewrite, and would probably be worse for coverage of things like concurrent DDL. So this patch just does what seems practical with limited changes. The net effect is that any test script can be run after running limited earlier dependencies: * all scripts depend on test_setup * many scripts depend on create_index * other dependencies are few in number, and are documented in the parallel_schedule file. To accomplish this, I chose a small number of commonly-used tables and moved their creation and filling into test_setup. Later scripts are expected not to modify these tables' data contents, for fear of affecting other scripts' results. Also, our former habit of declaring all C functions in one place is now gone in favor of declaring them where they're used, if that's just one script, or in test_setup if necessary. There's more that could be done to remove some of the remaining inter-script dependencies, but significantly more-invasive changes would be needed, and at least for now it doesn't seem worth it. Discussion: https://postgr.es/m/1114748.1640383217@sss.pgh.pa.us |
4 years ago |
|
|
a21049fd3f |
Fix pull_varnos to cope with translated PlaceHolderVars.
Commit
|
4 years ago |
|
|
83f4fcc655 |
Change the name of the Result Cache node to Memoize
"Result Cache" was never a great name for this node, but nobody managed to come up with another name that anyone liked enough. That was until David Johnston mentioned "Node Memoization", which Tom Lane revised to just "Memoize". People seem to like "Memoize", so let's do the rename. Reviewed-by: Justin Pryzby Discussion: https://postgr.es/m/20210708165145.GG1176@momjian.us Backpatch-through: 14, where Result Cache was introduced |
5 years ago |
|
|
d23ac62afa |
Avoid creating a RESULT RTE that's marked LATERAL.
Commit
|
5 years ago |
|
|
9eacee2e62 |
Add Result Cache executor node (take 2)
Here we add a new executor node type named "Result Cache". The planner can include this node type in the plan to have the executor cache the results from the inner side of parameterized nested loop joins. This allows caching of tuples for sets of parameters so that in the event that the node sees the same parameter values again, it can just return the cached tuples instead of rescanning the inner side of the join all over again. Internally, result cache uses a hash table in order to quickly find tuples that have been previously cached. For certain data sets, this can significantly improve the performance of joins. The best cases for using this new node type are for join problems where a large portion of the tuples from the inner side of the join have no join partner on the outer side of the join. In such cases, hash join would have to hash values that are never looked up, thus bloating the hash table and possibly causing it to multi-batch. Merge joins would have to skip over all of the unmatched rows. If we use a nested loop join with a result cache, then we only cache tuples that have at least one join partner on the outer side of the join. The benefits of using a parameterized nested loop with a result cache increase when there are fewer distinct values being looked up and the number of lookups of each value is large. Also, hash probes to lookup the cache can be much faster than the hash probe in a hash join as it's common that the result cache's hash table is much smaller than the hash join's due to result cache only caching useful tuples rather than all tuples from the inner side of the join. This variation in hash probe performance is more significant when the hash join's hash table no longer fits into the CPU's L3 cache, but the result cache's hash table does. The apparent "random" access of hash buckets with each hash probe can cause a poor L3 cache hit ratio for large hash tables. Smaller hash tables generally perform better. The hash table used for the cache limits itself to not exceeding work_mem * hash_mem_multiplier in size. We maintain a dlist of keys for this cache and when we're adding new tuples and realize we've exceeded the memory budget, we evict cache entries starting with the least recently used ones until we have enough memory to add the new tuples to the cache. For parameterized nested loop joins, we now consider using one of these result cache nodes in between the nested loop node and its inner node. We determine when this might be useful based on cost, which is primarily driven off of what the expected cache hit ratio will be. Estimating the cache hit ratio relies on having good distinct estimates on the nested loop's parameters. For now, the planner will only consider using a result cache for parameterized nested loop joins. This works for both normal joins and also for LATERAL type joins to subqueries. It is possible to use this new node for other uses in the future. For example, to cache results from correlated subqueries. However, that's not done here due to some difficulties obtaining a distinct estimation on the outer plan to calculate the estimated cache hit ratio. Currently we plan the inner plan before planning the outer plan so there is no good way to know if a result cache would be useful or not since we can't estimate the number of times the subplan will be called until the outer plan is generated. The functionality being added here is newly introducing a dependency on the return value of estimate_num_groups() during the join search. Previously, during the join search, we only ever needed to perform selectivity estimations. With this commit, we need to use estimate_num_groups() in order to estimate what the hit ratio on the result cache will be. In simple terms, if we expect 10 distinct values and we expect 1000 outer rows, then we'll estimate the hit ratio to be 99%. Since cache hits are very cheap compared to scanning the underlying nodes on the inner side of the nested loop join, then this will significantly reduce the planner's cost for the join. However, it's fairly easy to see here that things will go bad when estimate_num_groups() incorrectly returns a value that's significantly lower than the actual number of distinct values. If this happens then that may cause us to make use of a nested loop join with a result cache instead of some other join type, such as a merge or hash join. Our distinct estimations have been known to be a source of trouble in the past, so the extra reliance on them here could cause the planner to choose slower plans than it did previous to having this feature. Distinct estimations are also fairly hard to estimate accurately when several tables have been joined already or when a WHERE clause filters out a set of values that are correlated to the expressions we're estimating the number of distinct value for. For now, the costing we perform during query planning for result caches does put quite a bit of faith in the distinct estimations being accurate. When these are accurate then we should generally see faster execution times for plans containing a result cache. However, in the real world, we may find that we need to either change the costings to put less trust in the distinct estimations being accurate or perhaps even disable this feature by default. There's always an element of risk when we teach the query planner to do new tricks that it decides to use that new trick at the wrong time and causes a regression. Users may opt to get the old behavior by turning the feature off using the enable_resultcache GUC. Currently, this is enabled by default. It remains to be seen if we'll maintain that setting for the release. Additionally, the name "Result Cache" is the best name I could think of for this new node at the time I started writing the patch. Nobody seems to strongly dislike the name. A few people did suggest other names but no other name seemed to dominate in the brief discussion that there was about names. Let's allow the beta period to see if the current name pleases enough people. If there's some consensus on a better name, then we can change it before the release. Please see the 2nd discussion link below for the discussion on the "Result Cache" name. Author: David Rowley Reviewed-by: Andy Fan, Justin Pryzby, Zhihong Yu, Hou Zhijie Tested-By: Konstantin Knizhnik Discussion: https://postgr.es/m/CAApHDvrPcQyQdWERGYWx8J%2B2DLUNgXu%2BfOSbQ1UscxrunyXyrQ%40mail.gmail.com Discussion: https://postgr.es/m/CAApHDvq=yQXr5kqhRviT2RhNKwToaWr9JAN5t+5_PzhuRJ3wvg@mail.gmail.com |
5 years ago |
|
|
28b3e3905c |
Revert b6002a796
This removes "Add Result Cache executor node". It seems that something weird is going on with the tracking of cache hits and misses as highlighted by many buildfarm animals. It's not yet clear what the problem is as other parts of the plan indicate that the cache did work correctly, it's just the hits and misses that were being reported as 0. This is especially a bad time to have the buildfarm so broken, so reverting before too many more animals go red. Discussion: https://postgr.es/m/CAApHDvq_hydhfovm4=izgWs+C5HqEeRScjMbOgbpC-jRAeK3Yw@mail.gmail.com |
5 years ago |
|
|
b6002a796d |
Add Result Cache executor node
Here we add a new executor node type named "Result Cache". The planner can include this node type in the plan to have the executor cache the results from the inner side of parameterized nested loop joins. This allows caching of tuples for sets of parameters so that in the event that the node sees the same parameter values again, it can just return the cached tuples instead of rescanning the inner side of the join all over again. Internally, result cache uses a hash table in order to quickly find tuples that have been previously cached. For certain data sets, this can significantly improve the performance of joins. The best cases for using this new node type are for join problems where a large portion of the tuples from the inner side of the join have no join partner on the outer side of the join. In such cases, hash join would have to hash values that are never looked up, thus bloating the hash table and possibly causing it to multi-batch. Merge joins would have to skip over all of the unmatched rows. If we use a nested loop join with a result cache, then we only cache tuples that have at least one join partner on the outer side of the join. The benefits of using a parameterized nested loop with a result cache increase when there are fewer distinct values being looked up and the number of lookups of each value is large. Also, hash probes to lookup the cache can be much faster than the hash probe in a hash join as it's common that the result cache's hash table is much smaller than the hash join's due to result cache only caching useful tuples rather than all tuples from the inner side of the join. This variation in hash probe performance is more significant when the hash join's hash table no longer fits into the CPU's L3 cache, but the result cache's hash table does. The apparent "random" access of hash buckets with each hash probe can cause a poor L3 cache hit ratio for large hash tables. Smaller hash tables generally perform better. The hash table used for the cache limits itself to not exceeding work_mem * hash_mem_multiplier in size. We maintain a dlist of keys for this cache and when we're adding new tuples and realize we've exceeded the memory budget, we evict cache entries starting with the least recently used ones until we have enough memory to add the new tuples to the cache. For parameterized nested loop joins, we now consider using one of these result cache nodes in between the nested loop node and its inner node. We determine when this might be useful based on cost, which is primarily driven off of what the expected cache hit ratio will be. Estimating the cache hit ratio relies on having good distinct estimates on the nested loop's parameters. For now, the planner will only consider using a result cache for parameterized nested loop joins. This works for both normal joins and also for LATERAL type joins to subqueries. It is possible to use this new node for other uses in the future. For example, to cache results from correlated subqueries. However, that's not done here due to some difficulties obtaining a distinct estimation on the outer plan to calculate the estimated cache hit ratio. Currently we plan the inner plan before planning the outer plan so there is no good way to know if a result cache would be useful or not since we can't estimate the number of times the subplan will be called until the outer plan is generated. The functionality being added here is newly introducing a dependency on the return value of estimate_num_groups() during the join search. Previously, during the join search, we only ever needed to perform selectivity estimations. With this commit, we need to use estimate_num_groups() in order to estimate what the hit ratio on the result cache will be. In simple terms, if we expect 10 distinct values and we expect 1000 outer rows, then we'll estimate the hit ratio to be 99%. Since cache hits are very cheap compared to scanning the underlying nodes on the inner side of the nested loop join, then this will significantly reduce the planner's cost for the join. However, it's fairly easy to see here that things will go bad when estimate_num_groups() incorrectly returns a value that's significantly lower than the actual number of distinct values. If this happens then that may cause us to make use of a nested loop join with a result cache instead of some other join type, such as a merge or hash join. Our distinct estimations have been known to be a source of trouble in the past, so the extra reliance on them here could cause the planner to choose slower plans than it did previous to having this feature. Distinct estimations are also fairly hard to estimate accurately when several tables have been joined already or when a WHERE clause filters out a set of values that are correlated to the expressions we're estimating the number of distinct value for. For now, the costing we perform during query planning for result caches does put quite a bit of faith in the distinct estimations being accurate. When these are accurate then we should generally see faster execution times for plans containing a result cache. However, in the real world, we may find that we need to either change the costings to put less trust in the distinct estimations being accurate or perhaps even disable this feature by default. There's always an element of risk when we teach the query planner to do new tricks that it decides to use that new trick at the wrong time and causes a regression. Users may opt to get the old behavior by turning the feature off using the enable_resultcache GUC. Currently, this is enabled by default. It remains to be seen if we'll maintain that setting for the release. Additionally, the name "Result Cache" is the best name I could think of for this new node at the time I started writing the patch. Nobody seems to strongly dislike the name. A few people did suggest other names but no other name seemed to dominate in the brief discussion that there was about names. Let's allow the beta period to see if the current name pleases enough people. If there's some consensus on a better name, then we can change it before the release. Please see the 2nd discussion link below for the discussion on the "Result Cache" name. Author: David Rowley Reviewed-by: Andy Fan, Justin Pryzby, Zhihong Yu Tested-By: Konstantin Knizhnik Discussion: https://postgr.es/m/CAApHDvrPcQyQdWERGYWx8J%2B2DLUNgXu%2BfOSbQ1UscxrunyXyrQ%40mail.gmail.com Discussion: https://postgr.es/m/CAApHDvq=yQXr5kqhRviT2RhNKwToaWr9JAN5t+5_PzhuRJ3wvg@mail.gmail.com |
5 years ago |
|
|
055fee7eb4 |
Allow an alias to be attached to a JOIN ... USING
This allows something like
SELECT ... FROM t1 JOIN t2 USING (a, b, c) AS x
where x has the columns a, b, c and unlike a regular alias it does not
hide the range variables of the tables being joined t1 and t2.
Per SQL:2016 feature F404 "Range variable for common column names".
Reviewed-by: Vik Fearing <vik.fearing@2ndquadrant.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://www.postgresql.org/message-id/flat/454638cf-d563-ab76-a585-2564428062af@2ndquadrant.com
|
5 years ago |
|
|
55dc86eca7 |
Fix pull_varnos' miscomputation of relids set for a PlaceHolderVar.
Previously, pull_varnos() took the relids of a PlaceHolderVar as being
equal to the relids in its contents, but that fails to account for the
possibility that we have to postpone evaluation of the PHV due to outer
joins. This could result in a malformed plan. The known cases end up
triggering the "failed to assign all NestLoopParams to plan nodes"
sanity check in createplan.c, but other symptoms may be possible.
The right value to use is the join level we actually intend to evaluate
the PHV at. We can get that from the ph_eval_at field of the associated
PlaceHolderInfo. However, there are some places that call pull_varnos()
before the PlaceHolderInfos have been created; in that case, fall back
to the conservative assumption that the PHV will be evaluated at its
syntactic level. (In principle this might result in missing some legal
optimization, but I'm not aware of any cases where it's an issue in
practice.) Things are also a bit ticklish for calls occurring during
deconstruct_jointree(), but AFAICS the ph_eval_at fields should have
reached their final values by the time we need them.
The main problem in making this work is that pull_varnos() has no
way to get at the PlaceHolderInfos. We can fix that easily, if a
bit tediously, in HEAD by passing it the planner "root" pointer.
In the back branches that'd cause an unacceptable API/ABI break for
extensions, so leave the existing entry points alone and add new ones
with the additional parameter. (If an old entry point is called and
encounters a PHV, it'll fall back to using the syntactic level,
again possibly missing some valid optimization.)
Back-patch to v12. The computation is surely also wrong before that,
but it appears that we cannot reach a bad plan thanks to join order
restrictions imposed on the subquery that the PlaceHolderVar came from.
The error only became reachable when commit
|
5 years ago |