|
|
|
/*-------------------------------------------------------------------------
|
|
|
|
*
|
|
|
|
* planmain.c
|
|
|
|
* Routines to plan a single query
|
|
|
|
*
|
|
|
|
* What's in a name, anyway? The top-level entry point of the planner/
|
|
|
|
* optimizer is over in planner.c, not here as you might think from the
|
|
|
|
* file name. But this is the main code for planning a basic join operation,
|
|
|
|
* shorn of features like subselects, inheritance, aggregates, grouping,
|
|
|
|
* and so on. (Those are the things planner.c deals with.)
|
|
|
|
*
|
|
|
|
* Portions Copyright (c) 1996-2013, PostgreSQL Global Development Group
|
|
|
|
* Portions Copyright (c) 1994, Regents of the University of California
|
|
|
|
*
|
|
|
|
*
|
|
|
|
* IDENTIFICATION
|
|
|
|
* src/backend/optimizer/plan/planmain.c
|
|
|
|
*
|
|
|
|
*-------------------------------------------------------------------------
|
|
|
|
*/
|
|
|
|
#include "postgres.h"
|
|
|
|
|
Extract restriction OR clauses whether or not they are indexable.
It's possible to extract a restriction OR clause from a join clause that
has the form of an OR-of-ANDs, if each sub-AND includes a clause that
mentions only one specific relation. While PG has been aware of that idea
for many years, the code previously only did it if it could extract an
indexable OR clause. On reflection, though, that seems a silly limitation:
adding a restriction clause can be a win by reducing the number of rows
that have to be filtered at the join step, even if we have to test the
clause as a plain filter clause during the scan. This should be especially
useful for foreign tables, where the change can cut the number of rows that
have to be retrieved from the foreign server; but testing shows it can win
even on local tables. Per a suggestion from Robert Haas.
As a heuristic, I made the code accept an extracted restriction clause
if its estimated selectivity is less than 0.9, which will probably result
in accepting extracted clauses just about always. We might need to tweak
that later based on experience.
Since the code no longer has even a weak connection to Path creation,
remove orindxpath.c and create a new file optimizer/util/orclauses.c.
There's some additional janitorial cleanup of now-dead code that needs
to happen, but it seems like that's a fit subject for a separate commit.
12 years ago
|
|
|
#include "optimizer/orclauses.h"
|
|
|
|
#include "optimizer/pathnode.h"
|
|
|
|
#include "optimizer/paths.h"
|
|
|
|
#include "optimizer/placeholder.h"
|
|
|
|
#include "optimizer/planmain.h"
|
|
|
|
|
|
|
|
|
|
|
|
/*
|
|
|
|
* query_planner
|
|
|
|
* Generate a path (that is, a simplified plan) for a basic query,
|
|
|
|
* which may involve joins but not any fancier features.
|
|
|
|
*
|
|
|
|
* Since query_planner does not handle the toplevel processing (grouping,
|
Simplify query_planner's API by having it return the top-level RelOptInfo.
Formerly, query_planner returned one or possibly two Paths for the topmost
join relation, so that grouping_planner didn't see the join RelOptInfo
(at least not directly; it didn't have any hesitation about examining
cheapest_path->parent, though). However, correct selection of the Paths
involved a significant amount of coupling between query_planner and
grouping_planner, a problem which has gotten worse over time. It seems
best to give up on this API choice and instead return the topmost
RelOptInfo explicitly. Then grouping_planner can pull out the Paths it
wants from the rel's path list. In this way we can remove all knowledge
of grouping behaviors from query_planner.
The only real benefit of the old way is that in the case of an empty
FROM clause, we never made any RelOptInfos at all, just a Path. Now
we have to gin up a dummy RelOptInfo to represent the empty FROM clause.
That's not a very big deal though.
While at it, simplify query_planner's API a bit more by having the caller
set up root->tuple_fraction and root->limit_tuples, rather than passing
those values as separate parameters. Since query_planner no longer does
anything with either value, requiring it to fill the PlannerInfo fields
seemed pretty arbitrary.
This patch just rearranges code; it doesn't (intentionally) change any
behaviors. Followup patches will do more interesting things.
12 years ago
|
|
|
* sorting, etc) it cannot select the best path by itself. Instead, it
|
|
|
|
* returns the RelOptInfo for the top level of joining, and the caller
|
|
|
|
* (grouping_planner) can choose one of the surviving paths for the rel.
|
|
|
|
* Normally it would choose either the rel's cheapest path, or the cheapest
|
|
|
|
* path for the desired sort order.
|
|
|
|
*
|
|
|
|
* root describes the query to plan
|
|
|
|
* tlist is the target list the query should produce
|
|
|
|
* (this is NOT necessarily root->parse->targetList!)
|
Postpone creation of pathkeys lists to fix bug #8049.
This patch gets rid of the concept of, and infrastructure for,
non-canonical PathKeys; we now only ever create canonical pathkey lists.
The need for non-canonical pathkeys came from the desire to have
grouping_planner initialize query_pathkeys and related pathkey lists before
calling query_planner. However, since query_planner didn't actually *do*
anything with those lists before they'd been made canonical, we can get rid
of the whole mess by just not creating the lists at all until the point
where we formerly canonicalized them.
There are several ways in which we could implement that without making
query_planner itself deal with grouping/sorting features (which are
supposed to be the province of grouping_planner). I chose to add a
callback function to query_planner's API; other alternatives would have
required adding more fields to PlannerInfo, which while not bad in itself
would create an ABI break for planner-related plugins in the 9.2 release
series. This still breaks ABI for anything that calls query_planner
directly, but it seems somewhat unlikely that there are any such plugins.
I had originally conceived of this change as merely a step on the way to
fixing bug #8049 from Teun Hoogendoorn; but it turns out that this fixes
that bug all by itself, as per the added regression test. The reason is
that now get_eclass_for_sort_expr is adding the ORDER BY expression at the
end of EquivalenceClass creation not the start, and so anything that is in
a multi-member EquivalenceClass has already been created with correct
em_nullable_relids. I am suspicious that there are related scenarios in
which we still need to teach get_eclass_for_sort_expr to compute correct
nullable_relids, but am not eager to risk destabilizing either 9.2 or 9.3
to fix bugs that are only hypothetical. So for the moment, do this and
stop here.
Back-patch to 9.2 but not to earlier branches, since they don't exhibit
this bug for lack of join-clause-movement logic that depends on
em_nullable_relids being correct. (We might have to revisit that choice
if any related bugs turn up.) In 9.2, don't change the signature of
make_pathkeys_for_sortclauses nor remove canonicalize_pathkeys, so as
not to risk more plugin breakage than we have to.
13 years ago
|
|
|
* qp_callback is a function to compute query_pathkeys once it's safe to do so
|
|
|
|
* qp_extra is optional extra data to pass to qp_callback
|
|
|
|
*
|
Postpone creation of pathkeys lists to fix bug #8049.
This patch gets rid of the concept of, and infrastructure for,
non-canonical PathKeys; we now only ever create canonical pathkey lists.
The need for non-canonical pathkeys came from the desire to have
grouping_planner initialize query_pathkeys and related pathkey lists before
calling query_planner. However, since query_planner didn't actually *do*
anything with those lists before they'd been made canonical, we can get rid
of the whole mess by just not creating the lists at all until the point
where we formerly canonicalized them.
There are several ways in which we could implement that without making
query_planner itself deal with grouping/sorting features (which are
supposed to be the province of grouping_planner). I chose to add a
callback function to query_planner's API; other alternatives would have
required adding more fields to PlannerInfo, which while not bad in itself
would create an ABI break for planner-related plugins in the 9.2 release
series. This still breaks ABI for anything that calls query_planner
directly, but it seems somewhat unlikely that there are any such plugins.
I had originally conceived of this change as merely a step on the way to
fixing bug #8049 from Teun Hoogendoorn; but it turns out that this fixes
that bug all by itself, as per the added regression test. The reason is
that now get_eclass_for_sort_expr is adding the ORDER BY expression at the
end of EquivalenceClass creation not the start, and so anything that is in
a multi-member EquivalenceClass has already been created with correct
em_nullable_relids. I am suspicious that there are related scenarios in
which we still need to teach get_eclass_for_sort_expr to compute correct
nullable_relids, but am not eager to risk destabilizing either 9.2 or 9.3
to fix bugs that are only hypothetical. So for the moment, do this and
stop here.
Back-patch to 9.2 but not to earlier branches, since they don't exhibit
this bug for lack of join-clause-movement logic that depends on
em_nullable_relids being correct. (We might have to revisit that choice
if any related bugs turn up.) In 9.2, don't change the signature of
make_pathkeys_for_sortclauses nor remove canonicalize_pathkeys, so as
not to risk more plugin breakage than we have to.
13 years ago
|
|
|
* Note: the PlannerInfo node also includes a query_pathkeys field, which
|
|
|
|
* tells query_planner the sort order that is desired in the final output
|
|
|
|
* plan. This value is *not* available at call time, but is computed by
|
|
|
|
* qp_callback once we have completed merging the query's equivalence classes.
|
|
|
|
* (We cannot construct canonical pathkeys until that's done.)
|
|
|
|
*/
|
Simplify query_planner's API by having it return the top-level RelOptInfo.
Formerly, query_planner returned one or possibly two Paths for the topmost
join relation, so that grouping_planner didn't see the join RelOptInfo
(at least not directly; it didn't have any hesitation about examining
cheapest_path->parent, though). However, correct selection of the Paths
involved a significant amount of coupling between query_planner and
grouping_planner, a problem which has gotten worse over time. It seems
best to give up on this API choice and instead return the topmost
RelOptInfo explicitly. Then grouping_planner can pull out the Paths it
wants from the rel's path list. In this way we can remove all knowledge
of grouping behaviors from query_planner.
The only real benefit of the old way is that in the case of an empty
FROM clause, we never made any RelOptInfos at all, just a Path. Now
we have to gin up a dummy RelOptInfo to represent the empty FROM clause.
That's not a very big deal though.
While at it, simplify query_planner's API a bit more by having the caller
set up root->tuple_fraction and root->limit_tuples, rather than passing
those values as separate parameters. Since query_planner no longer does
anything with either value, requiring it to fill the PlannerInfo fields
seemed pretty arbitrary.
This patch just rearranges code; it doesn't (intentionally) change any
behaviors. Followup patches will do more interesting things.
12 years ago
|
|
|
RelOptInfo *
|
|
|
|
query_planner(PlannerInfo *root, List *tlist,
|
Simplify query_planner's API by having it return the top-level RelOptInfo.
Formerly, query_planner returned one or possibly two Paths for the topmost
join relation, so that grouping_planner didn't see the join RelOptInfo
(at least not directly; it didn't have any hesitation about examining
cheapest_path->parent, though). However, correct selection of the Paths
involved a significant amount of coupling between query_planner and
grouping_planner, a problem which has gotten worse over time. It seems
best to give up on this API choice and instead return the topmost
RelOptInfo explicitly. Then grouping_planner can pull out the Paths it
wants from the rel's path list. In this way we can remove all knowledge
of grouping behaviors from query_planner.
The only real benefit of the old way is that in the case of an empty
FROM clause, we never made any RelOptInfos at all, just a Path. Now
we have to gin up a dummy RelOptInfo to represent the empty FROM clause.
That's not a very big deal though.
While at it, simplify query_planner's API a bit more by having the caller
set up root->tuple_fraction and root->limit_tuples, rather than passing
those values as separate parameters. Since query_planner no longer does
anything with either value, requiring it to fill the PlannerInfo fields
seemed pretty arbitrary.
This patch just rearranges code; it doesn't (intentionally) change any
behaviors. Followup patches will do more interesting things.
12 years ago
|
|
|
query_pathkeys_callback qp_callback, void *qp_extra)
|
|
|
|
{
|
|
|
|
Query *parse = root->parse;
|
|
|
|
List *joinlist;
|
|
|
|
RelOptInfo *final_rel;
|
|
|
|
Index rti;
|
|
|
|
double total_pages;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* If the query has an empty join tree, then it's something easy like
|
|
|
|
* "SELECT 2+2;" or "INSERT ... VALUES()". Fall through quickly.
|
|
|
|
*/
|
|
|
|
if (parse->jointree->fromlist == NIL)
|
|
|
|
{
|
Simplify query_planner's API by having it return the top-level RelOptInfo.
Formerly, query_planner returned one or possibly two Paths for the topmost
join relation, so that grouping_planner didn't see the join RelOptInfo
(at least not directly; it didn't have any hesitation about examining
cheapest_path->parent, though). However, correct selection of the Paths
involved a significant amount of coupling between query_planner and
grouping_planner, a problem which has gotten worse over time. It seems
best to give up on this API choice and instead return the topmost
RelOptInfo explicitly. Then grouping_planner can pull out the Paths it
wants from the rel's path list. In this way we can remove all knowledge
of grouping behaviors from query_planner.
The only real benefit of the old way is that in the case of an empty
FROM clause, we never made any RelOptInfos at all, just a Path. Now
we have to gin up a dummy RelOptInfo to represent the empty FROM clause.
That's not a very big deal though.
While at it, simplify query_planner's API a bit more by having the caller
set up root->tuple_fraction and root->limit_tuples, rather than passing
those values as separate parameters. Since query_planner no longer does
anything with either value, requiring it to fill the PlannerInfo fields
seemed pretty arbitrary.
This patch just rearranges code; it doesn't (intentionally) change any
behaviors. Followup patches will do more interesting things.
12 years ago
|
|
|
/* We need a dummy joinrel to describe the empty set of baserels */
|
|
|
|
final_rel = build_empty_join_rel(root);
|
|
|
|
|
|
|
|
/* The only path for it is a trivial Result path */
|
|
|
|
add_path(final_rel, (Path *)
|
|
|
|
create_result_path((List *) parse->jointree->quals));
|
|
|
|
|
|
|
|
/* Select cheapest path (pretty easy in this case...) */
|
|
|
|
set_cheapest(final_rel);
|
|
|
|
|
|
|
|
/*
|
Postpone creation of pathkeys lists to fix bug #8049.
This patch gets rid of the concept of, and infrastructure for,
non-canonical PathKeys; we now only ever create canonical pathkey lists.
The need for non-canonical pathkeys came from the desire to have
grouping_planner initialize query_pathkeys and related pathkey lists before
calling query_planner. However, since query_planner didn't actually *do*
anything with those lists before they'd been made canonical, we can get rid
of the whole mess by just not creating the lists at all until the point
where we formerly canonicalized them.
There are several ways in which we could implement that without making
query_planner itself deal with grouping/sorting features (which are
supposed to be the province of grouping_planner). I chose to add a
callback function to query_planner's API; other alternatives would have
required adding more fields to PlannerInfo, which while not bad in itself
would create an ABI break for planner-related plugins in the 9.2 release
series. This still breaks ABI for anything that calls query_planner
directly, but it seems somewhat unlikely that there are any such plugins.
I had originally conceived of this change as merely a step on the way to
fixing bug #8049 from Teun Hoogendoorn; but it turns out that this fixes
that bug all by itself, as per the added regression test. The reason is
that now get_eclass_for_sort_expr is adding the ORDER BY expression at the
end of EquivalenceClass creation not the start, and so anything that is in
a multi-member EquivalenceClass has already been created with correct
em_nullable_relids. I am suspicious that there are related scenarios in
which we still need to teach get_eclass_for_sort_expr to compute correct
nullable_relids, but am not eager to risk destabilizing either 9.2 or 9.3
to fix bugs that are only hypothetical. So for the moment, do this and
stop here.
Back-patch to 9.2 but not to earlier branches, since they don't exhibit
this bug for lack of join-clause-movement logic that depends on
em_nullable_relids being correct. (We might have to revisit that choice
if any related bugs turn up.) In 9.2, don't change the signature of
make_pathkeys_for_sortclauses nor remove canonicalize_pathkeys, so as
not to risk more plugin breakage than we have to.
13 years ago
|
|
|
* We still are required to call qp_callback, in case it's something
|
|
|
|
* like "SELECT 2+2 ORDER BY 1".
|
|
|
|
*/
|
|
|
|
root->canon_pathkeys = NIL;
|
Postpone creation of pathkeys lists to fix bug #8049.
This patch gets rid of the concept of, and infrastructure for,
non-canonical PathKeys; we now only ever create canonical pathkey lists.
The need for non-canonical pathkeys came from the desire to have
grouping_planner initialize query_pathkeys and related pathkey lists before
calling query_planner. However, since query_planner didn't actually *do*
anything with those lists before they'd been made canonical, we can get rid
of the whole mess by just not creating the lists at all until the point
where we formerly canonicalized them.
There are several ways in which we could implement that without making
query_planner itself deal with grouping/sorting features (which are
supposed to be the province of grouping_planner). I chose to add a
callback function to query_planner's API; other alternatives would have
required adding more fields to PlannerInfo, which while not bad in itself
would create an ABI break for planner-related plugins in the 9.2 release
series. This still breaks ABI for anything that calls query_planner
directly, but it seems somewhat unlikely that there are any such plugins.
I had originally conceived of this change as merely a step on the way to
fixing bug #8049 from Teun Hoogendoorn; but it turns out that this fixes
that bug all by itself, as per the added regression test. The reason is
that now get_eclass_for_sort_expr is adding the ORDER BY expression at the
end of EquivalenceClass creation not the start, and so anything that is in
a multi-member EquivalenceClass has already been created with correct
em_nullable_relids. I am suspicious that there are related scenarios in
which we still need to teach get_eclass_for_sort_expr to compute correct
nullable_relids, but am not eager to risk destabilizing either 9.2 or 9.3
to fix bugs that are only hypothetical. So for the moment, do this and
stop here.
Back-patch to 9.2 but not to earlier branches, since they don't exhibit
this bug for lack of join-clause-movement logic that depends on
em_nullable_relids being correct. (We might have to revisit that choice
if any related bugs turn up.) In 9.2, don't change the signature of
make_pathkeys_for_sortclauses nor remove canonicalize_pathkeys, so as
not to risk more plugin breakage than we have to.
13 years ago
|
|
|
(*qp_callback) (root, qp_extra);
|
Simplify query_planner's API by having it return the top-level RelOptInfo.
Formerly, query_planner returned one or possibly two Paths for the topmost
join relation, so that grouping_planner didn't see the join RelOptInfo
(at least not directly; it didn't have any hesitation about examining
cheapest_path->parent, though). However, correct selection of the Paths
involved a significant amount of coupling between query_planner and
grouping_planner, a problem which has gotten worse over time. It seems
best to give up on this API choice and instead return the topmost
RelOptInfo explicitly. Then grouping_planner can pull out the Paths it
wants from the rel's path list. In this way we can remove all knowledge
of grouping behaviors from query_planner.
The only real benefit of the old way is that in the case of an empty
FROM clause, we never made any RelOptInfos at all, just a Path. Now
we have to gin up a dummy RelOptInfo to represent the empty FROM clause.
That's not a very big deal though.
While at it, simplify query_planner's API a bit more by having the caller
set up root->tuple_fraction and root->limit_tuples, rather than passing
those values as separate parameters. Since query_planner no longer does
anything with either value, requiring it to fill the PlannerInfo fields
seemed pretty arbitrary.
This patch just rearranges code; it doesn't (intentionally) change any
behaviors. Followup patches will do more interesting things.
12 years ago
|
|
|
|
|
|
|
return final_rel;
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Init planner lists to empty.
|
|
|
|
*
|
|
|
|
* NOTE: append_rel_list was set up by subquery_planner, so do not touch
|
|
|
|
* here; eq_classes and minmax_aggs may contain data already, too.
|
|
|
|
*/
|
|
|
|
root->join_rel_list = NIL;
|
|
|
|
root->join_rel_hash = NULL;
|
|
|
|
root->join_rel_level = NULL;
|
|
|
|
root->join_cur_level = 0;
|
|
|
|
root->canon_pathkeys = NIL;
|
Teach planner about some cases where a restriction clause can be
propagated inside an outer join. In particular, given
LEFT JOIN ON (A = B) WHERE A = constant, we cannot conclude that
B = constant at the top level (B might be null instead), but we
can nonetheless put a restriction B = constant into the quals for
B's relation, since no inner-side rows not meeting that condition
can contribute to the final result. Similarly, given
FULL JOIN USING (J) WHERE J = constant, we can't directly conclude
that either input J variable = constant, but it's OK to push such
quals into each input rel. Per recent gripe from Kim Bisgaard.
Along the way, remove 'valid_everywhere' flag from RestrictInfo,
as on closer analysis it was not being used for anything, and was
defined backwards anyway.
20 years ago
|
|
|
root->left_join_clauses = NIL;
|
|
|
|
root->right_join_clauses = NIL;
|
|
|
|
root->full_join_clauses = NIL;
|
|
|
|
root->join_info_list = NIL;
|
|
|
|
root->lateral_info_list = NIL;
|
|
|
|
root->placeholder_list = NIL;
|
|
|
|
root->initial_rels = NIL;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Make a flattened version of the rangetable for faster access (this is
|
|
|
|
* OK because the rangetable won't change any more), and set up an empty
|
|
|
|
* array for indexing base relations.
|
|
|
|
*/
|
|
|
|
setup_simple_rel_arrays(root);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Construct RelOptInfo nodes for all base relations in query, and
|
|
|
|
* indirectly for all appendrel member relations ("other rels"). This
|
|
|
|
* will give us a RelOptInfo for every "simple" (non-join) rel involved in
|
|
|
|
* the query.
|
|
|
|
*
|
|
|
|
* Note: the reason we find the rels by searching the jointree and
|
|
|
|
* appendrel list, rather than just scanning the rangetable, is that the
|
|
|
|
* rangetable may contain RTEs for rels not actively part of the query,
|
|
|
|
* for example views. We don't want to make RelOptInfos for them.
|
|
|
|
*/
|
|
|
|
add_base_rels_to_query(root, (Node *) parse->jointree);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Examine the targetlist and join tree, adding entries to baserel
|
|
|
|
* targetlists for all referenced Vars, and generating PlaceHolderInfo
|
|
|
|
* entries for all referenced PlaceHolderVars. Restrict and join clauses
|
|
|
|
* are added to appropriate lists belonging to the mentioned relations. We
|
|
|
|
* also build EquivalenceClasses for provably equivalent expressions. The
|
|
|
|
* SpecialJoinInfo list is also built to hold information about join order
|
|
|
|
* restrictions. Finally, we form a target joinlist for make_one_rel() to
|
|
|
|
* work from.
|
|
|
|
*/
|
|
|
|
build_base_rel_tlists(root, tlist);
|
|
|
|
|
Revisit handling of UNION ALL subqueries with non-Var output columns.
In commit 57664ed25e5dea117158a2e663c29e60b3546e1c I tried to fix a bug
reported by Teodor Sigaev by making non-simple-Var output columns distinct
(by wrapping their expressions with dummy PlaceHolderVar nodes). This did
not work too well. Commit b28ffd0fcc583c1811e5295279e7d4366c3cae6c fixed
some ensuing problems with matching to child indexes, but per a recent
report from Claus Stadler, constraint exclusion of UNION ALL subqueries was
still broken, because constant-simplification didn't handle the injected
PlaceHolderVars well either. On reflection, the original patch was quite
misguided: there is no reason to expect that EquivalenceClass child members
will be distinct. So instead of trying to make them so, we should ensure
that we can cope with the situation when they're not.
Accordingly, this patch reverts the code changes in the above-mentioned
commits (though the regression test cases they added stay). Instead, I've
added assorted defenses to make sure that duplicate EC child members don't
cause any problems. Teodor's original problem ("MergeAppend child's
targetlist doesn't match MergeAppend") is addressed more directly by
revising prepare_sort_from_pathkeys to let the parent MergeAppend's sort
list guide creation of each child's sort list.
In passing, get rid of add_sort_column; as far as I can tell, testing for
duplicate sort keys at this stage is dead code. Certainly it doesn't
trigger often enough to be worth expending cycles on in ordinary queries.
And keeping the test would've greatly complicated the new logic in
prepare_sort_from_pathkeys, because comparing pathkey list entries against
a previous output array requires that we not skip any entries in the list.
Back-patch to 9.1, like the previous patches. The only known issue in
this area that wasn't caused by the ill-advised previous patches was the
MergeAppend planning failure, which of course is not relevant before 9.1.
It's possible that we need some of the new defenses against duplicate child
EC entries in older branches, but until there's some clear evidence of that
I'm going to refrain from back-patching further.
14 years ago
|
|
|
find_placeholders_in_jointree(root);
|
|
|
|
|
|
|
|
find_lateral_references(root);
|
|
|
|
|
|
|
|
joinlist = deconstruct_jointree(root);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Reconsider any postponed outer-join quals now that we have built up
|
|
|
|
* equivalence classes. (This could result in further additions or
|
|
|
|
* mergings of classes.)
|
|
|
|
*/
|
|
|
|
reconsider_outer_join_clauses(root);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* If we formed any equivalence classes, generate additional restriction
|
|
|
|
* clauses as appropriate. (Implied join clauses are formed on-the-fly
|
|
|
|
* later.)
|
|
|
|
*/
|
|
|
|
generate_base_implied_equalities(root);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* We have completed merging equivalence sets, so it's now possible to
|
Postpone creation of pathkeys lists to fix bug #8049.
This patch gets rid of the concept of, and infrastructure for,
non-canonical PathKeys; we now only ever create canonical pathkey lists.
The need for non-canonical pathkeys came from the desire to have
grouping_planner initialize query_pathkeys and related pathkey lists before
calling query_planner. However, since query_planner didn't actually *do*
anything with those lists before they'd been made canonical, we can get rid
of the whole mess by just not creating the lists at all until the point
where we formerly canonicalized them.
There are several ways in which we could implement that without making
query_planner itself deal with grouping/sorting features (which are
supposed to be the province of grouping_planner). I chose to add a
callback function to query_planner's API; other alternatives would have
required adding more fields to PlannerInfo, which while not bad in itself
would create an ABI break for planner-related plugins in the 9.2 release
series. This still breaks ABI for anything that calls query_planner
directly, but it seems somewhat unlikely that there are any such plugins.
I had originally conceived of this change as merely a step on the way to
fixing bug #8049 from Teun Hoogendoorn; but it turns out that this fixes
that bug all by itself, as per the added regression test. The reason is
that now get_eclass_for_sort_expr is adding the ORDER BY expression at the
end of EquivalenceClass creation not the start, and so anything that is in
a multi-member EquivalenceClass has already been created with correct
em_nullable_relids. I am suspicious that there are related scenarios in
which we still need to teach get_eclass_for_sort_expr to compute correct
nullable_relids, but am not eager to risk destabilizing either 9.2 or 9.3
to fix bugs that are only hypothetical. So for the moment, do this and
stop here.
Back-patch to 9.2 but not to earlier branches, since they don't exhibit
this bug for lack of join-clause-movement logic that depends on
em_nullable_relids being correct. (We might have to revisit that choice
if any related bugs turn up.) In 9.2, don't change the signature of
make_pathkeys_for_sortclauses nor remove canonicalize_pathkeys, so as
not to risk more plugin breakage than we have to.
13 years ago
|
|
|
* generate pathkeys in canonical form; so compute query_pathkeys and
|
|
|
|
* other pathkeys fields in PlannerInfo.
|
|
|
|
*/
|
Postpone creation of pathkeys lists to fix bug #8049.
This patch gets rid of the concept of, and infrastructure for,
non-canonical PathKeys; we now only ever create canonical pathkey lists.
The need for non-canonical pathkeys came from the desire to have
grouping_planner initialize query_pathkeys and related pathkey lists before
calling query_planner. However, since query_planner didn't actually *do*
anything with those lists before they'd been made canonical, we can get rid
of the whole mess by just not creating the lists at all until the point
where we formerly canonicalized them.
There are several ways in which we could implement that without making
query_planner itself deal with grouping/sorting features (which are
supposed to be the province of grouping_planner). I chose to add a
callback function to query_planner's API; other alternatives would have
required adding more fields to PlannerInfo, which while not bad in itself
would create an ABI break for planner-related plugins in the 9.2 release
series. This still breaks ABI for anything that calls query_planner
directly, but it seems somewhat unlikely that there are any such plugins.
I had originally conceived of this change as merely a step on the way to
fixing bug #8049 from Teun Hoogendoorn; but it turns out that this fixes
that bug all by itself, as per the added regression test. The reason is
that now get_eclass_for_sort_expr is adding the ORDER BY expression at the
end of EquivalenceClass creation not the start, and so anything that is in
a multi-member EquivalenceClass has already been created with correct
em_nullable_relids. I am suspicious that there are related scenarios in
which we still need to teach get_eclass_for_sort_expr to compute correct
nullable_relids, but am not eager to risk destabilizing either 9.2 or 9.3
to fix bugs that are only hypothetical. So for the moment, do this and
stop here.
Back-patch to 9.2 but not to earlier branches, since they don't exhibit
this bug for lack of join-clause-movement logic that depends on
em_nullable_relids being correct. (We might have to revisit that choice
if any related bugs turn up.) In 9.2, don't change the signature of
make_pathkeys_for_sortclauses nor remove canonicalize_pathkeys, so as
not to risk more plugin breakage than we have to.
13 years ago
|
|
|
(*qp_callback) (root, qp_extra);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Examine any "placeholder" expressions generated during subquery pullup.
|
|
|
|
* Make sure that the Vars they need are marked as needed at the relevant
|
|
|
|
* join level. This must be done before join removal because it might
|
|
|
|
* cause Vars or placeholders to be needed above a join when they weren't
|
|
|
|
* so marked before.
|
|
|
|
*/
|
|
|
|
fix_placeholder_input_needed_levels(root);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Remove any useless outer joins. Ideally this would be done during
|
|
|
|
* jointree preprocessing, but the necessary information isn't available
|
|
|
|
* until we've built baserel data structures and classified qual clauses.
|
|
|
|
*/
|
|
|
|
joinlist = remove_useless_joins(root, joinlist);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Now distribute "placeholders" to base rels as needed. This has to be
|
|
|
|
* done after join removal because removal could change whether a
|
|
|
|
* placeholder is evaluatable at a base rel.
|
|
|
|
*/
|
|
|
|
add_placeholders_to_base_rels(root);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Create the LateralJoinInfo list now that we have finalized
|
|
|
|
* PlaceHolderVar eval levels and made any necessary additions to the
|
|
|
|
* lateral_vars lists for lateral references within PlaceHolderVars.
|
|
|
|
*/
|
|
|
|
create_lateral_join_info(root);
|
|
|
|
|
Extract restriction OR clauses whether or not they are indexable.
It's possible to extract a restriction OR clause from a join clause that
has the form of an OR-of-ANDs, if each sub-AND includes a clause that
mentions only one specific relation. While PG has been aware of that idea
for many years, the code previously only did it if it could extract an
indexable OR clause. On reflection, though, that seems a silly limitation:
adding a restriction clause can be a win by reducing the number of rows
that have to be filtered at the join step, even if we have to test the
clause as a plain filter clause during the scan. This should be especially
useful for foreign tables, where the change can cut the number of rows that
have to be retrieved from the foreign server; but testing shows it can win
even on local tables. Per a suggestion from Robert Haas.
As a heuristic, I made the code accept an extracted restriction clause
if its estimated selectivity is less than 0.9, which will probably result
in accepting extracted clauses just about always. We might need to tweak
that later based on experience.
Since the code no longer has even a weak connection to Path creation,
remove orindxpath.c and create a new file optimizer/util/orclauses.c.
There's some additional janitorial cleanup of now-dead code that needs
to happen, but it seems like that's a fit subject for a separate commit.
12 years ago
|
|
|
/*
|
|
|
|
* Look for join OR clauses that we can extract single-relation
|
|
|
|
* restriction OR clauses from.
|
|
|
|
*/
|
|
|
|
extract_restriction_or_clauses(root);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* We should now have size estimates for every actual table involved in
|
|
|
|
* the query, and we also know which if any have been deleted from the
|
|
|
|
* query by join removal; so we can compute total_table_pages.
|
|
|
|
*
|
|
|
|
* Note that appendrels are not double-counted here, even though we don't
|
|
|
|
* bother to distinguish RelOptInfos for appendrel parents, because the
|
|
|
|
* parents will still have size zero.
|
|
|
|
*
|
|
|
|
* XXX if a table is self-joined, we will count it once per appearance,
|
|
|
|
* which perhaps is the wrong thing ... but that's not completely clear,
|
|
|
|
* and detecting self-joins here is difficult, so ignore it for now.
|
|
|
|
*/
|
|
|
|
total_pages = 0;
|
|
|
|
for (rti = 1; rti < root->simple_rel_array_size; rti++)
|
|
|
|
{
|
|
|
|
RelOptInfo *brel = root->simple_rel_array[rti];
|
|
|
|
|
|
|
|
if (brel == NULL)
|
|
|
|
continue;
|
|
|
|
|
|
|
|
Assert(brel->relid == rti); /* sanity check on array */
|
|
|
|
|
|
|
|
if (brel->reloptkind == RELOPT_BASEREL ||
|
|
|
|
brel->reloptkind == RELOPT_OTHER_MEMBER_REL)
|
|
|
|
total_pages += (double) brel->pages;
|
|
|
|
}
|
|
|
|
root->total_table_pages = total_pages;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Ready to do the primary planning.
|
|
|
|
*/
|
|
|
|
final_rel = make_one_rel(root, joinlist);
|
|
|
|
|
Simplify query_planner's API by having it return the top-level RelOptInfo.
Formerly, query_planner returned one or possibly two Paths for the topmost
join relation, so that grouping_planner didn't see the join RelOptInfo
(at least not directly; it didn't have any hesitation about examining
cheapest_path->parent, though). However, correct selection of the Paths
involved a significant amount of coupling between query_planner and
grouping_planner, a problem which has gotten worse over time. It seems
best to give up on this API choice and instead return the topmost
RelOptInfo explicitly. Then grouping_planner can pull out the Paths it
wants from the rel's path list. In this way we can remove all knowledge
of grouping behaviors from query_planner.
The only real benefit of the old way is that in the case of an empty
FROM clause, we never made any RelOptInfos at all, just a Path. Now
we have to gin up a dummy RelOptInfo to represent the empty FROM clause.
That's not a very big deal though.
While at it, simplify query_planner's API a bit more by having the caller
set up root->tuple_fraction and root->limit_tuples, rather than passing
those values as separate parameters. Since query_planner no longer does
anything with either value, requiring it to fill the PlannerInfo fields
seemed pretty arbitrary.
This patch just rearranges code; it doesn't (intentionally) change any
behaviors. Followup patches will do more interesting things.
12 years ago
|
|
|
/* Check that we got at least one usable path */
|
Adjust definition of cheapest_total_path to work better with LATERAL.
In the initial cut at LATERAL, I kept the rule that cheapest_total_path
was always unparameterized, which meant it had to be NULL if the relation
has no unparameterized paths. It turns out to work much more nicely if
we always have *some* path nominated as cheapest-total for each relation.
In particular, let's still say it's the cheapest unparameterized path if
there is one; if not, take the cheapest-total-cost path among those of
the minimum available parameterization. (The first rule is actually
a special case of the second.)
This allows reversion of some temporary lobotomizations I'd put in place.
In particular, the planner can now consider hash and merge joins for
joins below a parameter-supplying nestloop, even if there aren't any
unparameterized paths available. This should bring planning of
LATERAL-containing queries to the same level as queries not using that
feature.
Along the way, simplify management of parameterized paths in add_path()
and friends. In the original coding for parameterized paths in 9.2,
I tried to minimize the logic changes in add_path(), so it just treated
parameterization as yet another dimension of comparison for paths.
We later made it ignore pathkeys (sort ordering) of parameterized paths,
on the grounds that ordering isn't a useful property for the path on the
inside of a nestloop, so we might as well get rid of useless parameterized
paths as quickly as possible. But we didn't take that reasoning as far as
we should have. Startup cost isn't a useful property inside a nestloop
either, so add_path() ought to discount startup cost of parameterized paths
as well. Having done that, the secondary sorting I'd implemented (in
add_parameterized_path) is no longer needed --- any parameterized path that
survives add_path() at all is worth considering at higher levels. So this
should be a bit faster as well as simpler.
13 years ago
|
|
|
if (!final_rel || !final_rel->cheapest_total_path ||
|
|
|
|
final_rel->cheapest_total_path->param_info != NULL)
|
|
|
|
elog(ERROR, "failed to construct the join relation");
|
|
|
|
|
Simplify query_planner's API by having it return the top-level RelOptInfo.
Formerly, query_planner returned one or possibly two Paths for the topmost
join relation, so that grouping_planner didn't see the join RelOptInfo
(at least not directly; it didn't have any hesitation about examining
cheapest_path->parent, though). However, correct selection of the Paths
involved a significant amount of coupling between query_planner and
grouping_planner, a problem which has gotten worse over time. It seems
best to give up on this API choice and instead return the topmost
RelOptInfo explicitly. Then grouping_planner can pull out the Paths it
wants from the rel's path list. In this way we can remove all knowledge
of grouping behaviors from query_planner.
The only real benefit of the old way is that in the case of an empty
FROM clause, we never made any RelOptInfos at all, just a Path. Now
we have to gin up a dummy RelOptInfo to represent the empty FROM clause.
That's not a very big deal though.
While at it, simplify query_planner's API a bit more by having the caller
set up root->tuple_fraction and root->limit_tuples, rather than passing
those values as separate parameters. Since query_planner no longer does
anything with either value, requiring it to fill the PlannerInfo fields
seemed pretty arbitrary.
This patch just rearranges code; it doesn't (intentionally) change any
behaviors. Followup patches will do more interesting things.
12 years ago
|
|
|
return final_rel;
|
|
|
|
}
|