|
|
|
|
@ -1,27 +1,22 @@ |
|
|
|
|
<!-- $PostgreSQL: pgsql/doc/src/sgml/planstats.sgml,v 1.8 2007/01/31 20:56:18 momjian Exp $ --> |
|
|
|
|
<!-- $PostgreSQL: pgsql/doc/src/sgml/planstats.sgml,v 1.9 2007/12/28 21:03:31 tgl Exp $ --> |
|
|
|
|
|
|
|
|
|
<chapter id="planner-stats-details"> |
|
|
|
|
<title>How the Planner Uses Statistics</title> |
|
|
|
|
|
|
|
|
|
<para> |
|
|
|
|
This chapter builds on the material covered in <xref linkend="using-explain"> |
|
|
|
|
and <xref linkend="planner-stats">, and shows how the planner uses the |
|
|
|
|
system statistics to estimate the number of rows each stage in a query might |
|
|
|
|
return. This is a significant part of the planning / optimizing process, |
|
|
|
|
This chapter builds on the material covered in <xref |
|
|
|
|
linkend="using-explain"> and <xref linkend="planner-stats"> to show some |
|
|
|
|
additional details about how the planner uses the |
|
|
|
|
system statistics to estimate the number of rows each part of a query might |
|
|
|
|
return. This is a significant part of the planning process, |
|
|
|
|
providing much of the raw material for cost calculation. |
|
|
|
|
</para> |
|
|
|
|
|
|
|
|
|
<para> |
|
|
|
|
The intent of this chapter is not to document the code — |
|
|
|
|
better done in the code itself, but to present an overview of how it works. |
|
|
|
|
This will perhaps ease the learning curve for someone who subsequently |
|
|
|
|
wishes to read the code. As a consequence, the approach chosen is to analyze |
|
|
|
|
a series of incrementally more complex examples. |
|
|
|
|
</para> |
|
|
|
|
|
|
|
|
|
<para> |
|
|
|
|
The outputs and algorithms shown below are taken from version 8.0. |
|
|
|
|
The behavior of earlier (or later) versions might vary. |
|
|
|
|
The intent of this chapter is not to document the code in detail, |
|
|
|
|
but to present an overview of how it works. |
|
|
|
|
This will perhaps ease the learning curve for someone who subsequently |
|
|
|
|
wishes to read the code. |
|
|
|
|
</para> |
|
|
|
|
|
|
|
|
|
<sect1 id="row-estimation-examples"> |
|
|
|
|
@ -33,160 +28,163 @@ |
|
|
|
|
</indexterm> |
|
|
|
|
|
|
|
|
|
<para> |
|
|
|
|
Using examples drawn from the regression test database, let's start with a |
|
|
|
|
very simple query: |
|
|
|
|
The examples shown below use tables in the <productname>PostgreSQL</> |
|
|
|
|
regression test database. |
|
|
|
|
The outputs shown are taken from version 8.3. |
|
|
|
|
The behavior of earlier (or later) versions might vary. |
|
|
|
|
Note also that since <command>ANALYZE</> uses random sampling |
|
|
|
|
while producing statistics, the results will change slightly after |
|
|
|
|
any new <command>ANALYZE</>. |
|
|
|
|
</para> |
|
|
|
|
|
|
|
|
|
<para> |
|
|
|
|
Let's start with a very simple query: |
|
|
|
|
|
|
|
|
|
<programlisting> |
|
|
|
|
EXPLAIN SELECT * FROM tenk1; |
|
|
|
|
|
|
|
|
|
QUERY PLAN |
|
|
|
|
------------------------------------------------------------- |
|
|
|
|
Seq Scan on tenk1 (cost=0.00..445.00 rows=10000 width=244) |
|
|
|
|
Seq Scan on tenk1 (cost=0.00..458.00 rows=10000 width=244) |
|
|
|
|
</programlisting> |
|
|
|
|
|
|
|
|
|
How the planner determines the cardinality of <classname>tenk1</classname> |
|
|
|
|
is covered in <xref linkend="using-explain">, but is repeated here for |
|
|
|
|
completeness. The number of rows is looked up from |
|
|
|
|
<classname>pg_class</classname>: |
|
|
|
|
|
|
|
|
|
How the planner determines the cardinality of <structname>tenk1</structname> |
|
|
|
|
is covered in <xref linkend="planner-stats">, but is repeated here for |
|
|
|
|
completeness. The number of pages and rows is looked up in |
|
|
|
|
<structname>pg_class</structname>: |
|
|
|
|
|
|
|
|
|
<programlisting> |
|
|
|
|
SELECT reltuples, relpages FROM pg_class WHERE relname = 'tenk1'; |
|
|
|
|
SELECT relpages, reltuples FROM pg_class WHERE relname = 'tenk1'; |
|
|
|
|
|
|
|
|
|
relpages | reltuples |
|
|
|
|
----------+----------- |
|
|
|
|
345 | 10000 |
|
|
|
|
</programlisting> |
|
|
|
|
The planner will check the <structfield>relpages</structfield> |
|
|
|
|
estimate (this is a cheap operation) and if incorrect might scale |
|
|
|
|
<structfield>reltuples</structfield> to obtain a row estimate. In this |
|
|
|
|
case it does not, thus: |
|
|
|
|
|
|
|
|
|
<programlisting> |
|
|
|
|
rows = 10000 |
|
|
|
|
358 | 10000 |
|
|
|
|
</programlisting> |
|
|
|
|
|
|
|
|
|
These numbers are current as of the last <command>VACUUM</> or |
|
|
|
|
<command>ANALYZE</> on the table. The planner then fetches the |
|
|
|
|
actual current number of pages in the table (this is a cheap operation, |
|
|
|
|
not requiring a table scan). If that is different from |
|
|
|
|
<structfield>relpages</structfield> then |
|
|
|
|
<structfield>reltuples</structfield> is scaled accordingly to |
|
|
|
|
arrive at a current number-of-rows estimate. In this case the values |
|
|
|
|
are correct so the rows estimate is the same as |
|
|
|
|
<structfield>reltuples</structfield>. |
|
|
|
|
</para> |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
<para> |
|
|
|
|
let's move on to an example with a range condition in its |
|
|
|
|
Let's move on to an example with a range condition in its |
|
|
|
|
<literal>WHERE</literal> clause: |
|
|
|
|
|
|
|
|
|
<programlisting> |
|
|
|
|
EXPLAIN SELECT * FROM tenk1 WHERE unique1 < 1000; |
|
|
|
|
|
|
|
|
|
QUERY PLAN |
|
|
|
|
------------------------------------------------------------ |
|
|
|
|
Seq Scan on tenk1 (cost=0.00..470.00 rows=1031 width=244) |
|
|
|
|
Filter: (unique1 < 1000) |
|
|
|
|
</programlisting> |
|
|
|
|
|
|
|
|
|
The planner examines the <literal>WHERE</literal> clause condition: |
|
|
|
|
|
|
|
|
|
<programlisting> |
|
|
|
|
unique1 < 1000 |
|
|
|
|
QUERY PLAN |
|
|
|
|
-------------------------------------------------------------------------------- |
|
|
|
|
Bitmap Heap Scan on tenk1 (cost=24.06..394.64 rows=1007 width=244) |
|
|
|
|
Recheck Cond: (unique1 < 1000) |
|
|
|
|
-> Bitmap Index Scan on tenk1_unique1 (cost=0.00..23.80 rows=1007 width=0) |
|
|
|
|
Index Cond: (unique1 < 1000) |
|
|
|
|
</programlisting> |
|
|
|
|
|
|
|
|
|
and looks up the restriction function for the operator |
|
|
|
|
<literal><</literal> in <classname>pg_operator</classname>. |
|
|
|
|
This is held in the column <structfield>oprrest</structfield>, |
|
|
|
|
and the result in this case is <function>scalarltsel</function>. |
|
|
|
|
The <function>scalarltsel</function> function retrieves the histogram for |
|
|
|
|
<structfield>unique1</structfield> from <classname>pg_statistics</classname> |
|
|
|
|
- we can follow this by using the simpler <classname>pg_stats</classname> |
|
|
|
|
The planner examines the <literal>WHERE</literal> clause condition |
|
|
|
|
and looks up the selectivity function for the operator |
|
|
|
|
<literal><</literal> in <structname>pg_operator</structname>. |
|
|
|
|
This is held in the column <structfield>oprrest</structfield>, |
|
|
|
|
and the entry in this case is <function>scalarltsel</function>. |
|
|
|
|
The <function>scalarltsel</function> function retrieves the histogram for |
|
|
|
|
<structfield>unique1</structfield> from |
|
|
|
|
<structname>pg_statistics</structname>. For manual queries it is more |
|
|
|
|
convenient to look in the simpler <structname>pg_stats</structname> |
|
|
|
|
view: |
|
|
|
|
|
|
|
|
|
<programlisting> |
|
|
|
|
SELECT histogram_bounds FROM pg_stats |
|
|
|
|
SELECT histogram_bounds FROM pg_stats |
|
|
|
|
WHERE tablename='tenk1' AND attname='unique1'; |
|
|
|
|
|
|
|
|
|
histogram_bounds |
|
|
|
|
------------------------------------------------------ |
|
|
|
|
{1,970,1943,2958,3971,5069,6028,7007,7919,8982,9995} |
|
|
|
|
{0,993,1997,3050,4040,5036,5957,7057,8029,9016,9995} |
|
|
|
|
</programlisting> |
|
|
|
|
|
|
|
|
|
Next the fraction of the histogram occupied by <quote>< 1000</quote> |
|
|
|
|
is worked out. This is the selectivity. The histogram divides the range |
|
|
|
|
into equal frequency buckets, so all we have to do is locate the bucket |
|
|
|
|
that our value is in and count <emphasis>part</emphasis> of it and |
|
|
|
|
<emphasis>all</emphasis> of the ones before. The value 1000 is clearly in |
|
|
|
|
the second (970 - 1943) bucket, so by assuming a linear distribution of |
|
|
|
|
values inside each bucket we can calculate the selectivity as: |
|
|
|
|
Next the fraction of the histogram occupied by <quote>< 1000</quote> |
|
|
|
|
is worked out. This is the selectivity. The histogram divides the range |
|
|
|
|
into equal frequency buckets, so all we have to do is locate the bucket |
|
|
|
|
that our value is in and count <emphasis>part</emphasis> of it and |
|
|
|
|
<emphasis>all</emphasis> of the ones before. The value 1000 is clearly in |
|
|
|
|
the second bucket (993-1997). Assuming a linear distribution of |
|
|
|
|
values inside each bucket, we can calculate the selectivity as: |
|
|
|
|
|
|
|
|
|
<programlisting> |
|
|
|
|
selectivity = (1 + (1000 - bckt[2].min)/(bckt[2].max - bckt[2].min))/num_bckts |
|
|
|
|
= (1 + (1000 - 970)/(1943 - 970))/10 |
|
|
|
|
= 0.1031 |
|
|
|
|
selectivity = (1 + (1000 - bucket[2].min)/(bucket[2].max - bucket[2].min))/num_buckets |
|
|
|
|
= (1 + (1000 - 993)/(1997 - 993))/10 |
|
|
|
|
= 0.100697 |
|
|
|
|
</programlisting> |
|
|
|
|
|
|
|
|
|
that is, one whole bucket plus a linear fraction of the second, divided by |
|
|
|
|
the number of buckets. The estimated number of rows can now be calculated as |
|
|
|
|
the product of the selectivity and the cardinality of |
|
|
|
|
<classname>tenk1</classname>: |
|
|
|
|
the product of the selectivity and the cardinality of |
|
|
|
|
<structname>tenk1</structname>: |
|
|
|
|
|
|
|
|
|
<programlisting> |
|
|
|
|
rows = rel_cardinality * selectivity |
|
|
|
|
= 10000 * 0.1031 |
|
|
|
|
= 1031 |
|
|
|
|
= 10000 * 0.100697 |
|
|
|
|
= 1007 (rounding off) |
|
|
|
|
</programlisting> |
|
|
|
|
|
|
|
|
|
</para> |
|
|
|
|
|
|
|
|
|
<para> |
|
|
|
|
Next let's consider an example with equality condition in its |
|
|
|
|
Next let's consider an example with an equality condition in its |
|
|
|
|
<literal>WHERE</literal> clause: |
|
|
|
|
|
|
|
|
|
<programlisting> |
|
|
|
|
EXPLAIN SELECT * FROM tenk1 WHERE stringu1 = 'ATAAAA'; |
|
|
|
|
EXPLAIN SELECT * FROM tenk1 WHERE stringu1 = 'CRAAAA'; |
|
|
|
|
|
|
|
|
|
QUERY PLAN |
|
|
|
|
---------------------------------------------------------- |
|
|
|
|
Seq Scan on tenk1 (cost=0.00..470.00 rows=31 width=244) |
|
|
|
|
Filter: (stringu1 = 'ATAAAA'::name) |
|
|
|
|
Seq Scan on tenk1 (cost=0.00..483.00 rows=30 width=244) |
|
|
|
|
Filter: (stringu1 = 'CRAAAA'::name) |
|
|
|
|
</programlisting> |
|
|
|
|
|
|
|
|
|
Again the planner examines the <literal>WHERE</literal> clause condition: |
|
|
|
|
Again the planner examines the <literal>WHERE</literal> clause condition |
|
|
|
|
and looks up the selectivity function for <literal>=</literal>, which is |
|
|
|
|
<function>eqsel</function>. For equality estimation the histogram is |
|
|
|
|
not useful; instead the list of <firstterm>most |
|
|
|
|
common values</> (<acronym>MCV</acronym>s) is used to determine the |
|
|
|
|
selectivity. Let's have a look at the MCVs, with some additional columns |
|
|
|
|
that will be useful later: |
|
|
|
|
|
|
|
|
|
<programlisting> |
|
|
|
|
stringu1 = 'ATAAAA' |
|
|
|
|
</programlisting> |
|
|
|
|
|
|
|
|
|
and looks up the restriction function for <literal>=</literal>, which is |
|
|
|
|
<function>eqsel</function>. This case is a bit different, as the most |
|
|
|
|
common values — <acronym>MCV</acronym>s, are used to determine the |
|
|
|
|
selectivity. Let's have a look at these, with some extra columns that will |
|
|
|
|
be useful later: |
|
|
|
|
|
|
|
|
|
<programlisting> |
|
|
|
|
SELECT null_frac, n_distinct, most_common_vals, most_common_freqs FROM pg_stats |
|
|
|
|
SELECT null_frac, n_distinct, most_common_vals, most_common_freqs FROM pg_stats |
|
|
|
|
WHERE tablename='tenk1' AND attname='stringu1'; |
|
|
|
|
|
|
|
|
|
null_frac | 0 |
|
|
|
|
n_distinct | 672 |
|
|
|
|
most_common_vals | {FDAAAA,NHAAAA,ATAAAA,BGAAAA,EBAAAA,MOAAAA,NDAAAA,OWAAAA,BHAAAA,BJAAAA} |
|
|
|
|
most_common_freqs | {0.00333333,0.00333333,0.003,0.003,0.003,0.003,0.003,0.003,0.00266667,0.00266667} |
|
|
|
|
n_distinct | 676 |
|
|
|
|
most_common_vals | {EJAAAA,BBAAAA,CRAAAA,FCAAAA,FEAAAA,GSAAAA,JOAAAA,MCAAAA,NAAAAA,WGAAAA} |
|
|
|
|
most_common_freqs | {0.00333333,0.003,0.003,0.003,0.003,0.003,0.003,0.003,0.003,0.003} |
|
|
|
|
|
|
|
|
|
</programlisting> |
|
|
|
|
|
|
|
|
|
The selectivity is merely the most common frequency (<acronym>MCF</acronym>) |
|
|
|
|
corresponding to the third <acronym>MCV</acronym> — 'ATAAAA': |
|
|
|
|
Since <literal>CRAAAA</> appears in the list of MCVs, the selectivity is |
|
|
|
|
merely the corresponding entry in the list of most common frequencies |
|
|
|
|
(<acronym>MCF</acronym>s): |
|
|
|
|
|
|
|
|
|
<programlisting> |
|
|
|
|
selectivity = mcf[3] |
|
|
|
|
= 0.003 |
|
|
|
|
</programlisting> |
|
|
|
|
|
|
|
|
|
The estimated number of rows is just the product of this with the |
|
|
|
|
cardinality of <classname>tenk1</classname> as before: |
|
|
|
|
As before, the estimated number of rows is just the product of this with the |
|
|
|
|
cardinality of <structname>tenk1</structname>: |
|
|
|
|
|
|
|
|
|
<programlisting> |
|
|
|
|
rows = 10000 * 0.003 |
|
|
|
|
= 30 |
|
|
|
|
</programlisting> |
|
|
|
|
|
|
|
|
|
The number displayed by <command>EXPLAIN</command> is one more than this, |
|
|
|
|
due to some post estimation checks. |
|
|
|
|
</para> |
|
|
|
|
|
|
|
|
|
<para> |
|
|
|
|
Now consider the same query, but with a constant that is not in the |
|
|
|
|
Now consider the same query, but with a constant that is not in the |
|
|
|
|
<acronym>MCV</acronym> list: |
|
|
|
|
|
|
|
|
|
<programlisting> |
|
|
|
|
@ -194,116 +192,197 @@ EXPLAIN SELECT * FROM tenk1 WHERE stringu1 = 'xxx'; |
|
|
|
|
|
|
|
|
|
QUERY PLAN |
|
|
|
|
---------------------------------------------------------- |
|
|
|
|
Seq Scan on tenk1 (cost=0.00..470.00 rows=15 width=244) |
|
|
|
|
Seq Scan on tenk1 (cost=0.00..483.00 rows=15 width=244) |
|
|
|
|
Filter: (stringu1 = 'xxx'::name) |
|
|
|
|
</programlisting> |
|
|
|
|
|
|
|
|
|
This is quite a different problem, how to estimate the selectivity when the |
|
|
|
|
value is <emphasis>not</emphasis> in the <acronym>MCV</acronym> list. |
|
|
|
|
The approach is to use the fact that the value is not in the list, |
|
|
|
|
This is quite a different problem: how to estimate the selectivity when the |
|
|
|
|
value is <emphasis>not</emphasis> in the <acronym>MCV</acronym> list. |
|
|
|
|
The approach is to use the fact that the value is not in the list, |
|
|
|
|
combined with the knowledge of the frequencies for all of the |
|
|
|
|
<acronym>MCV</acronym>s: |
|
|
|
|
|
|
|
|
|
<programlisting> |
|
|
|
|
selectivity = (1 - sum(mvf))/(num_distinct - num_mcv) |
|
|
|
|
= (1 - (0.00333333 + 0.00333333 + 0.003 + 0.003 + 0.003 |
|
|
|
|
+ 0.003 + 0.003 + 0.003 + 0.00266667 + 0.00266667))/(672 - 10) |
|
|
|
|
= 0.001465 |
|
|
|
|
= (1 - (0.00333333 + 0.003 + 0.003 + 0.003 + 0.003 + 0.003 + |
|
|
|
|
0.003 + 0.003 + 0.003 + 0.003))/(676 - 10) |
|
|
|
|
= 0.0014559 |
|
|
|
|
</programlisting> |
|
|
|
|
|
|
|
|
|
That is, add up all the frequencies for the <acronym>MCV</acronym>s and |
|
|
|
|
subtract them from one — because it is <emphasis>not</emphasis> one |
|
|
|
|
of these, and divide by the <emphasis>remaining</emphasis> distinct values. |
|
|
|
|
Notice that there are no null values so we don't have to worry about those. |
|
|
|
|
The estimated number of rows is calculated as usual: |
|
|
|
|
That is, add up all the frequencies for the <acronym>MCV</acronym>s and |
|
|
|
|
subtract them from one, then |
|
|
|
|
divide by the number of <emphasis>other</emphasis> distinct values. |
|
|
|
|
This amounts to assuming that the fraction of the column that is not any |
|
|
|
|
of the MCVs is evenly distributed among all the other distinct values. |
|
|
|
|
Notice that there are no null values so we don't have to worry about those |
|
|
|
|
(otherwise we'd subtract the null fraction from the numerator as well). |
|
|
|
|
The estimated number of rows is then calculated as usual: |
|
|
|
|
|
|
|
|
|
<programlisting> |
|
|
|
|
rows = 10000 * 0.001465 |
|
|
|
|
= 15 |
|
|
|
|
rows = 10000 * 0.0014559 |
|
|
|
|
= 15 (rounding off) |
|
|
|
|
</programlisting> |
|
|
|
|
|
|
|
|
|
</para> |
|
|
|
|
|
|
|
|
|
<para> |
|
|
|
|
Let's increase the complexity to consider a case with more than one |
|
|
|
|
condition in the <literal>WHERE</literal> clause: |
|
|
|
|
The previous example with <literal>unique1 < 1000</> was an |
|
|
|
|
oversimplification of what <function>scalarltsel</function> really does; |
|
|
|
|
now that we have seen an example of the use of MCVs, we can fill in some |
|
|
|
|
more detail. The example was correct as far as it went, because since |
|
|
|
|
<structfield>unique1</> is a unique column it has no MCVs (obviously, no |
|
|
|
|
value is any more common than any other value). For a non-unique |
|
|
|
|
column, there will normally be both a histogram and an MCV list, and |
|
|
|
|
<emphasis>the histogram does not include the portion of the column |
|
|
|
|
population represented by the MCVs</>. We do things this way because |
|
|
|
|
it allows more precise estimation. In this situation |
|
|
|
|
<function>scalarltsel</function> directly applies the condition (e.g., |
|
|
|
|
<quote>< 1000</>) to each value of the MCV list, and adds up the |
|
|
|
|
frequencies of the MCVs for which the condition is true. This gives |
|
|
|
|
an exact estimate of the selectivity within the portion of the table |
|
|
|
|
that is MCVs. The histogram is then used in the same way as above |
|
|
|
|
to estimate the selectivity in the portion of the table that is not |
|
|
|
|
MCVs, and then the two numbers are combined to estimate the overall |
|
|
|
|
selectivity. For example, consider |
|
|
|
|
|
|
|
|
|
<programlisting> |
|
|
|
|
EXPLAIN SELECT * FROM tenk1 WHERE unique1 < 1000 AND stringu1 = 'xxx'; |
|
|
|
|
EXPLAIN SELECT * FROM tenk1 WHERE stringu1 < 'IAAAAA'; |
|
|
|
|
|
|
|
|
|
QUERY PLAN |
|
|
|
|
----------------------------------------------------------- |
|
|
|
|
Seq Scan on tenk1 (cost=0.00..495.00 rows=2 width=244) |
|
|
|
|
Filter: ((unique1 < 1000) AND (stringu1 = 'xxx'::name)) |
|
|
|
|
QUERY PLAN |
|
|
|
|
------------------------------------------------------------ |
|
|
|
|
Seq Scan on tenk1 (cost=0.00..483.00 rows=3077 width=244) |
|
|
|
|
Filter: (stringu1 < 'IAAAAA'::name) |
|
|
|
|
</programlisting> |
|
|
|
|
|
|
|
|
|
An assumption of independence is made and the selectivities of the |
|
|
|
|
individual restrictions are multiplied together: |
|
|
|
|
We already saw the MCV information for <structfield>stringu1</>, |
|
|
|
|
and here is its histogram: |
|
|
|
|
|
|
|
|
|
<programlisting> |
|
|
|
|
selectivity = selectivity(unique1 < 1000) * selectivity(stringu1 = 'xxx') |
|
|
|
|
= 0.1031 * 0.001465 |
|
|
|
|
= 0.00015104 |
|
|
|
|
SELECT histogram_bounds FROM pg_stats |
|
|
|
|
WHERE tablename='tenk1' AND attname='stringu1'; |
|
|
|
|
|
|
|
|
|
histogram_bounds |
|
|
|
|
-------------------------------------------------------------------------------- |
|
|
|
|
{AAAAAA,CQAAAA,FRAAAA,IBAAAA,KRAAAA,NFAAAA,PSAAAA,SGAAAA,VAAAAA,XLAAAA,ZZAAAA} |
|
|
|
|
</programlisting> |
|
|
|
|
|
|
|
|
|
The row estimates are calculated as before: |
|
|
|
|
Checking the MCV list, we find that the condition <literal>stringu1 < |
|
|
|
|
'IAAAAA'</> is satisfied by the first six entries and not the last four, |
|
|
|
|
so the selectivity within the MCV part of the population is |
|
|
|
|
|
|
|
|
|
<programlisting> |
|
|
|
|
rows = 10000 * 0.00015104 |
|
|
|
|
= 2 |
|
|
|
|
selectivity = sum(relevant mvfs) |
|
|
|
|
= 0.00333333 + 0.003 + 0.003 + 0.003 + 0.003 + 0.003 |
|
|
|
|
= 0.01833333 |
|
|
|
|
</programlisting> |
|
|
|
|
|
|
|
|
|
Summing all the MCFs also tells us that the total fraction of the |
|
|
|
|
population represented by MCVs is 0.03033333, and therefore the |
|
|
|
|
fraction represented by the histogram is 0.96966667 (again, there |
|
|
|
|
are no nulls, else we'd have to exclude them here). We can see |
|
|
|
|
that the value <literal>IAAAAA</> falls nearly at the end of the |
|
|
|
|
third histogram bucket. Using some rather cheesy assumptions |
|
|
|
|
about the frequency of different characters, the planner arrives |
|
|
|
|
at the estimate 0.298387 for the portion of the histogram population |
|
|
|
|
that is less than <literal>IAAAAA</>. We then combine the estimates |
|
|
|
|
for the MCV and non-MCV populations: |
|
|
|
|
|
|
|
|
|
<programlisting> |
|
|
|
|
selectivity = mcv_selectivity + histogram_selectivity * histogram_fraction |
|
|
|
|
= 0.01833333 + 0.298387 * 0.96966667 |
|
|
|
|
= 0.307669 |
|
|
|
|
|
|
|
|
|
rows = 10000 * 0.307669 |
|
|
|
|
= 3077 (rounding off) |
|
|
|
|
</programlisting> |
|
|
|
|
|
|
|
|
|
In this particular example, the correction from the MCV list is fairly |
|
|
|
|
small, because the column distribution is actually quite flat (the |
|
|
|
|
statistics showing these particular values as being more common than |
|
|
|
|
others are mostly due to sampling error). In a more typical case where |
|
|
|
|
some values are significantly more common than others, this complicated |
|
|
|
|
process gives a useful improvement in accuracy because the selectivity |
|
|
|
|
for the most common values is found exactly. |
|
|
|
|
</para> |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
<para> |
|
|
|
|
Finally we will examine a query that includes a <literal>JOIN</literal> |
|
|
|
|
together with a <literal>WHERE</literal> clause: |
|
|
|
|
Now let's consider a case with more than one |
|
|
|
|
condition in the <literal>WHERE</literal> clause: |
|
|
|
|
|
|
|
|
|
<programlisting> |
|
|
|
|
EXPLAIN SELECT * FROM tenk1 t1, tenk2 t2 |
|
|
|
|
WHERE t1.unique1 < 50 AND t1.unique2 = t2.unique2; |
|
|
|
|
EXPLAIN SELECT * FROM tenk1 WHERE unique1 < 1000 AND stringu1 = 'xxx'; |
|
|
|
|
|
|
|
|
|
QUERY PLAN |
|
|
|
|
----------------------------------------------------------------------------------------- |
|
|
|
|
Nested Loop (cost=0.00..346.90 rows=51 width=488) |
|
|
|
|
-> Index Scan using tenk1_unique1 on tenk1 t1 (cost=0.00..192.57 rows=51 width=244) |
|
|
|
|
Index Cond: (unique1 < 50) |
|
|
|
|
-> Index Scan using tenk2_unique2 on tenk2 t2 (cost=0.00..3.01 rows=1 width=244) |
|
|
|
|
Index Cond: ("outer".unique2 = t2.unique2) |
|
|
|
|
QUERY PLAN |
|
|
|
|
-------------------------------------------------------------------------------- |
|
|
|
|
Bitmap Heap Scan on tenk1 (cost=23.80..396.91 rows=1 width=244) |
|
|
|
|
Recheck Cond: (unique1 < 1000) |
|
|
|
|
Filter: (stringu1 = 'xxx'::name) |
|
|
|
|
-> Bitmap Index Scan on tenk1_unique1 (cost=0.00..23.80 rows=1007 width=0) |
|
|
|
|
Index Cond: (unique1 < 1000) |
|
|
|
|
</programlisting> |
|
|
|
|
|
|
|
|
|
The restriction on <classname>tenk1</classname> |
|
|
|
|
<quote>unique1 < 50</quote> is evaluated before the nested-loop join. |
|
|
|
|
This is handled analogously to the previous range example. The restriction |
|
|
|
|
operator for <literal><</literal> is <function>scalarlteqsel</function> |
|
|
|
|
as before, but this time the value 50 is in the first bucket of the |
|
|
|
|
<structfield>unique1</structfield> histogram: |
|
|
|
|
The planner assumes that the two conditions are independent, so that |
|
|
|
|
the individual selectivities of the clauses can be multiplied together: |
|
|
|
|
|
|
|
|
|
<programlisting> |
|
|
|
|
selectivity = selectivity(unique1 < 1000) * selectivity(stringu1 = 'xxx') |
|
|
|
|
= 0.100697 * 0.0014559 |
|
|
|
|
= 0.0001466 |
|
|
|
|
|
|
|
|
|
rows = 10000 * 0.0001466 |
|
|
|
|
= 1 (rounding off) |
|
|
|
|
</programlisting> |
|
|
|
|
|
|
|
|
|
Notice that the number of rows estimated to be returned from the bitmap |
|
|
|
|
index scan reflects only the condition used with the index; this is |
|
|
|
|
important since it affects the cost estimate for the subsequent heap |
|
|
|
|
fetches. |
|
|
|
|
</para> |
|
|
|
|
|
|
|
|
|
<para> |
|
|
|
|
Finally we will examine a query that involves a join: |
|
|
|
|
|
|
|
|
|
<programlisting> |
|
|
|
|
selectivity = (0 + (50 - bckt[1].min)/(bckt[1].max - bckt[1].min))/num_bckts |
|
|
|
|
= (0 + (50 - 1)/(970 - 1))/10 |
|
|
|
|
= 0.005057 |
|
|
|
|
EXPLAIN SELECT * FROM tenk1 t1, tenk2 t2 |
|
|
|
|
WHERE t1.unique1 < 50 AND t1.unique2 = t2.unique2; |
|
|
|
|
|
|
|
|
|
rows = 10000 * 0.005057 |
|
|
|
|
= 51 |
|
|
|
|
QUERY PLAN |
|
|
|
|
-------------------------------------------------------------------------------------- |
|
|
|
|
Nested Loop (cost=4.64..456.23 rows=50 width=488) |
|
|
|
|
-> Bitmap Heap Scan on tenk1 t1 (cost=4.64..142.17 rows=50 width=244) |
|
|
|
|
Recheck Cond: (unique1 < 50) |
|
|
|
|
-> Bitmap Index Scan on tenk1_unique1 (cost=0.00..4.63 rows=50 width=0) |
|
|
|
|
Index Cond: (unique1 < 50) |
|
|
|
|
-> Index Scan using tenk2_unique2 on tenk2 t2 (cost=0.00..6.27 rows=1 width=244) |
|
|
|
|
Index Cond: (t2.unique2 = t1.unique2) |
|
|
|
|
</programlisting> |
|
|
|
|
|
|
|
|
|
The restriction for the join is: |
|
|
|
|
The restriction on <structname>tenk1</structname>, |
|
|
|
|
<literal>unique1 < 50</literal>, |
|
|
|
|
is evaluated before the nested-loop join. |
|
|
|
|
This is handled analogously to the previous range example. This time the |
|
|
|
|
value 50 falls into the first bucket of the |
|
|
|
|
<structfield>unique1</structfield> histogram: |
|
|
|
|
|
|
|
|
|
<programlisting> |
|
|
|
|
t2.unique2 = t1.unique2 |
|
|
|
|
selectivity = (0 + (50 - bucket[1].min)/(bucket[1].max - bucket[1].min))/num_buckets |
|
|
|
|
= (0 + (50 - 0)/(993 - 0))/10 |
|
|
|
|
= 0.005035 |
|
|
|
|
|
|
|
|
|
rows = 10000 * 0.005035 |
|
|
|
|
= 50 (rounding off) |
|
|
|
|
</programlisting> |
|
|
|
|
|
|
|
|
|
This is due to the join method being nested-loop, with |
|
|
|
|
<classname>tenk1</classname> being in the outer loop. The operator is just |
|
|
|
|
our familiar <literal>=</literal>, however the restriction function is |
|
|
|
|
obtained from the <structfield>oprjoin</structfield> column of |
|
|
|
|
<classname>pg_operator</classname> - and is <function>eqjoinsel</function>. |
|
|
|
|
Additionally we use the statistical information for both |
|
|
|
|
<classname>tenk2</classname> and <classname>tenk1</classname>: |
|
|
|
|
The restriction for the join is <literal>t2.unique2 = t1.unique2</>. |
|
|
|
|
The operator is just |
|
|
|
|
our familiar <literal>=</literal>, however the selectivity function is |
|
|
|
|
obtained from the <structfield>oprjoin</structfield> column of |
|
|
|
|
<structname>pg_operator</structname>, and is <function>eqjoinsel</function>. |
|
|
|
|
<function>eqjoinsel</function> looks up the statistical information for both |
|
|
|
|
<structname>tenk2</structname> and <structname>tenk1</structname>: |
|
|
|
|
|
|
|
|
|
<programlisting> |
|
|
|
|
SELECT tablename, null_frac,n_distinct, most_common_vals FROM pg_stats |
|
|
|
|
WHERE tablename IN ('tenk1', 'tenk2') AND attname='unique2'; |
|
|
|
|
SELECT tablename, null_frac,n_distinct, most_common_vals FROM pg_stats |
|
|
|
|
WHERE tablename IN ('tenk1', 'tenk2') AND attname='unique2'; |
|
|
|
|
|
|
|
|
|
tablename | null_frac | n_distinct | most_common_vals |
|
|
|
|
-----------+-----------+------------+------------------ |
|
|
|
|
@ -311,41 +390,62 @@ tablename | null_frac | n_distinct | most_common_vals |
|
|
|
|
tenk2 | 0 | -1 | |
|
|
|
|
</programlisting> |
|
|
|
|
|
|
|
|
|
In this case there is no <acronym>MCV</acronym> information for |
|
|
|
|
<structfield>unique2</structfield> because all the values appear to be |
|
|
|
|
unique, so we can use an algorithm that relies only on the number of |
|
|
|
|
In this case there is no <acronym>MCV</acronym> information for |
|
|
|
|
<structfield>unique2</structfield> because all the values appear to be |
|
|
|
|
unique, so we use an algorithm that relies only on the number of |
|
|
|
|
distinct values for both relations together with their null fractions: |
|
|
|
|
|
|
|
|
|
<programlisting> |
|
|
|
|
selectivity = (1 - null_frac1) * (1 - null_frac2) * min(1/num_distinct1, 1/num_distinct2) |
|
|
|
|
= (1 - 0) * (1 - 0) * min(1/10000, 1/1000) |
|
|
|
|
= (1 - 0) * (1 - 0) / max(10000, 10000) |
|
|
|
|
= 0.0001 |
|
|
|
|
</programlisting> |
|
|
|
|
|
|
|
|
|
This is, subtract the null fraction from one for each of the relations, |
|
|
|
|
and divide by the maximum of the two distinct values. The number of rows |
|
|
|
|
that the join is likely to emit is calculated as the cardinality of |
|
|
|
|
Cartesian product of the two nodes in the nested-loop, multiplied by the |
|
|
|
|
This is, subtract the null fraction from one for each of the relations, |
|
|
|
|
and divide by the maximum of the numbers of distinct values. |
|
|
|
|
The number of rows |
|
|
|
|
that the join is likely to emit is calculated as the cardinality of the |
|
|
|
|
Cartesian product of the two inputs, multiplied by the |
|
|
|
|
selectivity: |
|
|
|
|
|
|
|
|
|
<programlisting> |
|
|
|
|
rows = (outer_cardinality * inner_cardinality) * selectivity |
|
|
|
|
= (51 * 10000) * 0.0001 |
|
|
|
|
= 51 |
|
|
|
|
= (50 * 10000) * 0.0001 |
|
|
|
|
= 50 |
|
|
|
|
</programlisting> |
|
|
|
|
</para> |
|
|
|
|
|
|
|
|
|
<para> |
|
|
|
|
For those interested in further details, estimation of the number of rows in |
|
|
|
|
a relation is covered in |
|
|
|
|
<filename>src/backend/optimizer/util/plancat.c</filename>. The calculation |
|
|
|
|
logic for clause selectivities is in |
|
|
|
|
<filename>src/backend/optimizer/path/clausesel.c</filename>. The actual |
|
|
|
|
implementations of the operator and join restriction functions can be found |
|
|
|
|
Had there been MCV lists for the two columns, |
|
|
|
|
<function>eqjoinsel</function> would have used direct comparison of the MCV |
|
|
|
|
lists to determine the join selectivity within the part of the column |
|
|
|
|
populations represented by the MCVs. The estimate for the remainder of the |
|
|
|
|
populations follows the same approach shown here. |
|
|
|
|
</para> |
|
|
|
|
|
|
|
|
|
<para> |
|
|
|
|
Notice that we showed <literal>inner_cardinality</> as 10000, that is, |
|
|
|
|
the unmodified size of <structname>tenk2</>. It might appear from |
|
|
|
|
inspection of the <command>EXPLAIN</> output that the estimate of |
|
|
|
|
join rows comes from 50 * 1, that is, the number of outer rows times |
|
|
|
|
the estimated number of rows obtained by each inner indexscan on |
|
|
|
|
<structname>tenk2</>. But this is not the case: the join relation size |
|
|
|
|
is estimated before any particular join plan has been considered. If |
|
|
|
|
everything is working well then the two ways of estimating the join |
|
|
|
|
size will produce about the same answer, but due to roundoff error and |
|
|
|
|
other factors they sometimes diverge significantly. |
|
|
|
|
</para> |
|
|
|
|
|
|
|
|
|
<para> |
|
|
|
|
For those interested in further details, estimation of the size of |
|
|
|
|
a table (before any <literal>WHERE</> clauses) is done in |
|
|
|
|
<filename>src/backend/optimizer/util/plancat.c</filename>. The generic |
|
|
|
|
logic for clause selectivities is in |
|
|
|
|
<filename>src/backend/optimizer/path/clausesel.c</filename>. The |
|
|
|
|
operator-specific selectivity functions are mostly found |
|
|
|
|
in <filename>src/backend/utils/adt/selfuncs.c</filename>. |
|
|
|
|
</para> |
|
|
|
|
|
|
|
|
|
</sect1> |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
</chapter> |
|
|
|
|
|