|
|
|
@ -1,5 +1,5 @@ |
|
|
|
<!-- |
|
|
|
<!-- |
|
|
|
$Header: /cvsroot/pgsql/doc/src/sgml/ref/analyze.sgml,v 1.14 2003/09/09 18:28:52 tgl Exp $ |
|
|
|
$Header: /cvsroot/pgsql/doc/src/sgml/ref/analyze.sgml,v 1.15 2003/09/11 17:31:45 momjian Exp $ |
|
|
|
PostgreSQL documentation |
|
|
|
PostgreSQL documentation |
|
|
|
--> |
|
|
|
--> |
|
|
|
|
|
|
|
|
|
|
|
@ -28,10 +28,10 @@ ANALYZE [ VERBOSE ] [ <replaceable class="PARAMETER">table</replaceable> [ (<rep |
|
|
|
<title>Description</title> |
|
|
|
<title>Description</title> |
|
|
|
|
|
|
|
|
|
|
|
<para> |
|
|
|
<para> |
|
|
|
<command>ANALYZE</command> collects statistics about the contents of |
|
|
|
<command>ANALYZE</command> collects statistics about the contents |
|
|
|
tables in the database, and stores the results in |
|
|
|
of tables in the database, and stores the results in the system |
|
|
|
the system table <literal>pg_statistic</literal>. Subsequently, |
|
|
|
table <literal>pg_statistic</literal>. Subsequently, the query |
|
|
|
the query planner uses the statistics to help determine the most efficient |
|
|
|
planner uses these statistics to help determine the most efficient |
|
|
|
execution plans for queries. |
|
|
|
execution plans for queries. |
|
|
|
</para> |
|
|
|
</para> |
|
|
|
|
|
|
|
|
|
|
|
@ -90,49 +90,56 @@ ANALYZE [ VERBOSE ] [ <replaceable class="PARAMETER">table</replaceable> [ (<rep |
|
|
|
</para> |
|
|
|
</para> |
|
|
|
|
|
|
|
|
|
|
|
<para> |
|
|
|
<para> |
|
|
|
Unlike <command>VACUUM FULL</command>, |
|
|
|
Unlike <command>VACUUM FULL</command>, <command>ANALYZE</command> |
|
|
|
<command>ANALYZE</command> requires |
|
|
|
requires only a read lock on the target table, so it can run in |
|
|
|
only a read lock on the target table, so it can run in parallel with |
|
|
|
parallel with other activity on the table. |
|
|
|
other activity on the table. |
|
|
|
|
|
|
|
</para> |
|
|
|
</para> |
|
|
|
|
|
|
|
|
|
|
|
<para> |
|
|
|
<para> |
|
|
|
For large tables, <command>ANALYZE</command> takes a random sample of the |
|
|
|
The statistics collected by <command>ANALYZE</command> usually |
|
|
|
table contents, rather than examining every row. This allows even very |
|
|
|
include a list of some of the most common values in each column and |
|
|
|
large tables to be analyzed in a small amount of time. Note, however, |
|
|
|
a histogram showing the approximate data distribution in each |
|
|
|
that the statistics are only approximate, and will change slightly each |
|
|
|
column. One or both of these may be omitted if |
|
|
|
time <command>ANALYZE</command> is run, even if the actual table contents |
|
|
|
<command>ANALYZE</command> deems them uninteresting (for example, |
|
|
|
did not change. This may result in small changes in the planner's |
|
|
|
in a unique-key column, there are no common values) or if the |
|
|
|
estimated costs shown by <command>EXPLAIN</command>. |
|
|
|
column data type does not support the appropriate operators. There |
|
|
|
|
|
|
|
is more information about the statistics in <xref |
|
|
|
|
|
|
|
linkend="maintenance">. |
|
|
|
</para> |
|
|
|
</para> |
|
|
|
|
|
|
|
|
|
|
|
<para> |
|
|
|
<para> |
|
|
|
The collected statistics usually include a list of some of the most common |
|
|
|
For large tables, <command>ANALYZE</command> takes a random sample |
|
|
|
values in each column and a histogram showing the approximate data |
|
|
|
of the table contents, rather than examining every row. This |
|
|
|
distribution in each column. One or both of these may be omitted if |
|
|
|
allows even very large tables to be analyzed in a small amount of |
|
|
|
<command>ANALYZE</command> deems them uninteresting (for example, in |
|
|
|
time. Note, however, that the statistics are only approximate, and |
|
|
|
a unique-key column, there are no common values) or if the column |
|
|
|
will change slightly each time <command>ANALYZE</command> is run, |
|
|
|
data type does not support the appropriate operators. There is more |
|
|
|
even if the actual table contents did not change. This may result |
|
|
|
information about the statistics in <xref linkend="maintenance">. |
|
|
|
in small changes in the planner's estimated costs shown by |
|
|
|
|
|
|
|
<command>EXPLAIN</command>. In rare situations, this |
|
|
|
|
|
|
|
non-determinism will cause the query optimizer to choose a |
|
|
|
|
|
|
|
different query plan between runs of <command>ANALYZE</command>. To |
|
|
|
|
|
|
|
avoid this, raise the amount of statistics collected by |
|
|
|
|
|
|
|
<command>ANALYZE</command>, as described below. |
|
|
|
</para> |
|
|
|
</para> |
|
|
|
|
|
|
|
|
|
|
|
<para> |
|
|
|
<para> |
|
|
|
The extent of analysis can be controlled by adjusting the |
|
|
|
The extent of analysis can be controlled by adjusting the |
|
|
|
<literal>default_statistics_target</> parameter variable, or on a |
|
|
|
<varname>DEFAULT_STATISTICS_TARGET</varname> parameter variable, or |
|
|
|
column-by-column basis by setting the per-column |
|
|
|
on a column-by-column basis by setting the per-column statistics |
|
|
|
statistics target with <command>ALTER TABLE ... ALTER COLUMN ... SET |
|
|
|
target with <command>ALTER TABLE ... ALTER COLUMN ... SET |
|
|
|
STATISTICS</command> (see |
|
|
|
STATISTICS</command> (see <xref linkend="sql-altertable" |
|
|
|
<xref linkend="sql-altertable" endterm="sql-altertable-title">). The |
|
|
|
endterm="sql-altertable-title">). The target value sets the |
|
|
|
target value sets the maximum number of entries in the most-common-value |
|
|
|
maximum number of entries in the most-common-value list and the |
|
|
|
list and the maximum number of bins in the histogram. The default |
|
|
|
maximum number of bins in the histogram. The default target value |
|
|
|
target value is 10, but this can be adjusted up or down to trade off |
|
|
|
is 10, but this can be adjusted up or down to trade off accuracy of |
|
|
|
accuracy of planner estimates against the time taken for |
|
|
|
planner estimates against the time taken for |
|
|
|
<command>ANALYZE</command> and the amount of space occupied |
|
|
|
<command>ANALYZE</command> and the amount of space occupied in |
|
|
|
in <literal>pg_statistic</literal>. |
|
|
|
<literal>pg_statistic</literal>. In particular, setting the |
|
|
|
In particular, setting the statistics target to zero disables collection of |
|
|
|
statistics target to zero disables collection of statistics for |
|
|
|
statistics for that column. It may be useful to do that for columns that |
|
|
|
that column. It may be useful to do that for columns that are |
|
|
|
are never used as part of the <literal>WHERE</>, <literal>GROUP BY</>, or <literal>ORDER BY</> clauses of |
|
|
|
never used as part of the <literal>WHERE</>, <literal>GROUP BY</>, |
|
|
|
queries, since the planner will have no use for statistics on such columns. |
|
|
|
or <literal>ORDER BY</> clauses of queries, since the planner will |
|
|
|
|
|
|
|
have no use for statistics on such columns. |
|
|
|
</para> |
|
|
|
</para> |
|
|
|
|
|
|
|
|
|
|
|
<para> |
|
|
|
<para> |
|
|
|
|