Make an editorial pass over the newly SGML-ified contrib documentation.

Fix lots of bad markup, bad English, bad explanations. Second round of commits. pgcrypto and pgstandby still to go...
18 years ago · 8828689ae9
parent a3102ce1ef
commit 8828689ae9
12 changed files with 2285 additions and 1937 deletions
--- a/doc/src/sgml/intagg.sgml
+++ b/doc/src/sgml/intagg.sgml
@ -1,82 +1,145 @@
+<!-- $PostgreSQL: pgsql/doc/src/sgml/intagg.sgml,v 1.3 2007/12/10 05:32:51 tgl Exp $ -->

 <sect1 id="intagg">
 <title>intagg</title>
- 
+
 <indexterm zone="intagg">
  <primary>intagg</primary>
 </indexterm>

 <para>
-  This section describes the <literal>intagg</literal> module which provides an integer aggregator and an enumerator.
+  The <filename>intagg</filename> module provides an integer aggregator and an
+  enumerator.
+ </para>
+
+ <sect2>
+  <title>Functions</title>
+
+ <para>
+  The aggregator is an aggregate function
+  <function>int_array_aggregate(integer)</>
+  that produces an integer array
+  containing exactly the integers it is fed.
+  Here is a not-tremendously-useful example:
 </para>
+
+ <programlisting>
+test=# select int_array_aggregate(i) from
+test-#   generate_series(1,10,2) i;
+ int_array_aggregate
+---------------------
+ {1,3,5,7,9}
+(1 row)
+ </programlisting>
+
 <para>
-  Many database systems have the notion of a one to many table. Such a table usually sits between two indexed tables, as: 
+  The enumerator is a function
+  <function>int_array_enum(integer[])</>
+  that returns <type>setof integer</>.  It is essentially the reverse
+  operation of the aggregator: given an array of integers, expand it
+  into a set of rows.  For example,
 </para>
+
+ <programlisting>
+test=# select * from int_array_enum(array[1,3,5,7,9]);
+ int_array_enum
+----------------
+              1
+              3
+              5
+              7
+              9
+(5 rows)
+ </programlisting>
+
+ </sect2>
+
+ <sect2>
+  <title>Sample Uses</title>
+
+  <para>
+   Many database systems have the notion of a one to many table. Such a table
+   usually sits between two indexed tables, for example:
+  </para>
+
 <programlisting>
-CREATE TABLE one_to_many(left INT, right INT) ;
+CREATE TABLE left (id INT PRIMARY KEY, ...);
+CREATE TABLE right (id INT PRIMARY KEY, ...);
+CREATE TABLE one_to_many(left INT REFERENCES left, right INT REFERENCES right);
 </programlisting>

 <para>
-  And it is used like this:
+  It is typically used like this:
 </para>

 <programlisting>
-  SELECT right.* from right JOIN one_to_many ON (right.id = one_to_many.right) 
-        WHERE  one_to_many.left = item;
+  SELECT right.* from right JOIN one_to_many ON (right.id = one_to_many.right)
+    WHERE one_to_many.left = <replaceable>item</>;
 </programlisting>

 <para>
-  This will return all the items in the right hand table for an entry 
+  This will return all the items in the right hand table for an entry
  in the left hand table. This is a very common construct in SQL.
 </para>

 <para>
  Now, this methodology can be cumbersome with a very large number of
-  entries in the one_to_many table. Depending on the order in which
-  data was entered, a join like this could result in an index scan
+  entries in the <structname>one_to_many</> table.  Often,
+  a join like this would result in an index scan
  and a fetch for each right hand entry in the table for a particular
-  left hand entry. If you have a very dynamic system, there is not much you 
+  left hand entry. If you have a very dynamic system, there is not much you
  can do. However, if you have some data which is fairly static, you can
  create a summary table with the aggregator.
 </para>

 <programlisting>
-CREATE TABLE summary as SELECT left, int_array_aggregate(right) 
-        AS right FROM one_to_many GROUP BY left;
+CREATE TABLE summary as
+  SELECT left, int_array_aggregate(right) AS right
+  FROM one_to_many
+  GROUP BY left;
 </programlisting>

 <para>
  This will create a table with one row per left item, and an array
  of right items. Now this is pretty useless without some way of using
-  the array, thats why there is an array enumerator.
+  the array; that's why there is an array enumerator.  You can do
 </para>
+
 <programlisting>
-SELECT left, int_array_enum(right) FROM summary WHERE left = item;
+SELECT left, int_array_enum(right) FROM summary WHERE left = <replaceable>item</>;
 </programlisting>

 <para>
-  The above query using int_array_enum, produces the same results as:
+  The above query using <function>int_array_enum</> produces the same results
+  as
 </para>
+
 <programlisting>
-SELECT left, right FROM one_to_many WHERE left = item;
+SELECT left, right FROM one_to_many WHERE left = <replaceable>item</>;
 </programlisting>
- 
+
 <para>
  The difference is that the query against the summary table has to get
-  only one row from the table, where as the query against "one_to_many"
-  must index scan and fetch a row for each entry.
+  only one row from the table, whereas the direct query against
+  <structname>one_to_many</> must index scan and fetch a row for each entry.
 </para>
+
 <para>
-  On our system, an EXPLAIN shows a query with a cost of 8488 gets reduced
-  to a cost of 329. The query is a join between the one_to_many table,
+  On one system, an <command>EXPLAIN</> showed a query with a cost of 8488 was
+  reduced to a cost of 329.  The original query was a join involving the
+  <structname>one_to_many</> table, which was replaced by:
 </para>
+
 <programlisting>
-SELECT right, count(right) FROM 
-(
-        SELECT left, int_array_enum(right) AS right FROM summary JOIN
-                (SELECT left FROM left_table WHERE left = item) AS lefts
-                 ON (summary.left = lefts.left ) 
-) AS list GROUP BY right ORDER BY count DESC ;
+SELECT right, count(right) FROM
+  ( SELECT left, int_array_enum(right) AS right
+    FROM summary JOIN (SELECT left FROM left_table WHERE left = <replaceable>item</>) AS lefts
+         ON (summary.left = lefts.left)
+  ) AS list
+  GROUP BY right
+  ORDER BY count DESC;
 </programlisting>
-</sect1>

+ </sect2>
+
+</sect1>
--- a/doc/src/sgml/intarray.sgml
+++ b/doc/src/sgml/intarray.sgml
@ -1,196 +1,207 @@
+<!-- $PostgreSQL: pgsql/doc/src/sgml/intarray.sgml,v 1.5 2007/12/10 05:32:51 tgl Exp $ -->
+
 <sect1 id="intarray">
 <title>intarray</title>
- 
+
 <indexterm zone="intarray">
  <primary>intarray</primary>
 </indexterm>

 <para>
-  This is an implementation of RD-tree data structure using GiST interface
-  of PostgreSQL. It has built-in lossy compression.
- </para>
-
- <para>
-  Current implementation provides index support for one-dimensional array of
-  integers: gist__int_ops, suitable for small and medium size of arrays (used by 
-  default), and gist__intbig_ops for indexing large arrays (we use superimposed
-  signature with length of 4096 bits to represent sets). There is also a 
-  non-default gin__int_ops for GIN indexes on integer arrays.
+  The <filename>intarray</> module provides a number of useful functions
+  and operators for manipulating one-dimensional arrays of integers.
+  There is also support for indexed searches using some of the operators.
 </para>

 <sect2>
-  <title>Functions</title>
-
-  <itemizedlist>
-
-   <listitem>
-    <para>
-     <literal>int   icount(int[])</literal> - the number of elements in intarray
-    </para>
-    <programlisting>
-test=# select icount('{1,2,3}'::int[]);  
- icount
--------
-      3
-(1 row)
-    </programlisting>
-   </listitem>
-
-   <listitem>
-    <para>
-     <literal>int[] sort(int[], 'asc' | 'desc')</literal> - sort intarray
-    </para>
-    <programlisting>
-test=# select sort('{1,2,3}'::int[],'desc');
-  sort  
---------
- {3,2,1}
-(1 row)
-    </programlisting>
-   </listitem>
-
-   <listitem>
-    <para>
-     <literal>int[] sort(int[])</literal> - sort in ascending order
-    </para>
-   </listitem>
-
-   <listitem>
-    <para>
-     <literal>int[] sort_asc(int[]),sort_desc(int[])</literal> - shortcuts for sort
-    </para>
-   </listitem>
-
-   <listitem>
-    <para>
-     <literal>int[] uniq(int[])</literal> - returns unique elements
-    </para>
-    <programlisting>
-test=# select uniq(sort('{1,2,3,2,1}'::int[]));
-  uniq  
---------
- {1,2,3}
-(1 row)
-    </programlisting>
-   </listitem>
-
-   <listitem>
-    <para>
-     <literal>int   idx(int[], int item)</literal> - returns index of first
-      intarray matching element to item, or '0' if matching failed.
-    </para>
-    <programlisting>
-test=# select idx('{1,2,3,2,1}'::int[],2);
- idx
-----
-   2
-(1 row)
-    </programlisting>
-   </listitem>
-
-   <listitem>
-    <para>
-     <literal>int[] subarray(int[],int START [, int LEN])</literal> - returns
-    part of intarray starting from element number START (from 1) and length LEN.
-    </para>
-    <programlisting>
-test=# select subarray('{1,2,3,2,1}'::int[],2,3);
- subarray
----------
- {2,3,2}
-(1 row)
-    </programlisting>
-   </listitem>
-
-   <listitem>
-    <para>
-     <literal>int[] intset(int4)</literal> - casting int4 to int[]
-    </para>
-    <programlisting>
-test=# select intset(1);
- intset
--------
- {1}
-(1 row)
-    </programlisting>
-   </listitem>
-
-  </itemizedlist>
- </sect2>
+  <title><filename>intarray</> Functions and Operators</title>

- <sect2>
-  <title>Operations</title>
-  <table>
-   <title>Operations</title>
-   <tgroup cols="2">
+  <table id="intarray-func-table">
+   <title><filename>intarray</> Functions</title>
+
+   <tgroup cols="5">
+    <thead>
+     <row>
+      <entry>Function</entry>
+      <entry>Return Type</entry>
+      <entry>Description</entry>
+      <entry>Example</entry>
+      <entry>Result</entry>
+     </row>
+    </thead>
+
+    <tbody>
+     <row>
+      <entry><function>icount(int[])</function></entry>
+      <entry><type>int</type></entry>
+      <entry>number of elements in array</entry>
+      <entry><literal>icount('{1,2,3}'::int[])</literal></entry>
+      <entry><literal>3</literal></entry>
+     </row>
+
+     <row>
+      <entry><function>sort(int[], text dir)</function></entry>
+      <entry><type>int[]</type></entry>
+      <entry>sort array &mdash; <parameter>dir</> must be <literal>asc</> or <literal>desc</></entry>
+      <entry><literal>sort('{1,2,3}'::int[], 'desc')</literal></entry>
+      <entry><literal>{3,2,1}</literal></entry>
+     </row>
+
+     <row>
+      <entry><function>sort(int[])</function></entry>
+      <entry><type>int[]</type></entry>
+      <entry>sort in ascending order</entry>
+      <entry><literal>sort(array[11,77,44])</literal></entry>
+      <entry><literal>{11,44,77}</literal></entry>
+     </row>
+
+     <row>
+      <entry><function>sort_asc(int[])</function></entry>
+      <entry><type>int[]</type></entry>
+      <entry>sort in ascending order</entry>
+      <entry><literal></literal></entry>
+      <entry><literal></literal></entry>
+     </row>
+
+     <row>
+      <entry><function>sort_desc(int[])</function></entry>
+      <entry><type>int[]</type></entry>
+      <entry>sort in descending order</entry>
+      <entry><literal></literal></entry>
+      <entry><literal></literal></entry>
+     </row>
+
+     <row>
+      <entry><function>uniq(int[])</function></entry>
+      <entry><type>int[]</type></entry>
+      <entry>remove adjacent duplicates</entry>
+      <entry><literal>uniq(sort('{1,2,3,2,1}'::int[]))</literal></entry>
+      <entry><literal>{1,2,3}</literal></entry>
+     </row>
+
+     <row>
+      <entry><function>idx(int[], int item)</function></entry>
+      <entry><type>int</type></entry>
+      <entry>index of first element matching <parameter>item</> (0 if none)</entry>
+      <entry><literal>idx(array[11,22,33,22,11], 22)</literal></entry>
+      <entry><literal>2</literal></entry>
+     </row>
+
+     <row>
+      <entry><function>subarray(int[], int start, int len)</function></entry>
+      <entry><type>int[]</type></entry>
+      <entry>portion of array starting at position <parameter>start</>, len <parameter>elements</></entry>
+      <entry><literal>subarray('{1,2,3,2,1}'::int[], 2, 3)</literal></entry>
+      <entry><literal>{2,3,2}</literal></entry>
+     </row>
+
+     <row>
+      <entry><function>subarray(int[], int start)</function></entry>
+      <entry><type>int[]</type></entry>
+      <entry>portion of array starting at position <parameter>start</></entry>
+      <entry><literal>subarray('{1,2,3,2,1}'::int[], 2)</literal></entry>
+      <entry><literal>{2,3,2,1}</literal></entry>
+     </row>
+
+     <row>
+      <entry><function>intset(int)</function></entry>
+      <entry><type>int[]</type></entry>
+      <entry>make single-element array</entry>
+      <entry><literal>intset(42)</literal></entry>
+      <entry><literal>{42}</literal></entry>
+     </row>
+
+    </tbody>
+   </tgroup>
+  </table>
+
+  <table id="intarray-op-table">
+   <title><filename>intarray</> Operators</title>
+
+   <tgroup cols="3">
    <thead>
     <row>
      <entry>Operator</entry>
+      <entry>Returns</entry>
      <entry>Description</entry>
     </row>
    </thead>
+
    <tbody>
     <row>
      <entry><literal>int[] &amp;&amp; int[]</literal></entry>
-      <entry>overlap - returns TRUE if arrays have at least one common element</entry>
+      <entry><type>boolean</type></entry>
+      <entry>overlap &mdash; <literal>true</> if arrays have at least one common element</entry>
     </row>
     <row>
      <entry><literal>int[] @&gt; int[]</literal></entry>
-      <entry>contains - returns TRUE if left array contains right array</entry>
+      <entry><type>boolean</type></entry>
+      <entry>contains &mdash; <literal>true</> if left array contains right array</entry>
     </row>
     <row>
      <entry><literal>int[] &lt;@ int[]</literal></entry>
-      <entry>contained - returns TRUE if left array is contained in right array</entry>
+      <entry><type>boolean</type></entry>
+      <entry>contained &mdash; <literal>true</> if left array is contained in right array</entry>
     </row>
     <row>
      <entry><literal># int[]</literal></entry>
-      <entry>returns the number of elements in array</entry>
+      <entry><type>int</type></entry>
+      <entry>number of elements in array</entry>
+     </row>
+     <row>
+      <entry><literal>int[] # int</literal></entry>
+      <entry><type>int</type></entry>
+      <entry>index (same as <function>idx</> function)</entry>
     </row>
     <row>
      <entry><literal>int[] + int</literal></entry>
-      <entry>push element to array ( add to end of array)</entry>
+      <entry><type>int[]</type></entry>
+      <entry>push element onto array (add it to end of array)</entry>
     </row>
     <row>
      <entry><literal>int[] + int[]  </literal></entry>
-      <entry>merge of arrays (right array added to the end of left one)</entry>
+      <entry><type>int[]</type></entry>
+      <entry>array concatenation (right array added to the end of left one)</entry>
     </row>
     <row>
      <entry><literal>int[] - int</literal></entry>
-      <entry>remove entries matched by right argument from array</entry>
+      <entry><type>int[]</type></entry>
+      <entry>remove entries matching right argument from array</entry>
     </row>
     <row>
      <entry><literal>int[] - int[]</literal></entry>
-      <entry>remove right array from left</entry>
+      <entry><type>int[]</type></entry>
+      <entry>remove elements of right array from left</entry>
     </row>
     <row>
      <entry><literal>int[] | int</literal></entry>
-      <entry>returns intarray - union of arguments</entry>
+      <entry><type>int[]</type></entry>
+      <entry>union of arguments</entry>
     </row>
     <row>
      <entry><literal>int[] | int[]</literal></entry>
-      <entry>returns intarray as a union of two arrays</entry>
+      <entry><type>int[]</type></entry>
+      <entry>union of arrays</entry>
     </row>
-
     <row>
      <entry><literal>int[] &amp; int[]</literal></entry>
-      <entry>returns intersection of arrays</entry>
+      <entry><type>int[]</type></entry>
+      <entry>intersection of arrays</entry>
     </row>
-
     <row>
      <entry><literal>int[] @@ query_int</literal></entry>
-      <entry>
-       returns TRUE if array satisfies query (like
-       <literal>'1&amp;(2|3)'</literal>)
-      </entry>
+      <entry><type>boolean</type></entry>
+      <entry><literal>true</> if array satisfies query (see below)</entry>
     </row>
-
     <row>
      <entry><literal>query_int ~~ int[]</literal></entry>
-      <entry>returns TRUE if array satisfies query (commutator of @@)</entry>
+      <entry><type>boolean</type></entry>
+      <entry><literal>true</> if array satisfies query (commutator of <literal>@@</>)</entry>
     </row>
    </tbody>
   </tgroup>
  </table>
+
  <para>
   (Before PostgreSQL 8.2, the containment operators @&gt; and &lt;@ were
   respectively called @ and ~.  These names are still available, but are
@ -198,85 +209,102 @@ test=# select intset(1);
   are reversed from the convention formerly followed by the core geometric
   datatypes!)
  </para>
+
+  <para>
+   The <literal>@@</> and <literal>~~</> operators test whether an array
+   satisfies a <firstterm>query</>, which is expressed as a value of a
+   specialized data type <type>query_int</>.  A <firstterm>query</>
+   consists of integer values that are checked against the elements of
+   the array, possibly combined using the operators <literal>&amp;</>
+   (AND), <literal>|</> (OR), and <literal>!</> (NOT).  Parentheses
+   can be used as needed.  For example,
+   the query <literal>1&amp;(2|3)</> matches arrays that contain 1
+   and also contain either 2 or 3.
+  </para>
+ </sect2>
+
+ <sect2>
+  <title>Index Support</title>
+
+  <para>
+   <filename>intarray</> provides index support for the
+   <literal>&amp;&amp;</>, <literal>@&gt;</>, <literal>&lt;@</>,
+   and <literal>@@</> operators, as well as regular array equality.
+   The implementation uses an RD-tree data structure with
+   built-in lossy compression.
+  </para>
+
+  <para>
+   Two GiST index operator classes are provided:
+   <literal>gist__int_ops</> (used by default) is suitable for
+   small and medium-size arrays, while
+   <literal>gist__intbig_ops</> uses a larger signature and is more
+   suitable for indexing large arrays.
+  </para>
+
+  <para>
+   There is also a non-default GIN operator class
+   <literal>gin__int_ops</>.
+  </para>
+
+  <para>
+   The choice between GiST and GIN indexing depends on the relative
+   performance characteristics of GiST and GIN, which are discussed elsewhere.
+   As a rule of thumb, a GIN index is faster to search than a GiST index, but
+   slower to build or update; so GIN is better suited for static data and GiST
+   for often-updated data.
+  </para>
 </sect2>

 <sect2>
  <title>Example</title>

  <programlisting>
-CREATE TABLE message (mid INT NOT NULL,sections INT[]);
-CREATE TABLE message_section_map (mid INT NOT NULL,sid INT NOT NULL);
+-- a message can be in one or more <quote>sections</>
+CREATE TABLE message (mid INT PRIMARY KEY, sections INT[], ...);

-- create indices
-CREATE unique index message_key ON message ( mid );
-CREATE unique index message_section_map_key2 ON message_section_map (sid, mid );
-CREATE INDEX message_rdtree_idx ON message USING GIST ( sections gist__int_ops);
+-- create specialized index
+CREATE INDEX message_rdtree_idx ON message USING GIST (sections gist__int_ops);

-- select some messages with section in 1 OR 2 - OVERLAP operator
-SELECT message.mid FROM message WHERE message.sections &amp;&amp; '{1,2}'; 
+-- select messages in section 1 OR 2 - OVERLAP operator
+SELECT message.mid FROM message WHERE message.sections &amp;&amp; '{1,2}';

-- select messages contains in sections 1 AND 2 - CONTAINS operator
+-- select messages in sections 1 AND 2 - CONTAINS operator
 SELECT message.mid FROM message WHERE message.sections @&gt; '{1,2}';
-- the same, CONTAINED operator
-SELECT message.mid FROM message WHERE '{1,2}' &lt;@ message.sections;
+
+-- the same, using QUERY operator
+SELECT message.mid FROM message WHERE message.sections @@ '1&amp;2'::query_int;
  </programlisting>
 </sect2>

 <sect2>
  <title>Benchmark</title>
+
  <para>
-  subdirectory bench contains benchmark suite.
+   The source directory <filename>contrib/intarray/bench</> contains a
+   benchmark test suite.  To run:
  </para>
+
  <programlisting>
-  cd ./bench
-  1. createdb TEST
-  2. psql TEST &lt; ../_int.sql
-  3. ./create_test.pl | psql TEST
-  4. ./bench.pl - perl script to benchmark queries, supports OR, AND queries
-                  with/without RD-Tree. Run script without arguments to
-                  see availbale options.
-
-     a)test without RD-Tree (OR)
-       ./bench.pl -d TEST -c -s 1,2 -v
-     b)test with RD-Tree
-       ./bench.pl -d TEST -c -s 1,2 -v -r
-
-     BENCHMARKS:
-    
-     Size of table &lt;message>: 200000
-     Size of table &lt;message_section_map>: 269133
-    
-     Distribution of messages by sections:
-    
-     section 0: 74377 messages
-     section 1: 16284 messages
-     section 50: 1229 messages
-     section 99: 683 messages
-    
-     old - without RD-Tree support,
-     new - with RD-Tree
-    
-     +----------+---------------+----------------+
-     |Search set|OR, time in sec|AND, time in sec|
-     |          +-------+-------+--------+-------+
-     |          |  old  |  new  |   old  |  new  |
-     +----------+-------+-------+--------+-------+
-     |         1|  0.625|  0.101|       -|      -|
-     +----------+-------+-------+--------+-------+
-     |        99|  0.018|  0.017|       -|      -|
-     +----------+-------+-------+--------+-------+
-     |       1,2|  0.766|  0.133|   0.628|  0.045|
-     +----------+-------+-------+--------+-------+
-     | 1,2,50,65|  0.794|  0.141|   0.030|  0.006|
-     +----------+-------+-------+--------+-------+
+   cd .../bench
+   createdb TEST
+   psql TEST &lt; ../_int.sql
+   ./create_test.pl | psql TEST
+   ./bench.pl
  </programlisting>
+
+  <para>
+   The <filename>bench.pl</> script has numerous options, which
+   are displayed when it is run without any arguments.
+  </para>
 </sect2>

 <sect2>
  <title>Authors</title>
+
  <para>
-   All work was done by Teodor Sigaev (<email>teodor@stack.net</email>) and Oleg
-   Bartunov (<email>oleg@sai.msu.su</email>). See
+   All work was done by Teodor Sigaev (<email>teodor@stack.net</email>) and
+   Oleg Bartunov (<email>oleg@sai.msu.su</email>). See
   <ulink url="http://www.sai.msu.su/~megera/postgres/gist"></ulink> for
   additional information. Andrey Oktyabrski did a great work on adding new
   functions and operations.
@ -284,4 +312,3 @@ SELECT message.mid FROM message WHERE '{1,2}' &lt;@ message.sections;
 </sect2>

 </sect1>
-
--- a/doc/src/sgml/isn.sgml
+++ b/doc/src/sgml/isn.sgml
@ -1,429 +1,299 @@
+<!-- $PostgreSQL: pgsql/doc/src/sgml/isn.sgml,v 1.4 2007/12/10 05:32:51 tgl Exp $ -->
+
 <sect1 id="isn">
 <title>isn</title>
- 
+
 <indexterm zone="isn">
  <primary>isn</primary>
 </indexterm>

 <para>
-  The <literal>isn</literal> module adds data types for the following 
-  international-standard namespaces: EAN13, UPC, ISBN (books), ISMN (music), 
-  and ISSN (serials). This module is inspired by Garrett A. Wollman's 
-  isbn_issn code.
- </para>
- <para>
-  This module validates, and automatically adds the correct 
-  hyphenations to the numbers. Also, it supports the new ISBN-13 
-  numbers to be used starting in January 2007.
+  The <filename>isn</filename> module provides data types for the following
+  international product numbering standards: EAN13, UPC, ISBN (books), ISMN
+  (music), and ISSN (serials).  Numbers are validated on input, and correctly
+  hyphenated on output.
 </para>

- <para>
-  Premises:
- </para>
-
- <orderedlist>
-  <listitem>
-   <para>ISBN13, ISMN13, ISSN13 numbers are all EAN13 numbers</para>
-  </listitem>
-  <listitem>
-   <para>EAN13 numbers aren't always ISBN13, ISMN13 or ISSN13 (some are)</para>
-  </listitem>
-  <listitem>
-   <para>some ISBN13 numbers can be displayed as ISBN</para>
-  </listitem>
-  <listitem>
-   <para>some ISMN13 numbers can be displayed as ISMN</para>
-  </listitem>
-  <listitem>
-   <para>some ISSN13 numbers can be displayed as ISSN</para>
-  </listitem>
-  <listitem>
-   <para>all UPC, ISBN, ISMN and ISSN can be represented as EAN13 numbers</para>
-  </listitem>
- </orderedlist>
-
- <note>
-  <para>
-   All types are internally represented as 64 bit integers, 
-   and internally all are consistently interchangeable.
-  </para>
- </note>
- <note>
-  <para>
-   We have two operator classes (for btree and for hash) so each data type
-   can be indexed for faster access.
-  </para>
- </note>
-
 <sect2>
  <title>Data types</title>
-  
-  <para>
-   We have the following data types:
-  </para>

  <table>
-   <title>Data types</title>
+   <title><filename>isn</filename> data types</title>
   <tgroup cols="2">
    <thead>
     <row>
-      <entry><para>Data type</para></entry>
-      <entry><para>Description</para></entry>
+      <entry>Data type</entry>
+      <entry>Description</entry>
     </row>
    </thead>
+
    <tbody>
     <row>
-      <entry><para><literal>EAN13</literal></para></entry>
+      <entry><type>EAN13</type></entry>
      <entry>
-       <para>
-        European Article Numbers. This type will always show the EAN13-display 
-        format. Te output function for this is <literal>ean13_out()</literal>
-       </para>
+       European Article Numbers, always displayed in the EAN13 display format
      </entry>
     </row>

     <row>
-      <entry><para><literal>ISBN13</literal></para></entry>
+      <entry><type>ISBN13</type></entry>
      <entry>
-       <para>
-        For International Standard Book Numbers to be displayed in 
-          the new EAN13-display format.
-       </para>
+       International Standard Book Numbers to be displayed in
+       the new EAN13 display format
      </entry>
     </row>

     <row>
-      <entry><para><literal>ISMN13</literal></para></entry>
+      <entry><type>ISMN13</type></entry>
      <entry>
-       <para>
-        For International Standard Music Numbers to be displayed in
-        the new EAN13-display format.
-       </para>
+       International Standard Music Numbers to be displayed in
+       the new EAN13 display format
      </entry>
     </row>
     <row>
-      <entry><para><literal>ISSN13</literal></para></entry>
+      <entry><type>ISSN13</type></entry>
      <entry>
-       <para>
-        For International Standard Serial Numbers to be displayed in the new 
-        EAN13-display format.
-       </para>
+       International Standard Serial Numbers to be displayed in the new
+       EAN13 display format
      </entry>
     </row>
     <row>
-      <entry><para><literal>ISBN</literal></para></entry>
+      <entry><type>ISBN</type></entry>
      <entry>
-       <para>
-        For International Standard Book Numbers to be displayed in the current 
-        short-display format.
-       </para>
+       International Standard Book Numbers to be displayed in the old
+       short display format
      </entry>
     </row>
     <row>
-      <entry><para><literal>ISMN</literal></para></entry>
+      <entry><type>ISMN</type></entry>
      <entry>
-       <para>
-        For International Standard Music Numbers to be displayed in the 
-        current short-display format.
-       </para>
+       International Standard Music Numbers to be displayed in the
+       old short display format
      </entry>
     </row>
     <row>
-      <entry><para><literal>ISSN</literal></para></entry>
+      <entry><type>ISSN</type></entry>
      <entry>
-       <para>
-        For International Standard Serial Numbers to be displayed in the 
-        current short-display format. These types will display the short 
-        version of the ISxN (ISxN 10) whenever it's possible, and it will 
-        show ISxN 13 when it's impossible to show the short version.  The 
-        output function to do this is <literal>isn_out()</literal>
-       </para>
+       International Standard Serial Numbers to be displayed in the
+       old short display format
      </entry>
     </row>
     <row>
-      <entry><para><literal>UPC</literal></para></entry>
+      <entry><type>UPC</type></entry>
      <entry>
-       <para>
-        For Universal Product Codes. UPC numbers are a subset of the EAN13 
-        numbers (they are basically EAN13 without the first '0' digit.)
-        The output function to do this is also <literal>isn_out()</literal>
-       </para>
+       Universal Product Codes
      </entry>
     </row>
    </tbody>
   </tgroup>
  </table>

-  <note>
-   <para>
-    <literal>EAN13</literal>, <literal>ISBN13</literal>, 
-    <literal>ISMN13</literal> and <literal>ISSN13</literal> types will always 
-    display the long version of the ISxN (EAN13). The output function to do 
-    this is <literal>ean13_out()</literal>.
-   </para>
-   <para>
-    The need for these types is just for displaying in different ways the same 
-    data: <literal>ISBN13</literal> is actually the same as 
-    <literal>ISBN</literal>, <literal>ISMN13=ISMN</literal> and 
-    <literal>ISSN13=ISSN</literal>.
-   </para>
-  </note>
- </sect2>
-
- <sect2>
-  <title>Input functions</title>
- 
  <para>
-   We have the following input functions:
+   Some notes:
  </para>

-  <table>
-   <title>Input functions</title>
-   <tgroup cols="2">
-    <thead>
-     <row>
-      <entry>Function</entry>
-      <entry>Description</entry>
-     </row>
-    </thead>
-    <tbody>
-     <row>
-      <entry><para><literal>ean13_in()</literal></para></entry>
-      <entry>
-       <para>
-        To take a string and return an EAN13.
-       </para>
-      </entry>
-     </row>
-
-     <row>
-      <entry><para><literal>isbn_in()</literal></para></entry>
-      <entry>
-       <para>
-        To take a string and return valid ISBN or ISBN13 numbers.
-       </para>
-      </entry>
-     </row>
-
-     <row>
-      <entry><para><literal>ismn_in()</literal></para></entry>
-      <entry>
-       <para>
-        To take a string and return valid ISMN or ISMN13 numbers.
-       </para>
-      </entry>
-     </row>
-
-     <row>
-      <entry><para><literal>issn_in()</literal></para></entry>
-      <entry>
-       <para>
-        To take a string and return valid ISSN or ISSN13 numbers.
-       </para>
-      </entry>
-     </row>
-     <row>
-      <entry><para><literal>upc_in()</literal></para></entry>
-      <entry>
-       <para>
-        To take a string and return an UPC codes.
-       </para>
-      </entry>
-     </row>
-    </tbody>
-   </tgroup>
-  </table>
- </sect2>
-
- <sect2>
-  <title>Casts</title>
-
-  <para>
-   We are able to cast from:
-  </para>
-  <itemizedlist>
+  <orderedlist>
   <listitem>
-    <para>
-     ISBN13 -&gt; EAN13
-    </para>
+    <para>ISBN13, ISMN13, ISSN13 numbers are all EAN13 numbers.</para>
   </listitem>
   <listitem>
-    <para>
-     ISMN13 -&gt; EAN13
-    </para>
+    <para>EAN13 numbers aren't always ISBN13, ISMN13 or ISSN13 (some
+    are).</para>
   </listitem>
   <listitem>
-    <para>
-     ISSN13 -&gt; EAN13
-    </para>
+    <para>Some ISBN13 numbers can be displayed as ISBN.</para>
   </listitem>
   <listitem>
-    <para>
-     ISBN -&gt; EAN13
-    </para>
+    <para>Some ISMN13 numbers can be displayed as ISMN.</para>
   </listitem>
   <listitem>
-    <para>
-     ISMN -&gt; EAN13
-    </para>
+    <para>Some ISSN13 numbers can be displayed as ISSN.</para>
   </listitem>
   <listitem>
-    <para>
-     ISSN -&gt; EAN13
-    </para>
+    <para>UPC numbers are a subset of the EAN13 numbers (they are basically
+    EAN13 without the first <literal>0</> digit).</para>
   </listitem>
   <listitem>
-    <para>
-     UPC  -&gt; EAN13
-    </para>
+    <para>All UPC, ISBN, ISMN and ISSN numbers can be represented as EAN13
+    numbers.</para>
   </listitem>
+  </orderedlist>
+
+  <para>
+   Internally, all these types use the same representation (a 64-bit
+   integer), and all are interchangeable.  Multiple types are provided
+   to control display formatting and to permit tighter validity checking
+   of input that is supposed to denote one particular type of number.
+  </para>
+
+  <para>
+   The <type>ISBN</>, <type>ISMN</>, and <type>ISSN</> types will display the
+   short version of the number (ISxN 10) whenever it's possible, and will show
+   ISxN 13 format for numbers that do not fit in the short version.
+   The <type>EAN13</type>, <type>ISBN13</type>, <type>ISMN13</type> and
+   <type>ISSN13</type> types will always display the long version of the ISxN
+   (EAN13).
+  </para>
+ </sect2>
+
+ <sect2>
+  <title>Casts</title>
+
+  <para>
+   The <filename>isn</> module provides the following pairs of type casts:
+  </para>
+
+  <itemizedlist>
   <listitem>
    <para>
-     ISBN &lt;-&gt; ISBN13
+     ISBN13 &lt;=&gt; EAN13
    </para>
   </listitem>
   <listitem>
    <para>
-     ISMN &lt;-&gt; ISMN13
+     ISMN13 &lt;=&gt; EAN13
    </para>
   </listitem>
   <listitem>
    <para>
-     ISSN &lt;-&gt; ISSN13
+     ISSN13 &lt;=&gt; EAN13
    </para>
   </listitem>
-  </itemizedlist>
- </sect2>
-
- <sect2>
-  <title>C API</title>
-  <para>
-   The C API is implemented as:
-  </para>
-  <programlisting>
-   extern Datum isn_out(PG_FUNCTION_ARGS);
-   extern Datum ean13_out(PG_FUNCTION_ARGS);
-   extern Datum ean13_in(PG_FUNCTION_ARGS);
-   extern Datum isbn_in(PG_FUNCTION_ARGS);
-   extern Datum ismn_in(PG_FUNCTION_ARGS);
-   extern Datum issn_in(PG_FUNCTION_ARGS);
-   extern Datum upc_in(PG_FUNCTION_ARGS);
-  </programlisting>
-
-  <para>
-   On success:
-  </para>
-  <itemizedlist>
   <listitem>
    <para>
-     <literal>isn_out()</literal> takes any of our types and returns a string containing 
-  the shortes possible representation of the number.
+     ISBN &lt;=&gt; EAN13
    </para>
   </listitem>
   <listitem>
    <para>
-     <literal>ean13_out()</literal> takes any of our types and returns the 
-  EAN13 (long) representation of the number.
+     ISMN &lt;=&gt; EAN13
    </para>
   </listitem>
   <listitem>
    <para>
-     <literal>ean13_in()</literal> takes a string and return a EAN13. Which, as stated in (2) 
-  could or could not be any of our types, but it certainly is an EAN13 
-  number. Only if the string is a valid EAN13 number, otherwise it fails.
+     ISSN &lt;=&gt; EAN13
    </para>
   </listitem>
   <listitem>
    <para>
-     <literal>isbn_in()</literal> takes a string and return an ISBN/ISBN13. Only if the string
-  is really a ISBN/ISBN13, otherwise it fails.
+     UPC  &lt;=&gt; EAN13
    </para>
   </listitem>
   <listitem>
    <para>
-     <literal>ismn_in()</literal> takes a string and return an ISMN/ISMN13. Only if the string
-  is really a ISMN/ISMN13, otherwise it fails.
+     ISBN &lt;=&gt; ISBN13
    </para>
   </listitem>
   <listitem>
    <para>
-     <literal>issn_in()</literal> takes a string and return an ISSN/ISSN13. Only if the string
-  is really a ISSN/ISSN13, otherwise it fails.
+     ISMN &lt;=&gt; ISMN13
    </para>
   </listitem>
   <listitem>
    <para>
-     <literal>upc_in()</literal> takes a string and return an UPC. Only if the string is
-  really a UPC, otherwise it fails.
+     ISSN &lt;=&gt; ISSN13
    </para>
   </listitem>
  </itemizedlist>

  <para>
-   (on failure, the functions 'ereport' the error)
+   When casting from <type>EAN13</> to another type, there is a run-time
+   check that the value is within the domain of the other type, and an error
+   is thrown if not.  The other casts are simply relabelings that will
+   always succeed.
  </para>
 </sect2>

 <sect2>
-  <title>Testing functions</title>
+  <title>Functions and Operators</title>
+
+  <para>
+   The <filename>isn</> module provides the standard comparison operators,
+   plus btree and hash indexing support for all these datatypes.  In
+   addition there are several specialized functions.  In this table,
+   <type>isn</> means any one of the module's data types:
+  </para>
+
  <table>
-   <title>Testing functions</title>
-   <tgroup cols="2">
+   <title><filename>isn</> functions</title>
+   <tgroup cols="3">
    <thead>
     <row>
-      <entry><para>Function</para></entry>
-      <entry><para>Description</para></entry>
+      <entry>Function</entry>
+      <entry>Returns</entry>
+      <entry>Description</entry>
     </row>
    </thead>
+
    <tbody>
     <row>
-      <entry><para><literal>isn_weak(boolean)</literal></para></entry>
-      <entry><para>Sets the weak input mode.</para></entry>
+      <entry><function>isn_weak(boolean)</function></entry>
+      <entry><type>boolean</type></entry>
+      <entry>Sets the weak input mode (returns new setting)</entry>
     </row>
     <row>
-      <entry><para><literal>isn_weak()</literal></para></entry>
-      <entry><para>Gets the current status of the weak mode.</para></entry>
+      <entry><function>isn_weak()</function></entry>
+      <entry><type>boolean</type></entry>
+      <entry>Gets the current status of the weak mode</entry>
     </row>
     <row>
-      <entry><para><literal>make_valid()</literal></para></entry>
-      <entry><para>Validates an invalid number (deleting the invalid flag).</para></entry>
+      <entry><function>make_valid(isn)</function></entry>
+      <entry><type>isn</type></entry>
+      <entry>Validates an invalid number (clears the invalid flag)</entry>
     </row>
     <row>
-      <entry><para><literal>is_valid()</literal></para></entry>
-      <entry><para>Checks for the invalid flag prsence.</para></entry>
+      <entry><function>is_valid(isn)</function></entry>
+      <entry><type>boolean</type></entry>
+      <entry>Checks for the presence of the invalid flag</entry>
     </row>
    </tbody>
   </tgroup>
  </table>

  <para>
-   <literal>Weak</literal> mode is used to be able to insert invalid data to
-   a table. Invalid as in the check digit being wrong, not missing numbers.
+   <firstterm>Weak</firstterm> mode is used to be able to insert invalid data
+   into a table. Invalid means the check digit is wrong, not that there are
+   missing numbers.
  </para>
+
  <para>
   Why would you want to use the weak mode? Well, it could be that
   you have a huge collection of ISBN numbers, and that there are so many of
   them that for weird reasons some have the wrong check digit (perhaps the
-   numbers where scanned from a printed list and the OCR got the numbers wrong,
-   perhaps the numbers were manually captured... who knows.) Anyway, the thing
-   is you might want to clean the mess up, but you still want to be able to have
-   all the numbers in your database and maybe use an external tool to access
-   the invalid numbers in the database so you can verify the information and
-   validate it more easily; as selecting all the invalid numbers in the table.
+   numbers were scanned from a printed list and the OCR got the numbers wrong,
+   perhaps the numbers were manually captured... who knows). Anyway, the point
+   is you might want to clean the mess up, but you still want to be able to
+   have all the numbers in your database and maybe use an external tool to
+   locate the invalid numbers in the database so you can verify the
+   information and validate it more easily; so for example you'd want to
+   select all the invalid numbers in the table.
  </para>
+
  <para>
   When you insert invalid numbers in a table using the weak mode, the number
-   will be inserted with the corrected check digit, but it will be flagged 
-   with an exclamation mark ('!') at the end (i.e. 0-11-000322-5!)
+   will be inserted with the corrected check digit, but it will be displayed
+   with an exclamation mark (<literal>!</>) at the end, for example
+   <literal>0-11-000322-5!</>.  This invalid marker can be checked with
+   the <function>is_valid</> function and cleared with the
+   <function>make_valid</> function.
  </para>
+
+  <para>
+   You can also force the insertion of invalid numbers even when not in the
+   weak mode, by appending the <literal>!</> character at the end of the
+   number.
+  </para>
+
  <para>
-   You can also force the insertion of invalid numbers even not in the weak mode,
-   appending the '!' character at the end of the number.
+   Another special feature is that during input, you can write
+   <literal>?</> in place of the check digit, and the correct check digit
+   will be inserted automatically.
  </para>
 </sect2>

 <sect2>
  <title>Examples</title>
+
  <programlisting>
 --Using the types directly:
 SELECT isbn('978-0-393-04002-9');
@ -431,18 +301,18 @@ SELECT isbn13('0901690546');
 SELECT issn('1436-4522');

 --Casting types:
-- note that you can only cast from ean13 to other type when the casted 
-- number would be valid in the realm of the casted type; 
-- thus, the following will NOT work: select isbn(ean13('0220356483481')); 
-- but these will: 
-SELECT upc(ean13('0220356483481')); 
-SELECT ean13(upc('220356483481')); 
+-- note that you can only cast from ean13 to another type when the
+-- number would be valid in the realm of the target type;
+-- thus, the following will NOT work: select isbn(ean13('0220356483481'));
+-- but these will:
+SELECT upc(ean13('0220356483481'));
+SELECT ean13(upc('220356483481'));

 --Create a table with a single column to hold ISBN numbers:
-CREATE TABLE test ( id isbn );
+CREATE TABLE test (id isbn);
 INSERT INTO test VALUES('9780393040029');

--Automatically calculating check digits (observe the '?'):
+--Automatically calculate check digits (observe the '?'):
 INSERT INTO test VALUES('220500896?');
 INSERT INTO test VALUES('978055215372?');

@ -455,10 +325,10 @@ INSERT INTO test VALUES('978-0-11-000533-4');
 INSERT INTO test VALUES('9780141219307');
 INSERT INTO test VALUES('2-205-00876-X');
 SELECT isn_weak(false);
-  
+
 SELECT id FROM test WHERE NOT is_valid(id);
-UPDATE test SET id=make_valid(id) WHERE id = '2-205-00876-X!';
-  
+UPDATE test SET id = make_valid(id) WHERE id = '2-205-00876-X!';
+
 SELECT * FROM test;

 SELECT isbn13(id) FROM test;
@ -467,8 +337,9 @@ SELECT isbn13(id) FROM test;

 <sect2>
  <title>Bibliography</title>
+
  <para>
-   The information to implement this module was collected through
+   The information to implement this module was collected from
   several sites, including:
  </para>
  <programlisting>
@ -477,26 +348,33 @@ SELECT isbn13(id) FROM test;
   http://www.ismn-international.org/
   http://www.wikipedia.org/
  </programlisting>
+
  <para>
-   the prefixes used for hyphenation where also compiled from:
+   The prefixes used for hyphenation were also compiled from:
  </para>
  <programlisting>
   http://www.gs1.org/productssolutions/idkeys/support/prefix_list.html
   http://www.isbn-international.org/en/identifiers.html
   http://www.ismn-international.org/ranges.html
  </programlisting>
+
  <para>
   Care was taken during the creation of the algorithms and they
   were meticulously verified against the suggested algorithms
   in the official ISBN, ISMN, ISSN User Manuals.
  </para>
 </sect2>
- 
+
 <sect2>
  <title>Author</title>
  <para>
   Germán Méndez Bravo (Kronuz), 2004 - 2006
  </para>
+
+  <para>
+   This module was inspired by Garrett A. Wollman's
+   isbn_issn code.
+  </para>
 </sect2>
-</sect1>

+</sect1>
--- a/doc/src/sgml/ltree.sgml
+++ b/doc/src/sgml/ltree.sgml
--- a/doc/src/sgml/oid2name.sgml
+++ b/doc/src/sgml/oid2name.sgml
@ -1,70 +1,157 @@
+<!-- $PostgreSQL: pgsql/doc/src/sgml/oid2name.sgml,v 1.3 2007/12/10 05:32:51 tgl Exp $ -->
+
 <sect1 id="oid2name">
 <title>oid2name</title>
- 
+
 <indexterm zone="oid2name">
  <primary>oid2name</primary>
 </indexterm>

 <para>
-  This utility allows administrators to examine the file structure used by
-  PostgreSQL.  To make use of it, you need to be familiar with the file
-  structure, which is described in <xref linkend="storage">.
+  <application>oid2name</> is a utility program that helps administrators to
+  examine the file structure used by PostgreSQL.  To make use of it, you need
+  to be familiar with the database file structure, which is described in
+  <xref linkend="storage">.
 </para>
- 
- <sect2>
-  <title>Overview</title>
-  <para>
-   <literal>oid2name</literal> connects to the database and extracts OID, 
-   filenode, and table name information.  You can also have it show database 
-   OIDs and tablespace OIDs.
-  </para>
+
+ <note>
  <para>
-   When displaying specific tables, you can select which tables to show by
-   using -o, -f and -t.  The first switch takes an OID, the second takes
-   a filenode, and the third takes a tablename (actually, it's a LIKE
-   pattern, so you can use things like "foo%").  Note that you can use as many
-   of these switches as you like, and the listing will include all objects
-   matched by any of the switches.  Also note that these switches can only
-   show objects in the database given in -d.
+   The name <quote>oid2name</> is historical, and is actually rather
+   misleading, since most of the time when you use it, you will really
+   be concerned with tables' filenode numbers (which are the file names
+   visible in the database directories).  Be sure you understand the
+   difference between table OIDs and table filenodes!
  </para>
+ </note>
+
+ <sect2>
+  <title>Overview</title>
+
  <para>
-   If you don't give any of -o, -f or -t it will dump all the tables in the
-   database given in -d.  If you don't give -d, it will show a database
-   listing.  Alternatively you can give -s to get a tablespace listing.
+   <application>oid2name</application> connects to a target database and
+   extracts OID, filenode, and/or table name information.  You can also have
+   it show database OIDs or tablespace OIDs.  The program is controlled by
+   a large number of command-line switches, as shown in
+   <xref linkend="oid2name-switches">.
  </para>
-  <table>
-   <title>Additional switches</title>
+
+  <table id="oid2name-switches">
+   <title><application>oid2name</> switches</title>
   <tgroup cols="2">
+    <thead>
+     <row>
+      <entry>Switch</entry>
+      <entry>Description</entry>
+     </row>
+    </thead>
+
    <tbody>
     <row>
-      <entry><literal>-i</literal></entry>
-      <entry>include indexes and sequences in the database listing.</entry>
+      <entry><literal>-o</literal> <replaceable>oid</></entry>
+      <entry>show info for table with OID <replaceable>oid</></entry>
     </row>
+
     <row>
-      <entry><literal>-x</literal></entry>
-      <entry>display more information about each object shown: tablespace name, 
-       schema name, OID.
-      </entry>
+      <entry><literal>-f</literal> <replaceable>filenode</></entry>
+      <entry>show info for table with filenode <replaceable>filenode</></entry>
+     </row>
+
+     <row>
+      <entry><literal>-t</literal> <replaceable>tablename_pattern</></entry>
+      <entry>show info for table(s) matching <replaceable>tablename_pattern</></entry>
+     </row>
+
+     <row>
+      <entry><literal>-s</literal></entry>
+      <entry>show tablespace OIDs</entry>
     </row>
+
     <row>
      <entry><literal>-S</literal></entry>
-      <entry>also show system objects (those in information_schema, pg_toast 
-       and pg_catalog schemas)
+      <entry>include system objects (those in
+       <literal>information_schema</literal>, <literal>pg_toast</literal>
+       and <literal>pg_catalog</literal> schemas)
      </entry>
     </row>
+
+     <row>
+      <entry><literal>-i</literal></entry>
+      <entry>include indexes and sequences in the listing</entry>
+     </row>
+
+     <row>
+      <entry><literal>-x</literal></entry>
+      <entry>display more information about each object shown: tablespace name,
+       schema name, and OID
+      </entry>
+     </row>
+
     <row>
      <entry><literal>-q</literal></entry>
-      <entry>don't display headers(useful for scripting)</entry>
+      <entry>omit headers (useful for scripting)</entry>
+     </row>
+
+     <row>
+      <entry><literal>-d</literal> <replaceable>database</></entry>
+      <entry>database to connect to</entry>
+     </row>
+
+     <row>
+      <entry><literal>-H</literal> <replaceable>host</></entry>
+      <entry>database server's host</entry>
+     </row>
+
+     <row>
+      <entry><literal>-p</literal> <replaceable>port</></entry>
+      <entry>database server's port</entry>
+     </row>
+
+     <row>
+      <entry><literal>-U</literal> <replaceable>username</></entry>
+      <entry>username to connect as</entry>
+     </row>
+
+     <row>
+      <entry><literal>-P</literal> <replaceable>password</></entry>
+      <entry>password (deprecated &mdash; putting this on the command line
+       is a security hazard)</entry>
     </row>
    </tbody>
   </tgroup>
  </table>
+
+  <para>
+   To display specific tables, select which tables to show by
+   using <literal>-o</>, <literal>-f</> and/or <literal>-t</>.
+   <literal>-o</> takes an OID,
+   <literal>-f</> takes a filenode,
+   and <literal>-t</> takes a tablename (actually, it's a LIKE
+   pattern, so you can use things like <literal>foo%</>).
+   You can use as many
+   of these switches as you like, and the listing will include all objects
+   matched by any of the switches.  But note that these switches can only
+   show objects in the database given by <literal>-d</>.
+  </para>
+
+  <para>
+   If you don't give any of <literal>-o</>, <literal>-f</> or <literal>-t</>,
+   but do give <literal>-d</>, it will list all tables in the database
+   named by <literal>-d</>.  In this mode, the <literal>-S</> and
+   <literal>-i</> switches control what gets listed.
+  </para>
+
+  <para>
+   If you don't give <literal>-d</> either, it will show a listing of database
+   OIDs.  Alternatively you can give <literal>-s</> to get a tablespace
+   listing.
+  </para>
 </sect2>
- 
+
 <sect2>
  <title>Examples</title>
- 
+
  <programlisting>
+$ # what's in this database server, anyway?
 $ oid2name
 All databases:
    Oid  Database Name  Tablespace
@ -83,7 +170,8 @@ All tablespaces:
  155151         fastdisk
  155152          bigdisk

-$ cd $PGDATA/17228
+$ # OK, let's look into database alvherre
+$ cd $PGDATA/base/17228

 $ # get top 10 db objects in the default tablespace, ordered by size
 $ ls -lS * | head -10
@ -98,6 +186,7 @@ $ ls -lS * | head -10
 -rw-------  1 alvherre alvherre    163840 sep 14 09:50 16699
 -rw-------  1 alvherre alvherre    122880 sep  6 17:51 16751

+$ # I wonder what file 155173 is ...
 $ oid2name -d alvherre -f 155173
 From database "alvherre":
  Filenode  Table Name
@ -112,7 +201,7 @@ From database "alvherre":
    155173       accounts
   1155291  accounts_pkey

-$ # you can also mix the options, and have more details
+$ # you can mix the options, and get more details with -x
 $ oid2name -d alvherre -t accounts -f 1155291 -x
 From database "alvherre":
  Filenode     Table Name      Oid  Schema  Tablespace
@ -157,7 +246,7 @@ $ ls -d 155151/*
 155151/17228/  155151/PG_VERSION

 $ # Oh, what was database 17228 again?
-$ oid2name   
+$ oid2name
 All databases:
    Oid  Database Name  Tablespace
 ----------------------------------
@ -178,28 +267,25 @@ From database "alvherre":
  Filenode  Table Name
 ----------------------
    155156         foo
-
-$ # end of sample session.
  </programlisting>
+ </sect2>
+
+ <sect2>
+  <title>Limitations</title>

  <para>
-   You can also get approximate size data for each object using psql.  For
-   example, 
-  </para>
-  <programlisting>
-   SELECT relpages, relfilenode, relname FROM pg_class ORDER BY relpages DESC;
-  </programlisting>
-  <para>
-   Each page is typically 8k.  Relpages is updated by VACUUM.
+   <application>oid2name</> requires a running database server with
+   non-corrupt system catalogs.  It is therefore of only limited use
+   for recovering from catastrophic database corruption situations.
  </para>
 </sect2>
- 
+
 <sect2>
  <title>Author</title>
+
  <para>
-   b. palmer, <email>bpalmer@crimelabs.net</email>
+   B. Palmer <email>bpalmer@crimelabs.net</email>
  </para>
 </sect2>

 </sect1>
-
--- a/doc/src/sgml/pageinspect.sgml
+++ b/doc/src/sgml/pageinspect.sgml
@ -1,124 +1,170 @@
+<!-- $PostgreSQL: pgsql/doc/src/sgml/pageinspect.sgml,v 1.3 2007/12/10 05:32:51 tgl Exp $ -->

 <sect1 id="pageinspect">
 <title>pageinspect</title>
- 
+
 <indexterm zone="pageinspect">
  <primary>pageinspect</primary>
 </indexterm>

 <para>
-  The functions in this module allow you to inspect the contents of data pages
-  at a low level, for debugging purposes.
+  The <filename>pageinspect</> module provides functions that allow you to
+  inspect the contents of database pages at a low level, which is useful for
+  debugging purposes.  All of these functions may be used only by superusers.
 </para>

 <sect2>
-  <title>Functions included</title>
-
-  <itemizedlist>
-   <listitem>
-    <para>
-     <literal>get_raw_page</literal> reads one block of the named table and returns a copy as a
-     bytea field. This allows a single time-consistent copy of the block to be
-     made. Use of this functions is restricted to superusers.
-    </para>
-   </listitem>
+  <title>Functions</title>

-   <listitem>
-    <para>
-     <literal>page_header</literal> shows fields which are common to all PostgreSQL heap and index
-     pages. Use of this function is restricted to superusers.
-    </para>
-    <para>
-     A page image obtained with <literal>get_raw_page</literal> should be passed as argument:
-    </para>
-    <programlisting>
-regression=# SELECT * FROM page_header(get_raw_page('pg_class',0));
-     lsn    | tli | flags | lower | upper | special | pagesize | version | prune_xid
-      -----------+-----+-------+-------+-------+---------+----------+---------+-----------
-        0/24A1B50 |   1 |     1 |   232 |   368 |    8192 |     8192 |       4 |         0
-    </programlisting>
-    <para>
-     The returned columns correspond to the fields in the PageHeaderData struct.
-     See src/include/storage/bufpage.h for more details.
-    </para>
-   </listitem>
-
-   <listitem>
-    <para>
-     <literal>heap_page_items</literal> shows all line pointers on a heap page.  For those line
-     pointers that are in use, tuple headers are also shown. All tuples are
-     shown, whether or not the tuples were visible to an MVCC snapshot at the
-     time the raw page was copied. Use of this function is restricted to
-     superusers.
-    </para>
-    <para>
-     A heap page image obtained with <literal>get_raw_page</literal> should be passed as argument:
-    </para>
-    <programlisting>
-        test=# SELECT * FROM heap_page_items(get_raw_page('pg_class',0));
-    </programlisting>
-    <para>
-     See src/include/storage/itemid.h and src/include/access/htup.h for
-     explanations of the fields returned.
-    </para>
-   </listitem>
+  <variablelist>
+   <varlistentry>
+    <term>
+     <function>get_raw_page(text, int) returns bytea</function>
+    </term>

-   <listitem>
-    <para>
-     <literal>bt_metap()</literal> returns information about the btree index metapage:
-    </para>
-    <programlisting>
-        test=> SELECT * FROM bt_metap('pg_cast_oid_index');
-        -[ RECORD 1 ]-----
-        magic     | 340322
-        version   | 2
-        root      | 1
-        level     | 0
-        fastroot  | 1
-        fastlevel | 0
-    </programlisting>
-   </listitem>
-
-   <listitem>
-    <para>
-     <literal>bt_page_stats()</literal> shows information about single btree pages:
-    </para>
-    <programlisting>
-        test=> SELECT * FROM bt_page_stats('pg_cast_oid_index', 1);
-        -[ RECORD 1 ]-+-----
-        blkno         | 1
-        type          | l
-        live_items    | 256
-        dead_items    | 0
-        avg_item_size | 12
-        page_size     | 8192
-        free_size     | 4056
-        btpo_prev     | 0
-        btpo_next     | 0
-        btpo          | 0
-        btpo_flags    | 3
-    </programlisting>
-   </listitem>
-
-   <listitem>
-    <para>
-     <literal>bt_page_items()</literal> returns information about specific items on btree pages:
+    <listitem>
+     <para>
+      <function>get_raw_page</function> reads the specified block of the named
+      table and returns a copy as a <type>bytea</> value.  This allows a
+      single time-consistent copy of the block to be obtained.
+     </para>
+    </listitem>
+   </varlistentry>
+
+   <varlistentry>
+    <term>
+     <function>page_header(bytea) returns record</function>
+    </term>
+
+    <listitem>
+     <para>
+      <function>page_header</function> shows fields that are common to all
+      <productname>PostgreSQL</> heap and index pages.
+     </para>
+
+     <para>
+      A page image obtained with <function>get_raw_page</function> should be
+      passed as argument.  For example:
+     </para>
+     <programlisting>
+test=# SELECT * FROM page_header(get_raw_page('pg_class', 0));
+    lsn    | tli | flags | lower | upper | special | pagesize | version | prune_xid
+-----------+-----+-------+-------+-------+---------+----------+---------+-----------
+ 0/24A1B50 |   1 |     1 |   232 |   368 |    8192 |     8192 |       4 |         0
+     </programlisting>
+
+     <para>
+      The returned columns correspond to the fields in the
+      <structname>PageHeaderData</> struct.
+      See <filename>src/include/storage/bufpage.h</> for details.
    </para>
-    <programlisting>
-        test=> SELECT * FROM bt_page_items('pg_cast_oid_index', 1);
-         itemoffset |  ctid   | itemlen | nulls | vars |    data
-        ------------+---------+---------+-------+------+-------------
-                  1 | (0,1)   |      12 | f     | f    | 23 27 00 00
-                  2 | (0,2)   |      12 | f     | f    | 24 27 00 00
-                  3 | (0,3)   |      12 | f     | f    | 25 27 00 00
-                  4 | (0,4)   |      12 | f     | f    | 26 27 00 00
-                  5 | (0,5)   |      12 | f     | f    | 27 27 00 00
-                  6 | (0,6)   |      12 | f     | f    | 28 27 00 00
-                  7 | (0,7)   |      12 | f     | f    | 29 27 00 00
-                  8 | (0,8)   |      12 | f     | f    | 2a 27 00 00
-    </programlisting>
-   </listitem>
-  </itemizedlist>
+    </listitem>
+   </varlistentry>
+
+   <varlistentry>
+    <term>
+     <function>heap_page_items(bytea) returns setof record</function>
+    </term>
+
+    <listitem>
+     <para>
+      <function>heap_page_items</function> shows all line pointers on a heap
+      page.  For those line pointers that are in use, tuple headers are also
+      shown. All tuples are shown, whether or not the tuples were visible to
+      an MVCC snapshot at the time the raw page was copied.
+     </para>
+     <para>
+      A heap page image obtained with <function>get_raw_page</function> should
+      be passed as argument.  For example:
+     </para>
+     <programlisting>
+test=# SELECT * FROM heap_page_items(get_raw_page('pg_class', 0));
+     </programlisting>
+     <para>
+      See <filename>src/include/storage/itemid.h</> and
+      <filename>src/include/access/htup.h</> for explanations of the fields
+      returned.
+     </para>
+    </listitem>
+   </varlistentry>
+
+   <varlistentry>
+    <term>
+     <function>bt_metap(text) returns record</function>
+    </term>
+
+    <listitem>
+     <para>
+      <function>bt_metap</function> returns information about a btree
+      index's metapage.  For example:
+     </para>
+     <programlisting>
+test=# SELECT * FROM bt_metap('pg_cast_oid_index');
+-[ RECORD 1 ]-----
+magic     | 340322
+version   | 2
+root      | 1
+level     | 0
+fastroot  | 1
+fastlevel | 0
+     </programlisting>
+    </listitem>
+   </varlistentry>
+
+   <varlistentry>
+    <term>
+     <function>bt_page_stats(text, int) returns record</function>
+    </term>
+
+    <listitem>
+     <para>
+      <function>bt_page_stats</function> returns summary information about
+      single pages of btree indexes.  For example:
+     </para>
+     <programlisting>
+test=# SELECT * FROM bt_page_stats('pg_cast_oid_index', 1);
+-[ RECORD 1 ]-+-----
+blkno         | 1
+type          | l
+live_items    | 256
+dead_items    | 0
+avg_item_size | 12
+page_size     | 8192
+free_size     | 4056
+btpo_prev     | 0
+btpo_next     | 0
+btpo          | 0
+btpo_flags    | 3
+     </programlisting>
+    </listitem>
+   </varlistentry>
+
+   <varlistentry>
+    <term>
+     <function>bt_page_items(text, int) returns setof record</function>
+    </term>
+
+    <listitem>
+     <para>
+      <function>bt_page_items</function> returns detailed information about
+      all of the items on a btree index page.  For example:
+     </para>
+     <programlisting>
+test=# SELECT * FROM bt_page_items('pg_cast_oid_index', 1);
+ itemoffset |  ctid   | itemlen | nulls | vars |    data
+------------+---------+---------+-------+------+-------------
+          1 | (0,1)   |      12 | f     | f    | 23 27 00 00
+          2 | (0,2)   |      12 | f     | f    | 24 27 00 00
+          3 | (0,3)   |      12 | f     | f    | 25 27 00 00
+          4 | (0,4)   |      12 | f     | f    | 26 27 00 00
+          5 | (0,5)   |      12 | f     | f    | 27 27 00 00
+          6 | (0,6)   |      12 | f     | f    | 28 27 00 00
+          7 | (0,7)   |      12 | f     | f    | 29 27 00 00
+          8 | (0,8)   |      12 | f     | f    | 2a 27 00 00
+     </programlisting>
+    </listitem>
+   </varlistentry>
+  </variablelist>
 </sect2>
-</sect1>

+</sect1>
--- a/doc/src/sgml/pgbench.sgml
+++ b/doc/src/sgml/pgbench.sgml
@ -1,297 +1,291 @@
+<!-- $PostgreSQL: pgsql/doc/src/sgml/pgbench.sgml,v 1.4 2007/12/10 05:32:51 tgl Exp $ -->

 <sect1 id="pgbench">
 <title>pgbench</title>
- 
+
 <indexterm zone="pgbench">
  <primary>pgbench</primary>
 </indexterm>

 <para>
-  <literal>pgbench</literal> is a simple program to run a benchmark test. 
-  <literal>pgbench</literal> is a client application of PostgreSQL and runs 
-  with PostgreSQL only. It performs lots of small and simple transactions 
-  including SELECT/UPDATE/INSERT operations then calculates number of
-  transactions successfully completed within a second (transactions
-  per second, tps). Targeting data includes a table with at least 100k
-  tuples.
+  <application>pgbench</application> is a simple program for running benchmark
+  tests on <productname>PostgreSQL</>.  It runs the same sequence of SQL
+  commands over and over, possibly in multiple concurrent database sessions,
+  and then calculates the average transaction rate (transactions per second).
+  By default, <application>pgbench</application> tests a scenario that is
+  loosely based on TPC-B, involving five <command>SELECT</>,
+  <command>UPDATE</>, and <command>INSERT</> commands per transaction.
+  However, it is easy to test other cases by writing your own transaction
+  script files.
 </para>
+
 <para>
-  Example outputs from pgbench look like:
- </para>
+  Typical output from pgbench looks like:
+
 <programlisting>
-number of clients: 4
-number of transactions per client: 100
-number of processed transactions: 400/400
-tps = 19.875015(including connections establishing)
-tps = 20.098827(excluding connections establishing)
+transaction type: TPC-B (sort of)
+scaling factor: 10
+number of clients: 10
+number of transactions per client: 1000
+number of transactions actually processed: 10000/10000
+tps = 85.184871 (including connections establishing)
+tps = 85.296346 (excluding connections establishing)
 </programlisting>
- <para> Similar program called "JDBCBench" already exists, but it requires
-  Java that may not be available on every platform. Moreover some
-  people concerned about the overhead of Java that might lead
-  inaccurate results. So I decided to write in pure C, and named
-  it "pgbench."
- </para>

- <para>
-  Features of pgbench:
+  The first four lines just report some of the most important parameter
+  settings.  The next line reports the number of transactions completed
+  and intended (the latter being just the product of number of clients
+  and number of transactions); these will be equal unless the run
+  failed before completion.  The last two lines report the TPS rate,
+  figured with and without counting the time to start database sessions.
 </para>
- <itemizedlist>
-  <listitem>
-   <para>
-    pgbench is written in C using libpq only. So it is very portable
-    and easy to install.
-   </para>
-  </listitem>
-  <listitem>
-   <para>
-    pgbench can simulate concurrent connections using asynchronous
-    capability of libpq. No threading is required.
-   </para>
-  </listitem>
- </itemizedlist>

 <sect2>
  <title>Overview</title>
-  <orderedlist>
-   <listitem>
-    <para>(optional)Initialize database by:</para>
-    <programlisting>
-pgbench -i &lt;dbname&gt;
-    </programlisting>
-    <para>
-     where &lt;dbname&gt; is the name of database. pgbench uses four tables
-     accounts, branches, history and tellers. These tables will be
-     destroyed. Be very careful if you have tables having same
-     names. Default test data contains:
-    </para>
-    <programlisting>
-table           # of tuples
+
+  <para>
+   The default TPC-B-like transaction test requires specific tables to be
+   set up beforehand.  <application>pgbench</> should be invoked with
+   the <literal>-i</> (initialize) option to create and populate these
+   tables.  (When you are testing a custom script, you don't need this
+   step, but will instead need to do whatever setup your test needs.)
+   Initialization looks like:
+
+   <programlisting>
+pgbench -i <optional> <replaceable>other-options</> </optional> <replaceable>dbname</>
+   </programlisting>
+
+   where <replaceable>dbname</> is the name of the already-created
+   database to test in.  (You may also need <literal>-h</>,
+   <literal>-p</>, and/or <literal>-U</> options to specify how to
+   connect to the database server.)
+  </para>
+
+  <caution>
+   <para>
+    <literal>pgbench -i</> creates four tables <structname>accounts</>,
+    <structname>branches</>, <structname>history</>, and
+    <structname>tellers</>, destroying any existing tables of these names.
+    Be very careful to use another database if you have tables having these
+    names!
+   </para>
+  </caution>
+
+  <para>
+   At the default <quote>scale factor</> of 1, the tables initially
+   contain this many rows:
+  </para>
+  <programlisting>
+table           # of rows
 -------------------------
 branches        1
 tellers         10
 accounts        100000
 history         0
-    </programlisting>
-    <para>
-     You can increase the number of tuples by using -s option. branches,
-     tellers and accounts tables are created with a fillfactor which is
-     set using -F option. See below.
-    </para>
-   </listitem>
-   <listitem>
-    <para>Run the benchmark test</para>
-    <programlisting>
-pgbench &lt;dbname&gt;
-    </programlisting>
-    <para>
-     The default configuration is:
-    </para>
-    <programlisting>
-     number of clients: 1
-     number of transactions per client: 10
-    </programlisting>
-   </listitem>
-  </orderedlist>
+  </programlisting>
+  <para>
+   You can (and, for most purposes, probably should) increase the number
+   of rows by using the <literal>-s</> (scale factor) option.  The
+   <literal>-F</> (fillfactor) option might also be used at this point.
+  </para>
+
+  <para>
+   Once you have done the necessary setup, you can run your benchmark
+   with a command that doesn't include <literal>-i</>, that is
+
+   <programlisting>
+pgbench <optional> <replaceable>options</> </optional> <replaceable>dbname</>
+   </programlisting>

-  <table>
-   <title><literal>pgbench</literal> options</title>
+   In nearly all cases, you'll need some options to make a useful test.
+   The most important options are <literal>-c</> (number of clients),
+   <literal>-t</> (number of transactions), and <literal>-f</> (specify
+   a custom script file).  See below for a full list.
+  </para>
+
+  <para>
+   <xref linkend="pgbench-init-options"> shows options that are used
+   during database initialization, while
+   <xref linkend="pgbench-run-options"> shows options that are used
+   while running benchmarks, and
+   <xref linkend="pgbench-common-options"> shows options that are useful
+   in both cases.
+  </para>
+
+  <table id="pgbench-init-options">
+   <title><application>pgbench</application> initialization options</title>
   <tgroup cols="2">
    <thead>
     <row>
-      <entry>Parameter</entry>
+      <entry>Option</entry>
      <entry>Description</entry>
     </row>
    </thead>
+
    <tbody>
     <row>
-      <entry><literal>-h hostname</literal></entry>
+      <entry><literal>-i</literal></entry>
      <entry>
-       <para>
-        hostname where the backend is running. If this option
-        is omitted, pgbench will connect to the localhost via
-        Unix domain socket.
-       </para>
+       Required to invoke initialization mode.
      </entry>
     </row>
     <row>
-      <entry><literal>-p port</literal></entry>
+      <entry><literal>-s</literal> <replaceable>scale_factor</></entry>
      <entry>
-       <para>
-        the port number that the backend is accepting. default is
-        libpq's default, usually 5432.
-       </para>
+       Multiply the number of rows generated by the scale factor.
+       For example, <literal>-s 100</> will imply 10,000,000 rows
+       in the <structname>accounts</> table. Default is 1.
      </entry>
     </row>
     <row>
-      <entry><literal>-c number_of_clients</literal></entry>
+      <entry><literal>-F</literal> <replaceable>fillfactor</></entry>
      <entry>
-       <para>
-        Number of clients simulated. default is 1.
-       </para>
+       Create the <structname>accounts</>, <structname>tellers</> and
+       <structname>branches</> tables with the given fillfactor.
+       Default is 100.
      </entry>
     </row>
+    </tbody>
+   </tgroup>
+  </table>
+
+  <table id="pgbench-run-options">
+   <title><application>pgbench</application> benchmarking options</title>
+   <tgroup cols="2">
+    <thead>
+     <row>
+      <entry>Option</entry>
+      <entry>Description</entry>
+     </row>
+    </thead>
+
+    <tbody>
     <row>
-      <entry><literal>-t number_of_transactions</literal></entry>
+      <entry><literal>-c</literal> <replaceable>clients</></entry>
      <entry>
-       <para>
-        Number of transactions each client runs. default is 10.
-       </para>
+       Number of clients simulated, that is, number of concurrent database
+       sessions.  Default is 1.
      </entry>
     </row>
     <row>
-      <entry><literal>-s scaling_factor</literal></entry>
+      <entry><literal>-t</literal> <replaceable>transactions</></entry>
      <entry>
-       <para>
-        this should be used with -i (initialize) option.
-        number of tuples generated will be multiple of the
-        scaling factor. For example, -s 100 will imply 10M
-        (10,000,000) tuples in the accounts table.
-        default is 1.  
-       </para>
-       <para>
-        NOTE: scaling factor should be at least
-        as large as the largest number of clients you intend
-        to test; else you'll mostly be measuring update contention.
-        Regular (not initializing) runs using one of the
-        built-in tests will detect scale based on the number of
-        branches in the database.  For custom (-f) runs it can
-        be manually specified with this parameter.
-       </para>
+       Number of transactions each client runs.  Default is 10.
      </entry>
     </row>
     <row>
-      <entry><literal>-D varname=value</literal></entry>
+      <entry><literal>-N</literal></entry>
      <entry>
-       <para>
-        Define a variable. It can be refered to by a script
-        provided by using -f option. Multiple -D options are allowed.
-       </para>
+       Do not update <structname>tellers</> and <structname>branches</>.
+       This will avoid update contention on these tables, but
+       it makes the test case even less like TPC-B.
      </entry>
     </row>
     <row>
-      <entry><literal>-U login</literal></entry>
+      <entry><literal>-S</literal></entry>
      <entry>
-       <para>
-        Specify db user's login name if it is different from
-        the Unix login name.
-       </para>
+       Perform select-only transactions instead of TPC-B-like test.
      </entry>
     </row>
     <row>
-      <entry><literal>-P password</literal></entry>
+      <entry><literal>-f</literal> <replaceable>filename</></entry>
      <entry>
-       <para>
-        Specify the db password. CAUTION: using this option
-        might be a security hole since ps command will
-        show the password. Use this for TESTING PURPOSE ONLY.
-       </para>
+       Read transaction script from <replaceable>filename</>.
+       See below for details.
+       <literal>-N</literal>, <literal>-S</literal>, and <literal>-f</literal>
+       are mutually exclusive.
      </entry>
     </row>
     <row>
      <entry><literal>-n</literal></entry>
      <entry>
-       <para>
-        No vacuuming and cleaning the history table prior to the
-        test is performed.
-       </para>
+       No vacuuming is performed before running the test.
+       This option is <emphasis>necessary</>
+       if you are running a custom test scenario that does not include
+       the standard tables <structname>accounts</>,
+       <structname>branches</>, <structname>history</>, and
+       <structname>tellers</>.
      </entry>
     </row>
     <row>
      <entry><literal>-v</literal></entry>
      <entry>
-       <para>
-        Do vacuuming before testing. This will take some time.
-        With neither -n nor -v, pgbench will vacuum tellers and
-        branches tables only.
-       </para>
+       Vacuum all four standard tables before running the test.
+       With neither <literal>-n</> nor <literal>-v</>, pgbench will vacuum
+       <structname>tellers</> and <structname>branches</> tables, and
+       will remove all entries in <structname>history</>.
      </entry>
     </row>
     <row>
-      <entry><literal>-S</literal></entry>
+      <entry><literal>-D</literal> <replaceable>varname</><literal>=</><replaceable>value</></entry>
      <entry>
-       <para>
-        Perform select only transactions instead of TPC-B.
-       </para>
+       Define a variable for use by a custom script (see below).
+       Multiple <literal>-D</> options are allowed.
      </entry>
     </row>
     <row>
-      <entry><literal>-N</literal></entry>
+      <entry><literal>-C</literal></entry>
      <entry>
-       <para>
-        Do not update "branches" and "tellers". This will
-        avoid heavy update contention on branches and tellers,
-        while it will not make pgbench supporting TPC-B like
-        transactions.
-       </para>
+       Establish a new connection for each transaction, rather than
+       doing it just once per client thread.
+       This is useful to measure the connection overhead.
      </entry>
     </row>
     <row>
-      <entry><literal>-f filename</literal></entry>
+      <entry><literal>-l</literal></entry>
      <entry>
-       <para>
-        Read transaction script from file. Detailed
-        explanation will appear later.
-       </para>
+       Write the time taken by each transaction to a logfile.
+       See below for details.
      </entry>
     </row>
     <row>
-      <entry><literal>-C</literal></entry>
+      <entry><literal>-s</literal> <replaceable>scale_factor</></entry>
      <entry>
-       <para>
-        Establish connection for each transaction, rather than
-        doing it just once at beginning of pgbench in the normal
-        mode. This is useful to measure the connection overhead.
-       </para>
+       Report the specified scale factor in <application>pgbench</>'s
+       output.  With the built-in tests, this is not necessary; the
+       correct scale factor will be detected by counting the number of
+       rows in the <structname>branches</> table.  However, when testing
+       custom benchmarks (<literal>-f</> option), the scale factor
+       will be reported as 1 unless this option is used.
      </entry>
     </row>
     <row>
-      <entry><literal>-l</literal></entry>
+      <entry><literal>-d</literal></entry>
      <entry>
-       <para>
-        Write the time taken by each transaction to a logfile,
-        with the name "pgbench_log.xxx", where xxx is the PID
-        of the pgbench process. The format of the log is:
-       </para>
-       <programlisting>
-        client_id transaction_no time file_no time-epoch time-us
-       </programlisting>
-       <para>
-        where time is measured in microseconds, , the file_no is
-        which test file was used (useful when multiple were
-        specified with -f), and time-epoch/time-us are a
-        UNIX epoch format timestamp followed by an offset
-        in microseconds (suitable for creating a ISO 8601
-        timestamp with a fraction of a second) of when
-        the transaction completed.
-       </para>
-       <para>
-        Here are example outputs:
-       </para>
-       <programlisting>
- 0 199 2241 0 1175850568 995598
- 0 200 2465 0 1175850568 998079
- 0 201 2513 0 1175850569 608
- 0 202 2038 0 1175850569 2663
-       </programlisting>
+       Print debugging output.
      </entry>
     </row>
+    </tbody>
+   </tgroup>
+  </table>
+
+  <table id="pgbench-common-options">
+   <title><application>pgbench</application> common options</title>
+   <tgroup cols="2">
+    <thead>
     <row>
-      <entry><literal>-F fillfactor</literal></entry>
-      <entry>
-       <para>
-        Create tables(accounts, tellers and branches) with the given
-        fillfactor. Default is 100.  This should be used with -i
-        (initialize) option.
-       </para>
-      </entry>
+      <entry>Option</entry>
+      <entry>Description</entry>
     </row>
+    </thead>
+
+    <tbody>
     <row>
-      <entry><literal>-d</literal></entry>
-      <entry>
-       <para>
-        debug option.
-       </para>
-      </entry>
+      <entry><literal>-h</literal> <replaceable>hostname</></entry>
+      <entry>database server's host</entry>
+     </row>
+     <row>
+      <entry><literal>-p</literal> <replaceable>port</></entry>
+      <entry>database server's port</entry>
+     </row>
+     <row>
+      <entry><literal>-U</literal> <replaceable>login</></entry>
+      <entry>username to connect as</entry>
+     </row>
+     <row>
+      <entry><literal>-P</literal> <replaceable>password</></entry>
+      <entry>password (deprecated &mdash; putting this on the command line
+       is a security hazard)</entry>
     </row>
    </tbody>
   </tgroup>
@ -299,138 +293,252 @@ pgbench &lt;dbname&gt;
 </sect2>

 <sect2>
-  <title>What is the "transaction" actually performed in pgbench?</title>
-  <orderedlist>
-   <listitem><para><literal>begin;</literal></para></listitem>
-
-   <listitem><para><literal>update accounts set abalance = abalance + :delta where aid = :aid;</literal></para></listitem>
+  <title>What is the <quote>transaction</> actually performed in pgbench?</title>

-   <listitem><para><literal>select abalance from accounts where aid = :aid;</literal></para></listitem>
-
-   <listitem><para><literal>update tellers set tbalance = tbalance + :delta where tid = :tid;</literal></para></listitem>
-
-   <listitem><para><literal>update branches set bbalance = bbalance + :delta where bid = :bid;</literal></para></listitem>
-
-   <listitem><para><literal>insert into history(tid,bid,aid,delta) values(:tid,:bid,:aid,:delta);</literal></para></listitem>
+  <para>
+   The default transaction script issues seven commands per transaction:
+  </para>

-   <listitem><para><literal>end;</literal></para></listitem>
+  <orderedlist>
+   <listitem><para><literal>BEGIN;</literal></para></listitem>
+   <listitem><para><literal>UPDATE accounts SET abalance = abalance + :delta WHERE aid = :aid;</literal></para></listitem>
+   <listitem><para><literal>SELECT abalance FROM accounts WHERE aid = :aid;</literal></para></listitem>
+   <listitem><para><literal>UPDATE tellers SET tbalance = tbalance + :delta WHERE tid = :tid;</literal></para></listitem>
+   <listitem><para><literal>UPDATE branches SET bbalance = bbalance + :delta WHERE bid = :bid;</literal></para></listitem>
+   <listitem><para><literal>INSERT INTO history (tid, bid, aid, delta, mtime) VALUES (:tid, :bid, :aid, :delta, CURRENT_TIMESTAMP);</literal></para></listitem>
+   <listitem><para><literal>END;</literal></para></listitem>
  </orderedlist>
+
  <para>
-   If you specify -N, (4) and (5) aren't included in the transaction.
+   If you specify <literal>-N</>, steps 4 and 5 aren't included in the
+   transaction.  If you specify <literal>-S</>, only the <command>SELECT</> is
+   issued.
  </para>
 </sect2>

 <sect2>
-  <title>Script file</title>
+  <title>Custom Scripts</title>
+
  <para>
-   <literal>pgbench</literal> has support for reading a transaction script 
-   from a specified file (<literal>-f</literal> option). This file should 
-   include SQL commands in each line. SQL command consists of multiple lines 
-   are not supported. Empty lines and lines begging with "--" will be ignored.
+   <application>pgbench</application> has support for running custom
+   benchmark scenarios by replacing the default transaction script
+   (described above) with a transaction script read from a file
+   (<literal>-f</literal> option).  In this case a <quote>transaction</>
+   counts as one execution of a script file.  You can even specify
+   multiple scripts (multiple <literal>-f</literal> options), in which
+   case a random one of the scripts is chosen each time a client session
+   starts a new transaction.
  </para>
+
  <para>
-   Multiple <literal>-f</literal> options are allowed. In this case each 
-   transaction is assigned randomly chosen script.
+   The format of a script file is one SQL command per line; multi-line
+   SQL commands are not supported.  Empty lines and lines beginning with
+   <literal>--</> are ignored.  Script file lines can also be
+   <quote>meta commands</>, which are interpreted by <application>pgbench</>
+   itself, as described below.
  </para>
+
  <para>
-   SQL commands can include "meta command" which begins with "\" (back
-   slash). A meta command takes some arguments separted by white
-   spaces. Currently following meta command is supported:
+   There is a simple variable-substitution facility for script files.
+   Variables can be set by the command-line <literal>-D</> option,
+   explained above, or by the meta commands explained below.
+   In addition to any variables preset by <literal>-D</> command-line options,
+   the variable <literal>scale</> is preset to the current scale factor.
+   Once set, a variable's
+   value can be inserted into a SQL command by writing
+   <literal>:</><replaceable>variablename</>.  When running more than
+   one client session, each session has its own set of variables.
  </para>

-  <itemizedlist>
-   <listitem>
-    <para>
-     <literal>\set name operand1 [ operator operand2 ]</literal>
-     - Sets the calculated value using "operand1" "operator"
-     "operand2" to variable "name". If "operator" and "operand2"
-     are omitted, the value of operand1 is set to variable "name". 
-    </para>
-    <para>
-     Example:
-    </para>
-    <programlisting>
+  <para>
+   Script file meta commands begin with a backslash (<literal>\</>).
+   Arguments to a meta command are separated by white space.
+   These meta commands are supported:
+  </para>
+
+  <variablelist>
+   <varlistentry>
+    <term>
+     <literal>\set <replaceable>varname</> <replaceable>operand1</> [ <replaceable>operator</> <replaceable>operand2</> ]</literal>
+    </term>
+
+    <listitem>
+     <para>
+      Sets variable <replaceable>varname</> to a calculated integer value.
+      Each <replaceable>operand</> is either an integer constant or a
+      <literal>:</><replaceable>variablename</> reference to a variable
+      having an integer value.  The <replaceable>operator</> can be
+      <literal>+</>, <literal>-</>, <literal>*</>, or <literal>/</>.
+     </para>
+
+     <para>
+      Example:
+      <programlisting>
 \set ntellers 10 * :scale
-    </programlisting>
-   </listitem>
-   <listitem>
-    <para>
-     <literal>\setrandom name min max</literal>
-     - Assigns random integer to name between min and max
-    </para>
-    <para>
-     Example:
-    </para>
-    <programlisting>
-\setrandom aid 1 100000
-    </programlisting>
-   </listitem>
-   <listitem>
-    <para>
-     Variables can be referred to in SQL comands by adding ":" in front
-     of the varible name.
-    </para>
-    <para>
-     Example:
-    </para>
-    <programlisting>
-SELECT abalance FROM accounts WHERE aid = :aid
-    </programlisting>
-    <para>
-     Variables can also be defined by using -D option.
-    </para>
-   </listitem>
-   <listitem>
-    <para>
-     <literal>\sleep num [us|ms|s]</> - Causes script execution to sleep for the
-     specified duration of microseconds (us), milliseconds (ms) or the default
-     seconds (s).
-    </para>
-    <para>
-     Example:
-    </para>
-    <programlisting>
-\setrandom millisec 1000 2500
-\sleep : millisec ms
-    </programlisting>
-   </listitem>
-  </itemizedlist>
- </sect2>
+      </programlisting>
+     </para>
+    </listitem>
+   </varlistentry>
+
+   <varlistentry>
+    <term>
+     <literal>\setrandom <replaceable>varname</> <replaceable>min</> <replaceable>max</></literal>
+    </term>
+
+    <listitem>
+     <para>
+      Sets variable <replaceable>varname</> to a random integer value
+      between the limits <replaceable>min</> and <replaceable>max</>.
+      Each limit can be either an integer constant or a
+      <literal>:</><replaceable>variablename</> reference to a variable
+      having an integer value.
+     </para>
+
+     <para>
+      Example:
+      <programlisting>
+\setrandom aid 1 :naccounts
+      </programlisting>
+     </para>
+    </listitem>
+   </varlistentry>
+
+   <varlistentry>
+    <term>
+     <literal>\sleep <replaceable>number</> [ us | ms | s ]</literal>
+    </term>
+
+    <listitem>
+     <para>
+      Causes script execution to sleep for the specified duration in
+      microseconds (<literal>us</>), milliseconds (<literal>ms</>) or seconds
+      (<literal>s</>).  If the unit is omitted then seconds are the default.
+      <replaceable>number</> can be either an integer constant or a
+      <literal>:</><replaceable>variablename</> reference to a variable
+      having an integer value.
+     </para>
+
+     <para>
+      Example:
+      <programlisting>
+\sleep 10 ms
+      </programlisting>
+     </para>
+    </listitem>
+   </varlistentry>
+  </variablelist>

- <sect2>
-  <title>Examples</title>
  <para>
-   Example, TPC-B like benchmark can be defined as follows(scaling
-   factor = 1):
-  </para>
-  <programlisting>
+   As an example, the full definition of the built-in TPC-B-like
+   transaction is:
+
+   <programlisting>
 \set nbranches :scale
 \set ntellers 10 * :scale
 \set naccounts 100000 * :scale
 \setrandom aid 1 :naccounts
 \setrandom bid 1 :nbranches
 \setrandom tid 1 :ntellers
-\setrandom delta 1 10000
-BEGIN
-UPDATE accounts SET abalance = abalance + :delta WHERE aid = :aid
-SELECT abalance FROM accounts WHERE aid = :aid
-UPDATE tellers SET tbalance = tbalance + :delta WHERE tid = :tid
-UPDATE branches SET bbalance = bbalance + :delta WHERE bid = :bid
-INSERT INTO history (tid, bid, aid, delta, mtime) VALUES (:tid, :bid, :aid, :delta, 'now')
-END
-  </programlisting>
+\setrandom delta -5000 5000
+BEGIN;
+UPDATE accounts SET abalance = abalance + :delta WHERE aid = :aid;
+SELECT abalance FROM accounts WHERE aid = :aid;
+UPDATE tellers SET tbalance = tbalance + :delta WHERE tid = :tid;
+UPDATE branches SET bbalance = bbalance + :delta WHERE bid = :bid;
+INSERT INTO history (tid, bid, aid, delta, mtime) VALUES (:tid, :bid, :aid, :delta, CURRENT_TIMESTAMP);
+END;
+   </programlisting>
+
+   This script allows each iteration of the transaction to reference
+   different, randomly-chosen rows.  (This example also shows why it's
+   important for each client session to have its own variables &mdash;
+   otherwise they'd not be independently touching different rows.)
+  </para>
+
+ </sect2>
+
+ <sect2>
+  <title>Per-transaction logging</title>
+
  <para>
-   If you want to automatically set the scaling factor from the number of
-   tuples in branches table, use -s option and shell command like this:
+   With the <literal>-l</> option, <application>pgbench</> writes the time
+   taken by each transaction to a logfile.  The logfile will be named
+   <filename>pgbench_log.<replaceable>nnn</></filename>, where
+   <replaceable>nnn</> is the PID of the pgbench process.
+   The format of the log is:
+
+   <programlisting>
+    <replaceable>client_id</> <replaceable>transaction_no</> <replaceable>time</> <replaceable>file_no</> <replaceable>time_epoch</> <replaceable>time_us</>
+   </programlisting>
+
+   where <replaceable>time</> is the elapsed transaction time in microseconds,
+   <replaceable>file_no</> identifies which script file was used
+   (useful when multiple scripts were specified with <literal>-f</>),
+   and <replaceable>time_epoch</>/<replaceable>time_us</> are a
+   UNIX epoch format timestamp and an offset
+   in microseconds (suitable for creating a ISO 8601
+   timestamp with fractional seconds) showing when
+   the transaction completed.
  </para>
-  <programlisting>
-pgbench -s $(psql -At -c "SELECT count(*) FROM branches") -f tpc_b.sql
-  </programlisting>
+
  <para>
-   Notice that -f option does not execute vacuum and clearing history
-   table before starting benchmark.
+   Here are example outputs:
+   <programlisting>
+ 0 199 2241 0 1175850568 995598
+ 0 200 2465 0 1175850568 998079
+ 0 201 2513 0 1175850569 608
+ 0 202 2038 0 1175850569 2663
+   </programlisting>
  </para>
 </sect2>

-</sect1>
+ <sect2>
+  <title>Good Practices</title>
+
+  <para>
+   It is very easy to use <application>pgbench</> to produce completely
+   meaningless numbers.  Here are some guidelines to help you get useful
+   results.
+  </para>
+
+  <para>
+   In the first place, <emphasis>never</> believe any test that runs
+   for only a few seconds.  Increase the <literal>-t</> setting enough
+   to make the run last at least a few minutes, so as to average out noise.
+   In some cases you could need hours to get numbers that are reproducible.
+   It's a good idea to try the test run a few times, to find out if your
+   numbers are reproducible or not.
+  </para>
+
+  <para>
+   For the default TPC-B-like test scenario, the initialization scale factor
+   (<literal>-s</>) should be at least as large as the largest number of
+   clients you intend to test (<literal>-c</>); else you'll mostly be
+   measuring update contention.  There are only <literal>-s</> rows in
+   the <structname>branches</> table, and every transaction wants to
+   update one of them, so <literal>-c</> values in excess of <literal>-s</>
+   will undoubtedly result in lots of transactions blocked waiting for
+   other transactions.
+  </para>

+  <para>
+   The default test scenario is also quite sensitive to how long it's been
+   since the tables were initialized: accumulation of dead rows and dead space
+   in the tables changes the results.  To understand the results you must keep
+   track of the total number of updates and when vacuuming happens.  If
+   autovacuum is enabled it can result in unpredictable changes in measured
+   performance.
+  </para>
+
+  <para>
+   A limitation of <application>pgbench</> is that it can itself become
+   the bottleneck when trying to test a large number of client sessions.
+   This can be alleviated by running <application>pgbench</> on a different
+   machine from the database server, although low network latency will be
+   essential.  It might even be useful to run several <application>pgbench</>
+   instances concurrently, on several client machines, against the same
+   database server.
+  </para>
+ </sect2>
+
+</sect1>
--- a/doc/src/sgml/pgbuffercache.sgml
+++ b/doc/src/sgml/pgbuffercache.sgml
@ -1,85 +1,138 @@
+<!-- $PostgreSQL: pgsql/doc/src/sgml/pgbuffercache.sgml,v 2.2 2007/12/10 05:32:51 tgl Exp $ -->
+
 <sect1 id="pgbuffercache">
 <title>pg_buffercache</title>
- 
+
 <indexterm zone="pgbuffercache">
  <primary>pg_buffercache</primary>
 </indexterm>

 <para>
-  The <literal>pg_buffercache</literal> module provides a means for examining
-  what's happening to the buffercache at any given time without having to
-  restart or rebuild the server with debugging code added. The intent is to 
-  do for the buffercache what pg_locks does for locks.
+  The <filename>pg_buffercache</filename> module provides a means for
+  examining what's happening in the shared buffer cache in real time.
 </para>
+
 <para>
-  This module consists of a C function <literal>pg_buffercache_pages()</literal> 
-  that returns a set of records, plus a view <literal>pg_buffercache</literal> 
-  to wrapper the function.
+  The module provides a C function <function>pg_buffercache_pages</function>
+  that returns a set of records, plus a view
+  <structname>pg_buffercache</structname> that wraps the function for
+  convenient use.
 </para>
+
 <para>
-  By default public access is REVOKED from both of these, just in case there
+  By default public access is revoked from both of these, just in case there
  are security issues lurking.
 </para>

 <sect2>
-  <title>Notes</title>
+  <title>The <structname>pg_buffercache</structname> view</title>
+
  <para>
-   The definition of the columns exposed in the view is:
+   The definitions of the columns exposed by the view are:
  </para>
-  <programlisting>
-       Column     |  references          | Description
-  ----------------+----------------------+------------------------------------
-   bufferid       |                      | Id, 1..shared_buffers.
-   relfilenode    | pg_class.relfilenode | Refilenode of the relation.
-   reltablespace  | pg_tablespace.oid    | Tablespace oid of the relation.
-   reldatabase    | pg_database.oid      | Database for the relation.
-   relblocknumber |                      | Offset of the page in the relation.
-   isdirty        |                      | Is the page dirty?
-   usagecount     |                      | Page LRU count
-  </programlisting>
+
+  <table>
+   <title><structname>pg_buffercache</> Columns</title>
+
+   <tgroup cols="4">
+    <thead>
+     <row>
+      <entry>Name</entry>
+      <entry>Type</entry>
+      <entry>References</entry>
+      <entry>Description</entry>
+     </row>
+    </thead>
+    <tbody>
+
+     <row>
+      <entry><structfield>bufferid</structfield></entry>
+      <entry><type>integer</type></entry>
+      <entry></entry>
+      <entry>ID, in the range 1..<varname>shared_buffers</></entry>
+     </row>
+
+     <row>
+      <entry><structfield>relfilenode</structfield></entry>
+      <entry><type>oid</type></entry>
+      <entry><literal>pg_class.relfilenode</literal></entry>
+      <entry>Relfilenode of the relation</entry>
+     </row>
+
+     <row>
+      <entry><structfield>reltablespace</structfield></entry>
+      <entry><type>oid</type></entry>
+      <entry><literal>pg_tablespace.oid</literal></entry>
+      <entry>Tablespace OID of the relation</entry>
+     </row>
+
+     <row>
+      <entry><structfield>reldatabase</structfield></entry>
+      <entry><type>oid</type></entry>
+      <entry><literal>pg_database.oid</literal></entry>
+      <entry>Database OID of the relation</entry>
+     </row>
+
+     <row>
+      <entry><structfield>relblocknumber</structfield></entry>
+      <entry><type>bigint</type></entry>
+      <entry></entry>
+      <entry>Page number within the relation</entry>
+     </row>
+
+     <row>
+      <entry><structfield>isdirty</structfield></entry>
+      <entry><type>boolean</type></entry>
+      <entry></entry>
+      <entry>Is the page dirty?</entry>
+     </row>
+
+     <row>
+      <entry><structfield>usagecount</structfield></entry>
+      <entry><type>smallint</type></entry>
+      <entry></entry>
+      <entry>Page LRU count</entry>
+     </row>
+
+    </tbody>
+   </tgroup>
+  </table>
+
  <para>
   There is one row for each buffer in the shared cache. Unused buffers are
-   shown with all fields null except bufferid.
+   shown with all fields null except <structfield>bufferid</>.  Shared system
+   catalogs are shown as belonging to database zero.
  </para>
+
  <para>
-   Because the cache is shared by all the databases, there are pages from
-   relations not belonging to the current database.
+   Because the cache is shared by all the databases, there will normally be
+   pages from relations not belonging to the current database.  This means
+   that there may not be matching join rows in <structname>pg_class</> for
+   some rows, or that there could even be incorrect joins.  If you are
+   trying to join against <structname>pg_class</>, it's a good idea to
+   restrict the join to rows having <structfield>reldatabase</> equal to
+   the current database's OID or zero.
  </para>
+
  <para>
-   When the pg_buffercache view is accessed, internal buffer manager locks are 
-   taken, and a copy of the buffer cache data is made for the view to display. 
-   This ensures that the view produces a consistent set of results, while not 
-   blocking normal buffer activity longer than necessary.  Nonetheless there 
+   When the <structname>pg_buffercache</> view is accessed, internal buffer
+   manager locks are taken for long enough to copy all the buffer state
+   data that the view will display.
+   This ensures that the view produces a consistent set of results, while not
+   blocking normal buffer activity longer than necessary.  Nonetheless there
   could be some impact on database performance if this view is read often.
  </para>
 </sect2>

 <sect2>
  <title>Sample output</title>
-  <programlisting>
-  regression=# \d pg_buffercache;
-       View "public.pg_buffercache"
-       Column     |  Type    | Modifiers
-  ----------------+----------+-----------
-   bufferid       | integer  |
-   relfilenode    | oid      |
-   reltablespace  | oid      |
-   reldatabase    | oid      |
-   relblocknumber | bigint   |
-   isdirty        | boolean  |
-   usagecount     | smallint |
-
-  View definition:
-   SELECT p.bufferid, p.relfilenode, p.reltablespace, p.reldatabase, 
-          p.relblocknumber, p.isdirty, p.usagecount
-     FROM pg_buffercache_pages() p(bufferid integer, relfilenode oid, 
-     reltablespace oid, reldatabase oid, relblocknumber bigint, 
-     isdirty boolean, usagecount smallint);

+  <programlisting>
  regression=# SELECT c.relname, count(*) AS buffers
-               FROM pg_class c INNER JOIN pg_buffercache b
-               ON b.relfilenode = c.relfilenode INNER JOIN pg_database d
-               ON (b.reldatabase = d.oid AND d.datname = current_database())
+               FROM pg_buffercache b INNER JOIN pg_class c
+               ON b.relfilenode = c.relfilenode AND
+                  b.reldatabase IN (0, (SELECT oid FROM pg_database
+                                        WHERE datname = current_database()))
               GROUP BY c.relname
               ORDER BY 2 DESC LIMIT 10;
               relname             | buffers
@ -95,26 +148,23 @@
   pg_depend                       |      22
   pg_depend_reference_index       |      20
  (10 rows)
-
-  regression=# 
  </programlisting>
 </sect2>

 <sect2>
  <title>Authors</title>
-  <itemizedlist>
-   <listitem>
-    <para>
-     Mark Kirkwood <email>markir@paradise.net.nz</email>
-    </para>
-   </listitem>
-   <listitem>
-    <para>Design suggestions: Neil Conway <email>neilc@samurai.com</email></para>
-   </listitem>
-   <listitem>
-    <para>Debugging advice: Tom Lane <email>tgl@sss.pgh.pa.us</email></para>
-   </listitem>
-  </itemizedlist>
+
+  <para>
+   Mark Kirkwood <email>markir@paradise.net.nz</email>
+  </para>
+
+  <para>
+   Design suggestions: Neil Conway <email>neilc@samurai.com</email>
+  </para>
+
+  <para>
+   Debugging advice: Tom Lane <email>tgl@sss.pgh.pa.us</email>
+  </para>
 </sect2>

 </sect1>
--- a/doc/src/sgml/pgfreespacemap.sgml
+++ b/doc/src/sgml/pgfreespacemap.sgml
@ -1,184 +1,203 @@
+<!-- $PostgreSQL: pgsql/doc/src/sgml/pgfreespacemap.sgml,v 2.2 2007/12/10 05:32:51 tgl Exp $ -->
+
 <sect1 id="pgfreespacemap">
 <title>pg_freespacemap</title>
- 
+
 <indexterm zone="pgfreespacemap">
  <primary>pg_freespacemap</primary>
 </indexterm>

 <para>
-  This module provides a means for examining the free space map (FSM). It
-  consists of two C functions: <literal>pg_freespacemap_relations()</literal> 
-  and <literal>pg_freespacemap_pages()</literal> that return a set 
-  of records, plus two views <literal>pg_freespacemap_relations</literal> and 
-  <literal>pg_freespacemap_pages</literal> for more user-friendly access to 
-  the functions.
- </para>
- <para>
-  The module provides the ability to examine the contents of the free space
-  map, without having to restart or rebuild the server with additional
-  debugging code.
+  The <filename>pg_freespacemap</> module provides a means for examining the
+  free space map (FSM). It provides two C functions:
+  <function>pg_freespacemap_relations</function> and
+  <function>pg_freespacemap_pages</function> that each return a set of
+  records, plus two views <structname>pg_freespacemap_relations</structname>
+  and <structname>pg_freespacemap_pages</structname> that wrap the functions
+  for convenient use.
 </para>
+
 <para>
-  By default public access is REVOKED from the functions and views, just in 
-  case there are security issues present in the code.
+  By default public access is revoked from the functions and views, just in
+  case there are security issues lurking.
 </para>

 <sect2>
-  <title>Notes</title>
+  <title>The <filename>pg_freespacemap</> views</title>
+
  <para>
-   The definitions for the columns exposed in the views are:
+   The definitions of the columns exposed by the views are:
  </para>

  <table>
-   <title>pg_freespacemap_relations</title>
-   <tgroup cols="3">
+   <title><structname>pg_freespacemap_relations</> Columns</title>
+
+   <tgroup cols="4">
    <thead>
     <row>
-      <entry>Column</entry>
-      <entry>references</entry>
+      <entry>Name</entry>
+      <entry>Type</entry>
+      <entry>References</entry>
      <entry>Description</entry>
     </row>
    </thead>
    <tbody>
+
     <row>
-      <entry>reltablespace</entry>
-      <entry>pg_tablespace.oid</entry>
-      <entry>Tablespace oid of the relation.</entry>
+      <entry><structfield>reltablespace</structfield></entry>
+      <entry><type>oid</type></entry>
+      <entry><literal>pg_tablespace.oid</literal></entry>
+      <entry>Tablespace OID of the relation</entry>
     </row>
     <row>
-      <entry>reldatabase</entry>
-      <entry>pg_database.oid</entry>
-      <entry>Database oid of the relation.</entry>
+      <entry><structfield>reldatabase</structfield></entry>
+      <entry><type>oid</type></entry>
+      <entry><literal>pg_database.oid</literal></entry>
+      <entry>Database OID of the relation</entry>
     </row>
     <row>
-      <entry>relfilenode</entry>
-      <entry>pg_class.relfilenode</entry>
-      <entry>Relfilenode of the relation.</entry>
+      <entry><structfield>relfilenode</structfield></entry>
+      <entry><type>oid</type></entry>
+      <entry><literal>pg_class.relfilenode</literal></entry>
+      <entry>Relfilenode of the relation</entry>
     </row>
     <row>
-      <entry>avgrequest</entry>
+      <entry><structfield>avgrequest</structfield></entry>
+      <entry><type>integer</type></entry>
      <entry></entry>
      <entry>Moving average of free space requests (NULL for indexes)</entry>
     </row>
     <row>
-      <entry>interestingpages</entry>
+      <entry><structfield>interestingpages</structfield></entry>
+      <entry><type>integer</type></entry>
      <entry></entry>
-      <entry>Count of pages last reported as containing useful free space.</entry>
+      <entry>Count of pages last reported as containing useful free space</entry>
     </row>
     <row>
-      <entry>storedpages</entry>
+      <entry><structfield>storedpages</structfield></entry>
+      <entry><type>integer</type></entry>
      <entry></entry>
-      <entry>Count of pages actually stored in free space map.</entry>
+      <entry>Count of pages actually stored in free space map</entry>
     </row>
     <row>
-      <entry>nextpage</entry>
+      <entry><structfield>nextpage</structfield></entry>
+      <entry><type>integer</type></entry>
      <entry></entry>
-      <entry>Page index (from 0) to start next search at.</entry>
+      <entry>Page index (from 0) to start next search at</entry>
     </row>
+
    </tbody>
   </tgroup>
  </table>

  <table>
-   <title>pg_freespacemap_pages</title>
-   <tgroup cols="3">
+   <title><structname>pg_freespacemap_pages</> Columns</title>
+
+   <tgroup cols="4">
    <thead>
     <row>
-      <entry>Column</entry>
-      <entry> references</entry>
+      <entry>Name</entry>
+      <entry>Type</entry>
+      <entry>References</entry>
      <entry>Description</entry>
     </row>
    </thead>
    <tbody>
+
     <row>
-      <entry>reltablespace</entry>
-      <entry>pg_tablespace.oid</entry>
-      <entry>Tablespace oid of the relation.</entry>
+      <entry><structfield>reltablespace</structfield></entry>
+      <entry><type>oid</type></entry>
+      <entry><literal>pg_tablespace.oid</literal></entry>
+      <entry>Tablespace OID of the relation</entry>
     </row>
     <row>
-      <entry>reldatabase</entry>
-      <entry>pg_database.oid</entry>
-      <entry>Database oid of the relation.</entry>
+      <entry><structfield>reldatabase</structfield></entry>
+      <entry><type>oid</type></entry>
+      <entry><literal>pg_database.oid</literal></entry>
+      <entry>Database OID of the relation</entry>
     </row>
     <row>
-      <entry>relfilenode</entry>
-      <entry>pg_class.relfilenode</entry>
-      <entry>Relfilenode of the relation.</entry>
+      <entry><structfield>relfilenode</structfield></entry>
+      <entry><type>oid</type></entry>
+      <entry><literal>pg_class.relfilenode</literal></entry>
+      <entry>Relfilenode of the relation</entry>
     </row>
     <row>
-      <entry>relblocknumber</entry>
+      <entry><structfield>relblocknumber</structfield></entry>
+      <entry><type>bigint</type></entry>
      <entry></entry>
-      <entry>Page number in the relation.</entry>
+      <entry>Page number within the relation</entry>
     </row>
     <row>
-      <entry>bytes</entry>
+      <entry><structfield>bytes</structfield></entry>
+      <entry><type>integer</type></entry>
      <entry></entry>
-      <entry>Free bytes in the page, or NULL for an index page (see below).</entry>
+      <entry>Free bytes in the page, or NULL for an index page (see below)</entry>
     </row>
+
    </tbody>
   </tgroup>
  </table>

  <para>
-   For <literal>pg_freespacemap_relations</literal>, there is one row for each 
-   relation in the free space map. <literal>storedpages</literal> is the 
-   number of pages actually stored in the map, while 
-   <literal>interestingpages</literal> is the number of pages the last VACUUM 
-   thought had useful amounts of free space.
+   For <structname>pg_freespacemap_relations</structname>, there is one row
+   for each relation in the free space map.
+   <structfield>storedpages</structfield> is the number of pages actually
+   stored in the map, while <structfield>interestingpages</structfield> is the
+   number of pages the last <command>VACUUM</> thought had useful amounts of
+   free space.
  </para>
+
  <para>
-   If <literal>storedpages</literal> is consistently less than interestingpages 
-   then it'd be a good idea to increase <literal>max_fsm_pages</literal>.  Also, 
-   if the number of rows in <literal>pg_freespacemap_relations</literal> is 
-   close to <literal>max_fsm_relations</literal>, then you should consider 
-   increasing <literal>max_fsm_relations</literal>.
+   If <structfield>storedpages</structfield> is consistently less than
+   <structfield>interestingpages</> then it'd be a good idea to increase
+   <varname>max_fsm_pages</varname>.  Also, if the number of rows in
+   <structname>pg_freespacemap_relations</structname> is close to
+   <varname>max_fsm_relations</varname>, then you should consider increasing
+   <varname>max_fsm_relations</varname>.
  </para>
+
  <para>
-   For <literal>pg_freespacemap_pages</literal>, there is one row for each page 
-   in the free space map. The number of rows for a relation will match the 
-   <literal>storedpages</literal> column in 
-   <literal>pg_freespacemap_relations</literal>.
+   For <structname>pg_freespacemap_pages</structname>, there is one row for
+   each page in the free space map. The number of rows for a relation will
+   match the <structfield>storedpages</structfield> column in
+   <structname>pg_freespacemap_relations</structname>.
  </para>
+
  <para>
   For indexes, what is tracked is entirely-unused pages, rather than free
   space within pages.  Therefore, the average request size and free bytes
   within a page are not meaningful, and are shown as NULL.
  </para>
+
  <para>
-   Because the map is shared by all the databases, it will include relations
-   not belonging to the current database.
+   Because the map is shared by all the databases, there will normally be
+   entries for relations not belonging to the current database.  This means
+   that there may not be matching join rows in <structname>pg_class</> for
+   some rows, or that there could even be incorrect joins.  If you are
+   trying to join against <structname>pg_class</>, it's a good idea to
+   restrict the join to rows having <structfield>reldatabase</> equal to
+   the current database's OID or zero.
  </para>
+
  <para>
-   When either of the views are accessed, internal free space map locks are
-   taken, and a copy of the map data is made for them to display. 
-   This ensures that the views produce a consistent set of results, while not 
-   blocking normal activity longer than necessary.  Nonetheless there 
+   When either of the views is accessed, internal free space map locks are
+   taken for long enough to copy all the state data that the view will display.
+   This ensures that the views produce a consistent set of results, while not
+   blocking normal activity longer than necessary.  Nonetheless there
   could be some impact on database performance if they are read often.
  </para>
 </sect2>

 <sect2>
-  <title>Sample output - pg_freespacemap_relations</title>
-  <programlisting>
-regression=# \d pg_freespacemap_relations
-View "public.pg_freespacemap_relations"
-    Column        |  Type   | Modifiers
------------------+---------+-----------
- reltablespace    | oid     |
- reldatabase      | oid     |
- relfilenode      | oid     |
- avgrequest       | integer |
- interestingpages | integer |
- storedpages      | integer |
- nextpage         | integer |
-View definition:
- SELECT p.reltablespace, p.reldatabase, p.relfilenode, p.avgrequest, p.interestingpages, p.storedpages, p.nextpage
-   FROM pg_freespacemap_relations() p(reltablespace oid, reldatabase oid, relfilenode oid, avgrequest integer, interestingpages integer, storedpages integer, nextpage integer);
+  <title>Sample output</title>

+  <programlisting>
 regression=# SELECT c.relname, r.avgrequest, r.interestingpages, r.storedpages
             FROM pg_freespacemap_relations r INNER JOIN pg_class c
-             ON c.relfilenode = r.relfilenode INNER JOIN pg_database d
-             ON r.reldatabase = d.oid AND (d.datname = current_database()) 
+             ON r.relfilenode = c.relfilenode AND
+                r.reldatabase IN (0, (SELECT oid FROM pg_database
+                                      WHERE datname = current_database()))
             ORDER BY r.storedpages DESC LIMIT 10;
             relname             | avgrequest | interestingpages | storedpages
 ---------------------------------+------------+------------------+-------------
@ -193,31 +212,14 @@ regression=# SELECT c.relname, r.avgrequest, r.interestingpages, r.storedpages
 pg_class_relname_nsp_index      |            |               10 |          10
 pg_proc                         |        302 |                8 |           8
 (10 rows)
-  </programlisting>
- </sect2>
-
- <sect2>
-  <title>Sample output - pg_freespacemap_pages</title>
-  <programlisting>
-regression=# \d pg_freespacemap_pages
- View "public.pg_freespacemap_pages"
-     Column     |  Type   | Modifiers 
----------------+---------+-----------
- reltablespace  | oid     | 
- reldatabase    | oid     | 
- relfilenode    | oid     | 
- relblocknumber | bigint  | 
- bytes          | integer | 
-View definition:
- SELECT p.reltablespace, p.reldatabase, p.relfilenode, p.relblocknumber, p.bytes
-   FROM pg_freespacemap_pages() p(reltablespace oid, reldatabase oid, relfilenode oid, relblocknumber bigint, bytes integer);

 regression=# SELECT c.relname, p.relblocknumber, p.bytes
             FROM pg_freespacemap_pages p INNER JOIN pg_class c
-             ON c.relfilenode = p.relfilenode INNER JOIN pg_database d
-             ON (p.reldatabase = d.oid AND d.datname = current_database()) 
+             ON p.relfilenode = c.relfilenode AND
+                p.reldatabase IN (0, (SELECT oid FROM pg_database
+                                      WHERE datname = current_database()))
             ORDER BY c.relname LIMIT 10;
-   relname    | relblocknumber | bytes 
+   relname    | relblocknumber | bytes
 --------------+----------------+-------
 a_star       |              0 |  8040
 abstime_tbl  |              0 |  7908
@ -235,8 +237,10 @@ regression=# SELECT c.relname, p.relblocknumber, p.bytes

 <sect2>
  <title>Author</title>
+
  <para>
   Mark Kirkwood <email>markir@paradise.net.nz</email>
  </para>
 </sect2>
+
 </sect1>
--- a/doc/src/sgml/pgrowlocks.sgml
+++ b/doc/src/sgml/pgrowlocks.sgml
@ -1,115 +1,122 @@
+<!-- $PostgreSQL: pgsql/doc/src/sgml/pgrowlocks.sgml,v 1.4 2007/12/10 05:32:51 tgl Exp $ -->

 <sect1 id="pgrowlocks">
 <title>pgrowlocks</title>
- 
+
 <indexterm zone="pgrowlocks">
  <primary>pgrowlocks</primary>
 </indexterm>

 <para>
-  The <literal>pgrowlocks</literal> module provides a function to show row 
+  The <filename>pgrowlocks</filename> module provides a function to show row
  locking information for a specified table.
 </para>

 <sect2>
  <title>Overview</title>
-  <programlisting>
-pgrowlocks(text) RETURNS pgrowlocks_type
-  </programlisting>
+
+  <synopsis>
+pgrowlocks(text) returns setof record
+  </synopsis>
+
  <para>
-   The parameter is a name of table. And <literal>pgrowlocks_type</literal> is 
-   defined as:
+   The parameter is the name of a table.  The result is a set of records,
+   with one row for each locked row within the table.  The output columns
+   are:
  </para>
-  <programlisting>
-CREATE TYPE pgrowlocks_type AS (
-        locked_row TID,         -- row TID
-        lock_type TEXT,         -- lock type
-        locker XID,             -- locking XID
-        multi bool,             -- multi XID?
-        xids xid[],             -- multi XIDs
-        pids INTEGER[]          -- locker's process id
-);
-  </programlisting>

  <table>
-   <title>pgrowlocks_type</title>
-   <tgroup cols="2">
+   <title><function>pgrowlocks</> output columns</title>
+
+   <tgroup cols="3">
+    <thead>
+     <row>
+      <entry>Name</entry>
+      <entry>Type</entry>
+      <entry>Description</entry>
+     </row>
+    </thead>
    <tbody>
+
     <row>
-      <entry>locked_row</entry>
-      <entry>tuple ID(TID) of each locked rows</entry>
+      <entry><structfield>locked_row</structfield></entry>
+      <entry><type>tid</type></entry>
+      <entry>Tuple ID (TID) of locked row</entry>
     </row>
     <row>
-      <entry>lock_type</entry>
-      <entry>"Shared" for shared lock, "Exclusive" for exclusive lock</entry>
+      <entry><structfield>lock_type</structfield></entry>
+      <entry><type>text</type></entry>
+      <entry><literal>Shared</> for shared lock, or
+             <literal>Exclusive</> for exclusive lock</entry>
     </row>
     <row>
-      <entry>locker</entry>
-      <entry>transaction ID of locker (Note 1)</entry>
+      <entry><structfield>locker</structfield></entry>
+      <entry><type>xid</type></entry>
+      <entry>Transaction ID of locker, or multixact ID if multi-transaction</entry>
     </row>
     <row>
-      <entry>multi</entry>
-      <entry>"t" if locker is a multi transaction, otherwise "f"</entry>
+      <entry><structfield>multi</structfield></entry>
+      <entry><type>boolean</type></entry>
+      <entry>True if locker is a multi-transaction</entry>
     </row>
     <row>
-      <entry>xids</entry>
-      <entry>XIDs of lockers (Note 2)</entry>
+      <entry><structfield>xids</structfield></entry>
+      <entry><type>xid[]</type></entry>
+      <entry>Transaction IDs of lockers (more than one if multi-transaction)</entry>
     </row>
     <row>
-      <entry>pids</entry>
-      <entry>process ids of locking backends</entry>
+      <entry><structfield>pids</structfield></entry>
+      <entry><type>integer[]</type></entry>
+      <entry>Process IDs of locking backends (more than one if multi-transaction)</entry>
     </row>
+
    </tbody>
   </tgroup>
  </table>
-  <para>
-   Note1: If the locker is multi transaction, it represents the multi ID.
-  </para>
-  <para>
-   Note2: If the locker is multi, multiple data are shown.
-  </para>

  <para>
-   The calling sequence for <literal>pgrowlocks</literal> is as follows:
-   <literal>pgrowlocks</literal> grabs AccessShareLock for the target table and 
-   reads each row one by one to get the row locking information. You should
-   notice that:
+   <function>pgrowlocks</function> takes <literal>AccessShareLock</> for the
+   target table and reads each row one by one to collect the row locking
+   information.  This is not very speedy for a large table.  Note that:
  </para>
+
  <orderedlist>
   <listitem>
    <para>
-    if the table is exclusive locked by someone else, 
-    <literal>pgrowlocks</literal> will be blocked.
+    If the table as a whole is exclusive-locked by someone else,
+    <function>pgrowlocks</function> will be blocked.
    </para>
   </listitem>
   <listitem>
    <para>
-     <literal>pgrowlocks</literal> may show incorrect information if there's a 
-     new lock or a lock is freeed while its execution.
+     <function>pgrowlocks</function> is not guaranteed to produce a
+     self-consistent snapshot.  It is possible that a new row lock is taken,
+     or an old lock is freed, during its execution.
    </para>
   </listitem>
  </orderedlist>
+
  <para>
-   <literal>pgrowlocks</literal> does not show the contents of locked rows. If 
-   you want to take a look at the row contents at the same time, you could do
-   something like this:
-  </para>
+   <function>pgrowlocks</function> does not show the contents of locked
+   rows. If you want to take a look at the row contents at the same time, you
+   could do something like this:
+
  <programlisting>
-SELECT * FROM accounts AS a, pgrowlocks('accounts') AS p WHERE p.locked_ row = a.ctid;
+SELECT * FROM accounts AS a, pgrowlocks('accounts') AS p
+  WHERE p.locked_row = a.ctid;
  </programlisting>
+
+   Be aware however that (as of <productname>PostgreSQL</> 8.3) such a
+   query will be very inefficient.
+  </para>
 </sect2>

 <sect2>
-  <title>Example</title>
-  <para>
-   <literal>pgrowlocks</literal> returns the following columns:
-  </para>
-  <para>
-   Here is a sample execution of pgrowlocks:
-  </para>
+  <title>Sample output</title>
+
  <programlisting>
 test=# SELECT * FROM pgrowlocks('t1');
- locked_row | lock_type | locker | multi |   xids    |     pids      
+ locked_row | lock_type | locker | multi |   xids    |     pids
 ------------+-----------+--------+-------+-----------+---------------
      (0,1) | Shared    |     19 | t     | {804,805} | {29066,29068}
      (0,2) | Shared    |     19 | t     | {804,805} | {29066,29068}
@ -117,7 +124,14 @@ test=# SELECT * FROM pgrowlocks('t1');
      (0,4) | Exclusive |    804 | f     | {804}     | {29066}
 (4 rows)
  </programlisting>
+ </sect2>
+
+ <sect2>
+  <title>Author</title>

+  <para>
+   Tatsuo Ishii
+  </para>
 </sect2>
-</sect1>

+</sect1>
--- a/doc/src/sgml/pgstattuple.sgml
+++ b/doc/src/sgml/pgstattuple.sgml
@ -1,29 +1,35 @@
+<!-- $PostgreSQL: pgsql/doc/src/sgml/pgstattuple.sgml,v 1.3 2007/12/10 05:32:51 tgl Exp $ -->

 <sect1 id="pgstattuple">
 <title>pgstattuple</title>
- 
+
 <indexterm zone="pgstattuple">
  <primary>pgstattuple</primary>
 </indexterm>

 <para>
-  <literal>pgstattuple</literal> modules provides various functions to obtain 
-  tuple statistics.
+  The <filename>pgstattuple</filename> module provides various functions to
+  obtain tuple-level statistics.
 </para>

 <sect2>
  <title>Functions</title>

-  <itemizedlist>
-   <listitem>
-    <para>
-     <literal>pgstattuple()</literal> returns the relation length, percentage 
-     of the "dead" tuples of a relation and other info. This may help users to 
-     determine whether vacuum is necessary or not. Here is an example session:
-    </para>
-    <programlisting>
-test=> \x
-Expanded display is on.
+  <variablelist>
+   <varlistentry>
+    <term>
+     <function>pgstattuple(text) returns record</>
+    </term>
+
+    <listitem>
+     <para>
+      <function>pgstattuple</function> returns a relation's physical length,
+      percentage of <quote>dead</> tuples, and other info. This may help users
+      to determine whether vacuum is necessary or not.  The argument is the
+      target relation's name (optionally schema-qualified).
+      For example:
+     </para>
+     <programlisting>
 test=> SELECT * FROM pgstattuple('pg_catalog.pg_proc');
 -[ RECORD 1 ]------+-------
 table_len          | 458752
@ -35,86 +41,111 @@ dead_tuple_len     | 3157
 dead_tuple_percent | 0.69
 free_space         | 8932
 free_percent       | 1.95
-    </programlisting>
+     </programlisting>
+
    <para>
-     Here are explanations for each column:
+     The output columns are:
    </para>
-    
+
    <table>
-     <title><literal>pgstattuple()</literal> column descriptions</title>
-     <tgroup cols="2">
+     <title><function>pgstattuple</function> output columns</title>
+     <tgroup cols="3">
      <thead>
       <row>
        <entry>Column</entry>
+        <entry>Type</entry>
        <entry>Description</entry>
       </row>
      </thead>
+
      <tbody>
       <row>
-        <entry>table_len</entry>
-        <entry>physical relation length in bytes</entry>
+        <entry><structfield>table_len</structfield></entry>
+        <entry><type>bigint</type></entry>
+        <entry>Physical relation length in bytes</entry>
+       </row>
+       <row>
+        <entry><structfield>tuple_count</structfield></entry>
+        <entry><type>bigint</type></entry>
+        <entry>Number of live tuples</entry>
       </row>
       <row>
-        <entry>tuple_count</entry>
-        <entry>number of live tuples</entry>
+        <entry><structfield>tuple_len</structfield></entry>
+        <entry><type>bigint</type></entry>
+        <entry>Total length of live tuples in bytes</entry>
       </row>
       <row>
-        <entry>tuple_len</entry>
-        <entry>total tuples length in bytes</entry>
+        <entry><structfield>tuple_percent</structfield></entry>
+        <entry><type>float8</type></entry>
+        <entry>Percentage of live tuples</entry>
       </row>
       <row>
-        <entry>tuple_percent</entry>
-        <entry>live tuples in %</entry>
+        <entry><structfield>dead_tuple_count</structfield></entry>
+        <entry><type>bigint</type></entry>
+        <entry>Number of dead tuples</entry>
       </row>
       <row>
-        <entry>dead_tuple_len</entry>
-        <entry>total dead tuples length in bytes</entry>
+        <entry><structfield>dead_tuple_len</structfield></entry>
+        <entry><type>bigint</type></entry>
+        <entry>Total length of dead tuples in bytes</entry>
       </row>
       <row>
-        <entry>dead_tuple_percent</entry>
-        <entry>dead tuples in %</entry>
+        <entry><structfield>dead_tuple_percent</structfield></entry>
+        <entry><type>float8</type></entry>
+        <entry>Percentage of dead tuples</entry>
       </row>
       <row>
-        <entry>free_space</entry>
-        <entry>free space in bytes</entry>
+        <entry><structfield>free_space</structfield></entry>
+        <entry><type>bigint</type></entry>
+        <entry>Total free space in bytes</entry>
       </row>
       <row>
-        <entry>free_percent</entry>
-        <entry>free space in %</entry>
+        <entry><structfield>free_percent</structfield></entry>
+        <entry><type>float8</type></entry>
+        <entry>Percentage of free space</entry>
       </row>
+
      </tbody>
     </tgroup>
    </table>
-    <para>
-     <note>
-      <para>
-       <literal>pgstattuple</literal> acquires only a read lock on the relation. So 
-       concurrent update may affect the result.
-      </para>
-     </note>
-     <note>
-      <para>
-       <literal>pgstattuple</literal> judges a tuple is "dead" if HeapTupleSatisfiesNow()
-       returns false.
-      </para>
-     </note>
-    </para>
-   </listitem>

-
-   <listitem>
    <para>
-     <literal>pg_relpages()</literal> returns the number of pages in the relation.
+     <function>pgstattuple</function> acquires only a read lock on the
+     relation. So the results do not reflect an instantaneous snapshot;
+     concurrent updates will affect them.
    </para>
-   </listitem>

-   <listitem>
    <para>
-     <literal>pgstatindex()</literal> returns an array showing the information about an index:
+     <function>pgstattuple</function> judges a tuple is <quote>dead</> if
+     <function>HeapTupleSatisfiesNow</> returns false.
    </para>
-    <programlisting>
-test=> \x
-Expanded display is on.
+    </listitem>
+   </varlistentry>
+
+   <varlistentry>
+    <term>
+     <function>pgstattuple(oid) returns record</>
+    </term>
+
+    <listitem>
+     <para>
+      This is the same as <function>pgstattuple(text)</function>, except
+      that the target relation is specified by OID.
+     </para>
+    </listitem>
+   </varlistentry>
+
+   <varlistentry>
+    <term>
+     <function>pgstatindex(text) returns record</>
+    </term>
+
+    <listitem>
+     <para>
+      <function>pgstatindex</function> returns a record showing information
+      about a btree index.  For example:
+     </para>
+     <programlisting>
 test=> SELECT * FROM pgstatindex('pg_cast_oid_index');
 -[ RECORD 1 ]------+------
 version            | 2
@ -128,31 +159,116 @@ deleted_pages      | 0
 avg_leaf_density   | 50.27
 leaf_fragmentation | 0
    </programlisting>
-   </listitem>
-  </itemizedlist>
+
+    <para>
+     The output columns are:
+    </para>
+
+    <table>
+     <title><function>pgstatindex</function> output columns</title>
+     <tgroup cols="3">
+      <thead>
+       <row>
+        <entry>Column</entry>
+        <entry>Type</entry>
+        <entry>Description</entry>
+       </row>
+      </thead>
+
+      <tbody>
+       <row>
+        <entry><structfield>version</structfield></entry>
+        <entry><type>integer</type></entry>
+        <entry>Btree version number</entry>
+       </row>
+
+       <row>
+        <entry><structfield>tree_level</structfield></entry>
+        <entry><type>integer</type></entry>
+        <entry>Tree level of the root page</entry>
+       </row>
+
+       <row>
+        <entry><structfield>index_size</structfield></entry>
+        <entry><type>integer</type></entry>
+        <entry>Total number of pages in index</entry>
+       </row>
+
+       <row>
+        <entry><structfield>root_block_no</structfield></entry>
+        <entry><type>integer</type></entry>
+        <entry>Location of root block</entry>
+       </row>
+
+       <row>
+        <entry><structfield>internal_pages</structfield></entry>
+        <entry><type>integer</type></entry>
+        <entry>Number of <quote>internal</> (upper-level) pages</entry>
+       </row>
+
+       <row>
+        <entry><structfield>leaf_pages</structfield></entry>
+        <entry><type>integer</type></entry>
+        <entry>Number of leaf pages</entry>
+       </row>
+
+       <row>
+        <entry><structfield>empty_pages</structfield></entry>
+        <entry><type>integer</type></entry>
+        <entry>Number of empty pages</entry>
+       </row>
+
+       <row>
+        <entry><structfield>deleted_pages</structfield></entry>
+        <entry><type>integer</type></entry>
+        <entry>Number of deleted pages</entry>
+       </row>
+
+       <row>
+        <entry><structfield>avg_leaf_density</structfield></entry>
+        <entry><type>float8</type></entry>
+        <entry>Average density of leaf pages</entry>
+       </row>
+
+       <row>
+        <entry><structfield>leaf_fragmentation</structfield></entry>
+        <entry><type>float8</type></entry>
+        <entry>Leaf page fragmentation</entry>
+       </row>
+
+      </tbody>
+     </tgroup>
+    </table>
+
+    <para>
+     As with <function>pgstattuple</>, the results are accumulated
+     page-by-page, and should not be expected to represent an
+     instantaneous snapshot of the whole index.
+    </para>
+    </listitem>
+   </varlistentry>
+
+   <varlistentry>
+    <term>
+     <function>pg_relpages(text) returns integer</>
+    </term>
+
+    <listitem>
+     <para>
+      <function>pg_relpages</function> returns the number of pages in the
+      relation.
+     </para>
+    </listitem>
+   </varlistentry>
+  </variablelist>
 </sect2>

 <sect2>
-  <title>Usage</title>
-  <para>
-   <literal>pgstattuple</literal> may be called as a relation function and is
-   defined as follows:
-  </para>
-  <programlisting>
-    CREATE OR REPLACE FUNCTION pgstattuple(text) RETURNS pgstattuple_type
-     AS 'MODULE_PATHNAME', 'pgstattuple'
-     LANGUAGE C STRICT;
-
-    CREATE OR REPLACE FUNCTION pgstattuple(oid) RETURNS pgstattuple_type
-     AS 'MODULE_PATHNAME', 'pgstattuplebyid'
-     LANGUAGE C STRICT;
-  </programlisting>
+  <title>Author</title>
+
  <para>
-   The argument is the relation name (optionally it may be qualified)
-   or the OID of the relation.  Note that pgstattuple only returns
-   one row.
+   Tatsuo Ishii
  </para>
 </sect2>

 </sect1>
-
--- a/doc/src/sgml/pgtrgm.sgml
+++ b/doc/src/sgml/pgtrgm.sgml
@ -1,90 +1,120 @@
+<!-- $PostgreSQL: pgsql/doc/src/sgml/pgtrgm.sgml,v 2.2 2007/12/10 05:32:51 tgl Exp $ -->
+
 <sect1 id="pgtrgm">
 <title>pg_trgm</title>
- 
+
 <indexterm zone="pgtrgm">
  <primary>pg_trgm</primary>
 </indexterm>

 <para>
-  The <literal>pg_trgm</literal> module provides functions and index classes
-  for determining the similarity of text based on trigram matching.
+  The <filename>pg_trgm</filename> module provides functions and operators
+  for determining the similarity of text based on trigram matching, as
+  well as index operator classes that support fast searching for similar
+  strings.
 </para>

 <sect2>
-  <title>Trigram (or Trigraph)</title>
-  <para>
-   A trigram is a set of three consecutive characters taken
-   from a string.  A string is considered to have two spaces
-   prefixed and one space suffixed when determining the set
-   of trigrams that comprise the string.
-  </para>
+  <title>Trigram (or Trigraph) Concepts</title>
+
  <para>
-   eg. The set of trigrams in the word "cat" is "  c", " ca", 
-   "at " and "cat".
+   A trigram is a group of three consecutive characters taken
+   from a string.  We can measure the similarity of two strings by
+   counting the number of trigrams they share.  This simple idea
+   turns out to be very effective for measuring the similarity of
+   words in many natural languages.
  </para>
+
+  <note>
+   <para>
+    A string is considered to have two spaces
+    prefixed and one space suffixed when determining the set
+    of trigrams contained in the string.
+    For example, the set of trigrams in the string
+    <quote><literal>cat</literal></quote> is
+    <quote><literal>  c</literal></quote>,
+    <quote><literal> ca</literal></quote>,
+    <quote><literal>cat</literal></quote>, and
+    <quote><literal>at </literal></quote>.
+   </para>
+  </note>
 </sect2>

 <sect2>
-  <title>Public Functions</title>
-  <table>
-   <title><literal>pg_trgm</literal> functions</title>
-   <tgroup cols="2">
+  <title>Functions and Operators</title>
+
+  <table id="pgtrgm-func-table">
+   <title><filename>pg_trgm</filename> functions</title>
+   <tgroup cols="3">
    <thead>
     <row>
      <entry>Function</entry>
+      <entry>Returns</entry>
      <entry>Description</entry>
     </row>
    </thead>
+
    <tbody>
     <row>
-      <entry><literal>real similarity(text, text)</literal></entry>
+      <entry><function>similarity(text, text)</function></entry>
+      <entry><type>real</type></entry>
      <entry>
-       <para>
-        Returns a number that indicates how closely matches the two
-        arguments are.  A zero result indicates that the two words
-        are completely dissimilar, and a result of one indicates that
-        the two words are identical.
-       </para>
+       Returns a number that indicates how similar the two arguments are.
+       The range of the result is zero (indicating that the two strings are
+       completely dissimilar) to one (indicating that the two strings are
+       identical).
      </entry>
     </row>
     <row>
-      <entry><literal>real show_limit()</literal></entry>
+      <entry><function>show_trgm(text)</function></entry>
+      <entry><type>text[]</type></entry>
      <entry>
-       <para>
-        Returns the current similarity threshold used by the '%'
-        operator.  This in effect sets the minimum similarity between
-        two words in order that they be considered similar enough to
-        be misspellings of each other, for example.
-       </para>
+       Returns an array of all the trigrams in the given string.
+       (In practice this is seldom useful except for debugging.)
      </entry>
     </row>
     <row>
-      <entry><literal>real set_limit(real)</literal></entry>
+      <entry><function>show_limit()</function></entry>
+      <entry><type>real</type></entry>
      <entry>
-       <para>
-        Sets the current similarity threshold that is used by the '%'
-        operator, and is returned by the show_limit() function.
-       </para>
+       Returns the current similarity threshold used by the <literal>%</>
+       operator.  This sets the minimum similarity between
+       two words for them to be considered similar enough to
+       be misspellings of each other, for example.
      </entry>
     </row>
     <row>
-      <entry><literal>text[] show_trgm(text)</literal></entry>
+      <entry><function>set_limit(real)</function></entry>
+      <entry><type>real</type></entry>
      <entry>
-       <para>
-        Returns an array of all the trigrams of the supplied text
-        parameter.
-       </para>
+       Sets the current similarity threshold that is used by the <literal>%</>
+       operator.  The threshold must be between 0 and 1 (default is 0.3).
+       Returns the same value passed in.
      </entry>
     </row>
+    </tbody>
+   </tgroup>
+  </table>
+
+  <table id="pgtrgm-op-table">
+   <title><filename>pg_trgm</filename> operators</title>
+   <tgroup cols="3">
+    <thead>
     <row>
-      <entry>Operator: <literal>text % text (returns boolean)</literal></entry> 
+      <entry>Operator</entry>
+      <entry>Returns</entry>
+      <entry>Description</entry>
+     </row>
+    </thead>
+
+    <tbody>
+     <row>
+      <entry><type>text</> <literal>%</literal> <type>text</></entry>
+      <entry><type>boolean</type></entry>
      <entry>
-       <para>
-        The '%' operator returns TRUE if its two arguments have a similarity
-        that is greater than the similarity threshold set by set_limit(). It
-        will return FALSE if the similarity is less than the current
-        threshold.
-       </para>
+       Returns <literal>true</> if its arguments have a similarity that is
+       greater than the current similarity threshold set by
+       <function>set_limit</>.
      </entry>
     </row>
    </tbody>
@ -93,97 +123,111 @@
 </sect2>

 <sect2>
-  <title>Public Index Operator Class</title>
+  <title>Index Support</title>
+
  <para>
-   The <literal>pg_trgm</literal> module comes with the 
-   <literal>gist_trgm_ops</literal> index operator class that allows a
-   developer to create an index over a text column for the purpose
-   of very fast similarity searches.
+   The <filename>pg_trgm</filename> module provides GiST and GIN index
+   operator classes that allow you to create an index over a text column for
+   the purpose of very fast similarity searches.  These index types support
+   the <literal>%</> similarity operator (and no other operators, so you may
+   want a regular btree index too).
  </para>
+
  <para>
-   To use this index, the '%' operator must be used and an appropriate
-   similarity threshold for the application must be set. Example:
-  </para>
+   Example:
+
  <programlisting>
 CREATE TABLE test_trgm (t text);
 CREATE INDEX trgm_idx ON test_trgm USING gist (t gist_trgm_ops);
  </programlisting>
+or
+  <programlisting>
+CREATE INDEX trgm_idx ON test_trgm USING gin (t gin_trgm_ops);
+  </programlisting>
+  </para>
+
  <para>
-   At this point, you will have an index on the t text column that you
-   can use for similarity searching. Example:
+   At this point, you will have an index on the <structfield>t</> column that
+   you can use for similarity searching.  A typical query is
  </para>
  <programlisting>
-SELECT
-        t,
-        similarity(t, 'word') AS sml
-FROM
-        test_trgm
-WHERE
-        t % 'word'
-ORDER BY
-        sml DESC, t;
+SELECT t, similarity(t, '<replaceable>word</>') AS sml
+  FROM test_trgm
+  WHERE t % '<replaceable>word</>'
+  ORDER BY sml DESC, t;
  </programlisting>
  <para>
   This will return all values in the text column that are sufficiently
-   similar to 'word', sorted from best match to worst.  The index will
-   be used to make this a fast operation over very large data sets.
+   similar to <replaceable>word</>, sorted from best match to worst.  The
+   index will be used to make this a fast operation even over very large data
+   sets.
+  </para>
+
+  <para>
+   The choice between GiST and GIN indexing depends on the relative
+   performance characteristics of GiST and GIN, which are discussed elsewhere.
+   As a rule of thumb, a GIN index is faster to search than a GiST index, but
+   slower to build or update; so GIN is better suited for static data and GiST
+   for often-updated data.
  </para>
 </sect2>

 <sect2>
  <title>Text Search Integration</title>
+
  <para>
   Trigram matching is a very useful tool when used in conjunction
-   with a full text index.
+   with a full text index.  In particular it can help to recognize
+   misspelled input words that will not be matched directly by the
+   full text search mechanism.
  </para>
+
  <para>
   The first step is to generate an auxiliary table containing all
   the unique words in the documents:
  </para>
+
  <programlisting>
-CREATE TABLE words AS SELECT word FROM 
-        stat('SELECT to_tsvector(''simple'', bodytext) FROM documents');
+CREATE TABLE words AS SELECT word FROM
+        ts_stat('SELECT to_tsvector(''simple'', bodytext) FROM documents');
  </programlisting>
+
  <para>
   where <structname>documents</> is a table that has a text field
-   <structfield>bodytext</> that we wish to search.  The use of the
-   <literal>simple</> configuration with the <function>to_tsvector</>
-   function, instead of just using the already
-   existing vector is to avoid creating a list of already stemmed
-   words.  This way, only the original, unstemmed words are added
-   to the word list.
+   <structfield>bodytext</> that we wish to search.  The reason for using
+   the <literal>simple</> configuration with the <function>to_tsvector</>
+   function, instead of using a language-specific configuration,
+   is that we want a list of the original (unstemmed) words.
  </para>
+
  <para>
   Next, create a trigram index on the word column:
  </para>
+
  <programlisting>
-CREATE INDEX words_idx ON words USING gist(word gist_trgm_ops);
-  </programlisting>
-  <para>
-   or
-  </para>
-  <programlisting>
-CREATE INDEX words_idx ON words USING gin(word gist_trgm_ops);
+CREATE INDEX words_idx ON words USING gin(word gin_trgm_ops);
  </programlisting>
+
  <para>
-   Now, a <literal>SELECT</literal> query similar to the example above can be 
-   used to suggest spellings for misspelled words in user search terms. A
-   useful extra clause is to ensure that the similar words are also
-   of similar length to the misspelled word.
-  </para>
-  <para>
-   <note>
-    <para>
-     Since the <structname>words</> table has been generated as a separate,
-     static table, it will need to be periodically regenerated so that
-     it remains up to date with the document collection.
-    </para>
-   </note>
+   Now, a <command>SELECT</command> query similar to the previous example can
+   be used to suggest spellings for misspelled words in user search terms.
+   A useful extra test is to require that the selected words are also of
+   similar length to the misspelled word.
  </para>
+
+  <note>
+   <para>
+    Since the <structname>words</> table has been generated as a separate,
+    static table, it will need to be periodically regenerated so that
+    it remains reasonably up-to-date with the document collection.
+    Keeping it exactly current is usually unnecessary.
+   </para>
+  </note>
 </sect2>

 <sect2>
  <title>References</title>
+
  <para>
   GiST Development Site
   <ulink url="http://www.sai.msu.su/~megera/postgres/gist/"></ulink>
@ -196,6 +240,7 @@ CREATE INDEX words_idx ON words USING gin(word gist_trgm_ops);

 <sect2>
  <title>Authors</title>
+
  <para>
   Oleg Bartunov <email>oleg@sai.msu.su</email>, Moscow, Moscow University, Russia
  </para>
@ -203,7 +248,7 @@ CREATE INDEX words_idx ON words USING gin(word gist_trgm_ops);
   Teodor Sigaev <email>teodor@sigaev.ru</email>, Moscow, Delta-Soft Ltd.,Russia
  </para>
  <para>
-   Documentation: Christopher Kings-Lynne 
+   Documentation: Christopher Kings-Lynne
  </para>
  <para>
   This module is sponsored by Delta-Soft Ltd., Moscow, Russia.