pg_dump patch from Philip Warner

26 years ago · 500b62b057
parent 20c01ef130
commit 500b62b057
14 changed files with 3882 additions and 388 deletions
--- a/doc/TODO.detail/function
+++ b/doc/TODO.detail/function
@ -0,0 +1,519 @@
+From owner-pgsql-hackers@hub.org Wed Sep 22 20:31:02 1999
+Received: from renoir.op.net (root@renoir.op.net [209.152.193.4])
+	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id UAA15611
+	for <maillist@candle.pha.pa.us>; Wed, 22 Sep 1999 20:31:01 -0400 (EDT)
+Received: from hub.org (hub.org [216.126.84.1]) by renoir.op.net (o1/$ Revision: 1.18 $) with ESMTP id UAA02926 for <maillist@candle.pha.pa.us>; Wed, 22 Sep 1999 20:21:24 -0400 (EDT)
+Received: from hub.org (hub.org [216.126.84.1])
+	by hub.org (8.9.3/8.9.3) with ESMTP id UAA75413;
+	Wed, 22 Sep 1999 20:09:35 -0400 (EDT)
+	(envelope-from owner-pgsql-hackers@hub.org)
+Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Wed, 22 Sep 1999 20:08:50 +0000 (EDT)
+Received: (from majordom@localhost)
+	by hub.org (8.9.3/8.9.3) id UAA75058
+	for pgsql-hackers-outgoing; Wed, 22 Sep 1999 20:06:58 -0400 (EDT)
+	(envelope-from owner-pgsql-hackers@postgreSQL.org)
+Received: from sss.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2])
+	by hub.org (8.9.3/8.9.3) with ESMTP id UAA74982
+	for <pgsql-hackers@postgreSQL.org>; Wed, 22 Sep 1999 20:06:25 -0400 (EDT)
+	(envelope-from tgl@sss.pgh.pa.us)
+Received: from sss.sss.pgh.pa.us (localhost [127.0.0.1])
+	by sss.sss.pgh.pa.us (8.9.1/8.9.1) with ESMTP id UAA06411
+	for <pgsql-hackers@postgreSQL.org>; Wed, 22 Sep 1999 20:05:40 -0400 (EDT)
+To: pgsql-hackers@postgreSQL.org
+Subject: [HACKERS] Progress report: buffer refcount bugs and SQL functions
+Date: Wed, 22 Sep 1999 20:05:39 -0400
+Message-ID: <6408.938045139@sss.pgh.pa.us>
+From: Tom Lane <tgl@sss.pgh.pa.us>
+Sender: owner-pgsql-hackers@postgreSQL.org
+Precedence: bulk
+Status: RO
+
+I have been finding a lot of interesting stuff while looking into
+the buffer reference count/leakage issue.
+
+It turns out that there were two specific things that were camouflaging
+the existence of bugs in this area:
+
+1. The BufferLeakCheck routine that's run at transaction commit was
+only looking for nonzero PrivateRefCount to indicate a missing unpin.
+It failed to notice nonzero LastRefCount --- which meant that an
+error in refcount save/restore usage could leave a buffer pinned,
+and BufferLeakCheck wouldn't notice.
+
+2. The BufferIsValid macro, which you'd think just checks whether
+it's handed a valid buffer identifier or not, actually did more:
+it only returned true if the buffer ID was valid *and* the buffer
+had positive PrivateRefCount.  That meant that the common pattern
+	if (BufferIsValid(buf))
+		ReleaseBuffer(buf);
+wouldn't complain if it were handed a valid but already unpinned buffer.
+And that behavior masks bugs that result in buffers being unpinned too
+early.  For example, consider a sequence like
+
+1. LockBuffer (buffer now has refcount 1).  Store reference to
+   a tuple on that buffer page in a tuple table slot.
+2. Copy buffer reference to a second tuple-table slot, but forget to
+   increment buffer's refcount.
+3. Release second tuple table slot.  Buffer refcount drops to 0,
+   so it's unpinned.
+4. Release original tuple slot.  Because of BufferIsValid behavior,
+   no assert happens here; in fact nothing at all happens.
+
+This is, of course, buggy code: during the interval from 3 to 4 you
+still have an apparently valid tuple reference in the original slot,
+which someone might try to use; but the buffer it points to is unpinned
+and could be replaced at any time by another backend.
+
+In short, we had errors that would mask both missing-pin bugs and
+missing-unpin bugs.  And naturally there were a few such bugs lurking
+behind them...
+
+3. The buffer refcount save/restore stuff, which I had suspected
+was useless, is not only useless but also buggy.  The reason it's
+buggy is that it only works if used in a nested fashion.  You could
+save state A, pin some buffers, save state B, pin some more
+buffers, restore state B (thereby unpinning what you pinned since
+the save), and finally restore state A (unpinning the earlier stuff).
+What you could not do is save state A, pin, save B, pin more, then
+restore state A --- that might unpin some of A's buffers, or some
+of B's buffers, or some unforeseen combination thereof.  If you
+restore A and then restore B, you do not necessarily return to a zero-
+pins state, either.  And it turns out the actual usage pattern was a
+nearly random sequence of saves and restores, compounded by a failure to
+do all of the restores reliably (which was masked by the oversight in
+BufferLeakCheck).
+
+
+What I have done so far is to rip out the buffer refcount save/restore
+support (including LastRefCount), change BufferIsValid to a simple
+validity check (so that you get an assert if you unpin something that
+was pinned), change ExecStoreTuple so that it increments the refcount
+when it is handed a buffer reference (for symmetry with ExecClearTuple's
+decrement of the refcount), and fix about a dozen bugs exposed by these
+changes.
+
+I am still getting Buffer Leak notices in the "misc" regression test,
+specifically in the queries that invoke more than one SQL function.
+What I find there is that SQL functions are not always run to
+completion.  Apparently, when a function can return multiple tuples,
+it won't necessarily be asked to produce them all.  And when it isn't,
+postquel_end() isn't invoked for the function's current query, so its
+tuple table isn't cleared, so we have dangling refcounts if any of the
+tuples involved are in disk buffers.
+
+It may be that the save/restore code was a misguided attempt to fix
+this problem.  I can't tell.  But I think what we really need to do is
+find some way of ensuring that Postquel function execution contexts
+always get shut down by the end of the query, so that they don't leak
+resources.
+
+I suppose a straightforward approach would be to keep a list of open
+function contexts somewhere (attached to the outer execution context,
+perhaps), and clean them up at outer-plan shutdown.
+
+What I am wondering, though, is whether this addition is actually
+necessary, or is it a bug that the functions aren't run to completion
+in the first place?  I don't really understand the semantics of this
+"nested dot notation".  I suppose it is a Berkeleyism; I can't find
+anything about it in the SQL92 document.  The test cases shown in the
+misc regress test seem peculiar, not to say wrong.  For example:
+
+regression=> SELECT p.hobbies.equipment.name, p.hobbies.name, p.name FROM person p;
+name         |name       |name
+-------------+-----------+-----
+advil        |posthacking|mike
+peet's coffee|basketball |joe
+hightops     |basketball |sally
+(3 rows)
+
+which doesn't appear to agree with the contents of the underlying
+relations:
+
+regression=> SELECT * FROM hobbies_r;
+name       |person
+-----------+------
+posthacking|mike
+posthacking|jeff
+basketball |joe
+basketball |sally
+skywalking |
+(5 rows)
+
+regression=> SELECT * FROM equipment_r;
+name         |hobby
+-------------+-----------
+advil        |posthacking
+peet's coffee|posthacking
+hightops     |basketball
+guts         |skywalking
+(4 rows)
+
+I'd have expected an output along the lines of
+
+advil        |posthacking|mike
+peet's coffee|posthacking|mike
+hightops     |basketball |joe
+hightops     |basketball |sally
+
+Is the regression test's expected output wrong, or am I misunderstanding
+what this query is supposed to do?  Is there any documentation anywhere
+about how SQL functions returning multiple tuples are supposed to
+behave?
+
+			regards, tom lane
+
+************
+
+
+From owner-pgsql-hackers@hub.org Thu Sep 23 11:03:19 1999
+Received: from hub.org (hub.org [216.126.84.1])
+	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id LAA16211
+	for <maillist@candle.pha.pa.us>; Thu, 23 Sep 1999 11:03:17 -0400 (EDT)
+Received: from hub.org (hub.org [216.126.84.1])
+	by hub.org (8.9.3/8.9.3) with ESMTP id KAA58151;
+	Thu, 23 Sep 1999 10:53:46 -0400 (EDT)
+	(envelope-from owner-pgsql-hackers@hub.org)
+Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Thu, 23 Sep 1999 10:53:05 +0000 (EDT)
+Received: (from majordom@localhost)
+	by hub.org (8.9.3/8.9.3) id KAA57948
+	for pgsql-hackers-outgoing; Thu, 23 Sep 1999 10:52:23 -0400 (EDT)
+	(envelope-from owner-pgsql-hackers@postgreSQL.org)
+Received: from sss.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2])
+	by hub.org (8.9.3/8.9.3) with ESMTP id KAA57841
+	for <hackers@postgreSQL.org>; Thu, 23 Sep 1999 10:51:50 -0400 (EDT)
+	(envelope-from tgl@sss.pgh.pa.us)
+Received: from sss.sss.pgh.pa.us (localhost [127.0.0.1])
+	by sss.sss.pgh.pa.us (8.9.1/8.9.1) with ESMTP id KAA14211;
+	Thu, 23 Sep 1999 10:51:10 -0400 (EDT)
+To: Andreas Zeugswetter <andreas.zeugswetter@telecom.at>
+cc: hackers@postgreSQL.org
+Subject: Re: [HACKERS] Progress report: buffer refcount bugs and SQL functions 
+In-reply-to: Your message of Thu, 23 Sep 1999 10:07:24 +0200 
+             <37E9DFBC.5C0978F@telecom.at> 
+Date: Thu, 23 Sep 1999 10:51:10 -0400
+Message-ID: <14209.938098270@sss.pgh.pa.us>
+From: Tom Lane <tgl@sss.pgh.pa.us>
+Sender: owner-pgsql-hackers@postgreSQL.org
+Precedence: bulk
+Status: RO
+
+Andreas Zeugswetter <andreas.zeugswetter@telecom.at> writes:
+> That is what I use it for. I have never used it with a 
+> returns setof function, but reading the comments in the regression test,
+> -- mike needs advil and peet's coffee,
+> -- joe and sally need hightops, and
+> -- everyone else is fine.
+> it looks like the results you expected are correct, and currently the 
+> wrong result is given.
+
+Yes, I have concluded the same (and partially fixed it, per my previous
+message).
+
+> Those that don't have a hobbie should return name|NULL|NULL. A hobbie
+> that does'nt need equipment name|hobbie|NULL.
+
+That's a good point.  Currently (both with and without my uncommitted
+fix) you get *no* rows out from ExecTargetList if there are any Iters
+that return empty result sets.  It might be more reasonable to treat an
+empty result set as if it were NULL, which would give the behavior you
+suggest.
+
+This would be an easy change to my current patch, and I'm prepared to
+make it before committing what I have, if people agree that that's a
+more reasonable definition.  Comments?
+
+			regards, tom lane
+
+************
+
+
+From owner-pgsql-hackers@hub.org Thu Sep 23 04:31:15 1999
+Received: from renoir.op.net (root@renoir.op.net [209.152.193.4])
+	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id EAA11344
+	for <maillist@candle.pha.pa.us>; Thu, 23 Sep 1999 04:31:15 -0400 (EDT)
+Received: from hub.org (hub.org [216.126.84.1]) by renoir.op.net (o1/$ Revision: 1.18 $) with ESMTP id EAA05350 for <maillist@candle.pha.pa.us>; Thu, 23 Sep 1999 04:24:29 -0400 (EDT)
+Received: from hub.org (hub.org [216.126.84.1])
+	by hub.org (8.9.3/8.9.3) with ESMTP id EAA85679;
+	Thu, 23 Sep 1999 04:16:26 -0400 (EDT)
+	(envelope-from owner-pgsql-hackers@hub.org)
+Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Thu, 23 Sep 1999 04:09:52 +0000 (EDT)
+Received: (from majordom@localhost)
+	by hub.org (8.9.3/8.9.3) id EAA84708
+	for pgsql-hackers-outgoing; Thu, 23 Sep 1999 04:08:57 -0400 (EDT)
+	(envelope-from owner-pgsql-hackers@postgreSQL.org)
+Received: from gandalf.telecom.at (gandalf.telecom.at [194.118.26.84])
+	by hub.org (8.9.3/8.9.3) with ESMTP id EAA84632
+	for <hackers@postgresql.org>; Thu, 23 Sep 1999 04:08:03 -0400 (EDT)
+	(envelope-from andreas.zeugswetter@telecom.at)
+Received: from telecom.at (w0188000580.f000.d0188.sd.spardat.at [172.18.65.249])
+	by gandalf.telecom.at (xxx/xxx) with ESMTP id KAA195294
+	for <hackers@postgresql.org>; Thu, 23 Sep 1999 10:07:27 +0200
+Message-ID: <37E9DFBC.5C0978F@telecom.at>
+Date: Thu, 23 Sep 1999 10:07:24 +0200
+From: Andreas Zeugswetter <andreas.zeugswetter@telecom.at>
+X-Mailer: Mozilla 4.61 [en] (Win95; I)
+X-Accept-Language: en
+MIME-Version: 1.0
+To: hackers@postgreSQL.org
+Subject: Re: [HACKERS] Progress report: buffer refcount bugs and SQL functions
+Content-Type: text/plain; charset=us-ascii
+Content-Transfer-Encoding: 7bit
+Sender: owner-pgsql-hackers@postgreSQL.org
+Precedence: bulk
+Status: RO
+
+> Is the regression test's expected output wrong, or am I 
+> misunderstanding
+> what this query is supposed to do?  Is there any 
+> documentation anywhere
+> about how SQL functions returning multiple tuples are supposed to
+> behave?
+
+They are supposed to behave somewhat like a view.
+Not all rows are necessarily fetched.
+If used in a context that needs a single row answer,
+and the answer has multiple rows it is supposed to 
+runtime elog. Like in:
+
+select * from tbl where col=funcreturningmultipleresults();
+-- this must elog
+
+while this is ok:
+select * from tbl where col in (select funcreturningmultipleresults());
+
+But the caller could only fetch the first row if he wanted.
+
+The nested notation is supposed to call the function passing it the tuple
+as the first argument. This is what can be used to "fake" a column
+onto a table (computed column). 
+That is what I use it for. I have never used it with a 
+returns setof function, but reading the comments in the regression test,
+-- mike needs advil and peet's coffee,
+-- joe and sally need hightops, and
+-- everyone else is fine.
+it looks like the results you expected are correct, and currently the 
+wrong result is given.
+
+But I think this query could also elog whithout removing substantial
+functionality. 
+
+SELECT p.name, p.hobbies.name, p.hobbies.equipment.name FROM person p;
+
+Actually for me it would be intuitive, that this query return one row per 
+person, but elog on those that have more than one hobbie or a hobbie that 
+needs more than one equipment. Those that don't have a hobbie should 
+return name|NULL|NULL. A hobbie that does'nt need equipment name|hobbie|NULL.
+
+Andreas
+
+************
+
+
+From owner-pgsql-hackers@hub.org Wed Sep 22 22:01:07 1999
+Received: from renoir.op.net (root@renoir.op.net [209.152.193.4])
+	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id WAA16360
+	for <maillist@candle.pha.pa.us>; Wed, 22 Sep 1999 22:01:05 -0400 (EDT)
+Received: from hub.org (hub.org [216.126.84.1]) by renoir.op.net (o1/$ Revision: 1.18 $) with ESMTP id VAA08386 for <maillist@candle.pha.pa.us>; Wed, 22 Sep 1999 21:37:24 -0400 (EDT)
+Received: from hub.org (hub.org [216.126.84.1])
+	by hub.org (8.9.3/8.9.3) with ESMTP id VAA88083;
+	Wed, 22 Sep 1999 21:28:11 -0400 (EDT)
+	(envelope-from owner-pgsql-hackers@hub.org)
+Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Wed, 22 Sep 1999 21:27:48 +0000 (EDT)
+Received: (from majordom@localhost)
+	by hub.org (8.9.3/8.9.3) id VAA87938
+	for pgsql-hackers-outgoing; Wed, 22 Sep 1999 21:26:52 -0400 (EDT)
+	(envelope-from owner-pgsql-hackers@postgreSQL.org)
+Received: from orion.SAPserv.Hamburg.dsh.de (Tpolaris2.sapham.debis.de [53.2.131.8])
+	by hub.org (8.9.3/8.9.3) with SMTP id VAA87909
+	for <pgsql-hackers@postgresql.org>; Wed, 22 Sep 1999 21:26:36 -0400 (EDT)
+	(envelope-from wieck@debis.com)
+Received: by orion.SAPserv.Hamburg.dsh.de 
+	for pgsql-hackers@postgresql.org 
+	id m11TxXw-0003kLC; Thu, 23 Sep 99 03:19 MET DST
+Message-Id: <m11TxXw-0003kLC@orion.SAPserv.Hamburg.dsh.de>
+From: wieck@debis.com (Jan Wieck)
+Subject: Re: [HACKERS] Progress report: buffer refcount bugs and SQL functions
+To: tgl@sss.pgh.pa.us (Tom Lane)
+Date: Thu, 23 Sep 1999 03:19:39 +0200 (MET DST)
+Cc: pgsql-hackers@postgreSQL.org
+Reply-To: wieck@debis.com (Jan Wieck)
+In-Reply-To: <6408.938045139@sss.pgh.pa.us> from "Tom Lane" at Sep 22, 99 08:05:39 pm
+X-Mailer: ELM [version 2.4 PL25]
+Content-Type: text
+Sender: owner-pgsql-hackers@postgreSQL.org
+Precedence: bulk
+Status: RO
+
+Tom Lane wrote:
+
+> [...]
+>
+> What I am wondering, though, is whether this addition is actually
+> necessary, or is it a bug that the functions aren't run to completion
+> in the first place?  I don't really understand the semantics of this
+> "nested dot notation".  I suppose it is a Berkeleyism; I can't find
+> anything about it in the SQL92 document.  The test cases shown in the
+> misc regress test seem peculiar, not to say wrong.  For example:
+>
+> [...]
+>
+> Is the regression test's expected output wrong, or am I misunderstanding
+> what this query is supposed to do?  Is there any documentation anywhere
+> about how SQL functions returning multiple tuples are supposed to
+> behave?
+
+    I've  said some time (maybe too long) ago, that SQL functions
+    returning tuple sets are broken in general. This  nested  dot
+    notation  (which  I  think  is  an artefact from the postquel
+    querylanguage) is implemented via set functions.
+
+    Set functions have total different semantics from  all  other
+    functions.   First  they  don't  really return a tuple set as
+    someone might think  -  all  that  screwed  up  code  instead
+    simulates  that  they  return  something you could consider a
+    scan of the last SQL statement in  the  function.   Then,  on
+    each  subsequent call inside of the same command, they return
+    a "tupletable slot" containing the next found  tuple  (that's
+    why their Func node is mangled up after the first call).
+
+    Second  they  have  a  targetlist what I think was originally
+    intended to extract attributes out  of  the  tuples  returned
+    when  the above scan is asked to get the next tuple. But as I
+    read the code it invokes the function again  and  this  might
+    cause the resource leakage you see.
+
+    Third,   all  this  seems  to  never  have  been  implemented
+    (thought?) to the end. A targetlist  doesn't  make  sense  at
+    this place because it could at max contain a single attribute
+    - so a single attno would have the same  power.  And  if  set
+    functions  could appear in the rangetable (FROM clause), than
+    they would be treated as that and regular Var  nodes  in  the
+    query would do it.
+
+    I  think  you  shouldn't really care for that regression test
+    and maybe we should disable set  functions  until  we  really
+    implement stored procedures returning sets in the rangetable.
+
+    Set  functions  where  planned  by  Stonebraker's   team   as
+    something  that  today is called stored procedures. But AFAIK
+    they never reached the useful state because even in  Postgres
+    4.2  you haven't been able to get more than one attribute out
+    of a  set  function.   It  was  a  feature  of  the  postquel
+    querylanguage  that  you  could  get one attribute from a set
+    function via
+
+        RETRIEVE (attributename(setfuncname()))
+
+    While working on the constraint  triggers  I've  came  across
+    another  regression test (triggers :-) that's errorneous too.
+    The funny_dup17 trigger proc executes an INSERT into the same
+    relation  where it get fired for by a previous INSERT. And it
+    stops this recursion only if it reaches a  nesting  level  of
+    17,  which  could  only  occur  if  it  is  fired  DURING the
+    execution of it's own SPI_exec(). After  Vadim  quouted  some
+    SQL92  definitions  about when constraint checks and triggers
+    are to be executed, I decided to fire regular triggers at the
+    end  of  a  query  too.  Thus, there is absolutely no nesting
+    possible for AFTER triggers resulting in an endless loop.
+
+
+Jan
+
+--
+
+#======================================================================#
+# It's easier to get forgiveness for being wrong than for being right. #
+# Let's break this rule - forgive me.                                  #
+#========================================= wieck@debis.com (Jan Wieck) #
+
+
+
+************
+
+
+From owner-pgsql-hackers@hub.org Thu Sep 23 11:01:06 1999
+Received: from renoir.op.net (root@renoir.op.net [209.152.193.4])
+	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id LAA16162
+	for <maillist@candle.pha.pa.us>; Thu, 23 Sep 1999 11:01:04 -0400 (EDT)
+Received: from hub.org (hub.org [216.126.84.1]) by renoir.op.net (o1/$ Revision: 1.18 $) with ESMTP id KAA28544 for <maillist@candle.pha.pa.us>; Thu, 23 Sep 1999 10:45:54 -0400 (EDT)
+Received: from hub.org (hub.org [216.126.84.1])
+	by hub.org (8.9.3/8.9.3) with ESMTP id KAA52943;
+	Thu, 23 Sep 1999 10:20:51 -0400 (EDT)
+	(envelope-from owner-pgsql-hackers@hub.org)
+Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Thu, 23 Sep 1999 10:19:58 +0000 (EDT)
+Received: (from majordom@localhost)
+	by hub.org (8.9.3/8.9.3) id KAA52472
+	for pgsql-hackers-outgoing; Thu, 23 Sep 1999 10:19:03 -0400 (EDT)
+	(envelope-from owner-pgsql-hackers@postgreSQL.org)
+Received: from sss.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2])
+	by hub.org (8.9.3/8.9.3) with ESMTP id KAA52431
+	for <pgsql-hackers@postgresql.org>; Thu, 23 Sep 1999 10:18:47 -0400 (EDT)
+	(envelope-from tgl@sss.pgh.pa.us)
+Received: from sss.sss.pgh.pa.us (localhost [127.0.0.1])
+	by sss.sss.pgh.pa.us (8.9.1/8.9.1) with ESMTP id KAA13253;
+	Thu, 23 Sep 1999 10:18:02 -0400 (EDT)
+To: wieck@debis.com (Jan Wieck)
+cc: pgsql-hackers@postgreSQL.org
+Subject: Re: [HACKERS] Progress report: buffer refcount bugs and SQL functions 
+In-reply-to: Your message of Thu, 23 Sep 1999 03:19:39 +0200 (MET DST) 
+             <m11TxXw-0003kLC@orion.SAPserv.Hamburg.dsh.de> 
+Date: Thu, 23 Sep 1999 10:18:01 -0400
+Message-ID: <13251.938096281@sss.pgh.pa.us>
+From: Tom Lane <tgl@sss.pgh.pa.us>
+Sender: owner-pgsql-hackers@postgreSQL.org
+Precedence: bulk
+Status: RO
+
+wieck@debis.com (Jan Wieck) writes:
+> Tom Lane wrote:
+>> What I am wondering, though, is whether this addition is actually
+>> necessary, or is it a bug that the functions aren't run to completion
+>> in the first place?
+
+>     I've  said some time (maybe too long) ago, that SQL functions
+>     returning tuple sets are broken in general.
+
+Indeed they are.  Try this on for size (using the regression database):
+
+	SELECT p.name, p.hobbies.equipment.name FROM person p;
+	SELECT p.hobbies.equipment.name, p.name FROM person p;
+
+You get different result sets!?
+
+The problem in this example is that ExecTargetList returns the isDone
+flag from the last targetlist entry, regardless of whether there are
+incomplete iterations in previous entries.  More generally, the buffer
+leak problem that I started with only occurs if some Iter nodes are not
+run to completion --- but execQual.c has no mechanism to make sure that
+they have all reached completion simultaneously.
+
+What we really need to make functions-returning-sets work properly is
+an implementation somewhat like aggregate functions.  We need to make
+a list of all the Iter nodes present in a targetlist and cycle through
+the values returned by each in a methodical fashion (run the rightmost
+through its full cycle, then advance the next-to-rightmost one value,
+run the rightmost through its cycle again, etc etc).  Also there needs
+to be an understanding of the hierarchy when an Iter appears in the
+arguments of another Iter's function.  (You cycle the upper one for
+*each* set of arguments created by cycling its sub-Iters.)
+
+I am not particularly interested in working on this feature right now,
+since AFAIK it's a Berkeleyism not found in SQL92.  What I've done
+is to hack ExecTargetList so that it behaves semi-sanely when there's
+more than one Iter at the top level of the target list --- it still
+doesn't really give the right answer, but at least it will keep
+generating tuples until all the Iters are done at the same time.
+It happens that that's enough to give correct answers for the examples
+shown in the misc regress test.  Even when it fails to generate all
+the possible combinations, there will be no buffer leaks.
+
+So, I'm going to declare victory and go home ;-).  We ought to add a
+TODO item along the lines of
+ * Functions returning sets don't really work right
+in hopes that someone will feel like tackling this someday.
+
+			regards, tom lane
+
+************
+
+
--- a/src/bin/pg_dump/Makefile
+++ b/src/bin/pg_dump/Makefile
@ -4,7 +4,7 @@
 #
 # Copyright (c) 1994, Regents of the University of California
 #
-# $Header: /cvsroot/pgsql/src/bin/pg_dump/Makefile,v 1.17 2000/07/03 16:35:39 petere Exp $
+# $Header: /cvsroot/pgsql/src/bin/pg_dump/Makefile,v 1.18 2000/07/04 14:25:26 momjian Exp $
 #
 #-------------------------------------------------------------------------

@ -12,21 +12,19 @@ subdir = src/bin/pg_dump
 top_builddir = ../../..
 include ../../Makefile.global

-OBJS= pg_dump.o common.o $(STRDUP)
+OBJS= pg_backup_archiver.o pg_backup_custom.o pg_backup_files.o \
+       pg_backup_plain_text.o $(STRDUP)

 CFLAGS+= -I$(LIBPQDIR)
+LDFLAGS+= -lz

+all: submake pg_dump$(X) pg_restore$(X)

-all: submake pg_dump pg_dumpall
+pg_dump$(X): pg_dump.o common.o $(OBJS) $(LIBPQDIR)/libpq.a 
+	$(CC) $(CFLAGS) -o $@ pg_dump.o common.o $(OBJS) $(LIBPQ) $(LDFLAGS)

-pg_dump: $(OBJS) $(LIBPQDIR)/libpq.a
-	$(CC) $(CFLAGS) -o $@ $(OBJS) $(LIBPQ) $(LDFLAGS)
-
-pg_dumpall: pg_dumpall.sh
-	sed -e 's:__VERSION__:$(VERSION):g' \
-	    -e 's:__MULTIBYTE__:$(MULTIBYTE):g' \
-	    -e 's:__bindir__:$(bindir):g' \
-	  < $< > $@
+pg_restore$(X): pg_restore.o $(OBJS) $(LIBPQDIR)/libpq.a
+	$(CC) $(CFLAGS) -o $@ pg_restore.o $(OBJS) $(LIBPQ) $(LDFLAGS)

 ../../utils/strdup.o:
 	$(MAKE) -C ../../utils strdup.o
@ -37,6 +35,7 @@ submake:

 install: all installdirs
 	$(INSTALL_PROGRAM) pg_dump$(X) $(bindir)/pg_dump$(X)
+	$(INSTALL_PROGRAM) pg_restore$(X) $(bindir)/pg_restore$(X)
 	$(INSTALL_SCRIPT) pg_dumpall $(bindir)/pg_dumpall
 	$(INSTALL_SCRIPT) pg_upgrade $(bindir)/pg_upgrade

@ -50,7 +49,7 @@ depend dep:
 	$(CC) -MM $(CFLAGS) *.c >depend

 clean distclean maintainer-clean:
-	rm -f pg_dump$(X) $(OBJS) pg_dumpall
+	rm -f pg_dump$(X) pg_restore$(X) $(OBJS) pg_dump.o common.o pg_restore.o

 ifeq (depend,$(wildcard depend))
 include depend
--- a/src/bin/pg_dump/README
+++ b/src/bin/pg_dump/README
@ -0,0 +1,60 @@
+Notes on pg_dump
+================
+
+pg_dump, by default, still outputs text files.
+
+pg_dumpall forces all pg_dump output to be text, since it also outputs text into the same output stream.
+
+The plain text output format can not be used as input into pg_restore.
+
+
+To dump a database into the next custom format, type:
+
+    pg_dump <db-name> -Fc > <backup-file>
+
+To restore, try
+ 
+   To list contents:
+
+       pg_restore -l <backup-file> | less
+
+   or to list tables:
+
+       pg_restore <backup-file> --table | less
+
+   or to list in a differnet orderL
+
+       pg_restore <backup-file> -l --oid --rearrange | less
+
+Once you are happy with the list, just remove the '-l', and an SQL script will be output.
+
+
+You can also dump a listing:
+
+       pg_restore -l <backup-file> > toc.lis
+  or
+       pg_restore -l <backup-file> -f toc.lis
+
+edit it, and rearrange the lines (or delete some):
+
+    vi toc.lis
+
+then use it to restore selected items:
+
+    pg_restore <backup-file> --use=toc.lis -l | less
+
+When you like the list, type
+
+    pg_restore backup.bck --use=toc.lis > script.sql
+
+or, simply:
+
+    createdb newdbname
+    pg_restore backup.bck --use=toc.lis | psql newdbname
+
+
+Philip Warner, 3-Jul-2000
+pjw@rhyme.com.au
+
+
+
--- a/src/bin/pg_dump/common.c
+++ b/src/bin/pg_dump/common.c
@ -8,7 +8,7 @@
 *
 *
 * IDENTIFICATION
- *	  $Header: /cvsroot/pgsql/src/bin/pg_dump/common.c,v 1.43 2000/06/14 18:17:50 petere Exp $
+ *	  $Header: /cvsroot/pgsql/src/bin/pg_dump/common.c,v 1.44 2000/07/04 14:25:27 momjian Exp $
 *
 * Modifications - 6/12/96 - dave@bensoft.com - version 1.13.dhb.2
 *
@ -232,10 +232,13 @@ strInArray(const char *pattern, char **arr, int arr_size)
 */

 TableInfo  *
-dumpSchema(FILE *fout,
-		   int *numTablesPtr,
-		   const char *tablename,
-		   const bool aclsSkip)
+dumpSchema(Archive  *fout,
+		    int *numTablesPtr,
+		    const char *tablename,
+		    const bool aclsSkip,
+		    const bool oids,
+		    const bool schemaOnly,
+		    const bool dataOnly)
 {
 	int			numTypes;
 	int			numFuncs;
@ -290,7 +293,7 @@ dumpSchema(FILE *fout,
 				g_comment_start, g_comment_end);
 	flagInhAttrs(tblinfo, numTables, inhinfo, numInherits);

-	if (!tablename && fout)
+	if (!tablename && !dataOnly)
 	{
 		if (g_verbose)
 			fprintf(stderr, "%s dumping out database comment %s\n",
@ -306,16 +309,13 @@ dumpSchema(FILE *fout,
 		dumpTypes(fout, finfo, numFuncs, tinfo, numTypes);
 	}

-	if (fout)
-	{
-		if (g_verbose)
-			fprintf(stderr, "%s dumping out tables %s\n",
-					g_comment_start, g_comment_end);
-		dumpTables(fout, tblinfo, numTables, inhinfo, numInherits,
-				   tinfo, numTypes, tablename, aclsSkip);
-	}
+	if (g_verbose)
+		fprintf(stderr, "%s dumping out tables %s\n",
+				g_comment_start, g_comment_end);
+	dumpTables(fout, tblinfo, numTables, inhinfo, numInherits,
+			   tinfo, numTypes, tablename, aclsSkip, oids, schemaOnly, dataOnly);

-	if (!tablename && fout)
+	if (!tablename && !dataOnly)
 	{
 		if (g_verbose)
 			fprintf(stderr, "%s dumping out user-defined procedural languages %s\n",
@ -323,7 +323,7 @@ dumpSchema(FILE *fout,
 		dumpProcLangs(fout, finfo, numFuncs, tinfo, numTypes);
 	}

-	if (!tablename && fout)
+	if (!tablename && !dataOnly)
 	{
 		if (g_verbose)
 			fprintf(stderr, "%s dumping out user-defined functions %s\n",
@ -331,7 +331,7 @@ dumpSchema(FILE *fout,
 		dumpFuncs(fout, finfo, numFuncs, tinfo, numTypes);
 	}

-	if (!tablename && fout)
+	if (!tablename && !dataOnly)
 	{
 		if (g_verbose)
 			fprintf(stderr, "%s dumping out user-defined aggregates %s\n",
@ -339,7 +339,7 @@ dumpSchema(FILE *fout,
 		dumpAggs(fout, agginfo, numAggregates, tinfo, numTypes);
 	}

-	if (!tablename && fout)
+	if (!tablename && !dataOnly)
 	{
 		if (g_verbose)
 			fprintf(stderr, "%s dumping out user-defined operators %s\n",
@ -363,7 +363,7 @@ dumpSchema(FILE *fout,
 */

 extern void
-dumpSchemaIdx(FILE *fout, const char *tablename,
+dumpSchemaIdx(Archive *fout, const char *tablename,
 			  TableInfo *tblinfo, int numTables)
 {
 	int			numIndices;
--- a/src/bin/pg_dump/pg_backup.h
+++ b/src/bin/pg_dump/pg_backup.h
@ -0,0 +1,125 @@
+/*-------------------------------------------------------------------------
+ *
+ * pg_backup.h
+ *
+ *	Public interface to the pg_dump archiver routines.
+ *
+ *	See the headers to pg_restore for more details.
+ *
+ * Copyright (c) 2000, Philip Warner
+ *      Rights are granted to use this software in any way so long
+ *      as this notice is not removed.
+ *
+ *	The author is not responsible for loss or damages that may
+ *	result from it's use.
+ *
+ *
+ * IDENTIFICATION
+ *
+ * Modifications - 28-Jun-2000 - pjw@rhyme.com.au
+ *
+ *	Initial version. 
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#ifndef PG_BACKUP__
+
+#include "config.h"
+#include "c.h"
+
+#define PG_BACKUP__
+
+typedef enum _archiveFormat {
+    archUnknown = 0,
+    archCustom = 1,
+    archFiles = 2,
+    archTar = 3,
+    archPlainText = 4
+} ArchiveFormat;
+
+/*
+ *  We may want to have so user-readbale data, but in the mean
+ *  time this gives us some abstraction and type checking.
+ */
+typedef struct _Archive {
+    /* Nothing here */
+} Archive;
+
+typedef int     (*DataDumperPtr)(Archive* AH, char* oid, void* userArg);
+
+typedef struct _restoreOptions {
+	int			dataOnly;
+	int			dropSchema;
+	char		*filename;
+	int			schemaOnly;
+	int			verbose;
+	int			aclsSkip;
+	int			tocSummary;
+	char		*tocFile;
+	int			oidOrder;
+	int			origOrder;
+	int			rearrange;
+	int			format;
+	char		*formatName;
+
+	int			selTypes;
+	int		selIndex;
+	int		selFunction;
+	int		selTrigger;
+	int		selTable;
+	char		*indexNames;
+	char		*functionNames;
+	char		*tableNames;
+	char		*triggerNames;
+
+	int		*idWanted;
+	int		limitToList;
+	int		compression;
+
+} RestoreOptions;
+
+/*
+ * Main archiver interface.
+ */
+
+/* Called to add a TOC entry */
+extern void	ArchiveEntry(Archive* AH, const char* oid, const char* name,
+			const char* desc, const char* (deps[]), const char* defn,
+			const char* dropStmt, const char* owner, 
+			DataDumperPtr dumpFn, void* dumpArg);
+
+/* Called to write *data* to the archive */
+extern int	WriteData(Archive* AH, const void* data, int dLen);
+
+extern void	CloseArchive(Archive* AH);
+
+extern void	RestoreArchive(Archive* AH, RestoreOptions *ropt);
+
+/* Open an existing archive */
+extern Archive* OpenArchive(const char* FileSpec, ArchiveFormat fmt);
+
+/* Create a new archive */
+extern Archive* CreateArchive(const char* FileSpec, ArchiveFormat fmt, int compression);
+
+/* The --list option */
+extern void	PrintTOCSummary(Archive* AH, RestoreOptions *ropt);
+
+extern RestoreOptions*		NewRestoreOptions(void);
+
+/* Rearrange TOC entries */
+extern void	MoveToStart(Archive* AH, char *oType);
+extern void 	MoveToEnd(Archive* AH, char *oType); 
+extern void	SortTocByOID(Archive* AH);
+extern void	SortTocByID(Archive* AH);
+extern void	SortTocFromFile(Archive* AH, RestoreOptions *ropt);
+
+/* Convenience functions used only when writing DATA */
+extern int archputs(const char *s, Archive* AH);
+extern int archputc(const char c, Archive* AH);
+extern int archprintf(Archive* AH, const char *fmt, ...);
+
+#endif
+
+
+
--- a/src/bin/pg_dump/pg_backup_archiver.c
+++ b/src/bin/pg_dump/pg_backup_archiver.c
--- a/src/bin/pg_dump/pg_backup_archiver.h
+++ b/src/bin/pg_dump/pg_backup_archiver.h
@ -0,0 +1,193 @@
+/*-------------------------------------------------------------------------
+ *
+ * pg_backup_archiver.h
+ *
+ *	Private interface to the pg_dump archiver routines.
+ *	It is NOT intended that these routines be called by any 
+ *	dumper directly.
+ *
+ *	See the headers to pg_restore for more details.
+ *
+ * Copyright (c) 2000, Philip Warner
+ *      Rights are granted to use this software in any way so long
+ *      as this notice is not removed.
+ *
+ *	The author is not responsible for loss or damages that may
+ *	result from it's use.
+ *
+ *
+ * IDENTIFICATION
+ *
+ * Modifications - 28-Jun-2000 - pjw@rhyme.com.au
+ *
+ *	Initial version. 
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#ifndef __PG_BACKUP_ARCHIVE__
+#define __PG_BACKUP_ARCHIVE__
+
+#include <stdio.h>
+
+#ifdef HAVE_ZLIB
+#include <zlib.h>
+#define GZCLOSE(fh) gzclose(fh)
+#define GZWRITE(p, s, n, fh) gzwrite(fh, p, n * s)
+#define GZREAD(p, s, n, fh) gzread(fh, p, n * s)
+#else
+#define GZCLOSE(fh) fclose(fh)
+#define GZWRITE(p, s, n, fh) fwrite(p, s, n, fh)
+#define GZREAD(p, s, n, fh) fread(p, s, n, fh)
+#define Z_DEFAULT_COMPRESSION -1
+
+typedef struct _z_stream {
+    void	*next_in;
+    void	*next_out;
+    int		avail_in;
+    int		avail_out;
+} z_stream;
+typedef z_stream *z_streamp;
+#endif
+
+#include "pg_backup.h"
+
+#define K_VERS_MAJOR 1
+#define K_VERS_MINOR 2 
+#define K_VERS_REV 0 
+
+/* Some important version numbers (checked in code) */
+#define K_VERS_1_0 (( (1 * 256 + 0) * 256 + 0) * 256 + 0)
+#define K_VERS_1_2 (( (1 * 256 + 2) * 256 + 0) * 256 + 0)
+#define K_VERS_MAX (( (1 * 256 + 2) * 256 + 255) * 256 + 0)
+
+struct _archiveHandle;
+struct _tocEntry;
+struct _restoreList;
+
+typedef void    (*ClosePtr)		(struct _archiveHandle* AH);
+typedef void	(*ArchiveEntryPtr)	(struct _archiveHandle* AH, struct _tocEntry* te);
+ 
+typedef void	(*StartDataPtr)		(struct _archiveHandle* AH, struct _tocEntry* te);
+typedef int 	(*WriteDataPtr)		(struct _archiveHandle* AH, const void* data, int dLen);
+typedef void	(*EndDataPtr)		(struct _archiveHandle* AH, struct _tocEntry* te);
+
+typedef int	(*WriteBytePtr)		(struct _archiveHandle* AH, const int i);
+typedef int    	(*ReadBytePtr)		(struct _archiveHandle* AH);
+typedef int	(*WriteBufPtr)		(struct _archiveHandle* AH, const void* c, int len);
+typedef int	(*ReadBufPtr)		(struct _archiveHandle* AH, void* buf, int len);
+typedef void	(*SaveArchivePtr)	(struct _archiveHandle* AH);
+typedef void 	(*WriteExtraTocPtr)	(struct _archiveHandle* AH, struct _tocEntry* te);
+typedef void	(*ReadExtraTocPtr)	(struct _archiveHandle* AH, struct _tocEntry* te);
+typedef void	(*PrintExtraTocPtr)	(struct _archiveHandle* AH, struct _tocEntry* te);
+typedef void	(*PrintTocDataPtr)	(struct _archiveHandle* AH, struct _tocEntry* te, 
+						RestoreOptions *ropt);
+
+typedef int	(*TocSortCompareFn)	(const void* te1, const void *te2); 
+
+typedef enum _archiveMode {
+    archModeWrite,
+    archModeRead
+} ArchiveMode;
+
+typedef struct _outputContext {
+	void		*OF;
+	int		gzOut;
+} OutputContext;
+
+typedef struct _archiveHandle {
+	char				vmaj;				/* Version of file */
+	char				vmin;
+	char				vrev;
+	int					version;			/* Conveniently formatted version */
+
+	int					intSize;			/* Size of an integer in the archive */
+	ArchiveFormat		format;				/* Archive format */
+
+	int					readHeader;			/* Used if file header has been read already */
+
+	ArchiveEntryPtr		ArchiveEntryPtr;	/* Called for each metadata object */
+	StartDataPtr		StartDataPtr; 		/* Called when table data is about to be dumped */
+	WriteDataPtr		WriteDataPtr; 		/* Called to send some table data to the archive */
+	EndDataPtr			EndDataPtr; 		/* Called when table data dump is finished */
+	WriteBytePtr		WriteBytePtr;		/* Write a byte to output */
+	ReadBytePtr			ReadBytePtr;		/* */
+	WriteBufPtr			WriteBufPtr;	
+	ReadBufPtr			ReadBufPtr;
+	ClosePtr			ClosePtr;			/* Close the archive */
+	WriteExtraTocPtr	WriteExtraTocPtr;	/* Write extra TOC entry data associated with */
+											/* the current archive format */
+	ReadExtraTocPtr		ReadExtraTocPtr;	/* Read extr info associated with archie format */
+	PrintExtraTocPtr	PrintExtraTocPtr;	/* Extra TOC info for format */
+	PrintTocDataPtr		PrintTocDataPtr;
+
+	int			lastID;						/* Last internal ID for a TOC entry */
+	char*		fSpec;						/* Archive File Spec */
+	FILE		*FH;						/* General purpose file handle */
+	void		*OF;
+	int		gzOut;						/* Output file */
+
+	struct _tocEntry*		toc;			/* List of TOC entries */
+	int						tocCount;		/* Number of TOC entries */
+	struct _tocEntry*		currToc; 		/* Used when dumping data */
+	char					*currUser;		/* Restore: current username in script */
+	int						compression;	/* Compression requested on open */
+	ArchiveMode				mode;			/* File mode - r or w */
+	void*					formatData;		/* Header data specific to file format */
+
+} ArchiveHandle;
+
+typedef struct _tocEntry {
+	struct _tocEntry* 	prev;
+	struct _tocEntry*	next;
+	int					id;
+	int					hadDumper;		/* Archiver was passed a dumper routine (used in restore) */
+	char*				oid;
+	int					oidVal;
+	char*				name;
+	char*				desc;
+	char*				defn;
+	char*				dropStmt;
+	char*				owner;
+	char**				depOid;
+	int					printed;		/* Indicates if entry defn has been dumped */
+	DataDumperPtr		dataDumper;		/* Routine to dump data for object */
+	void*				dataDumperArg;		/* Arg for above routine */
+	void*				formatData;		/* TOC Entry data specific to file format */
+
+	int					_moved;			/* Marker used when rearranging TOC */
+
+} TocEntry;
+
+extern void die_horribly(const char *fmt, ...);
+
+extern void WriteTOC(ArchiveHandle* AH);
+extern void ReadTOC(ArchiveHandle* AH);
+extern void WriteHead(ArchiveHandle* AH);
+extern void ReadHead(ArchiveHandle* AH);
+extern void WriteToc(ArchiveHandle* AH);
+extern void ReadToc(ArchiveHandle* AH);
+extern void WriteDataChunks(ArchiveHandle* AH);
+
+extern int TocIDRequired(ArchiveHandle* AH, int id, RestoreOptions *ropt);
+
+/*
+ * Mandatory routines for each supported format
+ */
+
+extern int WriteInt(ArchiveHandle* AH, int i);
+extern int ReadInt(ArchiveHandle* AH);
+extern char* ReadStr(ArchiveHandle* AH);
+extern int WriteStr(ArchiveHandle* AH, char* s);
+
+extern void InitArchiveFmt_Custom(ArchiveHandle* AH);
+extern void InitArchiveFmt_Files(ArchiveHandle* AH);
+extern void InitArchiveFmt_PlainText(ArchiveHandle* AH);
+
+extern OutputContext	SetOutput(ArchiveHandle* AH, char *filename, int compression);
+extern void 		ResetOutput(ArchiveHandle* AH, OutputContext savedContext);
+
+int ahwrite(const void *ptr, size_t size, size_t nmemb, ArchiveHandle* AH);
+int ahprintf(ArchiveHandle* AH, const char *fmt, ...);
+
+#endif
--- a/src/bin/pg_dump/pg_backup_custom.c
+++ b/src/bin/pg_dump/pg_backup_custom.c
@ -0,0 +1,584 @@
+/*-------------------------------------------------------------------------
+ *
+ * pg_backup_custom.c
+ *
+ *	Implements the custom output format.
+ *
+ *	See the headers to pg_restore for more details.
+ *
+ * Copyright (c) 2000, Philip Warner
+ *      Rights are granted to use this software in any way so long
+ *      as this notice is not removed.
+ *
+ *	The author is not responsible for loss or damages that may
+ *	result from it's use.
+ *
+ *
+ * IDENTIFICATION
+ *
+ * Modifications - 28-Jun-2000 - pjw@rhyme.com.au
+ *
+ *	Initial version. 
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include <stdlib.h>
+#include "pg_backup.h"
+#include "pg_backup_archiver.h"
+
+extern int	errno;
+
+static void     _ArchiveEntry(ArchiveHandle* AH, TocEntry* te);
+static void	_StartData(ArchiveHandle* AH, TocEntry* te);
+static int	_WriteData(ArchiveHandle* AH, const void* data, int dLen);
+static void     _EndData(ArchiveHandle* AH, TocEntry* te);
+static int      _WriteByte(ArchiveHandle* AH, const int i);
+static int      _ReadByte(ArchiveHandle* );
+static int      _WriteBuf(ArchiveHandle* AH, const void* buf, int len);
+static int    	_ReadBuf(ArchiveHandle* AH, void* buf, int len);
+static void     _CloseArchive(ArchiveHandle* AH);
+static void	_PrintTocData(ArchiveHandle* AH, TocEntry* te, RestoreOptions *ropt);
+static void	_WriteExtraToc(ArchiveHandle* AH, TocEntry* te);
+static void	_ReadExtraToc(ArchiveHandle* AH, TocEntry* te);
+static void	_PrintExtraToc(ArchiveHandle* AH, TocEntry* te);
+
+static void	_PrintData(ArchiveHandle* AH);
+static void     _skipData(ArchiveHandle* AH);
+
+#define zlibOutSize	4096
+#define zlibInSize	4096
+
+typedef struct {
+    z_streamp	zp;
+    char*	zlibOut;
+    char*	zlibIn;
+    int		inSize;
+    int		hasSeek;
+    int		filePos;
+    int		dataStart;
+} lclContext;
+
+typedef struct {
+    int		dataPos;
+    int		dataLen;
+} lclTocEntry;
+
+static int	_getFilePos(ArchiveHandle* AH, lclContext* ctx);
+
+static char* progname = "Archiver(custom)";
+
+/*
+ *  Handler functions. 
+ */
+void InitArchiveFmt_Custom(ArchiveHandle* AH) 
+{
+    lclContext*		ctx;
+
+    /* Assuming static functions, this can be copied for each format. */
+    AH->ArchiveEntryPtr = _ArchiveEntry;
+    AH->StartDataPtr = _StartData;
+    AH->WriteDataPtr = _WriteData;
+    AH->EndDataPtr = _EndData;
+    AH->WriteBytePtr = _WriteByte;
+    AH->ReadBytePtr = _ReadByte;
+    AH->WriteBufPtr = _WriteBuf;
+    AH->ReadBufPtr = _ReadBuf;
+    AH->ClosePtr = _CloseArchive;
+    AH->PrintTocDataPtr = _PrintTocData;
+    AH->ReadExtraTocPtr = _ReadExtraToc;
+    AH->WriteExtraTocPtr = _WriteExtraToc;
+    AH->PrintExtraTocPtr = _PrintExtraToc;
+
+    /*
+     *	Set up some special context used in compressing data.
+    */
+    ctx = (lclContext*)malloc(sizeof(lclContext));
+    if (ctx == NULL)
+	die_horribly("%s: Unable to allocate archive context",progname);
+    AH->formatData = (void*)ctx;
+
+    ctx->zp = (z_streamp)malloc(sizeof(z_stream));
+    if (ctx->zp == NULL)
+	die_horribly("%s: unable to allocate zlib stream archive context",progname);
+
+    ctx->zlibOut = (char*)malloc(zlibOutSize);
+    ctx->zlibIn = (char*)malloc(zlibInSize);
+    ctx->inSize = zlibInSize;
+    ctx->filePos = 0;
+
+    if (ctx->zlibOut == NULL || ctx->zlibIn == NULL)
+	die_horribly("%s: unable to allocate buffers in archive context",progname);
+
+    /*
+     * Now open the file
+    */
+    if (AH->mode == archModeWrite) {
+	if (AH->fSpec && strcmp(AH->fSpec,"") != 0) {
+	    AH->FH = fopen(AH->fSpec, PG_BINARY_W);
+	} else {
+	    AH->FH = stdout;
+	}
+
+	if (!AH)
+	    die_horribly("%s: unable to open archive file %s",progname, AH->fSpec);
+
+	ctx->hasSeek = (fseek(AH->FH, 0, SEEK_CUR) == 0);
+
+    } else {
+	if (AH->fSpec && strcmp(AH->fSpec,"") != 0) {
+	    AH->FH = fopen(AH->fSpec, PG_BINARY_R);
+	} else {
+	    AH->FH = stdin;
+	}
+	if (!AH)
+	    die_horribly("%s: unable to open archive file %s",progname, AH->fSpec);
+
+	ctx->hasSeek = (fseek(AH->FH, 0, SEEK_CUR) == 0);
+
+	ReadHead(AH);
+	ReadToc(AH);
+	ctx->dataStart = _getFilePos(AH, ctx);
+    }
+
+}
+
+/*
+ * - Start a new TOC entry
+*/
+static void	_ArchiveEntry(ArchiveHandle* AH, TocEntry* te) 
+{
+    lclTocEntry*	ctx;
+
+    ctx = (lclTocEntry*)malloc(sizeof(lclTocEntry));
+    if (te->dataDumper) {
+	ctx->dataPos = -1;
+    } else {
+	ctx->dataPos = 0;
+    }
+    ctx->dataLen = 0;
+    te->formatData = (void*)ctx;
+
+}
+
+static void	_WriteExtraToc(ArchiveHandle* AH, TocEntry* te)
+{
+    lclTocEntry*	ctx = (lclTocEntry*)te->formatData;
+
+    WriteInt(AH, ctx->dataPos);
+    WriteInt(AH, ctx->dataLen);
+}
+
+static void	_ReadExtraToc(ArchiveHandle* AH, TocEntry* te)
+{
+    lclTocEntry*	ctx = (lclTocEntry*)te->formatData;
+
+    if (ctx == NULL) {
+	ctx = (lclTocEntry*)malloc(sizeof(lclTocEntry));
+	te->formatData = (void*)ctx;
+    }
+
+    ctx->dataPos = ReadInt( AH );
+    ctx->dataLen = ReadInt( AH );
+    
+}
+
+static void	_PrintExtraToc(ArchiveHandle* AH, TocEntry* te)
+{
+    lclTocEntry*	ctx = (lclTocEntry*)te->formatData;
+
+    ahprintf(AH, "-- Data Pos: %d (Length %d)\n", ctx->dataPos, ctx->dataLen);
+}
+
+static void	_StartData(ArchiveHandle* AH, TocEntry* te)
+{
+    lclContext*		ctx = (lclContext*)AH->formatData;
+    z_streamp   	zp = ctx->zp;
+    lclTocEntry*	tctx = (lclTocEntry*)te->formatData;
+
+    tctx->dataPos = _getFilePos(AH, ctx);
+
+    WriteInt(AH, te->id); /* For sanity check */
+
+#ifdef HAVE_ZLIB
+
+    if (AH->compression < 0 || AH->compression > 9) {
+	AH->compression = Z_DEFAULT_COMPRESSION;
+    }
+
+    if (AH->compression != 0) {
+	zp->zalloc = Z_NULL;
+	zp->zfree = Z_NULL;
+	zp->opaque = Z_NULL;
+
+	if (deflateInit(zp, AH->compression) != Z_OK)
+	    die_horribly("%s: could not initialize compression library - %s\n",progname, zp->msg);
+    }
+
+#else
+
+    AH->compression = 0;
+
+#endif
+
+    /* Just be paranoid - maye End is called after Start, with no Write */
+    zp->next_out = ctx->zlibOut;
+    zp->avail_out = zlibOutSize;
+}
+
+static int	_DoDeflate(ArchiveHandle* AH, lclContext* ctx, int flush) 
+{
+    z_streamp   zp = ctx->zp;
+
+#ifdef HAVE_ZLIB
+    char*	out = ctx->zlibOut;
+    int		res = Z_OK;
+
+    if (AH->compression != 0) 
+    {
+	res = deflate(zp, flush);
+	if (res == Z_STREAM_ERROR)
+	    die_horribly("%s: could not compress data - %s\n",progname, zp->msg);
+
+	if 	(      ( (flush == Z_FINISH) && (zp->avail_out < zlibOutSize) )
+		|| (zp->avail_out == 0) 
+		|| (zp->avail_in != 0)
+	    ) 
+	{
+	    /*
+	     * Extra paranoia: avoid zero-length chunks since a zero 
+	     * length chunk is the EOF marker. This should never happen
+	     * but...
+	    */
+	    if (zp->avail_out < zlibOutSize) {
+		/* printf("Wrote %d byte deflated chunk\n", zlibOutSize - zp->avail_out); */
+		WriteInt(AH, zlibOutSize - zp->avail_out);
+		fwrite(out, 1, zlibOutSize - zp->avail_out, AH->FH);
+		ctx->filePos += zlibOutSize - zp->avail_out;
+	    }
+	    zp->next_out = out;
+	    zp->avail_out = zlibOutSize;
+	}
+    } else {
+#endif
+	if (zp->avail_in > 0)
+	{
+	    WriteInt(AH, zp->avail_in);
+	    fwrite(zp->next_in, 1, zp->avail_in, AH->FH);
+	    ctx->filePos += zp->avail_in;
+	    zp->avail_in = 0;
+	} else {
+#ifdef HAVE_ZLIB
+	    if (flush == Z_FINISH)
+		res = Z_STREAM_END;
+#endif
+	}
+
+
+#ifdef HAVE_ZLIB
+    }
+
+    return res;
+#else
+    return 1;
+#endif
+
+}
+
+static int	_WriteData(ArchiveHandle* AH, const void* data, int dLen)
+{
+    lclContext*	ctx = (lclContext*)AH->formatData;
+    z_streamp	zp = ctx->zp;
+
+    zp->next_in = (void*)data;
+    zp->avail_in = dLen;
+
+    while (zp->avail_in != 0) {
+	/* printf("Deflating %d bytes\n", dLen); */
+	_DoDeflate(AH, ctx, 0);
+    }
+    return dLen;
+}
+
+static void	_EndData(ArchiveHandle* AH, TocEntry* te)
+{
+    lclContext*		ctx = (lclContext*)AH->formatData;
+    lclTocEntry*	tctx = (lclTocEntry*) te->formatData;
+
+#ifdef HAVE_ZLIB
+    z_streamp		zp = ctx->zp;
+    int			res;
+
+    if (AH->compression != 0)
+    {
+	zp->next_in = NULL;
+	zp->avail_in = 0;
+
+	do { 	
+	    /* printf("Ending data output\n"); */
+	    res = _DoDeflate(AH, ctx, Z_FINISH);
+	} while (res != Z_STREAM_END);
+
+	if (deflateEnd(zp) != Z_OK)
+	    die_horribly("%s: error closing compression stream - %s\n", progname, zp->msg);
+    }
+#endif
+
+    /* Send the end marker */
+    WriteInt(AH, 0);
+
+    tctx->dataLen = _getFilePos(AH, ctx) - tctx->dataPos;
+
+}
+
+/*
+ * Print data for a gievn TOC entry
+*/
+static void	_PrintTocData(ArchiveHandle* AH, TocEntry* te, RestoreOptions *ropt)
+{
+    lclContext* 	ctx = (lclContext*)AH->formatData;
+    int			id;
+    lclTocEntry*	tctx = (lclTocEntry*) te->formatData;
+
+    if (tctx->dataPos == 0) 
+	return;
+
+    if (!ctx->hasSeek || tctx->dataPos < 0) {
+	id = ReadInt(AH);
+
+	while (id != te->id) {
+	    if (TocIDRequired(AH, id, ropt) & 2)
+		die_horribly("%s: Dumping a specific TOC data block out of order is not supported"
+			       " without on this input stream (fseek required)\n", progname);
+	    _skipData(AH);
+	    id = ReadInt(AH);
+	}
+    } else {
+
+	if (fseek(AH->FH, tctx->dataPos, SEEK_SET) != 0)
+	    die_horribly("%s: error %d in file seek\n",progname, errno);
+
+	id = ReadInt(AH);
+
+    }
+
+    if (id != te->id)
+	die_horribly("%s: Found unexpected block ID (%d) when reading data - expected %d\n",
+			progname, id, te->id);
+
+    ahprintf(AH, "--\n-- Data for TOC Entry ID %d (OID %s) %s %s\n--\n\n",
+		te->id, te->oid, te->desc, te->name);
+
+    _PrintData(AH);
+
+    ahprintf(AH, "\n\n");
+}
+
+/*
+ * Print data from current file position.
+*/
+static void	_PrintData(ArchiveHandle* AH)
+{
+    lclContext*	ctx = (lclContext*)AH->formatData;
+    z_streamp	zp = ctx->zp;
+    int		blkLen;
+    char*	in = ctx->zlibIn;
+    int		cnt;
+
+#ifdef HAVE_ZLIB
+
+    int		res;
+    char*	out = ctx->zlibOut;
+
+    res = Z_OK;
+
+    if (AH->compression != 0) {
+	zp->zalloc = Z_NULL;
+	zp->zfree = Z_NULL;
+	zp->opaque = Z_NULL;
+
+	if (inflateInit(zp) != Z_OK)
+	    die_horribly("%s: could not initialize compression library - %s\n", progname, zp->msg);
+    }
+
+#endif
+
+    blkLen = ReadInt(AH);
+    while (blkLen != 0) {
+	if (blkLen > ctx->inSize) {
+	    free(ctx->zlibIn);
+	    ctx->zlibIn = NULL;
+	    ctx->zlibIn = (char*)malloc(blkLen);
+	    if (!ctx->zlibIn)
+		die_horribly("%s: failed to allocate decompression buffer\n", progname);
+
+	    ctx->inSize = blkLen;
+	    in = ctx->zlibIn;
+	}
+	cnt = fread(in, 1, blkLen, AH->FH);
+	if (cnt != blkLen) 
+	    die_horribly("%s: could not read data block - expected %d, got %d\n", progname, blkLen, cnt);
+
+	ctx->filePos += blkLen;
+
+	zp->next_in = in;
+	zp->avail_in = blkLen;
+
+#ifdef HAVE_ZLIB
+
+	if (AH->compression != 0) {
+
+	    while (zp->avail_in != 0) {
+		zp->next_out = out;
+		zp->avail_out = zlibOutSize;
+		res = inflate(zp, 0);
+		if (res != Z_OK && res != Z_STREAM_END)
+		    die_horribly("%s: unable to uncompress data - %s\n", progname, zp->msg);
+
+		out[zlibOutSize - zp->avail_out] = '\0';
+		ahwrite(out, 1, zlibOutSize - zp->avail_out, AH);
+	    }
+	} else {
+#endif
+	    ahwrite(in, 1, zp->avail_in, AH);
+	    zp->avail_in = 0;
+
+#ifdef HAVE_ZLIB
+	}
+#endif
+
+	blkLen = ReadInt(AH);
+    }
+
+#ifdef HAVE_ZLIB
+    if (AH->compression != 0) 
+    {
+	zp->next_in = NULL;
+	zp->avail_in = 0;
+	while (res != Z_STREAM_END) {
+	    zp->next_out = out;
+	    zp->avail_out = zlibOutSize;
+	    res = inflate(zp, 0);
+	    if (res != Z_OK && res != Z_STREAM_END)
+		die_horribly("%s: unable to uncompress data - %s\n", progname, zp->msg);
+
+	    out[zlibOutSize - zp->avail_out] = '\0';
+	    ahwrite(out, 1, zlibOutSize - zp->avail_out, AH);
+	}
+    }
+#endif
+
+}
+
+/*
+ * Skip data from current file position.
+*/
+static void	_skipData(ArchiveHandle* AH)
+{
+    lclContext*	ctx = (lclContext*)AH->formatData;
+    int		blkLen;
+    char*	in = ctx->zlibIn;
+    int		cnt;
+
+    blkLen = ReadInt(AH);
+    while (blkLen != 0) {
+	if (blkLen > ctx->inSize) {
+	    free(ctx->zlibIn);
+	    ctx->zlibIn = (char*)malloc(blkLen);
+	    ctx->inSize = blkLen;
+	    in = ctx->zlibIn;
+	}
+	cnt = fread(in, 1, blkLen, AH->FH);
+	if (cnt != blkLen) 
+	    die_horribly("%s: could not read data block - expected %d, got %d\n", progname, blkLen, cnt);
+
+	ctx->filePos += blkLen;
+
+	blkLen = ReadInt(AH);
+    }
+
+}
+
+static int	_WriteByte(ArchiveHandle* AH, const int i)
+{
+    lclContext*		ctx = (lclContext*)AH->formatData;
+    int			res;
+
+    res = fputc(i, AH->FH);
+    if (res != EOF) {
+	ctx->filePos += 1;
+    }
+    return res;
+}
+
+static int    	_ReadByte(ArchiveHandle* AH)
+{
+    lclContext*		ctx = (lclContext*)AH->formatData;
+    int			res;
+
+    res = fgetc(AH->FH);
+    if (res != EOF) {
+	ctx->filePos += 1;
+    }
+    return res;
+}
+
+static int	_WriteBuf(ArchiveHandle* AH, const void* buf, int len)
+{
+    lclContext*		ctx = (lclContext*)AH->formatData;
+    int			res;
+    res = fwrite(buf, 1, len, AH->FH);
+    ctx->filePos += res;
+    return res;
+}
+
+static int	_ReadBuf(ArchiveHandle* AH, void* buf, int len)
+{
+    lclContext*		ctx = (lclContext*)AH->formatData;
+    int			res;
+    res = fread(buf, 1, len, AH->FH);
+    ctx->filePos += res;
+    return res;
+}
+
+static void	_CloseArchive(ArchiveHandle* AH)
+{
+    lclContext*		ctx = (lclContext*)AH->formatData;
+    int			tpos;
+
+    if (AH->mode == archModeWrite) {
+	WriteHead(AH);
+	tpos = ftell(AH->FH);
+	WriteToc(AH);
+	ctx->dataStart = _getFilePos(AH, ctx);
+	WriteDataChunks(AH);
+	/* This is not an essential operation - it is really only
+	 * needed if we expect to be doing seeks to read the data back
+	 * - it may be ok to just use the existing self-consistent block
+	 * formatting.
+	 */
+	if (ctx->hasSeek) {
+	    fseek(AH->FH, tpos, SEEK_SET);
+	    WriteToc(AH);
+	}
+    }
+
+    fclose(AH->FH);
+    AH->FH = NULL; 
+}
+
+static int	_getFilePos(ArchiveHandle* AH, lclContext* ctx) 
+{
+    int		pos;
+    if (ctx->hasSeek) {
+	pos = ftell(AH->FH);
+	if (pos != ctx->filePos) {
+	    fprintf(stderr, "Warning: ftell mismatch with filePos\n");
+	}
+    } else {
+	pos = ctx->filePos;
+    }
+    return pos;
+}
+
+
--- a/src/bin/pg_dump/pg_backup_files.c
+++ b/src/bin/pg_dump/pg_backup_files.c
@ -0,0 +1,303 @@
+/*-------------------------------------------------------------------------
+ *
+ * pg_backup_files.c
+ *
+ *	This file is copied from the 'custom' format file, but dumps data into
+ *	separate files, and the TOC into the 'main' file.
+ *
+ *	IT IS FOR DEMONSTRATION PURPOSES ONLY.
+ *
+ *	(and could probably be used as a basis for writing a tar file)
+ *
+ *	See the headers to pg_restore for more details.
+ *
+ * Copyright (c) 2000, Philip Warner
+ *      Rights are granted to use this software in any way so long
+ *      as this notice is not removed.
+ *
+ *	The author is not responsible for loss or damages that may
+ *	result from it's use.
+ *
+ *
+ * IDENTIFICATION
+ *
+ * Modifications - 28-Jun-2000 - pjw@rhyme.com.au
+ *
+ *	Initial version. 
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include <stdlib.h>
+#include <string.h>
+#include "pg_backup.h"
+#include "pg_backup_archiver.h"
+
+static void     _ArchiveEntry(ArchiveHandle* AH, TocEntry* te);
+static void	_StartData(ArchiveHandle* AH, TocEntry* te);
+static int	_WriteData(ArchiveHandle* AH, const void* data, int dLen);
+static void     _EndData(ArchiveHandle* AH, TocEntry* te);
+static int      _WriteByte(ArchiveHandle* AH, const int i);
+static int      _ReadByte(ArchiveHandle* );
+static int      _WriteBuf(ArchiveHandle* AH, const void* buf, int len);
+static int    	_ReadBuf(ArchiveHandle* AH, void* buf, int len);
+static void     _CloseArchive(ArchiveHandle* AH);
+static void	_PrintTocData(ArchiveHandle* AH, TocEntry* te, RestoreOptions *ropt);
+static void	_WriteExtraToc(ArchiveHandle* AH, TocEntry* te);
+static void	_ReadExtraToc(ArchiveHandle* AH, TocEntry* te);
+static void	_PrintExtraToc(ArchiveHandle* AH, TocEntry* te);
+
+
+typedef struct {
+    int		hasSeek;
+    int		filePos;
+} lclContext;
+
+typedef struct {
+#ifdef HAVE_ZLIB
+    gzFile	*FH;
+#else
+    FILE	*FH;
+#endif
+    char	*filename;
+} lclTocEntry;
+
+/*
+ *  Initializer
+ */
+void InitArchiveFmt_Files(ArchiveHandle* AH) 
+{
+    lclContext*		ctx;
+
+    /* Assuming static functions, this can be copied for each format. */
+    AH->ArchiveEntryPtr = _ArchiveEntry;
+    AH->StartDataPtr = _StartData;
+    AH->WriteDataPtr = _WriteData;
+    AH->EndDataPtr = _EndData;
+    AH->WriteBytePtr = _WriteByte;
+    AH->ReadBytePtr = _ReadByte;
+    AH->WriteBufPtr = _WriteBuf;
+    AH->ReadBufPtr = _ReadBuf;
+    AH->ClosePtr = _CloseArchive;
+    AH->PrintTocDataPtr = _PrintTocData;
+    AH->ReadExtraTocPtr = _ReadExtraToc;
+    AH->WriteExtraTocPtr = _WriteExtraToc;
+    AH->PrintExtraTocPtr = _PrintExtraToc;
+
+    /*
+     *	Set up some special context used in compressing data.
+    */
+    ctx = (lclContext*)malloc(sizeof(lclContext));
+    AH->formatData = (void*)ctx;
+    ctx->filePos = 0;
+
+    /*
+     * Now open the TOC file
+     */
+    if (AH->mode == archModeWrite) {
+	if (AH->fSpec && strcmp(AH->fSpec,"") != 0) {
+	    AH->FH = fopen(AH->fSpec, PG_BINARY_W);
+	} else {
+	    AH->FH = stdout;
+	}
+	ctx->hasSeek = (fseek(AH->FH, 0, SEEK_CUR) == 0);
+
+	if (AH->compression < 0 || AH->compression > 9) {
+	    AH->compression = Z_DEFAULT_COMPRESSION;
+	}
+
+
+    } else {
+	if (AH->fSpec && strcmp(AH->fSpec,"") != 0) {
+	    AH->FH = fopen(AH->fSpec, PG_BINARY_R);
+	} else {
+	    AH->FH = stdin;
+	}
+	ctx->hasSeek = (fseek(AH->FH, 0, SEEK_CUR) == 0);
+
+	ReadHead(AH);
+	ReadToc(AH);
+	fclose(AH->FH); /* Nothing else in the file... */
+    }
+
+}
+
+/*
+ * - Start a new TOC entry
+ *   Setup the output file name.
+ */
+static void	_ArchiveEntry(ArchiveHandle* AH, TocEntry* te) 
+{
+    lclTocEntry*	ctx;
+    char		fn[1024];
+
+    ctx = (lclTocEntry*)malloc(sizeof(lclTocEntry));
+    if (te->dataDumper) {
+#ifdef HAVE_ZLIB
+	if (AH->compression == 0) {
+	    sprintf(fn, "%d.dat", te->id);
+	} else {
+	    sprintf(fn, "%d.dat.gz", te->id);
+	}
+#else
+	sprintf(fn, "%d.dat", te->id);
+#endif
+	ctx->filename = strdup(fn);
+    } else {
+	ctx->filename = NULL;
+	ctx->FH = NULL;
+    }
+    te->formatData = (void*)ctx;
+}
+
+static void	_WriteExtraToc(ArchiveHandle* AH, TocEntry* te)
+{
+    lclTocEntry*	ctx = (lclTocEntry*)te->formatData;
+
+    if (ctx->filename) {
+	WriteStr(AH, ctx->filename);
+    } else {
+	WriteStr(AH, "");
+    }
+}
+
+static void	_ReadExtraToc(ArchiveHandle* AH, TocEntry* te)
+{
+    lclTocEntry*	ctx = (lclTocEntry*)te->formatData;
+
+    if (ctx == NULL) {
+	ctx = (lclTocEntry*)malloc(sizeof(lclTocEntry));
+	te->formatData = (void*)ctx;
+    }
+
+    ctx->filename = ReadStr(AH);
+    if (strlen(ctx->filename) == 0) {
+	free(ctx->filename);
+	ctx->filename = NULL;
+    }
+    ctx->FH = NULL;
+}
+
+static void	_PrintExtraToc(ArchiveHandle* AH, TocEntry* te)
+{
+    lclTocEntry*	ctx = (lclTocEntry*)te->formatData;
+
+    ahprintf(AH, "-- File: %s\n", ctx->filename);
+}
+
+static void	_StartData(ArchiveHandle* AH, TocEntry* te)
+{
+    lclTocEntry*	tctx = (lclTocEntry*)te->formatData;
+    char		fmode[10];
+
+    sprintf(fmode, "wb%d", AH->compression);
+
+#ifdef HAVE_ZLIB
+    tctx->FH = gzopen(tctx->filename, fmode);
+#else
+    tctx->FH = fopen(tctx->filename, PG_BINARY_W);
+#endif
+}
+
+static int	_WriteData(ArchiveHandle* AH, const void* data, int dLen)
+{
+    lclTocEntry*	tctx = (lclTocEntry*)AH->currToc->formatData;
+
+    GZWRITE((void*)data, 1, dLen, tctx->FH);
+
+    return dLen;
+}
+
+static void	_EndData(ArchiveHandle* AH, TocEntry* te)
+{
+    lclTocEntry*	tctx = (lclTocEntry*) te->formatData;
+
+    /* Close the file */
+    GZCLOSE(tctx->FH);
+    tctx->FH = NULL;
+}
+
+/*
+ * Print data for a given TOC entry
+*/
+static void	_PrintTocData(ArchiveHandle* AH, TocEntry* te, RestoreOptions *ropt)
+{
+    lclTocEntry*	tctx = (lclTocEntry*) te->formatData;
+    char		buf[4096];
+    int			cnt;
+
+    if (!tctx->filename) 
+	return;
+
+#ifdef HAVE_ZLIB
+    AH->FH = gzopen(tctx->filename,"rb");
+#else
+    AH->FH = fopen(tctx->filename,PG_BINARY_R);
+#endif
+
+    ahprintf(AH, "--\n-- Data for TOC Entry ID %d (OID %s) %s %s\n--\n\n",
+		te->id, te->oid, te->desc, te->name);
+
+    while ( (cnt = GZREAD(buf, 1, 4096, AH->FH)) > 0) {
+	ahwrite(buf, 1, cnt, AH);
+    }
+
+    GZCLOSE(AH->FH);
+
+    ahprintf(AH, "\n\n");
+}
+
+static int	_WriteByte(ArchiveHandle* AH, const int i)
+{
+    lclContext*		ctx = (lclContext*)AH->formatData;
+    int			res;
+
+    res = fputc(i, AH->FH);
+    if (res != EOF) {
+	ctx->filePos += 1;
+    }
+    return res;
+}
+
+static int    	_ReadByte(ArchiveHandle* AH)
+{
+    lclContext*		ctx = (lclContext*)AH->formatData;
+    int			res;
+
+    res = fgetc(AH->FH);
+    if (res != EOF) {
+	ctx->filePos += 1;
+    }
+    return res;
+}
+
+static int	_WriteBuf(ArchiveHandle* AH, const void* buf, int len)
+{
+    lclContext*		ctx = (lclContext*)AH->formatData;
+    int			res;
+    res = fwrite(buf, 1, len, AH->FH);
+    ctx->filePos += res;
+    return res;
+}
+
+static int	_ReadBuf(ArchiveHandle* AH, void* buf, int len)
+{
+    lclContext*		ctx = (lclContext*)AH->formatData;
+    int			res;
+    res = fread(buf, 1, len, AH->FH);
+    ctx->filePos += res;
+    return res;
+}
+
+static void	_CloseArchive(ArchiveHandle* AH)
+{
+    if (AH->mode == archModeWrite) {
+	WriteHead(AH);
+	WriteToc(AH);
+	fclose(AH->FH);
+	WriteDataChunks(AH);
+    }
+
+    AH->FH = NULL; 
+}
+
--- a/src/bin/pg_dump/pg_backup_plain_text.c
+++ b/src/bin/pg_dump/pg_backup_plain_text.c
@ -0,0 +1,115 @@
+/*-------------------------------------------------------------------------
+ *
+ * pg_backup_plain_text.c
+ *
+ *	This file is copied from the 'custom' format file, but dumps data into
+ *	directly to a text file, and the TOC into the 'main' file.
+ *
+ *	See the headers to pg_restore for more details.
+ *
+ * Copyright (c) 2000, Philip Warner
+ *      Rights are granted to use this software in any way so long
+ *      as this notice is not removed.
+ *
+ *	The author is not responsible for loss or damages that may
+ *	result from it's use.
+ *
+ *
+ * IDENTIFICATION
+ *
+ * Modifications - 01-Jul-2000 - pjw@rhyme.com.au
+ *
+ *	Initial version. 
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include <stdlib.h>
+#include <string.h>
+#include <unistd.h> /* for dup */
+#include "pg_backup.h"
+#include "pg_backup_archiver.h"
+
+static void     _ArchiveEntry(ArchiveHandle* AH, TocEntry* te);
+static void	_StartData(ArchiveHandle* AH, TocEntry* te);
+static int	_WriteData(ArchiveHandle* AH, const void* data, int dLen);
+static void     _EndData(ArchiveHandle* AH, TocEntry* te);
+static int      _WriteByte(ArchiveHandle* AH, const int i);
+static int      _WriteBuf(ArchiveHandle* AH, const void* buf, int len);
+static void     _CloseArchive(ArchiveHandle* AH);
+static void	_PrintTocData(ArchiveHandle* AH, TocEntry* te, RestoreOptions *ropt);
+
+/*
+ *  Initializer
+ */
+void InitArchiveFmt_PlainText(ArchiveHandle* AH) 
+{
+    /* Assuming static functions, this can be copied for each format. */
+    AH->ArchiveEntryPtr = _ArchiveEntry;
+    AH->StartDataPtr = _StartData;
+    AH->WriteDataPtr = _WriteData;
+    AH->EndDataPtr = _EndData;
+    AH->WriteBytePtr = _WriteByte;
+    AH->WriteBufPtr = _WriteBuf;
+    AH->ClosePtr = _CloseArchive;
+    AH->PrintTocDataPtr = _PrintTocData;
+
+    /*
+     * Now prevent reading...
+     */
+    if (AH->mode == archModeRead)
+	die_horribly("%s: This format can not be read\n");
+
+}
+
+/*
+ * - Start a new TOC entry
+ */
+static void	_ArchiveEntry(ArchiveHandle* AH, TocEntry* te) 
+{
+    /* Don't need to do anything */
+}
+
+static void	_StartData(ArchiveHandle* AH, TocEntry* te)
+{
+    ahprintf(AH, "--\n-- Data for TOC Entry ID %d (OID %s) %s %s\n--\n\n",
+		te->id, te->oid, te->desc, te->name);
+}
+
+static int	_WriteData(ArchiveHandle* AH, const void* data, int dLen)
+{
+    ahwrite(data, 1, dLen, AH);
+    return dLen;
+}
+
+static void	_EndData(ArchiveHandle* AH, TocEntry* te)
+{
+    ahprintf(AH, "\n\n");
+}
+
+/*
+ * Print data for a given TOC entry
+*/
+static void	_PrintTocData(ArchiveHandle* AH, TocEntry* te, RestoreOptions *ropt)
+{
+    if (*te->dataDumper)
+	(*te->dataDumper)((Archive*)AH, te->oid, te->dataDumperArg);
+}
+
+static int	_WriteByte(ArchiveHandle* AH, const int i)
+{
+    /* Don't do anything */
+    return 0;
+}
+
+static int	_WriteBuf(ArchiveHandle* AH, const void* buf, int len)
+{
+    /* Don't do anything */
+    return len;
+}
+
+static void	_CloseArchive(ArchiveHandle* AH)
+{
+    /* Nothing to do */
+}
+
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@ -6,7 +6,7 @@
 * Portions Copyright (c) 1996-2000, PostgreSQL, Inc
 * Portions Copyright (c) 1994, Regents of the University of California
 *
- * $Id: pg_dump.h,v 1.48 2000/04/12 17:16:15 momjian Exp $
+ * $Id: pg_dump.h,v 1.49 2000/07/04 14:25:28 momjian Exp $
 *
 * Modifications - 6/12/96 - dave@bensoft.com - version 1.13.dhb.2
 *
@ -25,6 +25,7 @@

 #include "pqexpbuffer.h"
 #include "catalog/pg_index.h"
+#include "pg_backup.h"

 /* The data structures used to store system catalog information */

@ -64,6 +65,15 @@ typedef struct _funcInfo
 	int			dumped;			/* 1 if already dumped */
 } FuncInfo;

+typedef struct _trigInfo
+{
+	char	   *oid;
+	char	   *tgname;
+	char	   *tgsrc;
+	char	   *tgdel;
+	char	   *tgcomment;
+} TrigInfo;
+
 typedef struct _tableInfo
 {
 	char	   *oid;
@ -94,9 +104,7 @@ typedef struct _tableInfo
 	int			ncheck;			/* # of CHECK expressions */
 	char	  **check_expr;		/* [CONSTRAINT name] CHECK expressions */
 	int			ntrig;			/* # of triggers */
-	char	  **triggers;		/* CREATE TRIGGER ... */
-	char	  **trcomments;		/* COMMENT ON TRIGGER ... */
-	char	  **troids;			/* TRIGGER oids */
+	TrigInfo	*triggers;		/* Triggers on the table */
 	char	   *primary_key;	/* PRIMARY KEY of the table, if any */
 } TableInfo;

@ -162,7 +170,7 @@ typedef struct _oprInfo
 extern bool g_force_quotes;		/* double-quotes for identifiers flag */
 extern bool g_verbose;			/* verbose flag */
 extern int	g_last_builtin_oid; /* value of the last builtin oid */
-extern FILE *g_fout;			/* the script file */
+extern Archive *g_fout;			/* the script file */

 /* placeholders for comment starting and ending delimiters */
 extern char g_comment_start[10];
@ -179,11 +187,14 @@ extern char g_opaque_type[10];	/* name for the opaque type */
 *	common utility functions
 */

-extern TableInfo *dumpSchema(FILE *fout,
+extern TableInfo *dumpSchema(Archive *fout,
 		   int *numTablesPtr,
 		   const char *tablename,
-		   const bool acls);
-extern void dumpSchemaIdx(FILE *fout,
+		   const bool acls,
+		   const bool oids,
+		   const bool schemaOnly,
+		   const bool dataOnly);
+extern void dumpSchemaIdx(Archive *fout,
 			  const char *tablename,
 			  TableInfo *tblinfo,
 			  int numTables);
@ -215,22 +226,23 @@ extern TableInfo *getTables(int *numTables, FuncInfo *finfo, int numFuncs);
 extern InhInfo *getInherits(int *numInherits);
 extern void getTableAttrs(TableInfo *tbinfo, int numTables);
 extern IndInfo *getIndices(int *numIndices);
-extern void dumpDBComment(FILE *outfile);
-extern void dumpTypes(FILE *fout, FuncInfo *finfo, int numFuncs,
+extern void dumpDBComment(Archive *outfile);
+extern void dumpTypes(Archive *fout, FuncInfo *finfo, int numFuncs,
 		  TypeInfo *tinfo, int numTypes);
-extern void dumpProcLangs(FILE *fout, FuncInfo *finfo, int numFuncs,
+extern void dumpProcLangs(Archive *fout, FuncInfo *finfo, int numFuncs,
 			  TypeInfo *tinfo, int numTypes);
-extern void dumpFuncs(FILE *fout, FuncInfo *finfo, int numFuncs,
+extern void dumpFuncs(Archive *fout, FuncInfo *finfo, int numFuncs,
 		  TypeInfo *tinfo, int numTypes);
-extern void dumpAggs(FILE *fout, AggInfo *agginfo, int numAggregates,
+extern void dumpAggs(Archive *fout, AggInfo *agginfo, int numAggregates,
 		 TypeInfo *tinfo, int numTypes);
-extern void dumpOprs(FILE *fout, OprInfo *agginfo, int numOperators,
+extern void dumpOprs(Archive *fout, OprInfo *agginfo, int numOperators,
 		 TypeInfo *tinfo, int numTypes);
-extern void dumpTables(FILE *fout, TableInfo *tbinfo, int numTables,
+extern void dumpTables(Archive *fout, TableInfo *tbinfo, int numTables,
 		   InhInfo *inhinfo, int numInherits,
 		   TypeInfo *tinfo, int numTypes, const char *tablename,
-		   const bool acls);
-extern void dumpIndices(FILE *fout, IndInfo *indinfo, int numIndices,
+		   const bool acls, const bool oids,
+		   const bool schemaOnly, const bool dataOnly);
+extern void dumpIndices(Archive *fout, IndInfo *indinfo, int numIndices,
 			TableInfo *tbinfo, int numTables, const char *tablename);
 extern const char *fmtId(const char *identifier, bool force_quotes);

--- a/src/bin/pg_dump/pg_dumpall.sh
+++ b/src/bin/pg_dump/pg_dumpall.sh
@ -6,7 +6,7 @@
 # and "pg_group" tables, which belong to the whole installation rather
 # than any one individual database.
 #
-# $Header: /cvsroot/pgsql/src/bin/pg_dump/Attic/pg_dumpall.sh,v 1.1 2000/07/03 16:35:39 petere Exp $
+# $Header: /cvsroot/pgsql/src/bin/pg_dump/Attic/pg_dumpall.sh,v 1.2 2000/07/04 14:25:28 momjian Exp $

 CMDNAME=`basename $0`

@ -135,7 +135,7 @@ fi


 PSQL="${PGPATH}/psql $connectopts"
-PGDUMP="${PGPATH}/pg_dump $connectopts $pgdumpextraopts"
+PGDUMP="${PGPATH}/pg_dump $connectopts $pgdumpextraopts -Fp"


 echo "--"
--- a/src/bin/pg_dump/pg_restore.c
+++ b/src/bin/pg_dump/pg_restore.c
@ -0,0 +1,325 @@
+/*-------------------------------------------------------------------------
+ *
+ * pg_restore.c
+ *	pg_restore is an utility extracting postgres database definitions
+ *	from a backup archive created by pg_dump using the archiver 
+ *	interface.
+ *
+ *	pg_restore will read the backup archive and
+ *	dump out a script that reproduces
+ *	the schema of the database in terms of
+ *		  user-defined types
+ *		  user-defined functions
+ *		  tables
+ *		  indices
+ *		  aggregates
+ *		  operators
+ *		  ACL - grant/revoke
+ *
+ * the output script is SQL that is understood by PostgreSQL
+ *
+ * Basic process in a restore operation is:
+ * 
+ * 	Open the Archive and read the TOC.
+ * 	Set flags in TOC entries, and *maybe* reorder them.
+ * 	Generate script to stdout
+ * 	Exit
+ *
+ * Copyright (c) 2000, Philip Warner
+ *      Rights are granted to use this software in any way so long
+ *      as this notice is not removed.
+ *
+ *	The author is not responsible for loss or damages that may
+ *	result from it's use.
+ *
+ *
+ * IDENTIFICATION
+ *
+ * Modifications - 28-Jun-2000 - pjw@rhyme.com.au
+ *
+ *	Initial version. Command processing taken from original pg_dump.
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include <stdlib.h>
+#include <stdio.h>
+#include <string.h>
+#include <ctype.h>
+
+
+/*
+#include "postgres.h"
+#include "access/htup.h"
+#include "catalog/pg_type.h"
+#include "catalog/pg_language.h"
+#include "catalog/pg_index.h"
+#include "catalog/pg_trigger.h"
+#include "libpq-fe.h"
+*/
+
+#include "pg_backup.h"
+
+#ifndef HAVE_STRDUP
+#include "strdup.h"
+#endif
+
+#ifdef HAVE_TERMIOS_H
+#include <termios.h>
+#endif
+
+#ifdef HAVE_GETOPT_H 
+#include <getopt.h>
+#else
+#include <unistd.h>
+#endif
+
+/* Forward decls */
+static void usage(const char *progname);
+static char* _cleanupName(char* name);
+
+typedef struct option optType;
+
+#ifdef HAVE_GETOPT_H
+struct option cmdopts[] = {	
+				{ "clean", 0, NULL, 'c' },
+				{ "data-only", 0, NULL, 'a' },
+				{ "file", 1, NULL, 'f' },
+				{ "format", 1, NULL, 'F' },
+				{ "function", 2, NULL, 'p' },
+				{ "index", 2, NULL, 'i'},
+				{ "list", 0, NULL, 'l'},
+				{ "no-acl", 0, NULL, 'x' },
+				{ "oid-order", 0, NULL, 'o'},
+				{ "orig-order", 0, NULL, 'O' },
+				{ "rearrange", 0, NULL, 'r'},
+				{ "schema-only", 0, NULL, 's' },
+				{ "table", 2, NULL, 't'},
+				{ "trigger", 2, NULL, 'T' },
+				{ "use-list", 1, NULL, 'u'},
+				{ "verbose", 0, NULL, 'v' },
+				{ NULL, 0, NULL, 0}
+			    };
+#endif
+
+int main(int argc, char **argv)
+{
+	RestoreOptions	*opts;
+	char		*progname;
+	int		c;
+	Archive*    	AH;
+	char		*fileSpec;
+
+	opts = NewRestoreOptions();
+
+	progname = *argv;
+
+#ifdef HAVE_GETOPT_LONG
+	while ((c = getopt_long(argc, argv, "acf:F:i:loOp:st:T:u:vx", cmdopts, NULL)) != EOF)
+#else
+	while ((c = getopt(argc, argv, "acf:F:i:loOp:st:T:u:vx")) != -1)
+#endif
+	{
+		switch (c)
+		{
+			case 'a':			/* Dump data only */
+				opts->dataOnly = 1;
+				break;
+			case 'c':			/* clean (i.e., drop) schema prior to
+								 * create */
+				opts->dropSchema = 1;
+				break;
+			case 'f':			/* output file name */
+				opts->filename = strdup(optarg);
+				break;
+			case 'F':
+				if (strlen(optarg) != 0) 
+				    opts->formatName = strdup(optarg);
+				break;
+			case 'o':
+				opts->oidOrder = 1;
+				break;
+			case 'O':
+				opts->origOrder = 1;
+				break;
+			case 'r':
+				opts->rearrange = 1;
+				break;
+
+			case 'p': /* Function */
+				opts->selTypes = 1;
+				opts->selFunction = 1;
+				opts->functionNames = _cleanupName(optarg);
+				break;
+			case 'i': /* Index */
+				opts->selTypes = 1;
+				opts->selIndex = 1;
+				opts->indexNames = _cleanupName(optarg);
+				break;
+			case 'T': /* Trigger */
+				opts->selTypes = 1;
+				opts->selTrigger = 1;
+				opts->triggerNames = _cleanupName(optarg);
+				break;
+			case 's':			/* dump schema only */
+				opts->schemaOnly = 1;
+				break;
+			case 't':			/* Dump data for this table only */
+				opts->selTypes = 1;
+				opts->selTable = 1;
+				opts->tableNames = _cleanupName(optarg);
+				break;
+			case 'l':			/* Dump the TOC summary */
+				opts->tocSummary = 1;
+				break;
+
+			case 'u':			/* input TOC summary file name */
+				opts->tocFile = strdup(optarg);
+				break;
+
+			case 'v':			/* verbose */
+				opts->verbose = 1;
+				break;
+			case 'x':			/* skip ACL dump */
+				opts->aclsSkip = 1;
+				break;
+			default:
+				usage(progname);
+				break;
+		}
+	}
+
+	if (optind < argc) {
+	    fileSpec = argv[optind];
+	} else {
+	    fileSpec = NULL;
+	}
+
+    if (opts->formatName) { 
+
+	switch (opts->formatName[0]) {
+
+	    case 'c':
+	    case 'C':
+		opts->format = archCustom;
+		break;
+
+	    case 'f':
+	    case 'F':
+		opts->format = archFiles;
+		break;
+
+	    default:
+		fprintf(stderr, "%s: Unknown archive format '%s', please specify 'f' or 'c'\n", progname, opts->formatName);
+		exit (1);
+	}
+    }
+
+    AH = OpenArchive(fileSpec, opts->format);
+
+    if (opts->tocFile)
+	SortTocFromFile(AH, opts);
+
+    if (opts->oidOrder)
+	SortTocByOID(AH);
+    else if (opts->origOrder)
+	SortTocByID(AH);
+
+    if (opts->rearrange) {
+	MoveToEnd(AH, "TABLE DATA");
+	MoveToEnd(AH, "INDEX");
+	MoveToEnd(AH, "TRIGGER");
+	MoveToEnd(AH, "RULE");
+	MoveToEnd(AH, "ACL");
+    }
+
+    if (opts->tocSummary) {
+	PrintTOCSummary(AH, opts);
+    } else {
+	RestoreArchive(AH, opts);
+    }
+
+    CloseArchive(AH);
+
+    return 1;
+}
+
+static void usage(const char *progname)
+{
+#ifdef HAVE_GETOPT_LONG
+	fprintf(stderr,
+	"usage:  %s [options] [backup file]\n"
+	    "  -a, --data-only             \t dump out only the data, no schema\n"
+	    "  -c, --clean                 \t clean(drop) schema prior to create\n"
+	    "  -f filename                 \t script output filename\n"
+	    "  -F, --format {c|f}          \t specify backup file format\n"
+	    "  -p, --function[=name]       \t dump functions or named function\n"
+	    "  -i, --index[=name]          \t dump indexes or named index\n"
+	    "  -l, --list                  \t dump summarized TOC for this file\n"
+	    "  -o, --oid-order             \t dump in oid order\n"
+	    "  -O, --orig-order            \t dump in original dump order\n"
+	    "  -r, --rearrange             \t rearrange output to put indexes etc at end\n"
+	    "  -s, --schema-only           \t dump out only the schema, no data\n"
+	    "  -t [table], --table[=table] \t dump for this table only\n"
+	    "  -T, --trigger[=name]        \t dump triggers or named trigger\n"
+	    "  -u, --use-list filename     \t use specified TOC for ordering output from this file\n"
+	    "  -v                          \t verbose\n"
+	    "  -x, --no-acl                \t skip dumping of ACLs (grant/revoke)\n"
+	    , progname);
+#else
+	fprintf(stderr,
+	"usage:  %s [options] [backup file]\n"
+	    "  -a                          \t dump out only the data, no schema\n"
+	    "  -c                          \t clean(drop) schema prior to create\n"
+	    "  -f filename NOT IMPLEMENTED \t script output filename\n"
+	    "  -F           {c|f}          \t specify backup file format\n"
+	    "  -p name                     \t dump functions or named function\n"
+	    "  -i name                     \t dump indexes or named index\n"
+	    "  -l                          \t dump summarized TOC for this file\n"
+	    "  -o                          \t dump in oid order\n"
+	    "  -O                          \t dump in original dump order\n"
+	    "  -r                          \t rearrange output to put indexes etc at end\n"
+	    "  -s                          \t dump out only the schema, no data\n"
+	    "  -t name                     \t dump for this table only\n"
+	    "  -T name                     \t dump triggers or named trigger\n"
+	    "  -u filename                 \t use specified TOC for ordering output from this file\n"
+	    "  -v                          \t verbose\n"
+	    "  -x                          \t skip dumping of ACLs (grant/revoke)\n"
+	    , progname);
+#endif
+	fprintf(stderr,
+			"\nIf [backup file] is not supplied, then standard input "
+			"is used.\n");
+	fprintf(stderr, "\n");
+
+	exit(1);
+}
+
+static char* _cleanupName(char* name)
+{
+    int		i;
+
+    if (!name)
+	return NULL;
+
+    if (strlen(name) == 0)
+	return NULL;
+
+    name = strdup(name);
+
+    if (name[0] == '"')
+    {
+	strcpy(name, &name[1]);
+	if (*(name + strlen(name) - 1) == '"')
+	    *(name + strlen(name) - 1) = '\0';
+    }
+    /* otherwise, convert table name to lowercase... */
+    else
+    {
+	for (i = 0; name[i]; i++)
+	    if (isascii((unsigned char) name[i]) && isupper(name[i]))
+		name[i] = tolower(name[i]);
+    }
+    return name;
+}
+