|
|
|
<REFENTRY ID="SQL-COPY">
|
|
|
|
<REFMETA>
|
|
|
|
<REFENTRYTITLE>
|
|
|
|
COPY
|
|
|
|
</REFENTRYTITLE>
|
|
|
|
<REFMISCINFO>SQL - Language Statements</REFMISCINFO>
|
|
|
|
</REFMETA>
|
|
|
|
<REFNAMEDIV>
|
|
|
|
<REFNAME>
|
|
|
|
COPY
|
|
|
|
</REFNAME>
|
|
|
|
<REFPURPOSE>
|
|
|
|
Copies data between files and tables
|
|
|
|
</REFPURPOSE>
|
|
|
|
<REFSYNOPSISDIV>
|
|
|
|
<REFSYNOPSISDIVINFO>
|
|
|
|
<DATE>1998-04-15</DATE>
|
|
|
|
</REFSYNOPSISDIVINFO>
|
|
|
|
<SYNOPSIS>
|
|
|
|
COPY [BINARY] <replaceable class="parameter">table</replaceable> [WITH OIDS]
|
|
|
|
TO|FROM '<replaceable class="parameter">filename</replaceable>'|stdin|stdout
|
|
|
|
[USING DELIMITERS '<replaceable class="parameter">delimiter</replaceable>']
|
|
|
|
</SYNOPSIS>
|
|
|
|
|
|
|
|
<REFSECT2 ID="R2-SQL-COPY-1">
|
|
|
|
<REFSECT2INFO>
|
|
|
|
<DATE>1998-04-15</DATE>
|
|
|
|
</REFSECT2INFO>
|
|
|
|
<TITLE>
|
|
|
|
Inputs
|
|
|
|
</TITLE>
|
|
|
|
<PARA>
|
|
|
|
</PARA>
|
|
|
|
<VARIABLELIST>
|
|
|
|
<VARLISTENTRY>
|
|
|
|
<TERM>
|
|
|
|
</TERM>
|
|
|
|
<LISTITEM>
|
|
|
|
<PARA>
|
|
|
|
<VARIABLELIST>
|
|
|
|
<VARLISTENTRY>
|
|
|
|
<TERM>
|
|
|
|
<ReturnValue><replaceable class="parameter">table</replaceable></ReturnValue>
|
|
|
|
</TERM>
|
|
|
|
<LISTITEM>
|
|
|
|
<PARA>
|
|
|
|
The name of a table.
|
|
|
|
</PARA>
|
|
|
|
</LISTITEM>
|
|
|
|
</VARLISTENTRY>
|
|
|
|
<VARLISTENTRY>
|
|
|
|
<TERM>
|
|
|
|
<ReturnValue><replaceable class="parameter">delimiter</replaceable></ReturnValue>
|
|
|
|
</TERM>
|
|
|
|
<LISTITEM>
|
|
|
|
<PARA>
|
|
|
|
A character that delimits fields.
|
|
|
|
</PARA>
|
|
|
|
</LISTITEM>
|
|
|
|
</VARLISTENTRY>
|
|
|
|
</variablelist>
|
|
|
|
</LISTITEM>
|
|
|
|
</VARLISTENTRY>
|
|
|
|
</VARIABLELIST>
|
|
|
|
</REFSECT2>
|
|
|
|
|
|
|
|
<REFSECT2 ID="R2-SQL-COPY-2">
|
|
|
|
<REFSECT2INFO>
|
|
|
|
<DATE>1998-04-15</DATE>
|
|
|
|
</REFSECT2INFO>
|
|
|
|
<TITLE>
|
|
|
|
Outputs
|
|
|
|
</TITLE>
|
|
|
|
<PARA>
|
|
|
|
</PARA>
|
|
|
|
<VARIABLELIST>
|
|
|
|
<VARLISTENTRY>
|
|
|
|
<TERM>
|
|
|
|
Status
|
|
|
|
</TERM>
|
|
|
|
<LISTITEM>
|
|
|
|
<PARA>
|
|
|
|
<VARIABLELIST>
|
|
|
|
<VARLISTENTRY>
|
|
|
|
<TERM>
|
|
|
|
<ReturnValue>COPY</ReturnValue>
|
|
|
|
</TERM>
|
|
|
|
<LISTITEM>
|
|
|
|
<PARA>
|
|
|
|
The copy completed successfully.
|
|
|
|
</PARA>
|
|
|
|
</LISTITEM>
|
|
|
|
</VARLISTENTRY>
|
|
|
|
<VARLISTENTRY>
|
|
|
|
<TERM>
|
|
|
|
<ReturnValue>ERROR: <replaceable>error message</replaceable></ReturnValue>
|
|
|
|
</TERM>
|
|
|
|
<LISTITEM>
|
|
|
|
<PARA>
|
|
|
|
The copy failed for the reason stated in the error message.
|
|
|
|
</PARA>
|
|
|
|
</LISTITEM>
|
|
|
|
</VARLISTENTRY>
|
|
|
|
</variablelist>
|
|
|
|
</LISTITEM>
|
|
|
|
</VARLISTENTRY>
|
|
|
|
</VARIABLELIST>
|
|
|
|
</REFSECT2>
|
|
|
|
</REFSYNOPSISDIV>
|
|
|
|
|
|
|
|
<REFSECT1 ID="R1-SQL-COPY-1">
|
|
|
|
<REFSECT1INFO>
|
|
|
|
<DATE>1998-04-15</DATE>
|
|
|
|
</REFSECT1INFO>
|
|
|
|
<TITLE>
|
|
|
|
Description
|
|
|
|
</TITLE>
|
|
|
|
<PARA>
|
|
|
|
<command>COPY</command> moves data between PostgreSQL tables and
|
|
|
|
standard Unix files. The keyword <function>BINARY</function>
|
|
|
|
changes the behavior of field formatting, as described
|
|
|
|
below. <replaceable class="parameter">Table</replaceable> is the
|
|
|
|
name of an existing table. The keyword <function>WITH
|
|
|
|
OIDS</function> copies the internal unique object id (OID) for each
|
|
|
|
row. <replaceable class="parameter">Filename</replaceable> is the
|
|
|
|
absolute Unix pathname of the file. In place of a filename, the
|
|
|
|
keywords <function>stdin</function> and <function>stdout</function>
|
|
|
|
can be used, so that input to <command>COPY</command> can be written
|
|
|
|
by a libpq application and output from <command>COPY</command> can
|
|
|
|
be read by a libpq application.
|
|
|
|
</para>
|
|
|
|
<para>
|
|
|
|
The <function>BINARY</function> keyword will force all data to be
|
|
|
|
stored/read as binary objects rather than as ASCII text. It is
|
|
|
|
somewhat faster than the normal copy command, but is not
|
|
|
|
generally portable, and the files generated are somewhat larger,
|
|
|
|
although this factor is highly dependent on the data itself. By
|
|
|
|
default, an ASCII copy uses a tab (\t) character as a delimiter.
|
|
|
|
The delimiter may also be changed to any other single character
|
|
|
|
with the keyword <function>USING DELIMITERS</function>. Characters
|
|
|
|
in data fields which happen to match the delimiter character will
|
|
|
|
be quoted.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<REFSECT2 ID="R2-SQL-COPY-3">
|
|
|
|
<REFSECT2INFO>
|
|
|
|
<DATE>1998-04-15</DATE>
|
|
|
|
</REFSECT2INFO>
|
|
|
|
<TITLE>
|
|
|
|
Notes
|
|
|
|
</TITLE>
|
|
|
|
<para>
|
|
|
|
You must have select access on any table whose values are read by
|
|
|
|
<command>COPY</command>, and either insert or update access to a
|
|
|
|
table into which values are being inserted by <command>COPY</command>.
|
|
|
|
The backend also needs appropriate Unix permissions for any file read
|
|
|
|
or written by <command>COPY</command>.
|
|
|
|
<comment>
|
|
|
|
Is this right? The man page talked of read, write and append access, which
|
|
|
|
is neither SQL nor Unix terminology.
|
|
|
|
</comment>
|
|
|
|
</para>
|
|
|
|
<para>
|
|
|
|
The keyword <function>USING DELIMITERS</function> is inaptly
|
|
|
|
named, since only a single character may be specified. (If a
|
|
|
|
group of characters is specified, only the first character is
|
|
|
|
used.)
|
|
|
|
</para>
|
|
|
|
<para>
|
|
|
|
WARNING: do not confuse <command>COPY</command> with the
|
|
|
|
<command>psql</command> instruction <command>\copy</command>.
|
|
|
|
</para>
|
|
|
|
</REFSECT2>
|
|
|
|
</refsect1>
|
|
|
|
<refsect1 ID="R1-SQL-COPY-2">
|
|
|
|
<refsect1info>
|
|
|
|
<date>1998-05-04</date>
|
|
|
|
</refsect1info>
|
|
|
|
<title>Format of output files</title>
|
|
|
|
<refsect2>
|
|
|
|
<refsect2info>
|
|
|
|
<date>1998-05-04</date>
|
|
|
|
</refsect2info>
|
|
|
|
<title>ASCII copy format</title>
|
|
|
|
<para>
|
|
|
|
When <command>COPY</command> is used without <function>BINARY</function>,
|
|
|
|
the file generated will have each instance on a single line, with each
|
|
|
|
attribute separated by the delimiter character. Embedded
|
|
|
|
delimiter characters will be preceded by a backslash character
|
|
|
|
(\). The attribute values themselves are strings generated by the
|
|
|
|
output function associated with each attribute type. The output
|
|
|
|
function for a type should not try to generate the backslash
|
|
|
|
character; this will be handled by <command>COPY</command> itself.
|
|
|
|
</para>
|
|
|
|
<para>
|
|
|
|
The actual format for each instance is
|
|
|
|
<programlisting>
|
|
|
|
<attr1><<replaceable class=parameter>separator</replaceable>><attr2><<replaceable class=parameter>separator</replaceable>>...<<replaceable class=parameter>separator</replaceable>><attr<replaceable class="parameter">n</replaceable>><newline></programlisting>
|
|
|
|
The oid is placed on the beginning of the line
|
|
|
|
if <function>WITH OIDS</function> is specified.
|
|
|
|
</para>
|
|
|
|
<para>
|
|
|
|
If <command>COPY</command> is sending its output to standard
|
|
|
|
output instead of a file, it will send a backslash(\) and a period
|
|
|
|
(.) followed immediately by a newline, on a separate line,
|
|
|
|
when it is done. Similarly, if <command>COPY</command> is reading
|
|
|
|
from standard input, it will expect a backslash (\) and a period
|
|
|
|
(.) followed by a newline, as the first three characters on a
|
|
|
|
line, to denote end-of-file. However, <command>COPY</command>
|
|
|
|
will terminate (followed by the backend itself) if a true EOF is
|
|
|
|
encountered.
|
|
|
|
</para>
|
|
|
|
<para>
|
|
|
|
The backslash character has special meaning. NULL attributes are
|
|
|
|
output as \N. A literal backslash character is output as two
|
|
|
|
consecutive backslashes. A literal tab character is represented
|
|
|
|
as a backslash and a tab. A literal newline character is
|
|
|
|
represented as a backslash and a newline. When loading ASCII data
|
|
|
|
not generated by PostgreSQL, you will need to convert backslash
|
|
|
|
characters (\) to double-backslashes (\\) to ensure that they are loaded
|
|
|
|
properly.
|
|
|
|
</para>
|
|
|
|
</refsect2>
|
|
|
|
<refsect2>
|
|
|
|
<refsect2info>
|
|
|
|
<date>1998-05-04</date>
|
|
|
|
</refsect2info>
|
|
|
|
<title>Binary copy format</title>
|
|
|
|
<para>
|
|
|
|
In the case of <command>COPY BINARY</command>, the first four
|
|
|
|
bytes in the file will be the number of instances in the file. If
|
|
|
|
this number is zero, the <command>COPY BINARY</command> command
|
|
|
|
will read until end of file is encountered. Otherwise, it will
|
|
|
|
stop reading when this number of instances has been read.
|
|
|
|
Remaining data in the file will be ignored.
|
|
|
|
</para>
|
|
|
|
<para>
|
|
|
|
The format for each instance in the file is as follows. Note that
|
|
|
|
this format must be followed <emphasis>exactly</emphasis>.
|
|
|
|
Unsigned four-byte integer quantities are called uint32 in the
|
|
|
|
table below.
|
|
|
|
</para>
|
|
|
|
<table frame="all">
|
|
|
|
<title>Contents of a binary copy file</title>
|
|
|
|
<tgroup cols="2"colsep="1" rowsep="1" align="center">
|
|
|
|
<COLSPEC COLNAME="col1">
|
|
|
|
<COLSPEC COLNAME="col2">
|
|
|
|
<spanspec namest="col1" nameend="col2" spanname="subhead">
|
|
|
|
<tbody>
|
|
|
|
<row>
|
|
|
|
<entry align="center" spanname="subhead">At the start of the file</entry>
|
|
|
|
</row>
|
|
|
|
<row>
|
|
|
|
<entry>uint32</entry>
|
|
|
|
<entry>number of tuples</entry>
|
|
|
|
</row>
|
|
|
|
<row>
|
|
|
|
<entry align="center" spanname="subhead">For each tuple</entry>
|
|
|
|
</row>
|
|
|
|
<row>
|
|
|
|
<entry>uint32</entry>
|
|
|
|
<entry>total length of tuple data</entry>
|
|
|
|
</row>
|
|
|
|
<row>
|
|
|
|
<entry>uint32</entry>
|
|
|
|
<entry>oid (if specified)</entry>
|
|
|
|
</row>
|
|
|
|
<row>
|
|
|
|
<entry>uint32</entry>
|
|
|
|
<entry>number of null attributes</entry>
|
|
|
|
</row>
|
|
|
|
<row>
|
|
|
|
<entry>[uint32</entry>
|
|
|
|
<entry>attribute number of first null attribute, counting from 0</entry>
|
|
|
|
</row>
|
|
|
|
<row>
|
|
|
|
<entry>...</entry>
|
|
|
|
<entry>...</entry>
|
|
|
|
</row>
|
|
|
|
<row>
|
|
|
|
<entry>uint32</entry>
|
|
|
|
<entry>attribute number of last null attribute]</entry>
|
|
|
|
</row>
|
|
|
|
<row>
|
|
|
|
<entry>-</entry>
|
|
|
|
<entry><tuple data></entry>
|
|
|
|
</row>
|
|
|
|
</tbody>
|
|
|
|
</tgroup>
|
|
|
|
</table>
|
|
|
|
|
|
|
|
</refsect2>
|
|
|
|
<refsect2>
|
|
|
|
<refsect2info>
|
|
|
|
<date>1998-05-04</date>
|
|
|
|
</refsect2info>
|
|
|
|
<title>Alignment of binary data</title>
|
|
|
|
<para>
|
|
|
|
On Sun-3s, 2-byte attributes are aligned on two-byte boundaries,
|
|
|
|
and all larger attributes are aligned on four-byte boundaries.
|
|
|
|
Character attributes are aligned on single-byte boundaries. On
|
|
|
|
other machines, all attributes larger than 1 byte are aligned on
|
|
|
|
four-byte boundaries. Note that variable length attributes are
|
|
|
|
preceded by the attribute's length; arrays are simply contiguous
|
|
|
|
streams of the array element type.
|
|
|
|
</para>
|
|
|
|
</refsect2>
|
|
|
|
</refsect1>
|
|
|
|
|
|
|
|
|
|
|
|
<REFSECT1 ID="R1-SQL-COPY-3">
|
|
|
|
<TITLE>
|
|
|
|
Usage
|
|
|
|
</TITLE>
|
|
|
|
<PARA>
|
|
|
|
To copy a table to standard output, using | as a delimiter
|
|
|
|
</PARA>
|
|
|
|
<ProgramListing>
|
|
|
|
COPY country TO stdout USING DELIMITERS '|';
|
|
|
|
</ProgramListing>
|
|
|
|
<PARA>
|
|
|
|
To copy data from a Unix file into a table:
|
|
|
|
</PARA>
|
|
|
|
<ProgramListing>
|
|
|
|
COPY country FROM '/usr1/proj/bray/sql/country_data';
|
|
|
|
</ProgramListing>
|
|
|
|
<PARA>
|
|
|
|
A sample of data suitable for copying into a table from <filename>stdin</filename> (so it
|
|
|
|
has the termination sequence on the last line):
|
|
|
|
</PARA>
|
|
|
|
<ProgramListing>
|
|
|
|
AF AFGHANISTAN
|
|
|
|
AL ALBANIA
|
|
|
|
DZ ALGERIA
|
|
|
|
...
|
|
|
|
ZM ZAMBIA
|
|
|
|
ZW ZIMBABWE
|
|
|
|
\.
|
|
|
|
</ProgramListing>
|
|
|
|
<PARA>
|
|
|
|
The same data, output in binary format on a Linux Intel machine.
|
|
|
|
The data is shown after filtering through the Unix utility <command>od -c</command>. The table has
|
|
|
|
three fields; the first is <classname>char(2)</classname> and the second is <classname>text</classname>. All the
|
|
|
|
rows have a null value in the third field). Notice how the <classname>char(2)</classname>
|
|
|
|
field is padded with nulls to four bytes and the text field is
|
|
|
|
preceded by its length:
|
|
|
|
</PARA>
|
|
|
|
<ProgramListing>
|
|
|
|
355 \0 \0 \0 027 \0 \0 \0 001 \0 \0 \0 002 \0 \0 \0
|
|
|
|
006 \0 \0 \0 A F \0 \0 017 \0 \0 \0 A F G H
|
|
|
|
A N I S T A N 023 \0 \0 \0 001 \0 \0 \0 002
|
|
|
|
\0 \0 \0 006 \0 \0 \0 A L \0 \0 \v \0 \0 \0 A
|
|
|
|
L B A N I A 023 \0 \0 \0 001 \0 \0 \0 002 \0
|
|
|
|
\0 \0 006 \0 \0 \0 D Z \0 \0 \v \0 \0 \0 A L
|
|
|
|
G E R I A
|
|
|
|
... \n \0 \0 \0 Z A M B I A 024 \0
|
|
|
|
\0 \0 001 \0 \0 \0 002 \0 \0 \0 006 \0 \0 \0 Z W
|
|
|
|
\0 \0 \f \0 \0 \0 Z I M B A B W E
|
|
|
|
</ProgramListing>
|
|
|
|
</refsect1>
|
|
|
|
|
|
|
|
<refsect1 ID="R1-SQL-COPY-4">
|
|
|
|
<title>See also</title>
|
|
|
|
<para>
|
|
|
|
insert(l), create table(l), vacuum(l), libpq.
|
|
|
|
</para>
|
|
|
|
</refsect1>
|
|
|
|
|
|
|
|
<refsect1 ID="R1-SQL-COPY-5">
|
|
|
|
<title>Bugs</title>
|
|
|
|
<para>
|
|
|
|
<command>COPY</command> stops operation at the first error. This
|
|
|
|
should not lead to problems in the event of a copy from, but the
|
|
|
|
target relation will, of course, be partially modified in a copy
|
|
|
|
to. The <command>VACUUM</command> query should be used to clean up
|
|
|
|
after a failed copy.
|
|
|
|
</para>
|
|
|
|
<para>
|
|
|
|
Because Postgres' current directory is not the same as the user's
|
|
|
|
working directory, the result of copying to a file "foo" (without
|
|
|
|
additional path information) may yield unexpected results for the
|
|
|
|
naive user. In this case, "foo" will wind up in $PGDATA/foo. In
|
|
|
|
general, the full pathname should be used when specifying files to
|
|
|
|
be copied.
|
|
|
|
</para>
|
|
|
|
<para>
|
|
|
|
Files used as arguments to the copy command must reside on or be
|
|
|
|
accessible to the database server machine by being either on
|
|
|
|
local disks or on a networked file system.
|
|
|
|
</para>
|
|
|
|
<para>
|
|
|
|
When a TCP/IP connection from one machine to another is used, and a
|
|
|
|
target file is specified, the target file will be written on the
|
|
|
|
machine where the backend is running rather than the user's
|
|
|
|
machine.
|
|
|
|
</para>
|
|
|
|
</refsect1>
|
|
|
|
|
|
|
|
<REFSECT1 ID="R1-SQL-COPY-6">
|
|
|
|
<TITLE>
|
|
|
|
Compatibility
|
|
|
|
</TITLE>
|
|
|
|
<PARA>
|
|
|
|
</PARA>
|
|
|
|
|
|
|
|
<REFSECT2 ID="R2-SQL-COPY-4">
|
|
|
|
<REFSECT2INFO>
|
|
|
|
<DATE>1998-04-15</DATE>
|
|
|
|
</REFSECT2INFO>
|
|
|
|
<TITLE>
|
|
|
|
SQL92
|
|
|
|
</TITLE>
|
|
|
|
<PARA>
|
|
|
|
There is no COPY statement in SQL92.
|
|
|
|
</PARA>
|
|
|
|
</refsect2>
|
|
|
|
</refsect1>
|
|
|
|
</REFENTRY>
|
|
|
|
|
|
|
|
<!-- Keep this comment at the end of the file
|
|
|
|
Local variables:
|
|
|
|
mode: sgml
|
|
|
|
sgml-omittag:t
|
|
|
|
sgml-shorttag:t
|
|
|
|
sgml-minimize-attributes:nil
|
|
|
|
sgml-always-quote-attributes:t
|
|
|
|
sgml-indent-step:1
|
|
|
|
sgml-indent-data:t
|
|
|
|
sgml-parent-document:nil
|
|
|
|
sgml-default-dtd-file:"../reference.ced"
|
|
|
|
sgml-exposed-tags:nil
|
|
|
|
sgml-local-catalogs:"/usr/lib/sgml/catalog"
|
|
|
|
sgml-local-ecat-files:nil
|
|
|
|
End:
|
|
|
|
-->
|