|
|
|
@ -1,60 +1,60 @@ |
|
|
|
|
<Chapter Id="xaggr"> |
|
|
|
|
<Title>Extending <Acronym>SQL</Acronym>: Aggregates</Title> |
|
|
|
|
<chapter id="xaggr"> |
|
|
|
|
<title>Extending <acronym>SQL</acronym>: Aggregates</title> |
|
|
|
|
|
|
|
|
|
<Para> |
|
|
|
|
Aggregate functions in <ProductName>Postgres</ProductName> |
|
|
|
|
are expressed as <firstterm>state values</firstterm> |
|
|
|
|
and <firstterm>state transition functions</firstterm>. |
|
|
|
|
That is, an aggregate can be |
|
|
|
|
defined in terms of state that is modified whenever an |
|
|
|
|
input item is processed. To define a new aggregate |
|
|
|
|
function, one selects a datatype for the state value, |
|
|
|
|
an initial value for the state, and a state transition |
|
|
|
|
function. The state transition function is just an |
|
|
|
|
ordinary function that could also be used outside the |
|
|
|
|
context of the aggregate. |
|
|
|
|
</Para> |
|
|
|
|
<para> |
|
|
|
|
Aggregate functions in <productname>Postgres</productname> |
|
|
|
|
are expressed as <firstterm>state values</firstterm> |
|
|
|
|
and <firstterm>state transition functions</firstterm>. |
|
|
|
|
That is, an aggregate can be |
|
|
|
|
defined in terms of state that is modified whenever an |
|
|
|
|
input item is processed. To define a new aggregate |
|
|
|
|
function, one selects a datatype for the state value, |
|
|
|
|
an initial value for the state, and a state transition |
|
|
|
|
function. The state transition function is just an |
|
|
|
|
ordinary function that could also be used outside the |
|
|
|
|
context of the aggregate. |
|
|
|
|
</para> |
|
|
|
|
|
|
|
|
|
<Para> |
|
|
|
|
Actually, in order to make it easier to construct useful |
|
|
|
|
aggregates from existing functions, an aggregate can have |
|
|
|
|
one or two separate state values, one or two transition |
|
|
|
|
functions to update those state values, and a |
|
|
|
|
<firstterm>final function</firstterm> that computes the |
|
|
|
|
actual aggregate result from the ending state values. |
|
|
|
|
</Para> |
|
|
|
|
<para> |
|
|
|
|
Actually, in order to make it easier to construct useful |
|
|
|
|
aggregates from existing functions, an aggregate can have |
|
|
|
|
one or two separate state values, one or two transition |
|
|
|
|
functions to update those state values, and a |
|
|
|
|
<firstterm>final function</firstterm> that computes the |
|
|
|
|
actual aggregate result from the ending state values. |
|
|
|
|
</para> |
|
|
|
|
|
|
|
|
|
<Para> |
|
|
|
|
Thus there can be as many as four datatypes involved: |
|
|
|
|
the type of the input data items, the type of the aggregate's |
|
|
|
|
result, and the types of the two state values. Only the |
|
|
|
|
input and result datatypes are seen by a user of the aggregate. |
|
|
|
|
</Para> |
|
|
|
|
<para> |
|
|
|
|
Thus there can be as many as four datatypes involved: |
|
|
|
|
the type of the input data items, the type of the aggregate's |
|
|
|
|
result, and the types of the two state values. Only the |
|
|
|
|
input and result datatypes are seen by a user of the aggregate. |
|
|
|
|
</para> |
|
|
|
|
|
|
|
|
|
<Para> |
|
|
|
|
Some state transition functions need to look at each successive |
|
|
|
|
input to compute the next state value, while others ignore the |
|
|
|
|
specific input value and simply update their internal state. |
|
|
|
|
(The most useful example of the second kind is a running count |
|
|
|
|
of the number of input items.) The <ProductName>Postgres</ProductName> |
|
|
|
|
aggregate machinery defines <Acronym>sfunc1</Acronym> for |
|
|
|
|
an aggregate as a function that is passed both the old state |
|
|
|
|
value and the current input value, while <Acronym>sfunc2</Acronym> |
|
|
|
|
is a function that is passed only the old state value. |
|
|
|
|
</Para> |
|
|
|
|
<para> |
|
|
|
|
Some state transition functions need to look at each successive |
|
|
|
|
input to compute the next state value, while others ignore the |
|
|
|
|
specific input value and simply update their internal state. |
|
|
|
|
(The most useful example of the second kind is a running count |
|
|
|
|
of the number of input items.) The <productname>Postgres</productname> |
|
|
|
|
aggregate machinery defines <acronym>sfunc1</acronym> for |
|
|
|
|
an aggregate as a function that is passed both the old state |
|
|
|
|
value and the current input value, while <acronym>sfunc2</acronym> |
|
|
|
|
is a function that is passed only the old state value. |
|
|
|
|
</para> |
|
|
|
|
|
|
|
|
|
<Para> |
|
|
|
|
If we define an aggregate that uses only <Acronym>sfunc1</Acronym>, |
|
|
|
|
we have an aggregate that computes a running function of |
|
|
|
|
the attribute values from each instance. "Sum" is an |
|
|
|
|
example of this kind of aggregate. "Sum" starts at |
|
|
|
|
zero and always adds the current instance's value to |
|
|
|
|
its running total. For example, if we want to make a Sum |
|
|
|
|
aggregate to work on a datatype for complex numbers, |
|
|
|
|
we only need the addition function for that datatype. |
|
|
|
|
The aggregate definition is: |
|
|
|
|
|
|
|
|
|
<ProgramListing> |
|
|
|
|
<para> |
|
|
|
|
If we define an aggregate that uses only <acronym>sfunc1</acronym>, |
|
|
|
|
we have an aggregate that computes a running function of |
|
|
|
|
the attribute values from each instance. "Sum" is an |
|
|
|
|
example of this kind of aggregate. "Sum" starts at |
|
|
|
|
zero and always adds the current instance's value to |
|
|
|
|
its running total. For example, if we want to make a Sum |
|
|
|
|
aggregate to work on a datatype for complex numbers, |
|
|
|
|
we only need the addition function for that datatype. |
|
|
|
|
The aggregate definition is: |
|
|
|
|
|
|
|
|
|
<programlisting> |
|
|
|
|
CREATE AGGREGATE complex_sum ( |
|
|
|
|
sfunc1 = complex_add, |
|
|
|
|
basetype = complex, |
|
|
|
@ -69,27 +69,27 @@ SELECT complex_sum(a) FROM test_complex; |
|
|
|
|
+------------+ |
|
|
|
|
|(34,53.9) | |
|
|
|
|
+------------+ |
|
|
|
|
</ProgramListing> |
|
|
|
|
</programlisting> |
|
|
|
|
|
|
|
|
|
(In practice, we'd just name the aggregate "sum", and rely on |
|
|
|
|
<ProductName>Postgres</ProductName> to figure out which kind |
|
|
|
|
of sum to apply to a complex column.) |
|
|
|
|
</Para> |
|
|
|
|
(In practice, we'd just name the aggregate "sum", and rely on |
|
|
|
|
<productname>Postgres</productname> to figure out which kind |
|
|
|
|
of sum to apply to a complex column.) |
|
|
|
|
</para> |
|
|
|
|
|
|
|
|
|
<Para> |
|
|
|
|
If we define only <Acronym>sfunc2</Acronym>, we are |
|
|
|
|
specifying an aggregate |
|
|
|
|
that computes a running function that is independent of |
|
|
|
|
the attribute values from each instance. |
|
|
|
|
"Count" is the most common example of this kind of |
|
|
|
|
aggregate. "Count" starts at zero and adds one to its |
|
|
|
|
running total for each instance, ignoring the instance |
|
|
|
|
value. Here, we use the built-in |
|
|
|
|
<Acronym>int4inc</Acronym> routine to do |
|
|
|
|
the work for us. This routine increments (adds one to) |
|
|
|
|
its argument. |
|
|
|
|
|
|
|
|
|
<ProgramListing> |
|
|
|
|
<para> |
|
|
|
|
If we define only <acronym>sfunc2</acronym>, we are |
|
|
|
|
specifying an aggregate |
|
|
|
|
that computes a running function that is independent of |
|
|
|
|
the attribute values from each instance. |
|
|
|
|
"Count" is the most common example of this kind of |
|
|
|
|
aggregate. "Count" starts at zero and adds one to its |
|
|
|
|
running total for each instance, ignoring the instance |
|
|
|
|
value. Here, we use the built-in |
|
|
|
|
<acronym>int4inc</acronym> routine to do |
|
|
|
|
the work for us. This routine increments (adds one to) |
|
|
|
|
its argument. |
|
|
|
|
|
|
|
|
|
<programlisting> |
|
|
|
|
CREATE AGGREGATE my_count ( |
|
|
|
|
sfunc2 = int4inc, -- add one |
|
|
|
|
basetype = int4, |
|
|
|
@ -104,21 +104,22 @@ SELECT my_count(*) as emp_count from EMP; |
|
|
|
|
+----------+ |
|
|
|
|
|5 | |
|
|
|
|
+----------+ |
|
|
|
|
</ProgramListing> |
|
|
|
|
</Para> |
|
|
|
|
|
|
|
|
|
<Para> |
|
|
|
|
"Average" is an example of an aggregate that requires |
|
|
|
|
both a function to compute the running sum and a function |
|
|
|
|
to compute the running count. When all of the |
|
|
|
|
instances have been processed, the final answer for the |
|
|
|
|
aggregate is the running sum divided by the running |
|
|
|
|
count. We use the <Acronym>int4pl</Acronym> and <Acronym>int4inc</Acronym> routines we used |
|
|
|
|
before as well as the <ProductName>Postgres</ProductName> integer division |
|
|
|
|
routine, <Acronym>int4div</Acronym>, to compute the division of the sum by |
|
|
|
|
the count. |
|
|
|
|
|
|
|
|
|
<ProgramListing> |
|
|
|
|
</programlisting> |
|
|
|
|
</para> |
|
|
|
|
|
|
|
|
|
<para> |
|
|
|
|
"Average" is an example of an aggregate that requires |
|
|
|
|
both a function to compute the running sum and a function |
|
|
|
|
to compute the running count. When all of the |
|
|
|
|
instances have been processed, the final answer for the |
|
|
|
|
aggregate is the running sum divided by the running |
|
|
|
|
count. We use the <acronym>int4pl</acronym> and |
|
|
|
|
<acronym>int4inc</acronym> routines we used |
|
|
|
|
before as well as the <productname>Postgres</productname> integer division |
|
|
|
|
routine, <acronym>int4div</acronym>, to compute the division of the sum by |
|
|
|
|
the count. |
|
|
|
|
|
|
|
|
|
<programlisting> |
|
|
|
|
CREATE AGGREGATE my_average ( |
|
|
|
|
sfunc1 = int4pl, -- sum |
|
|
|
|
basetype = int4, |
|
|
|
@ -137,12 +138,34 @@ SELECT my_average(salary) as emp_average FROM EMP; |
|
|
|
|
+------------+ |
|
|
|
|
|1640 | |
|
|
|
|
+------------+ |
|
|
|
|
</ProgramListing> |
|
|
|
|
</Para> |
|
|
|
|
</programlisting> |
|
|
|
|
</para> |
|
|
|
|
|
|
|
|
|
<Para> |
|
|
|
|
For further details see |
|
|
|
|
<xref endterm="sql-createaggregate-title" |
|
|
|
|
linkend="sql-createaggregate-title">. |
|
|
|
|
</Para> |
|
|
|
|
</Chapter> |
|
|
|
|
<para> |
|
|
|
|
For further details see |
|
|
|
|
<!-- |
|
|
|
|
Not available in the Programmer's Guide |
|
|
|
|
<xref endterm="sql-createaggregate-title" |
|
|
|
|
linkend="sql-createaggregate-title">. |
|
|
|
|
--> |
|
|
|
|
<command>CREATE AGGREGATE</command> in |
|
|
|
|
<citetitle>The PostgreSQL User's Guide</citetitle>. |
|
|
|
|
</para> |
|
|
|
|
</chapter> |
|
|
|
|
|
|
|
|
|
<!-- Keep this comment at the end of the file |
|
|
|
|
Local variables: |
|
|
|
|
mode:sgml |
|
|
|
|
sgml-omittag:nil |
|
|
|
|
sgml-shorttag:t |
|
|
|
|
sgml-minimize-attributes:nil |
|
|
|
|
sgml-always-quote-attributes:t |
|
|
|
|
sgml-indent-step:1 |
|
|
|
|
sgml-indent-data:t |
|
|
|
|
sgml-parent-document:nil |
|
|
|
|
sgml-default-dtd-file:"./reference.ced" |
|
|
|
|
sgml-exposed-tags:nil |
|
|
|
|
sgml-local-catalogs:("/usr/lib/sgml/catalog") |
|
|
|
|
sgml-local-ecat-files:nil |
|
|
|
|
End: |
|
|
|
|
--> |
|
|
|
|