You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
postgres/src/test/regress/expected/opr_sanity.out

2302 lines
80 KiB

--
-- OPR_SANITY
-- Sanity checks for common errors in making operator/procedure system tables:
-- pg_operator, pg_proc, pg_cast, pg_conversion, pg_aggregate, pg_am,
-- pg_amop, pg_amproc, pg_opclass, pg_opfamily, pg_index.
--
-- Every test failure in this file should be closely inspected.
-- The description of the failing test should be read carefully before
-- adjusting the expected output. In most cases, the queries should
-- not find *any* matching entries.
--
-- NB: we assume the oidjoins test will have caught any dangling links,
-- that is OID or REGPROC fields that are not zero and do not match some
-- row in the linked-to table. However, if we want to enforce that a link
-- field can't be 0, we have to check it here.
--
-- NB: run this test earlier than the create_operator test, because
-- that test creates some bogus operators...
-- **************** pg_proc ****************
-- Look for illegal values in pg_proc fields.
SELECT p1.oid, p1.proname
FROM pg_proc as p1
WHERE p1.prolang = 0 OR p1.prorettype = 0 OR
p1.pronargs < 0 OR
p1.pronargdefaults < 0 OR
p1.pronargdefaults > p1.pronargs OR
array_lower(p1.proargtypes, 1) != 0 OR
array_upper(p1.proargtypes, 1) != p1.pronargs-1 OR
0::oid = ANY (p1.proargtypes) OR
procost <= 0 OR
CASE WHEN proretset THEN prorows <= 0 ELSE prorows != 0 END OR
prokind NOT IN ('f', 'a', 'w', 'p') OR
provolatile NOT IN ('i', 's', 'v') OR
proparallel NOT IN ('s', 'r', 'u');
oid | proname
-----+---------
(0 rows)
-- prosrc should never be null; it can be empty only if prosqlbody isn't null
SELECT p1.oid, p1.proname
FROM pg_proc as p1
WHERE prosrc IS NULL;
oid | proname
-----+---------
(0 rows)
SELECT p1.oid, p1.proname
FROM pg_proc as p1
WHERE (prosrc = '' OR prosrc = '-') AND prosqlbody IS NULL;
oid | proname
-----+---------
(0 rows)
-- proretset should only be set for normal functions
SELECT p1.oid, p1.proname
FROM pg_proc AS p1
WHERE proretset AND prokind != 'f';
oid | proname
-----+---------
(0 rows)
-- currently, no built-in functions should be SECURITY DEFINER;
-- this might change in future, but there will probably never be many.
SELECT p1.oid, p1.proname
FROM pg_proc AS p1
WHERE prosecdef
ORDER BY 1;
oid | proname
-----+---------
(0 rows)
-- pronargdefaults should be 0 iff proargdefaults is null
SELECT p1.oid, p1.proname
FROM pg_proc AS p1
WHERE (pronargdefaults <> 0) != (proargdefaults IS NOT NULL);
oid | proname
-----+---------
(0 rows)
-- probin should be non-empty for C functions, null everywhere else
SELECT p1.oid, p1.proname
FROM pg_proc as p1
WHERE prolang = 13 AND (probin IS NULL OR probin = '' OR probin = '-');
oid | proname
-----+---------
(0 rows)
SELECT p1.oid, p1.proname
FROM pg_proc as p1
WHERE prolang != 13 AND probin IS NOT NULL;
oid | proname
-----+---------
(0 rows)
-- Look for conflicting proc definitions (same names and input datatypes).
-- (This test should be dead code now that we have the unique index
-- pg_proc_proname_args_nsp_index, but I'll leave it in anyway.)
SELECT p1.oid, p1.proname, p2.oid, p2.proname
FROM pg_proc AS p1, pg_proc AS p2
WHERE p1.oid != p2.oid AND
p1.proname = p2.proname AND
p1.pronargs = p2.pronargs AND
p1.proargtypes = p2.proargtypes;
oid | proname | oid | proname
-----+---------+-----+---------
(0 rows)
-- Considering only built-in procs (prolang = 12), look for multiple uses
-- of the same internal function (ie, matching prosrc fields). It's OK to
-- have several entries with different pronames for the same internal function,
-- but conflicts in the number of arguments and other critical items should
-- be complained of. (We don't check data types here; see next query.)
-- Note: ignore aggregate functions here, since they all point to the same
-- dummy built-in function.
SELECT p1.oid, p1.proname, p2.oid, p2.proname
FROM pg_proc AS p1, pg_proc AS p2
WHERE p1.oid < p2.oid AND
p1.prosrc = p2.prosrc AND
p1.prolang = 12 AND p2.prolang = 12 AND
(p1.prokind != 'a' OR p2.prokind != 'a') AND
(p1.prolang != p2.prolang OR
p1.prokind != p2.prokind OR
p1.prosecdef != p2.prosecdef OR
p1.proleakproof != p2.proleakproof OR
p1.proisstrict != p2.proisstrict OR
p1.proretset != p2.proretset OR
p1.provolatile != p2.provolatile OR
p1.pronargs != p2.pronargs);
oid | proname | oid | proname
-----+---------+-----+---------
(0 rows)
-- Look for uses of different type OIDs in the argument/result type fields
-- for different aliases of the same built-in function.
-- This indicates that the types are being presumed to be binary-equivalent,
-- or that the built-in function is prepared to deal with different types.
-- That's not wrong, necessarily, but we make lists of all the types being
-- so treated. Note that the expected output of this part of the test will
-- need to be modified whenever new pairs of types are made binary-equivalent,
-- or when new polymorphic built-in functions are added!
-- Note: ignore aggregate functions here, since they all point to the same
5 years ago
-- dummy built-in function. Likewise, ignore range and multirange constructor
-- functions.
SELECT DISTINCT p1.prorettype::regtype, p2.prorettype::regtype
FROM pg_proc AS p1, pg_proc AS p2
WHERE p1.oid != p2.oid AND
p1.prosrc = p2.prosrc AND
p1.prolang = 12 AND p2.prolang = 12 AND
p1.prokind != 'a' AND p2.prokind != 'a' AND
p1.prosrc NOT LIKE E'range\\_constructor_' AND
p2.prosrc NOT LIKE E'range\\_constructor_' AND
5 years ago
p1.prosrc NOT LIKE E'multirange\\_constructor_' AND
p2.prosrc NOT LIKE E'multirange\\_constructor_' AND
(p1.prorettype < p2.prorettype)
ORDER BY 1, 2;
prorettype | prorettype
-----------------------------+--------------------------
bigint | xid8
text | character varying
timestamp without time zone | timestamp with time zone
txid_snapshot | pg_snapshot
(4 rows)
SELECT DISTINCT p1.proargtypes[0]::regtype, p2.proargtypes[0]::regtype
FROM pg_proc AS p1, pg_proc AS p2
WHERE p1.oid != p2.oid AND
p1.prosrc = p2.prosrc AND
p1.prolang = 12 AND p2.prolang = 12 AND
p1.prokind != 'a' AND p2.prokind != 'a' AND
p1.prosrc NOT LIKE E'range\\_constructor_' AND
p2.prosrc NOT LIKE E'range\\_constructor_' AND
5 years ago
p1.prosrc NOT LIKE E'multirange\\_constructor_' AND
p2.prosrc NOT LIKE E'multirange\\_constructor_' AND
(p1.proargtypes[0] < p2.proargtypes[0])
ORDER BY 1, 2;
proargtypes | proargtypes
-----------------------------+--------------------------
bigint | xid8
text | character
text | character varying
timestamp without time zone | timestamp with time zone
bit | bit varying
txid_snapshot | pg_snapshot
(6 rows)
SELECT DISTINCT p1.proargtypes[1]::regtype, p2.proargtypes[1]::regtype
FROM pg_proc AS p1, pg_proc AS p2
WHERE p1.oid != p2.oid AND
p1.prosrc = p2.prosrc AND
p1.prolang = 12 AND p2.prolang = 12 AND
p1.prokind != 'a' AND p2.prokind != 'a' AND
p1.prosrc NOT LIKE E'range\\_constructor_' AND
p2.prosrc NOT LIKE E'range\\_constructor_' AND
5 years ago
p1.prosrc NOT LIKE E'multirange\\_constructor_' AND
p2.prosrc NOT LIKE E'multirange\\_constructor_' AND
(p1.proargtypes[1] < p2.proargtypes[1])
ORDER BY 1, 2;
proargtypes | proargtypes
-----------------------------+--------------------------
integer | xid
timestamp without time zone | timestamp with time zone
bit | bit varying
txid_snapshot | pg_snapshot
anyrange | anymultirange
(5 rows)
SELECT DISTINCT p1.proargtypes[2]::regtype, p2.proargtypes[2]::regtype
FROM pg_proc AS p1, pg_proc AS p2
WHERE p1.oid != p2.oid AND
p1.prosrc = p2.prosrc AND
p1.prolang = 12 AND p2.prolang = 12 AND
p1.prokind != 'a' AND p2.prokind != 'a' AND
(p1.proargtypes[2] < p2.proargtypes[2])
ORDER BY 1, 2;
proargtypes | proargtypes
-----------------------------+--------------------------
timestamp without time zone | timestamp with time zone
(1 row)
SELECT DISTINCT p1.proargtypes[3]::regtype, p2.proargtypes[3]::regtype
FROM pg_proc AS p1, pg_proc AS p2
WHERE p1.oid != p2.oid AND
p1.prosrc = p2.prosrc AND
p1.prolang = 12 AND p2.prolang = 12 AND
p1.prokind != 'a' AND p2.prokind != 'a' AND
(p1.proargtypes[3] < p2.proargtypes[3])
ORDER BY 1, 2;
proargtypes | proargtypes
-----------------------------+--------------------------
timestamp without time zone | timestamp with time zone
(1 row)
SELECT DISTINCT p1.proargtypes[4]::regtype, p2.proargtypes[4]::regtype
FROM pg_proc AS p1, pg_proc AS p2
WHERE p1.oid != p2.oid AND
p1.prosrc = p2.prosrc AND
p1.prolang = 12 AND p2.prolang = 12 AND
p1.prokind != 'a' AND p2.prokind != 'a' AND
(p1.proargtypes[4] < p2.proargtypes[4])
ORDER BY 1, 2;
proargtypes | proargtypes
-------------+-------------
(0 rows)
SELECT DISTINCT p1.proargtypes[5]::regtype, p2.proargtypes[5]::regtype
FROM pg_proc AS p1, pg_proc AS p2
WHERE p1.oid != p2.oid AND
p1.prosrc = p2.prosrc AND
p1.prolang = 12 AND p2.prolang = 12 AND
p1.prokind != 'a' AND p2.prokind != 'a' AND
(p1.proargtypes[5] < p2.proargtypes[5])
ORDER BY 1, 2;
proargtypes | proargtypes
-------------+-------------
(0 rows)
SELECT DISTINCT p1.proargtypes[6]::regtype, p2.proargtypes[6]::regtype
FROM pg_proc AS p1, pg_proc AS p2
WHERE p1.oid != p2.oid AND
p1.prosrc = p2.prosrc AND
p1.prolang = 12 AND p2.prolang = 12 AND
p1.prokind != 'a' AND p2.prokind != 'a' AND
(p1.proargtypes[6] < p2.proargtypes[6])
ORDER BY 1, 2;
proargtypes | proargtypes
-------------+-------------
(0 rows)
SELECT DISTINCT p1.proargtypes[7]::regtype, p2.proargtypes[7]::regtype
FROM pg_proc AS p1, pg_proc AS p2
WHERE p1.oid != p2.oid AND
p1.prosrc = p2.prosrc AND
p1.prolang = 12 AND p2.prolang = 12 AND
p1.prokind != 'a' AND p2.prokind != 'a' AND
(p1.proargtypes[7] < p2.proargtypes[7])
ORDER BY 1, 2;
proargtypes | proargtypes
-------------+-------------
(0 rows)
-- Look for functions that return type "internal" and do not have any
-- "internal" argument. Such a function would be a security hole since
-- it might be used to call an internal function from an SQL command.
Fix type-safety problem with parallel aggregate serial/deserialization. The original specification for this called for the deserialization function to have signature "deserialize(serialtype) returns transtype", which is a security violation if transtype is INTERNAL (which it always would be in practice) and serialtype is not (which ditto). The patch blithely overrode the opr_sanity check for that, which was sloppy-enough work in itself, but the indisputable reason this cannot be allowed to stand is that CREATE FUNCTION will reject such a signature and thus it'd be impossible for extensions to create parallelizable aggregates. The minimum fix to make the signature type-safe is to add a second, dummy argument of type INTERNAL. But to lock it down a bit more and make misuse of INTERNAL-accepting functions less likely, let's get rid of the ability to specify a "serialtype" for an aggregate and just say that the only useful serialtype is BYTEA --- which, in practice, is the only interesting value anyway, due to the usefulness of the send/recv infrastructure for this purpose. That means we only have to allow "serialize(internal) returns bytea" and "deserialize(bytea, internal) returns internal" as the signatures for these support functions. In passing fix bogus signature of int4_avg_combine, which I found thanks to adding an opr_sanity check on combinefunc signatures. catversion bump due to removing pg_aggregate.aggserialtype and adjusting signatures of assorted built-in functions. David Rowley and Tom Lane Discussion: <27247.1466185504@sss.pgh.pa.us>
9 years ago
-- As of 7.3 this query should find only internal_in, which is safe because
-- it always throws an error when called.
SELECT p1.oid, p1.proname
FROM pg_proc as p1
WHERE p1.prorettype = 'internal'::regtype AND NOT
'internal'::regtype = ANY (p1.proargtypes);
Fix type-safety problem with parallel aggregate serial/deserialization. The original specification for this called for the deserialization function to have signature "deserialize(serialtype) returns transtype", which is a security violation if transtype is INTERNAL (which it always would be in practice) and serialtype is not (which ditto). The patch blithely overrode the opr_sanity check for that, which was sloppy-enough work in itself, but the indisputable reason this cannot be allowed to stand is that CREATE FUNCTION will reject such a signature and thus it'd be impossible for extensions to create parallelizable aggregates. The minimum fix to make the signature type-safe is to add a second, dummy argument of type INTERNAL. But to lock it down a bit more and make misuse of INTERNAL-accepting functions less likely, let's get rid of the ability to specify a "serialtype" for an aggregate and just say that the only useful serialtype is BYTEA --- which, in practice, is the only interesting value anyway, due to the usefulness of the send/recv infrastructure for this purpose. That means we only have to allow "serialize(internal) returns bytea" and "deserialize(bytea, internal) returns internal" as the signatures for these support functions. In passing fix bogus signature of int4_avg_combine, which I found thanks to adding an opr_sanity check on combinefunc signatures. catversion bump due to removing pg_aggregate.aggserialtype and adjusting signatures of assorted built-in functions. David Rowley and Tom Lane Discussion: <27247.1466185504@sss.pgh.pa.us>
9 years ago
oid | proname
------+-------------
2304 | internal_in
Fix type-safety problem with parallel aggregate serial/deserialization. The original specification for this called for the deserialization function to have signature "deserialize(serialtype) returns transtype", which is a security violation if transtype is INTERNAL (which it always would be in practice) and serialtype is not (which ditto). The patch blithely overrode the opr_sanity check for that, which was sloppy-enough work in itself, but the indisputable reason this cannot be allowed to stand is that CREATE FUNCTION will reject such a signature and thus it'd be impossible for extensions to create parallelizable aggregates. The minimum fix to make the signature type-safe is to add a second, dummy argument of type INTERNAL. But to lock it down a bit more and make misuse of INTERNAL-accepting functions less likely, let's get rid of the ability to specify a "serialtype" for an aggregate and just say that the only useful serialtype is BYTEA --- which, in practice, is the only interesting value anyway, due to the usefulness of the send/recv infrastructure for this purpose. That means we only have to allow "serialize(internal) returns bytea" and "deserialize(bytea, internal) returns internal" as the signatures for these support functions. In passing fix bogus signature of int4_avg_combine, which I found thanks to adding an opr_sanity check on combinefunc signatures. catversion bump due to removing pg_aggregate.aggserialtype and adjusting signatures of assorted built-in functions. David Rowley and Tom Lane Discussion: <27247.1466185504@sss.pgh.pa.us>
9 years ago
(1 row)
Allow polymorphic aggregates to have non-polymorphic state data types. Before 9.4, such an aggregate couldn't be declared, because its final function would have to have polymorphic result type but no polymorphic argument, which CREATE FUNCTION would quite properly reject. The ordered-set-aggregate patch found a workaround: allow the final function to be declared as accepting additional dummy arguments that have types matching the aggregate's regular input arguments. However, we failed to notice that this problem applies just as much to regular aggregates, despite the fact that we had a built-in regular aggregate array_agg() that was known to be undeclarable in SQL because its final function had an illegal signature. So what we should have done, and what this patch does, is to decouple the extra-dummy-arguments behavior from ordered-set aggregates and make it generally available for all aggregate declarations. We have to put this into 9.4 rather than waiting till later because it slightly alters the rules for declaring ordered-set aggregates. The patch turned out a bit bigger than I'd hoped because it proved necessary to record the extra-arguments option in a new pg_aggregate column. I'd thought we could just look at the final function's pronargs at runtime, but that didn't work well for variadic final functions. It's probably just as well though, because it simplifies life for pg_dump to record the option explicitly. While at it, fix array_agg() to have a valid final-function signature, and add an opr_sanity test to notice future deviations from polymorphic consistency. I also marked the percentile_cont() aggregates as not needing extra arguments, since they don't.
12 years ago
-- Look for functions that return a polymorphic type and do not have any
-- polymorphic argument. Calls of such functions would be unresolvable
Fix assorted inconsistencies in GiST opclass support function declarations. The conventions specified by the GiST SGML documentation were widely ignored. For example, the strategy-number argument for "consistent" and "distance" functions is specified to be a smallint, but most of the built-in support functions declared it as an integer, and for that matter the core code passed it using Int32GetDatum not Int16GetDatum. None of that makes any real difference at runtime, but it's quite confusing for newcomers to the code, and it makes it very hard to write an amvalidate() function that checks support function signatures. So let's try to instill some consistency here. Another similar issue is that the "query" argument is not of a single well-defined type, but could have different types depending on the strategy (corresponding to search operators with different righthand-side argument types). Some of the functions threw up their hands and declared the query argument as being of "internal" type, which surely isn't right ("any" would have been more appropriate); but the majority position seemed to be to declare it as being of the indexed data type, corresponding to a search operator with both input types the same. So I've specified a convention that that's what to do always. Also, the result of the "union" support function actually must be of the index's storage type, but the documentation suggested declaring it to return "internal", and some of the functions followed that. Standardize on telling the truth, instead. Similarly, standardize on declaring the "same" function's inputs as being of the storage type, not "internal". Also, somebody had forgotten to add the "recheck" argument to both the documentation of the "distance" support function and all of their SQL declarations, even though the C code was happily using that argument. Clean that up too. Fix up some other omissions in the docs too, such as documenting that union's second input argument is vestigial. So far as the errors in core function declarations go, we can just fix pg_proc.h and bump catversion. Adjusting the erroneous declarations in contrib modules is more debatable: in principle any change in those scripts should involve an extension version bump, which is a pain. However, since these changes are purely cosmetic and make no functional difference, I think we can get away without doing that.
10 years ago
-- at parse time. As of 9.6 this query should find only some input functions
-- and GiST support functions associated with these pseudotypes.
Allow polymorphic aggregates to have non-polymorphic state data types. Before 9.4, such an aggregate couldn't be declared, because its final function would have to have polymorphic result type but no polymorphic argument, which CREATE FUNCTION would quite properly reject. The ordered-set-aggregate patch found a workaround: allow the final function to be declared as accepting additional dummy arguments that have types matching the aggregate's regular input arguments. However, we failed to notice that this problem applies just as much to regular aggregates, despite the fact that we had a built-in regular aggregate array_agg() that was known to be undeclarable in SQL because its final function had an illegal signature. So what we should have done, and what this patch does, is to decouple the extra-dummy-arguments behavior from ordered-set aggregates and make it generally available for all aggregate declarations. We have to put this into 9.4 rather than waiting till later because it slightly alters the rules for declaring ordered-set aggregates. The patch turned out a bit bigger than I'd hoped because it proved necessary to record the extra-arguments option in a new pg_aggregate column. I'd thought we could just look at the final function's pronargs at runtime, but that didn't work well for variadic final functions. It's probably just as well though, because it simplifies life for pg_dump to record the option explicitly. While at it, fix array_agg() to have a valid final-function signature, and add an opr_sanity test to notice future deviations from polymorphic consistency. I also marked the percentile_cont() aggregates as not needing extra arguments, since they don't.
12 years ago
SELECT p1.oid, p1.proname
FROM pg_proc as p1
WHERE p1.prorettype IN
('anyelement'::regtype, 'anyarray'::regtype, 'anynonarray'::regtype,
'anyenum'::regtype)
Allow polymorphic aggregates to have non-polymorphic state data types. Before 9.4, such an aggregate couldn't be declared, because its final function would have to have polymorphic result type but no polymorphic argument, which CREATE FUNCTION would quite properly reject. The ordered-set-aggregate patch found a workaround: allow the final function to be declared as accepting additional dummy arguments that have types matching the aggregate's regular input arguments. However, we failed to notice that this problem applies just as much to regular aggregates, despite the fact that we had a built-in regular aggregate array_agg() that was known to be undeclarable in SQL because its final function had an illegal signature. So what we should have done, and what this patch does, is to decouple the extra-dummy-arguments behavior from ordered-set aggregates and make it generally available for all aggregate declarations. We have to put this into 9.4 rather than waiting till later because it slightly alters the rules for declaring ordered-set aggregates. The patch turned out a bit bigger than I'd hoped because it proved necessary to record the extra-arguments option in a new pg_aggregate column. I'd thought we could just look at the final function's pronargs at runtime, but that didn't work well for variadic final functions. It's probably just as well though, because it simplifies life for pg_dump to record the option explicitly. While at it, fix array_agg() to have a valid final-function signature, and add an opr_sanity test to notice future deviations from polymorphic consistency. I also marked the percentile_cont() aggregates as not needing extra arguments, since they don't.
12 years ago
AND NOT
('anyelement'::regtype = ANY (p1.proargtypes) OR
'anyarray'::regtype = ANY (p1.proargtypes) OR
'anynonarray'::regtype = ANY (p1.proargtypes) OR
'anyenum'::regtype = ANY (p1.proargtypes) OR
5 years ago
'anyrange'::regtype = ANY (p1.proargtypes) OR
'anymultirange'::regtype = ANY (p1.proargtypes))
Allow polymorphic aggregates to have non-polymorphic state data types. Before 9.4, such an aggregate couldn't be declared, because its final function would have to have polymorphic result type but no polymorphic argument, which CREATE FUNCTION would quite properly reject. The ordered-set-aggregate patch found a workaround: allow the final function to be declared as accepting additional dummy arguments that have types matching the aggregate's regular input arguments. However, we failed to notice that this problem applies just as much to regular aggregates, despite the fact that we had a built-in regular aggregate array_agg() that was known to be undeclarable in SQL because its final function had an illegal signature. So what we should have done, and what this patch does, is to decouple the extra-dummy-arguments behavior from ordered-set aggregates and make it generally available for all aggregate declarations. We have to put this into 9.4 rather than waiting till later because it slightly alters the rules for declaring ordered-set aggregates. The patch turned out a bit bigger than I'd hoped because it proved necessary to record the extra-arguments option in a new pg_aggregate column. I'd thought we could just look at the final function's pronargs at runtime, but that didn't work well for variadic final functions. It's probably just as well though, because it simplifies life for pg_dump to record the option explicitly. While at it, fix array_agg() to have a valid final-function signature, and add an opr_sanity test to notice future deviations from polymorphic consistency. I also marked the percentile_cont() aggregates as not needing extra arguments, since they don't.
12 years ago
ORDER BY 2;
oid | proname
------+----------------
Allow polymorphic aggregates to have non-polymorphic state data types. Before 9.4, such an aggregate couldn't be declared, because its final function would have to have polymorphic result type but no polymorphic argument, which CREATE FUNCTION would quite properly reject. The ordered-set-aggregate patch found a workaround: allow the final function to be declared as accepting additional dummy arguments that have types matching the aggregate's regular input arguments. However, we failed to notice that this problem applies just as much to regular aggregates, despite the fact that we had a built-in regular aggregate array_agg() that was known to be undeclarable in SQL because its final function had an illegal signature. So what we should have done, and what this patch does, is to decouple the extra-dummy-arguments behavior from ordered-set aggregates and make it generally available for all aggregate declarations. We have to put this into 9.4 rather than waiting till later because it slightly alters the rules for declaring ordered-set aggregates. The patch turned out a bit bigger than I'd hoped because it proved necessary to record the extra-arguments option in a new pg_aggregate column. I'd thought we could just look at the final function's pronargs at runtime, but that didn't work well for variadic final functions. It's probably just as well though, because it simplifies life for pg_dump to record the option explicitly. While at it, fix array_agg() to have a valid final-function signature, and add an opr_sanity test to notice future deviations from polymorphic consistency. I also marked the percentile_cont() aggregates as not needing extra arguments, since they don't.
12 years ago
2296 | anyarray_in
2502 | anyarray_recv
2312 | anyelement_in
3504 | anyenum_in
2777 | anynonarray_in
750 | array_in
2400 | array_recv
3506 | enum_in
3532 | enum_recv
(9 rows)
5 years ago
-- anyrange and anymultirange are tighter than the rest, can only resolve
-- from each other
SELECT p1.oid, p1.proname
FROM pg_proc as p1
5 years ago
WHERE p1.prorettype IN ('anyrange'::regtype, 'anymultirange'::regtype)
AND NOT
5 years ago
('anyrange'::regtype = ANY (p1.proargtypes) OR
'anymultirange'::regtype = ANY (p1.proargtypes))
ORDER BY 2;
oid | proname
------+------------------
5 years ago
4229 | anymultirange_in
3832 | anyrange_in
5 years ago
4231 | multirange_in
4233 | multirange_recv
Fix assorted inconsistencies in GiST opclass support function declarations. The conventions specified by the GiST SGML documentation were widely ignored. For example, the strategy-number argument for "consistent" and "distance" functions is specified to be a smallint, but most of the built-in support functions declared it as an integer, and for that matter the core code passed it using Int32GetDatum not Int16GetDatum. None of that makes any real difference at runtime, but it's quite confusing for newcomers to the code, and it makes it very hard to write an amvalidate() function that checks support function signatures. So let's try to instill some consistency here. Another similar issue is that the "query" argument is not of a single well-defined type, but could have different types depending on the strategy (corresponding to search operators with different righthand-side argument types). Some of the functions threw up their hands and declared the query argument as being of "internal" type, which surely isn't right ("any" would have been more appropriate); but the majority position seemed to be to declare it as being of the indexed data type, corresponding to a search operator with both input types the same. So I've specified a convention that that's what to do always. Also, the result of the "union" support function actually must be of the index's storage type, but the documentation suggested declaring it to return "internal", and some of the functions followed that. Standardize on telling the truth, instead. Similarly, standardize on declaring the "same" function's inputs as being of the storage type, not "internal". Also, somebody had forgotten to add the "recheck" argument to both the documentation of the "distance" support function and all of their SQL declarations, even though the C code was happily using that argument. Clean that up too. Fix up some other omissions in the docs too, such as documenting that union's second input argument is vestigial. So far as the errors in core function declarations go, we can just fix pg_proc.h and bump catversion. Adjusting the erroneous declarations in contrib modules is more debatable: in principle any change in those scripts should involve an extension version bump, which is a pain. However, since these changes are purely cosmetic and make no functional difference, I think we can get away without doing that.
10 years ago
3876 | range_gist_union
Allow polymorphic aggregates to have non-polymorphic state data types. Before 9.4, such an aggregate couldn't be declared, because its final function would have to have polymorphic result type but no polymorphic argument, which CREATE FUNCTION would quite properly reject. The ordered-set-aggregate patch found a workaround: allow the final function to be declared as accepting additional dummy arguments that have types matching the aggregate's regular input arguments. However, we failed to notice that this problem applies just as much to regular aggregates, despite the fact that we had a built-in regular aggregate array_agg() that was known to be undeclarable in SQL because its final function had an illegal signature. So what we should have done, and what this patch does, is to decouple the extra-dummy-arguments behavior from ordered-set aggregates and make it generally available for all aggregate declarations. We have to put this into 9.4 rather than waiting till later because it slightly alters the rules for declaring ordered-set aggregates. The patch turned out a bit bigger than I'd hoped because it proved necessary to record the extra-arguments option in a new pg_aggregate column. I'd thought we could just look at the final function's pronargs at runtime, but that didn't work well for variadic final functions. It's probably just as well though, because it simplifies life for pg_dump to record the option explicitly. While at it, fix array_agg() to have a valid final-function signature, and add an opr_sanity test to notice future deviations from polymorphic consistency. I also marked the percentile_cont() aggregates as not needing extra arguments, since they don't.
12 years ago
3834 | range_in
3836 | range_recv
5 years ago
(7 rows)
-- similarly for the anycompatible family
SELECT p1.oid, p1.proname
FROM pg_proc as p1
WHERE p1.prorettype IN
('anycompatible'::regtype, 'anycompatiblearray'::regtype,
'anycompatiblenonarray'::regtype)
AND NOT
('anycompatible'::regtype = ANY (p1.proargtypes) OR
'anycompatiblearray'::regtype = ANY (p1.proargtypes) OR
'anycompatiblenonarray'::regtype = ANY (p1.proargtypes) OR
'anycompatiblerange'::regtype = ANY (p1.proargtypes))
ORDER BY 2;
oid | proname
------+--------------------------
5086 | anycompatible_in
5088 | anycompatiblearray_in
5090 | anycompatiblearray_recv
5092 | anycompatiblenonarray_in
(4 rows)
SELECT p1.oid, p1.proname
FROM pg_proc as p1
WHERE p1.prorettype = 'anycompatiblerange'::regtype
AND NOT
'anycompatiblerange'::regtype = ANY (p1.proargtypes)
ORDER BY 2;
oid | proname
------+-----------------------
5094 | anycompatiblerange_in
(1 row)
Allow polymorphic aggregates to have non-polymorphic state data types. Before 9.4, such an aggregate couldn't be declared, because its final function would have to have polymorphic result type but no polymorphic argument, which CREATE FUNCTION would quite properly reject. The ordered-set-aggregate patch found a workaround: allow the final function to be declared as accepting additional dummy arguments that have types matching the aggregate's regular input arguments. However, we failed to notice that this problem applies just as much to regular aggregates, despite the fact that we had a built-in regular aggregate array_agg() that was known to be undeclarable in SQL because its final function had an illegal signature. So what we should have done, and what this patch does, is to decouple the extra-dummy-arguments behavior from ordered-set aggregates and make it generally available for all aggregate declarations. We have to put this into 9.4 rather than waiting till later because it slightly alters the rules for declaring ordered-set aggregates. The patch turned out a bit bigger than I'd hoped because it proved necessary to record the extra-arguments option in a new pg_aggregate column. I'd thought we could just look at the final function's pronargs at runtime, but that didn't work well for variadic final functions. It's probably just as well though, because it simplifies life for pg_dump to record the option explicitly. While at it, fix array_agg() to have a valid final-function signature, and add an opr_sanity test to notice future deviations from polymorphic consistency. I also marked the percentile_cont() aggregates as not needing extra arguments, since they don't.
12 years ago
-- Look for functions that accept cstring and are neither datatype input
-- functions nor encoding conversion functions. It's almost never a good
-- idea to use cstring input for a function meant to be called from SQL;
-- text should be used instead, because cstring lacks suitable casts.
-- As of 9.6 this query should find only cstring_out and cstring_send.
-- However, we must manually exclude shell_in, which might or might not be
-- rejected by the EXISTS clause depending on whether there are currently
-- any shell types.
SELECT p1.oid, p1.proname
FROM pg_proc as p1
WHERE 'cstring'::regtype = ANY (p1.proargtypes)
AND NOT EXISTS(SELECT 1 FROM pg_type WHERE typinput = p1.oid)
AND NOT EXISTS(SELECT 1 FROM pg_conversion WHERE conproc = p1.oid)
AND p1.oid != 'shell_in(cstring)'::regprocedure
ORDER BY 1;
oid | proname
------+--------------
2293 | cstring_out
2501 | cstring_send
(2 rows)
-- Likewise, look for functions that return cstring and aren't datatype output
-- functions nor typmod output functions.
-- As of 9.6 this query should find only cstring_in and cstring_recv.
-- However, we must manually exclude shell_out.
SELECT p1.oid, p1.proname
FROM pg_proc as p1
WHERE p1.prorettype = 'cstring'::regtype
AND NOT EXISTS(SELECT 1 FROM pg_type WHERE typoutput = p1.oid)
AND NOT EXISTS(SELECT 1 FROM pg_type WHERE typmodout = p1.oid)
AND p1.oid != 'shell_out(void)'::regprocedure
ORDER BY 1;
oid | proname
------+--------------
2292 | cstring_in
2500 | cstring_recv
(2 rows)
-- Check for length inconsistencies between the various argument-info arrays.
SELECT p1.oid, p1.proname
FROM pg_proc as p1
WHERE proallargtypes IS NOT NULL AND
array_length(proallargtypes,1) < array_length(proargtypes,1);
oid | proname
-----+---------
(0 rows)
SELECT p1.oid, p1.proname
FROM pg_proc as p1
WHERE proargmodes IS NOT NULL AND
array_length(proargmodes,1) < array_length(proargtypes,1);
oid | proname
-----+---------
(0 rows)
SELECT p1.oid, p1.proname
FROM pg_proc as p1
WHERE proargnames IS NOT NULL AND
array_length(proargnames,1) < array_length(proargtypes,1);
oid | proname
-----+---------
(0 rows)
SELECT p1.oid, p1.proname
FROM pg_proc as p1
WHERE proallargtypes IS NOT NULL AND proargmodes IS NOT NULL AND
array_length(proallargtypes,1) <> array_length(proargmodes,1);
oid | proname
-----+---------
(0 rows)
SELECT p1.oid, p1.proname
FROM pg_proc as p1
WHERE proallargtypes IS NOT NULL AND proargnames IS NOT NULL AND
array_length(proallargtypes,1) <> array_length(proargnames,1);
oid | proname
-----+---------
(0 rows)
SELECT p1.oid, p1.proname
FROM pg_proc as p1
WHERE proargmodes IS NOT NULL AND proargnames IS NOT NULL AND
array_length(proargmodes,1) <> array_length(proargnames,1);
oid | proname
-----+---------
(0 rows)
-- Check that proallargtypes matches proargtypes
SELECT p1.oid, p1.proname, p1.proargtypes, p1.proallargtypes, p1.proargmodes
FROM pg_proc as p1
WHERE proallargtypes IS NOT NULL AND
ARRAY(SELECT unnest(proargtypes)) <>
ARRAY(SELECT proallargtypes[i]
FROM generate_series(1, array_length(proallargtypes, 1)) g(i)
WHERE proargmodes IS NULL OR proargmodes[i] IN ('i', 'b', 'v'));
oid | proname | proargtypes | proallargtypes | proargmodes
-----+---------+-------------+----------------+-------------
(0 rows)
-- Check for type of the variadic array parameter's elements.
-- provariadic should be ANYOID if the type of the last element is ANYOID,
-- ANYELEMENTOID if the type of the last element is ANYARRAYOID,
-- ANYCOMPATIBLEOID if the type of the last element is ANYCOMPATIBLEARRAYOID,
-- and otherwise the element type corresponding to the array type.
SELECT oid::regprocedure, provariadic::regtype, proargtypes::regtype[]
FROM pg_proc
WHERE provariadic != 0
AND case proargtypes[array_length(proargtypes, 1)-1]
WHEN '"any"'::regtype THEN '"any"'::regtype
WHEN 'anyarray'::regtype THEN 'anyelement'::regtype
WHEN 'anycompatiblearray'::regtype THEN 'anycompatible'::regtype
ELSE (SELECT t.oid
FROM pg_type t
WHERE t.typarray = proargtypes[array_length(proargtypes, 1)-1])
END != provariadic;
oid | provariadic | proargtypes
-----+-------------+-------------
(0 rows)
-- Check that all and only those functions with a variadic type have
-- a variadic argument.
SELECT oid::regprocedure, proargmodes, provariadic
FROM pg_proc
WHERE (proargmodes IS NOT NULL AND 'v' = any(proargmodes))
IS DISTINCT FROM
(provariadic != 0);
oid | proargmodes | provariadic
-----+-------------+-------------
(0 rows)
-- Check for prosupport functions with the wrong signature
SELECT p1.oid, p1.proname, p2.oid, p2.proname
FROM pg_proc AS p1, pg_proc AS p2
WHERE p2.oid = p1.prosupport AND
(p2.prorettype != 'internal'::regtype OR p2.proretset OR p2.pronargs != 1
OR p2.proargtypes[0] != 'internal'::regtype);
oid | proname | oid | proname
-----+---------+-----+---------
(0 rows)
-- Insist that all built-in pg_proc entries have descriptions
SELECT p1.oid, p1.proname
FROM pg_proc as p1 LEFT JOIN pg_description as d
ON p1.tableoid = d.classoid and p1.oid = d.objoid and d.objsubid = 0
WHERE d.classoid IS NULL AND p1.oid <= 9999;
oid | proname
-----+---------
(0 rows)
-- List of built-in leakproof functions
--
-- Leakproof functions should only be added after carefully
-- scrutinizing all possibly executed codepaths for possible
-- information leaks. Don't add functions here unless you know what a
-- leakproof function is. If unsure, don't mark it as such.
-- temporarily disable fancy output, so catalog changes create less diff noise
\a\t
SELECT p1.oid::regprocedure
FROM pg_proc p1 JOIN pg_namespace pn
ON pronamespace = pn.oid
WHERE nspname = 'pg_catalog' AND proleakproof
ORDER BY 1;
boollt(boolean,boolean)
boolgt(boolean,boolean)
booleq(boolean,boolean)
chareq("char","char")
nameeq(name,name)
int2eq(smallint,smallint)
int2lt(smallint,smallint)
int4eq(integer,integer)
int4lt(integer,integer)
texteq(text,text)
xideq(xid,xid)
cideq(cid,cid)
charne("char","char")
charle("char","char")
chargt("char","char")
charge("char","char")
boolne(boolean,boolean)
int4ne(integer,integer)
int2ne(smallint,smallint)
int2gt(smallint,smallint)
int4gt(integer,integer)
int2le(smallint,smallint)
int4le(integer,integer)
int4ge(integer,integer)
int2ge(smallint,smallint)
textne(text,text)
int24eq(smallint,integer)
int42eq(integer,smallint)
int24lt(smallint,integer)
int42lt(integer,smallint)
int24gt(smallint,integer)
int42gt(integer,smallint)
int24ne(smallint,integer)
int42ne(integer,smallint)
int24le(smallint,integer)
int42le(integer,smallint)
int24ge(smallint,integer)
int42ge(integer,smallint)
oideq(oid,oid)
oidne(oid,oid)
float8(smallint)
float4(smallint)
nameeqtext(name,text)
Straighten out leakproofness markings on text comparison functions. Since we introduced the idea of leakproof functions, texteq and textne were marked leakproof but their sibling text comparison functions were not. This inconsistency seemed justified because texteq/textne just relied on memcmp() and so could easily be seen to be leakproof, while the other comparison functions are far more complex and indeed can throw input-dependent errors. However, that argument crashed and burned with the addition of nondeterministic collations, because now texteq/textne may invoke the exact same varstr_cmp() infrastructure as the rest. It makes no sense whatever to give them different leakproofness markings. After a certain amount of angst we've concluded that it's all right to consider varstr_cmp() to be leakproof, mostly because the other choice would be disastrous for performance of many queries where leakproofness matters. The input-dependent errors should only be reachable for corrupt input data, or so we hope anyway; certainly, if they are reachable in practice, we've got problems with requirements as basic as maintaining a btree index on a text column. Hence, run around to all the SQL functions that derive from varstr_cmp() and mark them leakproof. This should result in a useful gain in flexibility/performance for queries in which non-leakproofness degrades the efficiency of the query plan. Back-patch to v12 where nondeterministic collations were added. While this isn't an essential bug fix given the determination that varstr_cmp() is leakproof, we might as well apply it now that we've been forced into a post-beta4 catversion bump. Discussion: https://postgr.es/m/31481.1568303470@sss.pgh.pa.us
6 years ago
namelttext(name,text)
nameletext(name,text)
namegetext(name,text)
namegttext(name,text)
namenetext(name,text)
Straighten out leakproofness markings on text comparison functions. Since we introduced the idea of leakproof functions, texteq and textne were marked leakproof but their sibling text comparison functions were not. This inconsistency seemed justified because texteq/textne just relied on memcmp() and so could easily be seen to be leakproof, while the other comparison functions are far more complex and indeed can throw input-dependent errors. However, that argument crashed and burned with the addition of nondeterministic collations, because now texteq/textne may invoke the exact same varstr_cmp() infrastructure as the rest. It makes no sense whatever to give them different leakproofness markings. After a certain amount of angst we've concluded that it's all right to consider varstr_cmp() to be leakproof, mostly because the other choice would be disastrous for performance of many queries where leakproofness matters. The input-dependent errors should only be reachable for corrupt input data, or so we hope anyway; certainly, if they are reachable in practice, we've got problems with requirements as basic as maintaining a btree index on a text column. Hence, run around to all the SQL functions that derive from varstr_cmp() and mark them leakproof. This should result in a useful gain in flexibility/performance for queries in which non-leakproofness degrades the efficiency of the query plan. Back-patch to v12 where nondeterministic collations were added. While this isn't an essential bug fix given the determination that varstr_cmp() is leakproof, we might as well apply it now that we've been forced into a post-beta4 catversion bump. Discussion: https://postgr.es/m/31481.1568303470@sss.pgh.pa.us
6 years ago
btnametextcmp(name,text)
texteqname(text,name)
Straighten out leakproofness markings on text comparison functions. Since we introduced the idea of leakproof functions, texteq and textne were marked leakproof but their sibling text comparison functions were not. This inconsistency seemed justified because texteq/textne just relied on memcmp() and so could easily be seen to be leakproof, while the other comparison functions are far more complex and indeed can throw input-dependent errors. However, that argument crashed and burned with the addition of nondeterministic collations, because now texteq/textne may invoke the exact same varstr_cmp() infrastructure as the rest. It makes no sense whatever to give them different leakproofness markings. After a certain amount of angst we've concluded that it's all right to consider varstr_cmp() to be leakproof, mostly because the other choice would be disastrous for performance of many queries where leakproofness matters. The input-dependent errors should only be reachable for corrupt input data, or so we hope anyway; certainly, if they are reachable in practice, we've got problems with requirements as basic as maintaining a btree index on a text column. Hence, run around to all the SQL functions that derive from varstr_cmp() and mark them leakproof. This should result in a useful gain in flexibility/performance for queries in which non-leakproofness degrades the efficiency of the query plan. Back-patch to v12 where nondeterministic collations were added. While this isn't an essential bug fix given the determination that varstr_cmp() is leakproof, we might as well apply it now that we've been forced into a post-beta4 catversion bump. Discussion: https://postgr.es/m/31481.1568303470@sss.pgh.pa.us
6 years ago
textltname(text,name)
textlename(text,name)
textgename(text,name)
textgtname(text,name)
textnename(text,name)
Straighten out leakproofness markings on text comparison functions. Since we introduced the idea of leakproof functions, texteq and textne were marked leakproof but their sibling text comparison functions were not. This inconsistency seemed justified because texteq/textne just relied on memcmp() and so could easily be seen to be leakproof, while the other comparison functions are far more complex and indeed can throw input-dependent errors. However, that argument crashed and burned with the addition of nondeterministic collations, because now texteq/textne may invoke the exact same varstr_cmp() infrastructure as the rest. It makes no sense whatever to give them different leakproofness markings. After a certain amount of angst we've concluded that it's all right to consider varstr_cmp() to be leakproof, mostly because the other choice would be disastrous for performance of many queries where leakproofness matters. The input-dependent errors should only be reachable for corrupt input data, or so we hope anyway; certainly, if they are reachable in practice, we've got problems with requirements as basic as maintaining a btree index on a text column. Hence, run around to all the SQL functions that derive from varstr_cmp() and mark them leakproof. This should result in a useful gain in flexibility/performance for queries in which non-leakproofness degrades the efficiency of the query plan. Back-patch to v12 where nondeterministic collations were added. While this isn't an essential bug fix given the determination that varstr_cmp() is leakproof, we might as well apply it now that we've been forced into a post-beta4 catversion bump. Discussion: https://postgr.es/m/31481.1568303470@sss.pgh.pa.us
6 years ago
bttextnamecmp(text,name)
float4eq(real,real)
float4ne(real,real)
float4lt(real,real)
float4le(real,real)
float4gt(real,real)
float4ge(real,real)
float8eq(double precision,double precision)
float8ne(double precision,double precision)
float8lt(double precision,double precision)
float8le(double precision,double precision)
float8gt(double precision,double precision)
float8ge(double precision,double precision)
float48eq(real,double precision)
float48ne(real,double precision)
float48lt(real,double precision)
float48le(real,double precision)
float48gt(real,double precision)
float48ge(real,double precision)
float84eq(double precision,real)
float84ne(double precision,real)
float84lt(double precision,real)
float84le(double precision,real)
float84gt(double precision,real)
float84ge(double precision,real)
float8(real)
int4(smallint)
float8(integer)
float4(integer)
btint2cmp(smallint,smallint)
btint4cmp(integer,integer)
btfloat4cmp(real,real)
btfloat8cmp(double precision,double precision)
btoidcmp(oid,oid)
btcharcmp("char","char")
Straighten out leakproofness markings on text comparison functions. Since we introduced the idea of leakproof functions, texteq and textne were marked leakproof but their sibling text comparison functions were not. This inconsistency seemed justified because texteq/textne just relied on memcmp() and so could easily be seen to be leakproof, while the other comparison functions are far more complex and indeed can throw input-dependent errors. However, that argument crashed and burned with the addition of nondeterministic collations, because now texteq/textne may invoke the exact same varstr_cmp() infrastructure as the rest. It makes no sense whatever to give them different leakproofness markings. After a certain amount of angst we've concluded that it's all right to consider varstr_cmp() to be leakproof, mostly because the other choice would be disastrous for performance of many queries where leakproofness matters. The input-dependent errors should only be reachable for corrupt input data, or so we hope anyway; certainly, if they are reachable in practice, we've got problems with requirements as basic as maintaining a btree index on a text column. Hence, run around to all the SQL functions that derive from varstr_cmp() and mark them leakproof. This should result in a useful gain in flexibility/performance for queries in which non-leakproofness degrades the efficiency of the query plan. Back-patch to v12 where nondeterministic collations were added. While this isn't an essential bug fix given the determination that varstr_cmp() is leakproof, we might as well apply it now that we've been forced into a post-beta4 catversion bump. Discussion: https://postgr.es/m/31481.1568303470@sss.pgh.pa.us
6 years ago
btnamecmp(name,name)
bttextcmp(text,text)
cash_cmp(money,money)
btoidvectorcmp(oidvector,oidvector)
text(name)
name(text)
name(character)
Straighten out leakproofness markings on text comparison functions. Since we introduced the idea of leakproof functions, texteq and textne were marked leakproof but their sibling text comparison functions were not. This inconsistency seemed justified because texteq/textne just relied on memcmp() and so could easily be seen to be leakproof, while the other comparison functions are far more complex and indeed can throw input-dependent errors. However, that argument crashed and burned with the addition of nondeterministic collations, because now texteq/textne may invoke the exact same varstr_cmp() infrastructure as the rest. It makes no sense whatever to give them different leakproofness markings. After a certain amount of angst we've concluded that it's all right to consider varstr_cmp() to be leakproof, mostly because the other choice would be disastrous for performance of many queries where leakproofness matters. The input-dependent errors should only be reachable for corrupt input data, or so we hope anyway; certainly, if they are reachable in practice, we've got problems with requirements as basic as maintaining a btree index on a text column. Hence, run around to all the SQL functions that derive from varstr_cmp() and mark them leakproof. This should result in a useful gain in flexibility/performance for queries in which non-leakproofness degrades the efficiency of the query plan. Back-patch to v12 where nondeterministic collations were added. While this isn't an essential bug fix given the determination that varstr_cmp() is leakproof, we might as well apply it now that we've been forced into a post-beta4 catversion bump. Discussion: https://postgr.es/m/31481.1568303470@sss.pgh.pa.us
6 years ago
text_larger(text,text)
text_smaller(text,text)
int8eq(bigint,bigint)
int8ne(bigint,bigint)
int8lt(bigint,bigint)
int8gt(bigint,bigint)
int8le(bigint,bigint)
int8ge(bigint,bigint)
int84eq(bigint,integer)
int84ne(bigint,integer)
int84lt(bigint,integer)
int84gt(bigint,integer)
int84le(bigint,integer)
int84ge(bigint,integer)
int8(integer)
float8(bigint)
oidvectorne(oidvector,oidvector)
float4(bigint)
Straighten out leakproofness markings on text comparison functions. Since we introduced the idea of leakproof functions, texteq and textne were marked leakproof but their sibling text comparison functions were not. This inconsistency seemed justified because texteq/textne just relied on memcmp() and so could easily be seen to be leakproof, while the other comparison functions are far more complex and indeed can throw input-dependent errors. However, that argument crashed and burned with the addition of nondeterministic collations, because now texteq/textne may invoke the exact same varstr_cmp() infrastructure as the rest. It makes no sense whatever to give them different leakproofness markings. After a certain amount of angst we've concluded that it's all right to consider varstr_cmp() to be leakproof, mostly because the other choice would be disastrous for performance of many queries where leakproofness matters. The input-dependent errors should only be reachable for corrupt input data, or so we hope anyway; certainly, if they are reachable in practice, we've got problems with requirements as basic as maintaining a btree index on a text column. Hence, run around to all the SQL functions that derive from varstr_cmp() and mark them leakproof. This should result in a useful gain in flexibility/performance for queries in which non-leakproofness degrades the efficiency of the query plan. Back-patch to v12 where nondeterministic collations were added. While this isn't an essential bug fix given the determination that varstr_cmp() is leakproof, we might as well apply it now that we've been forced into a post-beta4 catversion bump. Discussion: https://postgr.es/m/31481.1568303470@sss.pgh.pa.us
6 years ago
namelt(name,name)
namele(name,name)
namegt(name,name)
namege(name,name)
namene(name,name)
oidvectorlt(oidvector,oidvector)
oidvectorle(oidvector,oidvector)
oidvectoreq(oidvector,oidvector)
oidvectorge(oidvector,oidvector)
oidvectorgt(oidvector,oidvector)
oidlt(oid,oid)
oidle(oid,oid)
Straighten out leakproofness markings on text comparison functions. Since we introduced the idea of leakproof functions, texteq and textne were marked leakproof but their sibling text comparison functions were not. This inconsistency seemed justified because texteq/textne just relied on memcmp() and so could easily be seen to be leakproof, while the other comparison functions are far more complex and indeed can throw input-dependent errors. However, that argument crashed and burned with the addition of nondeterministic collations, because now texteq/textne may invoke the exact same varstr_cmp() infrastructure as the rest. It makes no sense whatever to give them different leakproofness markings. After a certain amount of angst we've concluded that it's all right to consider varstr_cmp() to be leakproof, mostly because the other choice would be disastrous for performance of many queries where leakproofness matters. The input-dependent errors should only be reachable for corrupt input data, or so we hope anyway; certainly, if they are reachable in practice, we've got problems with requirements as basic as maintaining a btree index on a text column. Hence, run around to all the SQL functions that derive from varstr_cmp() and mark them leakproof. This should result in a useful gain in flexibility/performance for queries in which non-leakproofness degrades the efficiency of the query plan. Back-patch to v12 where nondeterministic collations were added. While this isn't an essential bug fix given the determination that varstr_cmp() is leakproof, we might as well apply it now that we've been forced into a post-beta4 catversion bump. Discussion: https://postgr.es/m/31481.1568303470@sss.pgh.pa.us
6 years ago
text_lt(text,text)
text_le(text,text)
text_gt(text,text)
text_ge(text,text)
int8(smallint)
macaddr_eq(macaddr,macaddr)
macaddr_lt(macaddr,macaddr)
macaddr_le(macaddr,macaddr)
macaddr_gt(macaddr,macaddr)
macaddr_ge(macaddr,macaddr)
macaddr_ne(macaddr,macaddr)
macaddr_cmp(macaddr,macaddr)
btint8cmp(bigint,bigint)
int48eq(integer,bigint)
int48ne(integer,bigint)
int48lt(integer,bigint)
int48gt(integer,bigint)
int48le(integer,bigint)
int48ge(integer,bigint)
cash_eq(money,money)
cash_ne(money,money)
cash_lt(money,money)
cash_le(money,money)
cash_gt(money,money)
cash_ge(money,money)
network_eq(inet,inet)
network_lt(inet,inet)
network_le(inet,inet)
network_gt(inet,inet)
network_ge(inet,inet)
network_ne(inet,inet)
network_cmp(inet,inet)
lseg_eq(lseg,lseg)
bpchareq(character,character)
Straighten out leakproofness markings on text comparison functions. Since we introduced the idea of leakproof functions, texteq and textne were marked leakproof but their sibling text comparison functions were not. This inconsistency seemed justified because texteq/textne just relied on memcmp() and so could easily be seen to be leakproof, while the other comparison functions are far more complex and indeed can throw input-dependent errors. However, that argument crashed and burned with the addition of nondeterministic collations, because now texteq/textne may invoke the exact same varstr_cmp() infrastructure as the rest. It makes no sense whatever to give them different leakproofness markings. After a certain amount of angst we've concluded that it's all right to consider varstr_cmp() to be leakproof, mostly because the other choice would be disastrous for performance of many queries where leakproofness matters. The input-dependent errors should only be reachable for corrupt input data, or so we hope anyway; certainly, if they are reachable in practice, we've got problems with requirements as basic as maintaining a btree index on a text column. Hence, run around to all the SQL functions that derive from varstr_cmp() and mark them leakproof. This should result in a useful gain in flexibility/performance for queries in which non-leakproofness degrades the efficiency of the query plan. Back-patch to v12 where nondeterministic collations were added. While this isn't an essential bug fix given the determination that varstr_cmp() is leakproof, we might as well apply it now that we've been forced into a post-beta4 catversion bump. Discussion: https://postgr.es/m/31481.1568303470@sss.pgh.pa.us
6 years ago
bpcharlt(character,character)
bpcharle(character,character)
bpchargt(character,character)
bpcharge(character,character)
bpcharne(character,character)
Straighten out leakproofness markings on text comparison functions. Since we introduced the idea of leakproof functions, texteq and textne were marked leakproof but their sibling text comparison functions were not. This inconsistency seemed justified because texteq/textne just relied on memcmp() and so could easily be seen to be leakproof, while the other comparison functions are far more complex and indeed can throw input-dependent errors. However, that argument crashed and burned with the addition of nondeterministic collations, because now texteq/textne may invoke the exact same varstr_cmp() infrastructure as the rest. It makes no sense whatever to give them different leakproofness markings. After a certain amount of angst we've concluded that it's all right to consider varstr_cmp() to be leakproof, mostly because the other choice would be disastrous for performance of many queries where leakproofness matters. The input-dependent errors should only be reachable for corrupt input data, or so we hope anyway; certainly, if they are reachable in practice, we've got problems with requirements as basic as maintaining a btree index on a text column. Hence, run around to all the SQL functions that derive from varstr_cmp() and mark them leakproof. This should result in a useful gain in flexibility/performance for queries in which non-leakproofness degrades the efficiency of the query plan. Back-patch to v12 where nondeterministic collations were added. While this isn't an essential bug fix given the determination that varstr_cmp() is leakproof, we might as well apply it now that we've been forced into a post-beta4 catversion bump. Discussion: https://postgr.es/m/31481.1568303470@sss.pgh.pa.us
6 years ago
bpchar_larger(character,character)
bpchar_smaller(character,character)
bpcharcmp(character,character)
date_eq(date,date)
date_lt(date,date)
date_le(date,date)
date_gt(date,date)
date_ge(date,date)
date_ne(date,date)
date_cmp(date,date)
time_lt(time without time zone,time without time zone)
time_le(time without time zone,time without time zone)
time_gt(time without time zone,time without time zone)
time_ge(time without time zone,time without time zone)
time_ne(time without time zone,time without time zone)
time_cmp(time without time zone,time without time zone)
time_eq(time without time zone,time without time zone)
timestamptz_eq(timestamp with time zone,timestamp with time zone)
timestamptz_ne(timestamp with time zone,timestamp with time zone)
timestamptz_lt(timestamp with time zone,timestamp with time zone)
timestamptz_le(timestamp with time zone,timestamp with time zone)
timestamptz_ge(timestamp with time zone,timestamp with time zone)
timestamptz_gt(timestamp with time zone,timestamp with time zone)
interval_eq(interval,interval)
interval_ne(interval,interval)
interval_lt(interval,interval)
interval_le(interval,interval)
interval_ge(interval,interval)
interval_gt(interval,interval)
charlt("char","char")
tidne(tid,tid)
int8(oid)
tideq(tid,tid)
timestamptz_cmp(timestamp with time zone,timestamp with time zone)
interval_cmp(interval,interval)
xideqint4(xid,integer)
timetz_eq(time with time zone,time with time zone)
timetz_ne(time with time zone,time with time zone)
timetz_lt(time with time zone,time with time zone)
timetz_le(time with time zone,time with time zone)
timetz_ge(time with time zone,time with time zone)
timetz_gt(time with time zone,time with time zone)
timetz_cmp(time with time zone,time with time zone)
"interval"(time without time zone)
name(character varying)
"varchar"(name)
circle_eq(circle,circle)
circle_ne(circle,circle)
circle_lt(circle,circle)
circle_gt(circle,circle)
circle_le(circle,circle)
circle_ge(circle,circle)
lseg_ne(lseg,lseg)
lseg_lt(lseg,lseg)
lseg_le(lseg,lseg)
lseg_gt(lseg,lseg)
lseg_ge(lseg,lseg)
biteq(bit,bit)
bitne(bit,bit)
bitge(bit,bit)
bitgt(bit,bit)
bitle(bit,bit)
bitlt(bit,bit)
bitcmp(bit,bit)
oidgt(oid,oid)
oidge(oid,oid)
varbiteq(bit varying,bit varying)
varbitne(bit varying,bit varying)
varbitge(bit varying,bit varying)
varbitgt(bit varying,bit varying)
varbitle(bit varying,bit varying)
varbitlt(bit varying,bit varying)
varbitcmp(bit varying,bit varying)
boolle(boolean,boolean)
boolge(boolean,boolean)
btboolcmp(boolean,boolean)
"numeric"(integer)
"numeric"(real)
"numeric"(double precision)
"numeric"(bigint)
"numeric"(smallint)
int28eq(smallint,bigint)
int28ne(smallint,bigint)
int28lt(smallint,bigint)
int28gt(smallint,bigint)
int28le(smallint,bigint)
int28ge(smallint,bigint)
int82eq(bigint,smallint)
int82ne(bigint,smallint)
int82lt(bigint,smallint)
int82gt(bigint,smallint)
int82le(bigint,smallint)
int82ge(bigint,smallint)
byteaeq(bytea,bytea)
bytealt(bytea,bytea)
byteale(bytea,bytea)
byteagt(bytea,bytea)
byteage(bytea,bytea)
byteane(bytea,bytea)
byteacmp(bytea,bytea)
timestamp_cmp(timestamp without time zone,timestamp without time zone)
timestamp_eq(timestamp without time zone,timestamp without time zone)
timestamp_ne(timestamp without time zone,timestamp without time zone)
timestamp_lt(timestamp without time zone,timestamp without time zone)
timestamp_le(timestamp without time zone,timestamp without time zone)
timestamp_ge(timestamp without time zone,timestamp without time zone)
timestamp_gt(timestamp without time zone,timestamp without time zone)
Straighten out leakproofness markings on text comparison functions. Since we introduced the idea of leakproof functions, texteq and textne were marked leakproof but their sibling text comparison functions were not. This inconsistency seemed justified because texteq/textne just relied on memcmp() and so could easily be seen to be leakproof, while the other comparison functions are far more complex and indeed can throw input-dependent errors. However, that argument crashed and burned with the addition of nondeterministic collations, because now texteq/textne may invoke the exact same varstr_cmp() infrastructure as the rest. It makes no sense whatever to give them different leakproofness markings. After a certain amount of angst we've concluded that it's all right to consider varstr_cmp() to be leakproof, mostly because the other choice would be disastrous for performance of many queries where leakproofness matters. The input-dependent errors should only be reachable for corrupt input data, or so we hope anyway; certainly, if they are reachable in practice, we've got problems with requirements as basic as maintaining a btree index on a text column. Hence, run around to all the SQL functions that derive from varstr_cmp() and mark them leakproof. This should result in a useful gain in flexibility/performance for queries in which non-leakproofness degrades the efficiency of the query plan. Back-patch to v12 where nondeterministic collations were added. While this isn't an essential bug fix given the determination that varstr_cmp() is leakproof, we might as well apply it now that we've been forced into a post-beta4 catversion bump. Discussion: https://postgr.es/m/31481.1568303470@sss.pgh.pa.us
6 years ago
text_pattern_lt(text,text)
text_pattern_le(text,text)
text_pattern_ge(text,text)
text_pattern_gt(text,text)
bttext_pattern_cmp(text,text)
bpchar_pattern_lt(character,character)
bpchar_pattern_le(character,character)
bpchar_pattern_ge(character,character)
bpchar_pattern_gt(character,character)
btbpchar_pattern_cmp(character,character)
btint48cmp(integer,bigint)
btint84cmp(bigint,integer)
btint24cmp(smallint,integer)
btint42cmp(integer,smallint)
btint28cmp(smallint,bigint)
btint82cmp(bigint,smallint)
btfloat48cmp(real,double precision)
btfloat84cmp(double precision,real)
md5(text)
md5(bytea)
bool(integer)
int4(boolean)
tidgt(tid,tid)
tidlt(tid,tid)
tidge(tid,tid)
tidle(tid,tid)
bttidcmp(tid,tid)
uuid_lt(uuid,uuid)
uuid_le(uuid,uuid)
uuid_eq(uuid,uuid)
uuid_ge(uuid,uuid)
uuid_gt(uuid,uuid)
uuid_ne(uuid,uuid)
uuid_cmp(uuid,uuid)
pg_lsn_lt(pg_lsn,pg_lsn)
pg_lsn_le(pg_lsn,pg_lsn)
pg_lsn_eq(pg_lsn,pg_lsn)
pg_lsn_ge(pg_lsn,pg_lsn)
pg_lsn_gt(pg_lsn,pg_lsn)
pg_lsn_ne(pg_lsn,pg_lsn)
pg_lsn_cmp(pg_lsn,pg_lsn)
xidneq(xid,xid)
xidneqint4(xid,integer)
sha224(bytea)
sha256(bytea)
sha384(bytea)
sha512(bytea)
gen_random_uuid()
starts_with(text,text)
macaddr8_eq(macaddr8,macaddr8)
macaddr8_lt(macaddr8,macaddr8)
macaddr8_le(macaddr8,macaddr8)
macaddr8_gt(macaddr8,macaddr8)
macaddr8_ge(macaddr8,macaddr8)
macaddr8_ne(macaddr8,macaddr8)
macaddr8_cmp(macaddr8,macaddr8)
macaddr8(macaddr)
xid8lt(xid8,xid8)
xid8gt(xid8,xid8)
xid8le(xid8,xid8)
xid8ge(xid8,xid8)
xid8eq(xid8,xid8)
xid8ne(xid8,xid8)
xid8cmp(xid8,xid8)
-- restore normal output mode
\a\t
-- List of functions used by libpq's fe-lobj.c
--
-- If the output of this query changes, you probably broke libpq.
-- lo_initialize() assumes that there will be at most one match for
-- each listed name.
select proname, oid from pg_catalog.pg_proc
where proname in (
'lo_open',
'lo_close',
'lo_creat',
'lo_create',
'lo_unlink',
'lo_lseek',
'lo_lseek64',
'lo_tell',
'lo_tell64',
'lo_truncate',
'lo_truncate64',
'loread',
'lowrite')
and pronamespace = (select oid from pg_catalog.pg_namespace
where nspname = 'pg_catalog')
order by 1;
proname | oid
---------------+------
lo_close | 953
lo_creat | 957
lo_create | 715
lo_lseek | 956
lo_lseek64 | 3170
lo_open | 952
lo_tell | 958
lo_tell64 | 3171
lo_truncate | 1004
lo_truncate64 | 3172
lo_unlink | 964
loread | 954
lowrite | 955
(13 rows)
-- Check that all immutable functions are marked parallel safe
SELECT p1.oid, p1.proname
FROM pg_proc AS p1
WHERE provolatile = 'i' AND proparallel = 'u';
oid | proname
-----+---------
(0 rows)
-- **************** pg_cast ****************
-- Catch bogus values in pg_cast columns (other than cases detected by
-- oidjoins test).
SELECT *
FROM pg_cast c
WHERE castsource = 0 OR casttarget = 0 OR castcontext NOT IN ('e', 'a', 'i')
OR castmethod NOT IN ('f', 'b' ,'i');
Remove WITH OIDS support, change oid catalog column visibility. Previously tables declared WITH OIDS, including a significant fraction of the catalog tables, stored the oid column not as a normal column, but as part of the tuple header. This special column was not shown by default, which was somewhat odd, as it's often (consider e.g. pg_class.oid) one of the more important parts of a row. Neither pg_dump nor COPY included the contents of the oid column by default. The fact that the oid column was not an ordinary column necessitated a significant amount of special case code to support oid columns. That already was painful for the existing, but upcoming work aiming to make table storage pluggable, would have required expanding and duplicating that "specialness" significantly. WITH OIDS has been deprecated since 2005 (commit ff02d0a05280e0). Remove it. Removing includes: - CREATE TABLE and ALTER TABLE syntax for declaring the table to be WITH OIDS has been removed (WITH (oids[ = true]) will error out) - pg_dump does not support dumping tables declared WITH OIDS and will issue a warning when dumping one (and ignore the oid column). - restoring an pg_dump archive with pg_restore will warn when restoring a table with oid contents (and ignore the oid column) - COPY will refuse to load binary dump that includes oids. - pg_upgrade will error out when encountering tables declared WITH OIDS, they have to be altered to remove the oid column first. - Functionality to access the oid of the last inserted row (like plpgsql's RESULT_OID, spi's SPI_lastoid, ...) has been removed. The syntax for declaring a table WITHOUT OIDS (or WITH (oids = false) for CREATE TABLE) is still supported. While that requires a bit of support code, it seems unnecessary to break applications / dumps that do not use oids, and are explicit about not using them. The biggest user of WITH OID columns was postgres' catalog. This commit changes all 'magic' oid columns to be columns that are normally declared and stored. To reduce unnecessary query breakage all the newly added columns are still named 'oid', even if a table's column naming scheme would indicate 'reloid' or such. This obviously requires adapting a lot code, mostly replacing oid access via HeapTupleGetOid() with access to the underlying Form_pg_*->oid column. The bootstrap process now assigns oids for all oid columns in genbki.pl that do not have an explicit value (starting at the largest oid previously used), only oids assigned later by oids will be above FirstBootstrapObjectId. As the oid column now is a normal column the special bootstrap syntax for oids has been removed. Oids are not automatically assigned during insertion anymore, all backend code explicitly assigns oids with GetNewOidWithIndex(). For the rare case that insertions into the catalog via SQL are called for the new pg_nextoid() function can be used (which only works on catalog tables). The fact that oid columns on system tables are now normal columns means that they will be included in the set of columns expanded by * (i.e. SELECT * FROM pg_class will now include the table's oid, previously it did not). It'd not technically be hard to hide oid column by default, but that'd mean confusing behavior would either have to be carried forward forever, or it'd cause breakage down the line. While it's not unlikely that further adjustments are needed, the scope/invasiveness of the patch makes it worthwhile to get merge this now. It's painful to maintain externally, too complicated to commit after the code code freeze, and a dependency of a number of other patches. Catversion bump, for obvious reasons. Author: Andres Freund, with contributions by John Naylor Discussion: https://postgr.es/m/20180930034810.ywp2c7awz7opzcfr@alap3.anarazel.de
7 years ago
oid | castsource | casttarget | castfunc | castcontext | castmethod
-----+------------+------------+----------+-------------+------------
(0 rows)
-- Check that castfunc is nonzero only for cast methods that need a function,
-- and zero otherwise
SELECT *
FROM pg_cast c
WHERE (castmethod = 'f' AND castfunc = 0)
OR (castmethod IN ('b', 'i') AND castfunc <> 0);
Remove WITH OIDS support, change oid catalog column visibility. Previously tables declared WITH OIDS, including a significant fraction of the catalog tables, stored the oid column not as a normal column, but as part of the tuple header. This special column was not shown by default, which was somewhat odd, as it's often (consider e.g. pg_class.oid) one of the more important parts of a row. Neither pg_dump nor COPY included the contents of the oid column by default. The fact that the oid column was not an ordinary column necessitated a significant amount of special case code to support oid columns. That already was painful for the existing, but upcoming work aiming to make table storage pluggable, would have required expanding and duplicating that "specialness" significantly. WITH OIDS has been deprecated since 2005 (commit ff02d0a05280e0). Remove it. Removing includes: - CREATE TABLE and ALTER TABLE syntax for declaring the table to be WITH OIDS has been removed (WITH (oids[ = true]) will error out) - pg_dump does not support dumping tables declared WITH OIDS and will issue a warning when dumping one (and ignore the oid column). - restoring an pg_dump archive with pg_restore will warn when restoring a table with oid contents (and ignore the oid column) - COPY will refuse to load binary dump that includes oids. - pg_upgrade will error out when encountering tables declared WITH OIDS, they have to be altered to remove the oid column first. - Functionality to access the oid of the last inserted row (like plpgsql's RESULT_OID, spi's SPI_lastoid, ...) has been removed. The syntax for declaring a table WITHOUT OIDS (or WITH (oids = false) for CREATE TABLE) is still supported. While that requires a bit of support code, it seems unnecessary to break applications / dumps that do not use oids, and are explicit about not using them. The biggest user of WITH OID columns was postgres' catalog. This commit changes all 'magic' oid columns to be columns that are normally declared and stored. To reduce unnecessary query breakage all the newly added columns are still named 'oid', even if a table's column naming scheme would indicate 'reloid' or such. This obviously requires adapting a lot code, mostly replacing oid access via HeapTupleGetOid() with access to the underlying Form_pg_*->oid column. The bootstrap process now assigns oids for all oid columns in genbki.pl that do not have an explicit value (starting at the largest oid previously used), only oids assigned later by oids will be above FirstBootstrapObjectId. As the oid column now is a normal column the special bootstrap syntax for oids has been removed. Oids are not automatically assigned during insertion anymore, all backend code explicitly assigns oids with GetNewOidWithIndex(). For the rare case that insertions into the catalog via SQL are called for the new pg_nextoid() function can be used (which only works on catalog tables). The fact that oid columns on system tables are now normal columns means that they will be included in the set of columns expanded by * (i.e. SELECT * FROM pg_class will now include the table's oid, previously it did not). It'd not technically be hard to hide oid column by default, but that'd mean confusing behavior would either have to be carried forward forever, or it'd cause breakage down the line. While it's not unlikely that further adjustments are needed, the scope/invasiveness of the patch makes it worthwhile to get merge this now. It's painful to maintain externally, too complicated to commit after the code code freeze, and a dependency of a number of other patches. Catversion bump, for obvious reasons. Author: Andres Freund, with contributions by John Naylor Discussion: https://postgr.es/m/20180930034810.ywp2c7awz7opzcfr@alap3.anarazel.de
7 years ago
oid | castsource | casttarget | castfunc | castcontext | castmethod
-----+------------+------------+----------+-------------+------------
(0 rows)
-- Look for casts to/from the same type that aren't length coercion functions.
-- (We assume they are length coercions if they take multiple arguments.)
-- Such entries are not necessarily harmful, but they are useless.
SELECT *
FROM pg_cast c
WHERE castsource = casttarget AND castfunc = 0;
Remove WITH OIDS support, change oid catalog column visibility. Previously tables declared WITH OIDS, including a significant fraction of the catalog tables, stored the oid column not as a normal column, but as part of the tuple header. This special column was not shown by default, which was somewhat odd, as it's often (consider e.g. pg_class.oid) one of the more important parts of a row. Neither pg_dump nor COPY included the contents of the oid column by default. The fact that the oid column was not an ordinary column necessitated a significant amount of special case code to support oid columns. That already was painful for the existing, but upcoming work aiming to make table storage pluggable, would have required expanding and duplicating that "specialness" significantly. WITH OIDS has been deprecated since 2005 (commit ff02d0a05280e0). Remove it. Removing includes: - CREATE TABLE and ALTER TABLE syntax for declaring the table to be WITH OIDS has been removed (WITH (oids[ = true]) will error out) - pg_dump does not support dumping tables declared WITH OIDS and will issue a warning when dumping one (and ignore the oid column). - restoring an pg_dump archive with pg_restore will warn when restoring a table with oid contents (and ignore the oid column) - COPY will refuse to load binary dump that includes oids. - pg_upgrade will error out when encountering tables declared WITH OIDS, they have to be altered to remove the oid column first. - Functionality to access the oid of the last inserted row (like plpgsql's RESULT_OID, spi's SPI_lastoid, ...) has been removed. The syntax for declaring a table WITHOUT OIDS (or WITH (oids = false) for CREATE TABLE) is still supported. While that requires a bit of support code, it seems unnecessary to break applications / dumps that do not use oids, and are explicit about not using them. The biggest user of WITH OID columns was postgres' catalog. This commit changes all 'magic' oid columns to be columns that are normally declared and stored. To reduce unnecessary query breakage all the newly added columns are still named 'oid', even if a table's column naming scheme would indicate 'reloid' or such. This obviously requires adapting a lot code, mostly replacing oid access via HeapTupleGetOid() with access to the underlying Form_pg_*->oid column. The bootstrap process now assigns oids for all oid columns in genbki.pl that do not have an explicit value (starting at the largest oid previously used), only oids assigned later by oids will be above FirstBootstrapObjectId. As the oid column now is a normal column the special bootstrap syntax for oids has been removed. Oids are not automatically assigned during insertion anymore, all backend code explicitly assigns oids with GetNewOidWithIndex(). For the rare case that insertions into the catalog via SQL are called for the new pg_nextoid() function can be used (which only works on catalog tables). The fact that oid columns on system tables are now normal columns means that they will be included in the set of columns expanded by * (i.e. SELECT * FROM pg_class will now include the table's oid, previously it did not). It'd not technically be hard to hide oid column by default, but that'd mean confusing behavior would either have to be carried forward forever, or it'd cause breakage down the line. While it's not unlikely that further adjustments are needed, the scope/invasiveness of the patch makes it worthwhile to get merge this now. It's painful to maintain externally, too complicated to commit after the code code freeze, and a dependency of a number of other patches. Catversion bump, for obvious reasons. Author: Andres Freund, with contributions by John Naylor Discussion: https://postgr.es/m/20180930034810.ywp2c7awz7opzcfr@alap3.anarazel.de
7 years ago
oid | castsource | casttarget | castfunc | castcontext | castmethod
-----+------------+------------+----------+-------------+------------
(0 rows)
SELECT c.*
FROM pg_cast c, pg_proc p
WHERE c.castfunc = p.oid AND p.pronargs < 2 AND castsource = casttarget;
Remove WITH OIDS support, change oid catalog column visibility. Previously tables declared WITH OIDS, including a significant fraction of the catalog tables, stored the oid column not as a normal column, but as part of the tuple header. This special column was not shown by default, which was somewhat odd, as it's often (consider e.g. pg_class.oid) one of the more important parts of a row. Neither pg_dump nor COPY included the contents of the oid column by default. The fact that the oid column was not an ordinary column necessitated a significant amount of special case code to support oid columns. That already was painful for the existing, but upcoming work aiming to make table storage pluggable, would have required expanding and duplicating that "specialness" significantly. WITH OIDS has been deprecated since 2005 (commit ff02d0a05280e0). Remove it. Removing includes: - CREATE TABLE and ALTER TABLE syntax for declaring the table to be WITH OIDS has been removed (WITH (oids[ = true]) will error out) - pg_dump does not support dumping tables declared WITH OIDS and will issue a warning when dumping one (and ignore the oid column). - restoring an pg_dump archive with pg_restore will warn when restoring a table with oid contents (and ignore the oid column) - COPY will refuse to load binary dump that includes oids. - pg_upgrade will error out when encountering tables declared WITH OIDS, they have to be altered to remove the oid column first. - Functionality to access the oid of the last inserted row (like plpgsql's RESULT_OID, spi's SPI_lastoid, ...) has been removed. The syntax for declaring a table WITHOUT OIDS (or WITH (oids = false) for CREATE TABLE) is still supported. While that requires a bit of support code, it seems unnecessary to break applications / dumps that do not use oids, and are explicit about not using them. The biggest user of WITH OID columns was postgres' catalog. This commit changes all 'magic' oid columns to be columns that are normally declared and stored. To reduce unnecessary query breakage all the newly added columns are still named 'oid', even if a table's column naming scheme would indicate 'reloid' or such. This obviously requires adapting a lot code, mostly replacing oid access via HeapTupleGetOid() with access to the underlying Form_pg_*->oid column. The bootstrap process now assigns oids for all oid columns in genbki.pl that do not have an explicit value (starting at the largest oid previously used), only oids assigned later by oids will be above FirstBootstrapObjectId. As the oid column now is a normal column the special bootstrap syntax for oids has been removed. Oids are not automatically assigned during insertion anymore, all backend code explicitly assigns oids with GetNewOidWithIndex(). For the rare case that insertions into the catalog via SQL are called for the new pg_nextoid() function can be used (which only works on catalog tables). The fact that oid columns on system tables are now normal columns means that they will be included in the set of columns expanded by * (i.e. SELECT * FROM pg_class will now include the table's oid, previously it did not). It'd not technically be hard to hide oid column by default, but that'd mean confusing behavior would either have to be carried forward forever, or it'd cause breakage down the line. While it's not unlikely that further adjustments are needed, the scope/invasiveness of the patch makes it worthwhile to get merge this now. It's painful to maintain externally, too complicated to commit after the code code freeze, and a dependency of a number of other patches. Catversion bump, for obvious reasons. Author: Andres Freund, with contributions by John Naylor Discussion: https://postgr.es/m/20180930034810.ywp2c7awz7opzcfr@alap3.anarazel.de
7 years ago
oid | castsource | casttarget | castfunc | castcontext | castmethod
-----+------------+------------+----------+-------------+------------
(0 rows)
-- Look for cast functions that don't have the right signature. The
-- argument and result types in pg_proc must be the same as, or binary
-- compatible with, what it says in pg_cast.
-- As a special case, we allow casts from CHAR(n) that use functions
-- declared to take TEXT. This does not pass the binary-coercibility test
-- because CHAR(n)-to-TEXT normally invokes rtrim(). However, the results
-- are the same, so long as the function is one that ignores trailing blanks.
SELECT c.*
FROM pg_cast c, pg_proc p
WHERE c.castfunc = p.oid AND
(p.pronargs < 1 OR p.pronargs > 3
OR NOT (binary_coercible(c.castsource, p.proargtypes[0])
OR (c.castsource = 'character'::regtype AND
p.proargtypes[0] = 'text'::regtype))
OR NOT binary_coercible(p.prorettype, c.casttarget));
Remove WITH OIDS support, change oid catalog column visibility. Previously tables declared WITH OIDS, including a significant fraction of the catalog tables, stored the oid column not as a normal column, but as part of the tuple header. This special column was not shown by default, which was somewhat odd, as it's often (consider e.g. pg_class.oid) one of the more important parts of a row. Neither pg_dump nor COPY included the contents of the oid column by default. The fact that the oid column was not an ordinary column necessitated a significant amount of special case code to support oid columns. That already was painful for the existing, but upcoming work aiming to make table storage pluggable, would have required expanding and duplicating that "specialness" significantly. WITH OIDS has been deprecated since 2005 (commit ff02d0a05280e0). Remove it. Removing includes: - CREATE TABLE and ALTER TABLE syntax for declaring the table to be WITH OIDS has been removed (WITH (oids[ = true]) will error out) - pg_dump does not support dumping tables declared WITH OIDS and will issue a warning when dumping one (and ignore the oid column). - restoring an pg_dump archive with pg_restore will warn when restoring a table with oid contents (and ignore the oid column) - COPY will refuse to load binary dump that includes oids. - pg_upgrade will error out when encountering tables declared WITH OIDS, they have to be altered to remove the oid column first. - Functionality to access the oid of the last inserted row (like plpgsql's RESULT_OID, spi's SPI_lastoid, ...) has been removed. The syntax for declaring a table WITHOUT OIDS (or WITH (oids = false) for CREATE TABLE) is still supported. While that requires a bit of support code, it seems unnecessary to break applications / dumps that do not use oids, and are explicit about not using them. The biggest user of WITH OID columns was postgres' catalog. This commit changes all 'magic' oid columns to be columns that are normally declared and stored. To reduce unnecessary query breakage all the newly added columns are still named 'oid', even if a table's column naming scheme would indicate 'reloid' or such. This obviously requires adapting a lot code, mostly replacing oid access via HeapTupleGetOid() with access to the underlying Form_pg_*->oid column. The bootstrap process now assigns oids for all oid columns in genbki.pl that do not have an explicit value (starting at the largest oid previously used), only oids assigned later by oids will be above FirstBootstrapObjectId. As the oid column now is a normal column the special bootstrap syntax for oids has been removed. Oids are not automatically assigned during insertion anymore, all backend code explicitly assigns oids with GetNewOidWithIndex(). For the rare case that insertions into the catalog via SQL are called for the new pg_nextoid() function can be used (which only works on catalog tables). The fact that oid columns on system tables are now normal columns means that they will be included in the set of columns expanded by * (i.e. SELECT * FROM pg_class will now include the table's oid, previously it did not). It'd not technically be hard to hide oid column by default, but that'd mean confusing behavior would either have to be carried forward forever, or it'd cause breakage down the line. While it's not unlikely that further adjustments are needed, the scope/invasiveness of the patch makes it worthwhile to get merge this now. It's painful to maintain externally, too complicated to commit after the code code freeze, and a dependency of a number of other patches. Catversion bump, for obvious reasons. Author: Andres Freund, with contributions by John Naylor Discussion: https://postgr.es/m/20180930034810.ywp2c7awz7opzcfr@alap3.anarazel.de
7 years ago
oid | castsource | casttarget | castfunc | castcontext | castmethod
-----+------------+------------+----------+-------------+------------
(0 rows)
SELECT c.*
FROM pg_cast c, pg_proc p
WHERE c.castfunc = p.oid AND
((p.pronargs > 1 AND p.proargtypes[1] != 'int4'::regtype) OR
(p.pronargs > 2 AND p.proargtypes[2] != 'bool'::regtype));
Remove WITH OIDS support, change oid catalog column visibility. Previously tables declared WITH OIDS, including a significant fraction of the catalog tables, stored the oid column not as a normal column, but as part of the tuple header. This special column was not shown by default, which was somewhat odd, as it's often (consider e.g. pg_class.oid) one of the more important parts of a row. Neither pg_dump nor COPY included the contents of the oid column by default. The fact that the oid column was not an ordinary column necessitated a significant amount of special case code to support oid columns. That already was painful for the existing, but upcoming work aiming to make table storage pluggable, would have required expanding and duplicating that "specialness" significantly. WITH OIDS has been deprecated since 2005 (commit ff02d0a05280e0). Remove it. Removing includes: - CREATE TABLE and ALTER TABLE syntax for declaring the table to be WITH OIDS has been removed (WITH (oids[ = true]) will error out) - pg_dump does not support dumping tables declared WITH OIDS and will issue a warning when dumping one (and ignore the oid column). - restoring an pg_dump archive with pg_restore will warn when restoring a table with oid contents (and ignore the oid column) - COPY will refuse to load binary dump that includes oids. - pg_upgrade will error out when encountering tables declared WITH OIDS, they have to be altered to remove the oid column first. - Functionality to access the oid of the last inserted row (like plpgsql's RESULT_OID, spi's SPI_lastoid, ...) has been removed. The syntax for declaring a table WITHOUT OIDS (or WITH (oids = false) for CREATE TABLE) is still supported. While that requires a bit of support code, it seems unnecessary to break applications / dumps that do not use oids, and are explicit about not using them. The biggest user of WITH OID columns was postgres' catalog. This commit changes all 'magic' oid columns to be columns that are normally declared and stored. To reduce unnecessary query breakage all the newly added columns are still named 'oid', even if a table's column naming scheme would indicate 'reloid' or such. This obviously requires adapting a lot code, mostly replacing oid access via HeapTupleGetOid() with access to the underlying Form_pg_*->oid column. The bootstrap process now assigns oids for all oid columns in genbki.pl that do not have an explicit value (starting at the largest oid previously used), only oids assigned later by oids will be above FirstBootstrapObjectId. As the oid column now is a normal column the special bootstrap syntax for oids has been removed. Oids are not automatically assigned during insertion anymore, all backend code explicitly assigns oids with GetNewOidWithIndex(). For the rare case that insertions into the catalog via SQL are called for the new pg_nextoid() function can be used (which only works on catalog tables). The fact that oid columns on system tables are now normal columns means that they will be included in the set of columns expanded by * (i.e. SELECT * FROM pg_class will now include the table's oid, previously it did not). It'd not technically be hard to hide oid column by default, but that'd mean confusing behavior would either have to be carried forward forever, or it'd cause breakage down the line. While it's not unlikely that further adjustments are needed, the scope/invasiveness of the patch makes it worthwhile to get merge this now. It's painful to maintain externally, too complicated to commit after the code code freeze, and a dependency of a number of other patches. Catversion bump, for obvious reasons. Author: Andres Freund, with contributions by John Naylor Discussion: https://postgr.es/m/20180930034810.ywp2c7awz7opzcfr@alap3.anarazel.de
7 years ago
oid | castsource | casttarget | castfunc | castcontext | castmethod
-----+------------+------------+----------+-------------+------------
(0 rows)
-- Look for binary compatible casts that do not have the reverse
-- direction registered as well, or where the reverse direction is not
-- also binary compatible. This is legal, but usually not intended.
-- As of 7.4, this finds the casts from text and varchar to bpchar, because
-- those are binary-compatible while the reverse way goes through rtrim().
-- As of 8.2, this finds the cast from cidr to inet, because that is a
-- trivial binary coercion while the other way goes through inet_to_cidr().
-- As of 8.3, this finds the casts from xml to text, varchar, and bpchar,
-- because those are binary-compatible while the reverse goes through
-- texttoxml(), which does an XML syntax check.
-- As of 9.1, this finds the cast from pg_node_tree to text, which we
-- intentionally do not provide a reverse pathway for.
SELECT castsource::regtype, casttarget::regtype, castfunc, castcontext
FROM pg_cast c
WHERE c.castmethod = 'b' AND
NOT EXISTS (SELECT 1 FROM pg_cast k
WHERE k.castmethod = 'b' AND
k.castsource = c.casttarget AND
k.casttarget = c.castsource);
castsource | casttarget | castfunc | castcontext
-------------------+-------------------+----------+-------------
text | character | 0 | i
character varying | character | 0 | i
pg_node_tree | text | 0 | i
Implement multivariate n-distinct coefficients Add support for explicitly declared statistic objects (CREATE STATISTICS), allowing collection of statistics on more complex combinations that individual table columns. Companion commands DROP STATISTICS and ALTER STATISTICS ... OWNER TO / SET SCHEMA / RENAME are added too. All this DDL has been designed so that more statistic types can be added later on, such as multivariate most-common-values and multivariate histograms between columns of a single table, leaving room for permitting columns on multiple tables, too, as well as expressions. This commit only adds support for collection of n-distinct coefficient on user-specified sets of columns in a single table. This is useful to estimate number of distinct groups in GROUP BY and DISTINCT clauses; estimation errors there can cause over-allocation of memory in hashed aggregates, for instance, so it's a worthwhile problem to solve. A new special pseudo-type pg_ndistinct is used. (num-distinct estimation was deemed sufficiently useful by itself that this is worthwhile even if no further statistic types are added immediately; so much so that another version of essentially the same functionality was submitted by Kyotaro Horiguchi: https://postgr.es/m/20150828.173334.114731693.horiguchi.kyotaro@lab.ntt.co.jp though this commit does not use that code.) Author: Tomas Vondra. Some code rework by Álvaro. Reviewed-by: Dean Rasheed, David Rowley, Kyotaro Horiguchi, Jeff Janes, Ideriha Takeshi Discussion: https://postgr.es/m/543AFA15.4080608@fuzzy.cz https://postgr.es/m/20170320190220.ixlaueanxegqd5gr@alvherre.pgsql
9 years ago
pg_ndistinct | bytea | 0 | i
pg_dependencies | bytea | 0 | i
pg_mcv_list | bytea | 0 | i
cidr | inet | 0 | i
xml | text | 0 | a
xml | character varying | 0 | a
xml | character | 0 | a
(10 rows)
-- **************** pg_conversion ****************
-- Look for illegal values in pg_conversion fields.
SELECT c.oid, c.conname
FROM pg_conversion as c
WHERE c.conproc = 0 OR
pg_encoding_to_char(conforencoding) = '' OR
pg_encoding_to_char(contoencoding) = '';
oid | conname
-----+---------
(0 rows)
-- Look for conprocs that don't have the expected signature.
SELECT p.oid, p.proname, c.oid, c.conname
FROM pg_proc p, pg_conversion c
WHERE p.oid = c.conproc AND
(p.prorettype != 'int4'::regtype OR p.proretset OR
p.pronargs != 6 OR
p.proargtypes[0] != 'int4'::regtype OR
p.proargtypes[1] != 'int4'::regtype OR
p.proargtypes[2] != 'cstring'::regtype OR
p.proargtypes[3] != 'internal'::regtype OR
p.proargtypes[4] != 'int4'::regtype OR
p.proargtypes[5] != 'bool'::regtype);
oid | proname | oid | conname
-----+---------+-----+---------
(0 rows)
-- Check for conprocs that don't perform the specific conversion that
-- pg_conversion alleges they do, by trying to invoke each conversion
-- on some simple ASCII data. (The conproc should throw an error if
-- it doesn't accept the encodings that are passed to it.)
-- Unfortunately, we can't test non-default conprocs this way, because
-- there is no way to ask convert() to invoke them, and we cannot call
-- them directly from SQL. But there are no non-default built-in
-- conversions anyway.
-- (Similarly, this doesn't cope with any search path issues.)
SELECT c.oid, c.conname
FROM pg_conversion as c
WHERE condefault AND
convert('ABC'::bytea, pg_encoding_to_char(conforencoding),
pg_encoding_to_char(contoencoding)) != 'ABC';
oid | conname
-----+---------
(0 rows)
-- **************** pg_operator ****************
-- Look for illegal values in pg_operator fields.
SELECT o1.oid, o1.oprname
FROM pg_operator as o1
WHERE (o1.oprkind != 'b' AND o1.oprkind != 'l') OR
o1.oprresult = 0 OR o1.oprcode = 0;
oid | oprname
-----+---------
(0 rows)
-- Look for missing or unwanted operand types
SELECT o1.oid, o1.oprname
FROM pg_operator as o1
WHERE (o1.oprleft = 0 and o1.oprkind != 'l') OR
(o1.oprleft != 0 and o1.oprkind = 'l') OR
o1.oprright = 0;
oid | oprname
-----+---------
(0 rows)
-- Look for conflicting operator definitions (same names and input datatypes).
SELECT o1.oid, o1.oprcode, o2.oid, o2.oprcode
FROM pg_operator AS o1, pg_operator AS o2
WHERE o1.oid != o2.oid AND
o1.oprname = o2.oprname AND
o1.oprkind = o2.oprkind AND
o1.oprleft = o2.oprleft AND
o1.oprright = o2.oprright;
oid | oprcode | oid | oprcode
-----+---------+-----+---------
(0 rows)
-- Look for commutative operators that don't commute.
-- DEFINITIONAL NOTE: If A.oprcom = B, then x A y has the same result as y B x.
-- We expect that B will always say that B.oprcom = A as well; that's not
-- inherently essential, but it would be inefficient not to mark it so.
SELECT o1.oid, o1.oprcode, o2.oid, o2.oprcode
FROM pg_operator AS o1, pg_operator AS o2
WHERE o1.oprcom = o2.oid AND
(o1.oprkind != 'b' OR
o1.oprleft != o2.oprright OR
o1.oprright != o2.oprleft OR
o1.oprresult != o2.oprresult OR
o1.oid != o2.oprcom);
oid | oprcode | oid | oprcode
-----+---------+-----+---------
(0 rows)
-- Look for negatory operators that don't agree.
-- DEFINITIONAL NOTE: If A.oprnegate = B, then both A and B must yield
-- boolean results, and (x A y) == ! (x B y), or the equivalent for
-- single-operand operators.
-- We expect that B will always say that B.oprnegate = A as well; that's not
-- inherently essential, but it would be inefficient not to mark it so.
-- Also, A and B had better not be the same operator.
SELECT o1.oid, o1.oprcode, o2.oid, o2.oprcode
FROM pg_operator AS o1, pg_operator AS o2
WHERE o1.oprnegate = o2.oid AND
(o1.oprkind != o2.oprkind OR
o1.oprleft != o2.oprleft OR
o1.oprright != o2.oprright OR
o1.oprresult != 'bool'::regtype OR
o2.oprresult != 'bool'::regtype OR
o1.oid != o2.oprnegate OR
o1.oid = o2.oid);
oid | oprcode | oid | oprcode
-----+---------+-----+---------
(0 rows)
-- Make a list of the names of operators that are claimed to be commutator
-- pairs. This list will grow over time, but before accepting a new entry
-- make sure you didn't link the wrong operators.
SELECT DISTINCT o1.oprname AS op1, o2.oprname AS op2
FROM pg_operator o1, pg_operator o2
WHERE o1.oprcom = o2.oid AND o1.oprname <= o2.oprname
ORDER BY 1, 2;
op1 | op2
------+------
# | #
& | &
&& | &&
* | *
*< | *>
*<= | *>=
*<> | *<>
*= | *=
+ | +
-|- | -|-
< | >
<-> | <->
<< | >>
<<= | >>=
<= | >=
<> | <>
<@ | @>
= | =
?# | ?#
?- | ?-
?-| | ?-|
?| | ?|
?|| | ?||
@@ | @@
@@@ | @@@
| | |
~<=~ | ~>=~
~<~ | ~>~
~= | ~=
(29 rows)
-- Likewise for negator pairs.
SELECT DISTINCT o1.oprname AS op1, o2.oprname AS op2
FROM pg_operator o1, pg_operator o2
WHERE o1.oprnegate = o2.oid AND o1.oprname <= o2.oprname
ORDER BY 1, 2;
op1 | op2
------+------
!~ | ~
!~* | ~*
!~~ | ~~
!~~* | ~~*
*< | *>=
*<= | *>
*<> | *=
< | >=
<= | >
<> | =
<> | ~=
~<=~ | ~>~
~<~ | ~>=~
(13 rows)
-- A mergejoinable or hashjoinable operator must be binary, must return
-- boolean, and must have a commutator (itself, unless it's a cross-type
-- operator).
SELECT o1.oid, o1.oprname FROM pg_operator AS o1
WHERE (o1.oprcanmerge OR o1.oprcanhash) AND NOT
(o1.oprkind = 'b' AND o1.oprresult = 'bool'::regtype AND o1.oprcom != 0);
oid | oprname
-----+---------
(0 rows)
-- What's more, the commutator had better be mergejoinable/hashjoinable too.
SELECT o1.oid, o1.oprname, o2.oid, o2.oprname
FROM pg_operator AS o1, pg_operator AS o2
WHERE o1.oprcom = o2.oid AND
(o1.oprcanmerge != o2.oprcanmerge OR
o1.oprcanhash != o2.oprcanhash);
oid | oprname | oid | oprname
-----+---------+-----+---------
(0 rows)
-- Mergejoinable operators should appear as equality members of btree index
-- opfamilies.
SELECT o1.oid, o1.oprname
FROM pg_operator AS o1
WHERE o1.oprcanmerge AND NOT EXISTS
(SELECT 1 FROM pg_amop
WHERE amopmethod = (SELECT oid FROM pg_am WHERE amname = 'btree') AND
amopopr = o1.oid AND amopstrategy = 3);
oid | oprname
-----+---------
(0 rows)
-- And the converse.
SELECT o1.oid, o1.oprname, p.amopfamily
FROM pg_operator AS o1, pg_amop p
WHERE amopopr = o1.oid
AND amopmethod = (SELECT oid FROM pg_am WHERE amname = 'btree')
AND amopstrategy = 3
AND NOT o1.oprcanmerge;
oid | oprname | amopfamily
-----+---------+------------
(0 rows)
-- Hashable operators should appear as members of hash index opfamilies.
SELECT o1.oid, o1.oprname
FROM pg_operator AS o1
WHERE o1.oprcanhash AND NOT EXISTS
(SELECT 1 FROM pg_amop
WHERE amopmethod = (SELECT oid FROM pg_am WHERE amname = 'hash') AND
amopopr = o1.oid AND amopstrategy = 1);
oid | oprname
-----+---------
(0 rows)
-- And the converse.
SELECT o1.oid, o1.oprname, p.amopfamily
FROM pg_operator AS o1, pg_amop p
WHERE amopopr = o1.oid
AND amopmethod = (SELECT oid FROM pg_am WHERE amname = 'hash')
AND NOT o1.oprcanhash;
oid | oprname | amopfamily
-----+---------+------------
(0 rows)
-- Check that each operator defined in pg_operator matches its oprcode entry
-- in pg_proc. Easiest to do this separately for each oprkind.
SELECT o1.oid, o1.oprname, p1.oid, p1.proname
FROM pg_operator AS o1, pg_proc AS p1
WHERE o1.oprcode = p1.oid AND
o1.oprkind = 'b' AND
(p1.pronargs != 2
OR NOT binary_coercible(p1.prorettype, o1.oprresult)
OR NOT binary_coercible(o1.oprleft, p1.proargtypes[0])
OR NOT binary_coercible(o1.oprright, p1.proargtypes[1]));
oid | oprname | oid | proname
-----+---------+-----+---------
(0 rows)
SELECT o1.oid, o1.oprname, p1.oid, p1.proname
FROM pg_operator AS o1, pg_proc AS p1
WHERE o1.oprcode = p1.oid AND
o1.oprkind = 'l' AND
(p1.pronargs != 1
OR NOT binary_coercible(p1.prorettype, o1.oprresult)
OR NOT binary_coercible(o1.oprright, p1.proargtypes[0])
OR o1.oprleft != 0);
oid | oprname | oid | proname
-----+---------+-----+---------
(0 rows)
-- If the operator is mergejoinable or hashjoinable, its underlying function
-- should not be volatile.
SELECT o1.oid, o1.oprname, p1.oid, p1.proname
FROM pg_operator AS o1, pg_proc AS p1
WHERE o1.oprcode = p1.oid AND
(o1.oprcanmerge OR o1.oprcanhash) AND
p1.provolatile = 'v';
oid | oprname | oid | proname
-----+---------+-----+---------
(0 rows)
-- If oprrest is set, the operator must return boolean,
-- and it must link to a proc with the right signature
-- to be a restriction selectivity estimator.
-- The proc signature we want is: float8 proc(internal, oid, internal, int4)
SELECT o1.oid, o1.oprname, p2.oid, p2.proname
FROM pg_operator AS o1, pg_proc AS p2
WHERE o1.oprrest = p2.oid AND
(o1.oprresult != 'bool'::regtype OR
p2.prorettype != 'float8'::regtype OR p2.proretset OR
p2.pronargs != 4 OR
p2.proargtypes[0] != 'internal'::regtype OR
p2.proargtypes[1] != 'oid'::regtype OR
p2.proargtypes[2] != 'internal'::regtype OR
p2.proargtypes[3] != 'int4'::regtype);
oid | oprname | oid | proname
-----+---------+-----+---------
(0 rows)
-- If oprjoin is set, the operator must be a binary boolean op,
-- and it must link to a proc with the right signature
-- to be a join selectivity estimator.
Clean up the loose ends in selectivity estimation left by my patch for semi and anti joins. To do this, pass the SpecialJoinInfo struct for the current join as an additional optional argument to operator join selectivity estimation functions. This allows the estimator to tell not only what kind of join is being formed, but which variable is on which side of the join; a requirement long recognized but not dealt with till now. This also leaves the door open for future improvements in the estimators, such as accounting for the null-insertion effects of lower outer joins. I didn't do anything about that in the current patch but the information is in principle deducible from what's passed. The patch also clarifies the definition of join selectivity for semi/anti joins: it's the fraction of the left input that has (at least one) match in the right input. This allows getting rid of some very fuzzy thinking that I had committed in the original 7.4-era IN-optimization patch. There's probably room to estimate this better than the present patch does, but at least we know what to estimate. Since I had to touch CREATE OPERATOR anyway to allow a variant signature for join estimator functions, I took the opportunity to add a couple of additional checks that were missing, per my recent message to -hackers: * Check that estimator functions return float8; * Require execute permission at the time of CREATE OPERATOR on the operator's function as well as the estimator functions; * Require ownership of any pre-existing operator that's modified by the command. I also moved the lookup of the functions out of OperatorCreate() and into operatorcmds.c, since that seemed more consistent with most of the other catalog object creation processes, eg CREATE TYPE.
17 years ago
-- The proc signature we want is: float8 proc(internal, oid, internal, int2, internal)
-- (Note: the old signature with only 4 args is still allowed, but no core
-- estimator should be using it.)
SELECT o1.oid, o1.oprname, p2.oid, p2.proname
FROM pg_operator AS o1, pg_proc AS p2
WHERE o1.oprjoin = p2.oid AND
(o1.oprkind != 'b' OR o1.oprresult != 'bool'::regtype OR
p2.prorettype != 'float8'::regtype OR p2.proretset OR
Clean up the loose ends in selectivity estimation left by my patch for semi and anti joins. To do this, pass the SpecialJoinInfo struct for the current join as an additional optional argument to operator join selectivity estimation functions. This allows the estimator to tell not only what kind of join is being formed, but which variable is on which side of the join; a requirement long recognized but not dealt with till now. This also leaves the door open for future improvements in the estimators, such as accounting for the null-insertion effects of lower outer joins. I didn't do anything about that in the current patch but the information is in principle deducible from what's passed. The patch also clarifies the definition of join selectivity for semi/anti joins: it's the fraction of the left input that has (at least one) match in the right input. This allows getting rid of some very fuzzy thinking that I had committed in the original 7.4-era IN-optimization patch. There's probably room to estimate this better than the present patch does, but at least we know what to estimate. Since I had to touch CREATE OPERATOR anyway to allow a variant signature for join estimator functions, I took the opportunity to add a couple of additional checks that were missing, per my recent message to -hackers: * Check that estimator functions return float8; * Require execute permission at the time of CREATE OPERATOR on the operator's function as well as the estimator functions; * Require ownership of any pre-existing operator that's modified by the command. I also moved the lookup of the functions out of OperatorCreate() and into operatorcmds.c, since that seemed more consistent with most of the other catalog object creation processes, eg CREATE TYPE.
17 years ago
p2.pronargs != 5 OR
p2.proargtypes[0] != 'internal'::regtype OR
p2.proargtypes[1] != 'oid'::regtype OR
p2.proargtypes[2] != 'internal'::regtype OR
Clean up the loose ends in selectivity estimation left by my patch for semi and anti joins. To do this, pass the SpecialJoinInfo struct for the current join as an additional optional argument to operator join selectivity estimation functions. This allows the estimator to tell not only what kind of join is being formed, but which variable is on which side of the join; a requirement long recognized but not dealt with till now. This also leaves the door open for future improvements in the estimators, such as accounting for the null-insertion effects of lower outer joins. I didn't do anything about that in the current patch but the information is in principle deducible from what's passed. The patch also clarifies the definition of join selectivity for semi/anti joins: it's the fraction of the left input that has (at least one) match in the right input. This allows getting rid of some very fuzzy thinking that I had committed in the original 7.4-era IN-optimization patch. There's probably room to estimate this better than the present patch does, but at least we know what to estimate. Since I had to touch CREATE OPERATOR anyway to allow a variant signature for join estimator functions, I took the opportunity to add a couple of additional checks that were missing, per my recent message to -hackers: * Check that estimator functions return float8; * Require execute permission at the time of CREATE OPERATOR on the operator's function as well as the estimator functions; * Require ownership of any pre-existing operator that's modified by the command. I also moved the lookup of the functions out of OperatorCreate() and into operatorcmds.c, since that seemed more consistent with most of the other catalog object creation processes, eg CREATE TYPE.
17 years ago
p2.proargtypes[3] != 'int2'::regtype OR
p2.proargtypes[4] != 'internal'::regtype);
oid | oprname | oid | proname
-----+---------+-----+---------
(0 rows)
-- Insist that all built-in pg_operator entries have descriptions
SELECT o1.oid, o1.oprname
FROM pg_operator as o1 LEFT JOIN pg_description as d
ON o1.tableoid = d.classoid and o1.oid = d.objoid and d.objsubid = 0
WHERE d.classoid IS NULL AND o1.oid <= 9999;
oid | oprname
-----+---------
(0 rows)
-- Check that operators' underlying functions have suitable comments,
-- namely 'implementation of XXX operator'. (Note: it's not necessary to
-- put such comments into pg_proc.dat; initdb will generate them as needed.)
-- In some cases involving legacy names for operators, there are multiple
-- operators referencing the same pg_proc entry, so ignore operators whose
-- comments say they are deprecated.
-- We also have a few functions that are both operator support and meant to
-- be called directly; those should have comments matching their operator.
WITH funcdescs AS (
SELECT p.oid as p_oid, proname, o.oid as o_oid,
pd.description as prodesc,
'implementation of ' || oprname || ' operator' as expecteddesc,
od.description as oprdesc
FROM pg_proc p JOIN pg_operator o ON oprcode = p.oid
LEFT JOIN pg_description pd ON
(pd.objoid = p.oid and pd.classoid = p.tableoid and pd.objsubid = 0)
LEFT JOIN pg_description od ON
(od.objoid = o.oid and od.classoid = o.tableoid and od.objsubid = 0)
WHERE o.oid <= 9999
)
SELECT * FROM funcdescs
WHERE prodesc IS DISTINCT FROM expecteddesc
AND oprdesc NOT LIKE 'deprecated%'
AND prodesc IS DISTINCT FROM oprdesc;
p_oid | proname | o_oid | prodesc | expecteddesc | oprdesc
-------+---------+-------+---------+--------------+---------
(0 rows)
-- Show all the operator-implementation functions that have their own
-- comments. This should happen only in cases where the function and
-- operator syntaxes are both documented at the user level.
-- This should be a pretty short list; it's mostly legacy cases.
WITH funcdescs AS (
SELECT p.oid as p_oid, proname, o.oid as o_oid,
pd.description as prodesc,
'implementation of ' || oprname || ' operator' as expecteddesc,
od.description as oprdesc
FROM pg_proc p JOIN pg_operator o ON oprcode = p.oid
LEFT JOIN pg_description pd ON
(pd.objoid = p.oid and pd.classoid = p.tableoid and pd.objsubid = 0)
LEFT JOIN pg_description od ON
(od.objoid = o.oid and od.classoid = o.tableoid and od.objsubid = 0)
WHERE o.oid <= 9999
)
SELECT p_oid, proname, prodesc FROM funcdescs
WHERE prodesc IS DISTINCT FROM expecteddesc
AND oprdesc NOT LIKE 'deprecated%'
ORDER BY 1;
p_oid | proname | prodesc
-------+-------------------------+-------------------------------------------------
378 | array_append | append element onto end of array
379 | array_prepend | prepend element onto front of array
1035 | aclinsert | add/update ACL item
1036 | aclremove | remove ACL item
1037 | aclcontains | contains
3217 | jsonb_extract_path | get value from jsonb with path elements
3940 | jsonb_extract_path_text | get value from jsonb as text with path elements
3951 | json_extract_path | get value from json with path elements
3953 | json_extract_path_text | get value from json as text with path elements
(9 rows)
-- Operators that are commutator pairs should have identical volatility
-- and leakproofness markings on their implementation functions.
SELECT o1.oid, o1.oprcode, o2.oid, o2.oprcode
FROM pg_operator AS o1, pg_operator AS o2, pg_proc AS p1, pg_proc AS p2
WHERE o1.oprcom = o2.oid AND p1.oid = o1.oprcode AND p2.oid = o2.oprcode AND
(p1.provolatile != p2.provolatile OR
p1.proleakproof != p2.proleakproof);
oid | oprcode | oid | oprcode
-----+---------+-----+---------
(0 rows)
-- Likewise for negator pairs.
SELECT o1.oid, o1.oprcode, o2.oid, o2.oprcode
FROM pg_operator AS o1, pg_operator AS o2, pg_proc AS p1, pg_proc AS p2
WHERE o1.oprnegate = o2.oid AND p1.oid = o1.oprcode AND p2.oid = o2.oprcode AND
(p1.provolatile != p2.provolatile OR
p1.proleakproof != p2.proleakproof);
oid | oprcode | oid | oprcode
-----+---------+-----+---------
(0 rows)
-- Btree comparison operators' functions should have the same volatility
-- and leakproofness markings as the associated comparison support function.
SELECT pp.oid::regprocedure as proc, pp.provolatile as vp, pp.proleakproof as lp,
po.oid::regprocedure as opr, po.provolatile as vo, po.proleakproof as lo
FROM pg_proc pp, pg_proc po, pg_operator o, pg_amproc ap, pg_amop ao
WHERE pp.oid = ap.amproc AND po.oid = o.oprcode AND o.oid = ao.amopopr AND
ao.amopmethod = (SELECT oid FROM pg_am WHERE amname = 'btree') AND
ao.amopfamily = ap.amprocfamily AND
ao.amoplefttype = ap.amproclefttype AND
ao.amoprighttype = ap.amprocrighttype AND
ap.amprocnum = 1 AND
(pp.provolatile != po.provolatile OR
pp.proleakproof != po.proleakproof)
ORDER BY 1;
Straighten out leakproofness markings on text comparison functions. Since we introduced the idea of leakproof functions, texteq and textne were marked leakproof but their sibling text comparison functions were not. This inconsistency seemed justified because texteq/textne just relied on memcmp() and so could easily be seen to be leakproof, while the other comparison functions are far more complex and indeed can throw input-dependent errors. However, that argument crashed and burned with the addition of nondeterministic collations, because now texteq/textne may invoke the exact same varstr_cmp() infrastructure as the rest. It makes no sense whatever to give them different leakproofness markings. After a certain amount of angst we've concluded that it's all right to consider varstr_cmp() to be leakproof, mostly because the other choice would be disastrous for performance of many queries where leakproofness matters. The input-dependent errors should only be reachable for corrupt input data, or so we hope anyway; certainly, if they are reachable in practice, we've got problems with requirements as basic as maintaining a btree index on a text column. Hence, run around to all the SQL functions that derive from varstr_cmp() and mark them leakproof. This should result in a useful gain in flexibility/performance for queries in which non-leakproofness degrades the efficiency of the query plan. Back-patch to v12 where nondeterministic collations were added. While this isn't an essential bug fix given the determination that varstr_cmp() is leakproof, we might as well apply it now that we've been forced into a post-beta4 catversion bump. Discussion: https://postgr.es/m/31481.1568303470@sss.pgh.pa.us
6 years ago
proc | vp | lp | opr | vo | lo
------+----+----+-----+----+----
(0 rows)
-- **************** pg_aggregate ****************
-- Look for illegal values in pg_aggregate fields.
SELECT ctid, aggfnoid::oid
FROM pg_aggregate as a
Support ordered-set (WITHIN GROUP) aggregates. This patch introduces generic support for ordered-set and hypothetical-set aggregate functions, as well as implementations of the instances defined in SQL:2008 (percentile_cont(), percentile_disc(), rank(), dense_rank(), percent_rank(), cume_dist()). We also added mode() though it is not in the spec, as well as versions of percentile_cont() and percentile_disc() that can compute multiple percentile values in one pass over the data. Unlike the original submission, this patch puts full control of the sorting process in the hands of the aggregate's support functions. To allow the support functions to find out how they're supposed to sort, a new API function AggGetAggref() is added to nodeAgg.c. This allows retrieval of the aggregate call's Aggref node, which may have other uses beyond the immediate need. There is also support for ordered-set aggregates to install cleanup callback functions, so that they can be sure that infrastructure such as tuplesort objects gets cleaned up. In passing, make some fixes in the recently-added support for variadic aggregates, and make some editorial adjustments in the recent FILTER additions for aggregates. Also, simplify use of IsBinaryCoercible() by allowing it to succeed whenever the target type is ANY or ANYELEMENT. It was inconsistent that it dealt with other polymorphic target types but not these. Atri Sharma and Andrew Gierth; reviewed by Pavel Stehule and Vik Fearing, and rather heavily editorialized upon by Tom Lane
12 years ago
WHERE aggfnoid = 0 OR aggtransfn = 0 OR
aggkind NOT IN ('n', 'o', 'h') OR
aggnumdirectargs < 0 OR
(aggkind = 'n' AND aggnumdirectargs > 0) OR
Explicitly track whether aggregate final functions modify transition state. Up to now, there's been hard-wired assumptions that normal aggregates' final functions never modify their transition states, while ordered-set aggregates' final functions always do. This has always been a bit limiting, and in particular it's getting in the way of improving the built-in ordered-set aggregates to allow merging of transition states. Therefore, let's introduce catalog and CREATE AGGREGATE infrastructure that lets the finalfn's behavior be declared explicitly. There are now three possibilities for the finalfn behavior: it's purely read-only, it trashes the transition state irrecoverably, or it changes the state in such a way that no more transfn calls are possible but the state can still be passed to other, compatible finalfns. There are no examples of this third case today, but we'll shortly make the built-in OSAs act like that. This change allows user-defined aggregates to explicitly disclaim support for use as window functions, and/or to prevent transition state merging, if their implementations cannot handle that. While it was previously possible to handle the window case with a run-time error check, there was not any way to prevent transition state merging, which in retrospect is something commit 804163bc2 should have provided for. But better late than never. In passing, split out pg_aggregate.c's extern function declarations into a new header file pg_aggregate_fn.h, similarly to what we've done for some other catalog headers, so that pg_aggregate.h itself can be safe for frontend files to include. This lets pg_dump use the symbolic names for relevant constants. Discussion: https://postgr.es/m/4834.1507849699@sss.pgh.pa.us
8 years ago
aggfinalmodify NOT IN ('r', 's', 'w') OR
aggmfinalmodify NOT IN ('r', 's', 'w') OR
aggtranstype = 0 OR aggtransspace < 0 OR aggmtransspace < 0;
ctid | aggfnoid
------+----------
(0 rows)
-- Make sure the matching pg_proc entry is sensible, too.
SELECT a.aggfnoid::oid, p.proname
FROM pg_aggregate as a, pg_proc as p
WHERE a.aggfnoid = p.oid AND
(p.prokind != 'a' OR p.proretset OR p.pronargs < a.aggnumdirectargs);
aggfnoid | proname
----------+---------
(0 rows)
-- Make sure there are no prokind = PROKIND_AGGREGATE pg_proc entries without matches.
SELECT oid, proname
FROM pg_proc as p
WHERE p.prokind = 'a' AND
NOT EXISTS (SELECT 1 FROM pg_aggregate a WHERE a.aggfnoid = p.oid);
oid | proname
-----+---------
(0 rows)
-- If there is no finalfn then the output type must be the transtype.
SELECT a.aggfnoid::oid, p.proname
FROM pg_aggregate as a, pg_proc as p
WHERE a.aggfnoid = p.oid AND
a.aggfinalfn = 0 AND p.prorettype != a.aggtranstype;
aggfnoid | proname
----------+---------
(0 rows)
-- Cross-check transfn against its entry in pg_proc.
SELECT a.aggfnoid::oid, p.proname, ptr.oid, ptr.proname
FROM pg_aggregate AS a, pg_proc AS p, pg_proc AS ptr
WHERE a.aggfnoid = p.oid AND
a.aggtransfn = ptr.oid AND
(ptr.proretset
Support ordered-set (WITHIN GROUP) aggregates. This patch introduces generic support for ordered-set and hypothetical-set aggregate functions, as well as implementations of the instances defined in SQL:2008 (percentile_cont(), percentile_disc(), rank(), dense_rank(), percent_rank(), cume_dist()). We also added mode() though it is not in the spec, as well as versions of percentile_cont() and percentile_disc() that can compute multiple percentile values in one pass over the data. Unlike the original submission, this patch puts full control of the sorting process in the hands of the aggregate's support functions. To allow the support functions to find out how they're supposed to sort, a new API function AggGetAggref() is added to nodeAgg.c. This allows retrieval of the aggregate call's Aggref node, which may have other uses beyond the immediate need. There is also support for ordered-set aggregates to install cleanup callback functions, so that they can be sure that infrastructure such as tuplesort objects gets cleaned up. In passing, make some fixes in the recently-added support for variadic aggregates, and make some editorial adjustments in the recent FILTER additions for aggregates. Also, simplify use of IsBinaryCoercible() by allowing it to succeed whenever the target type is ANY or ANYELEMENT. It was inconsistent that it dealt with other polymorphic target types but not these. Atri Sharma and Andrew Gierth; reviewed by Pavel Stehule and Vik Fearing, and rather heavily editorialized upon by Tom Lane
12 years ago
OR NOT (ptr.pronargs =
CASE WHEN a.aggkind = 'n' THEN p.pronargs + 1
ELSE greatest(p.pronargs - a.aggnumdirectargs, 1) + 1 END)
OR NOT binary_coercible(ptr.prorettype, a.aggtranstype)
OR NOT binary_coercible(a.aggtranstype, ptr.proargtypes[0])
OR (p.pronargs > 0 AND
NOT binary_coercible(p.proargtypes[0], ptr.proargtypes[1]))
OR (p.pronargs > 1 AND
NOT binary_coercible(p.proargtypes[1], ptr.proargtypes[2]))
OR (p.pronargs > 2 AND
NOT binary_coercible(p.proargtypes[2], ptr.proargtypes[3]))
OR (p.pronargs > 3 AND
NOT binary_coercible(p.proargtypes[3], ptr.proargtypes[4]))
-- we could carry the check further, but 4 args is enough for now
OR (p.pronargs > 4)
);
aggfnoid | proname | oid | proname
----------+---------+-----+---------
(0 rows)
-- Cross-check finalfn (if present) against its entry in pg_proc.
SELECT a.aggfnoid::oid, p.proname, pfn.oid, pfn.proname
FROM pg_aggregate AS a, pg_proc AS p, pg_proc AS pfn
WHERE a.aggfnoid = p.oid AND
a.aggfinalfn = pfn.oid AND
Support ordered-set (WITHIN GROUP) aggregates. This patch introduces generic support for ordered-set and hypothetical-set aggregate functions, as well as implementations of the instances defined in SQL:2008 (percentile_cont(), percentile_disc(), rank(), dense_rank(), percent_rank(), cume_dist()). We also added mode() though it is not in the spec, as well as versions of percentile_cont() and percentile_disc() that can compute multiple percentile values in one pass over the data. Unlike the original submission, this patch puts full control of the sorting process in the hands of the aggregate's support functions. To allow the support functions to find out how they're supposed to sort, a new API function AggGetAggref() is added to nodeAgg.c. This allows retrieval of the aggregate call's Aggref node, which may have other uses beyond the immediate need. There is also support for ordered-set aggregates to install cleanup callback functions, so that they can be sure that infrastructure such as tuplesort objects gets cleaned up. In passing, make some fixes in the recently-added support for variadic aggregates, and make some editorial adjustments in the recent FILTER additions for aggregates. Also, simplify use of IsBinaryCoercible() by allowing it to succeed whenever the target type is ANY or ANYELEMENT. It was inconsistent that it dealt with other polymorphic target types but not these. Atri Sharma and Andrew Gierth; reviewed by Pavel Stehule and Vik Fearing, and rather heavily editorialized upon by Tom Lane
12 years ago
(pfn.proretset OR
NOT binary_coercible(pfn.prorettype, p.prorettype) OR
NOT binary_coercible(a.aggtranstype, pfn.proargtypes[0]) OR
Allow polymorphic aggregates to have non-polymorphic state data types. Before 9.4, such an aggregate couldn't be declared, because its final function would have to have polymorphic result type but no polymorphic argument, which CREATE FUNCTION would quite properly reject. The ordered-set-aggregate patch found a workaround: allow the final function to be declared as accepting additional dummy arguments that have types matching the aggregate's regular input arguments. However, we failed to notice that this problem applies just as much to regular aggregates, despite the fact that we had a built-in regular aggregate array_agg() that was known to be undeclarable in SQL because its final function had an illegal signature. So what we should have done, and what this patch does, is to decouple the extra-dummy-arguments behavior from ordered-set aggregates and make it generally available for all aggregate declarations. We have to put this into 9.4 rather than waiting till later because it slightly alters the rules for declaring ordered-set aggregates. The patch turned out a bit bigger than I'd hoped because it proved necessary to record the extra-arguments option in a new pg_aggregate column. I'd thought we could just look at the final function's pronargs at runtime, but that didn't work well for variadic final functions. It's probably just as well though, because it simplifies life for pg_dump to record the option explicitly. While at it, fix array_agg() to have a valid final-function signature, and add an opr_sanity test to notice future deviations from polymorphic consistency. I also marked the percentile_cont() aggregates as not needing extra arguments, since they don't.
12 years ago
CASE WHEN a.aggfinalextra THEN pfn.pronargs != p.pronargs + 1
ELSE pfn.pronargs != a.aggnumdirectargs + 1 END
OR (pfn.pronargs > 1 AND
Support ordered-set (WITHIN GROUP) aggregates. This patch introduces generic support for ordered-set and hypothetical-set aggregate functions, as well as implementations of the instances defined in SQL:2008 (percentile_cont(), percentile_disc(), rank(), dense_rank(), percent_rank(), cume_dist()). We also added mode() though it is not in the spec, as well as versions of percentile_cont() and percentile_disc() that can compute multiple percentile values in one pass over the data. Unlike the original submission, this patch puts full control of the sorting process in the hands of the aggregate's support functions. To allow the support functions to find out how they're supposed to sort, a new API function AggGetAggref() is added to nodeAgg.c. This allows retrieval of the aggregate call's Aggref node, which may have other uses beyond the immediate need. There is also support for ordered-set aggregates to install cleanup callback functions, so that they can be sure that infrastructure such as tuplesort objects gets cleaned up. In passing, make some fixes in the recently-added support for variadic aggregates, and make some editorial adjustments in the recent FILTER additions for aggregates. Also, simplify use of IsBinaryCoercible() by allowing it to succeed whenever the target type is ANY or ANYELEMENT. It was inconsistent that it dealt with other polymorphic target types but not these. Atri Sharma and Andrew Gierth; reviewed by Pavel Stehule and Vik Fearing, and rather heavily editorialized upon by Tom Lane
12 years ago
NOT binary_coercible(p.proargtypes[0], pfn.proargtypes[1]))
Allow polymorphic aggregates to have non-polymorphic state data types. Before 9.4, such an aggregate couldn't be declared, because its final function would have to have polymorphic result type but no polymorphic argument, which CREATE FUNCTION would quite properly reject. The ordered-set-aggregate patch found a workaround: allow the final function to be declared as accepting additional dummy arguments that have types matching the aggregate's regular input arguments. However, we failed to notice that this problem applies just as much to regular aggregates, despite the fact that we had a built-in regular aggregate array_agg() that was known to be undeclarable in SQL because its final function had an illegal signature. So what we should have done, and what this patch does, is to decouple the extra-dummy-arguments behavior from ordered-set aggregates and make it generally available for all aggregate declarations. We have to put this into 9.4 rather than waiting till later because it slightly alters the rules for declaring ordered-set aggregates. The patch turned out a bit bigger than I'd hoped because it proved necessary to record the extra-arguments option in a new pg_aggregate column. I'd thought we could just look at the final function's pronargs at runtime, but that didn't work well for variadic final functions. It's probably just as well though, because it simplifies life for pg_dump to record the option explicitly. While at it, fix array_agg() to have a valid final-function signature, and add an opr_sanity test to notice future deviations from polymorphic consistency. I also marked the percentile_cont() aggregates as not needing extra arguments, since they don't.
12 years ago
OR (pfn.pronargs > 2 AND
Support ordered-set (WITHIN GROUP) aggregates. This patch introduces generic support for ordered-set and hypothetical-set aggregate functions, as well as implementations of the instances defined in SQL:2008 (percentile_cont(), percentile_disc(), rank(), dense_rank(), percent_rank(), cume_dist()). We also added mode() though it is not in the spec, as well as versions of percentile_cont() and percentile_disc() that can compute multiple percentile values in one pass over the data. Unlike the original submission, this patch puts full control of the sorting process in the hands of the aggregate's support functions. To allow the support functions to find out how they're supposed to sort, a new API function AggGetAggref() is added to nodeAgg.c. This allows retrieval of the aggregate call's Aggref node, which may have other uses beyond the immediate need. There is also support for ordered-set aggregates to install cleanup callback functions, so that they can be sure that infrastructure such as tuplesort objects gets cleaned up. In passing, make some fixes in the recently-added support for variadic aggregates, and make some editorial adjustments in the recent FILTER additions for aggregates. Also, simplify use of IsBinaryCoercible() by allowing it to succeed whenever the target type is ANY or ANYELEMENT. It was inconsistent that it dealt with other polymorphic target types but not these. Atri Sharma and Andrew Gierth; reviewed by Pavel Stehule and Vik Fearing, and rather heavily editorialized upon by Tom Lane
12 years ago
NOT binary_coercible(p.proargtypes[1], pfn.proargtypes[2]))
Allow polymorphic aggregates to have non-polymorphic state data types. Before 9.4, such an aggregate couldn't be declared, because its final function would have to have polymorphic result type but no polymorphic argument, which CREATE FUNCTION would quite properly reject. The ordered-set-aggregate patch found a workaround: allow the final function to be declared as accepting additional dummy arguments that have types matching the aggregate's regular input arguments. However, we failed to notice that this problem applies just as much to regular aggregates, despite the fact that we had a built-in regular aggregate array_agg() that was known to be undeclarable in SQL because its final function had an illegal signature. So what we should have done, and what this patch does, is to decouple the extra-dummy-arguments behavior from ordered-set aggregates and make it generally available for all aggregate declarations. We have to put this into 9.4 rather than waiting till later because it slightly alters the rules for declaring ordered-set aggregates. The patch turned out a bit bigger than I'd hoped because it proved necessary to record the extra-arguments option in a new pg_aggregate column. I'd thought we could just look at the final function's pronargs at runtime, but that didn't work well for variadic final functions. It's probably just as well though, because it simplifies life for pg_dump to record the option explicitly. While at it, fix array_agg() to have a valid final-function signature, and add an opr_sanity test to notice future deviations from polymorphic consistency. I also marked the percentile_cont() aggregates as not needing extra arguments, since they don't.
12 years ago
OR (pfn.pronargs > 3 AND
Support ordered-set (WITHIN GROUP) aggregates. This patch introduces generic support for ordered-set and hypothetical-set aggregate functions, as well as implementations of the instances defined in SQL:2008 (percentile_cont(), percentile_disc(), rank(), dense_rank(), percent_rank(), cume_dist()). We also added mode() though it is not in the spec, as well as versions of percentile_cont() and percentile_disc() that can compute multiple percentile values in one pass over the data. Unlike the original submission, this patch puts full control of the sorting process in the hands of the aggregate's support functions. To allow the support functions to find out how they're supposed to sort, a new API function AggGetAggref() is added to nodeAgg.c. This allows retrieval of the aggregate call's Aggref node, which may have other uses beyond the immediate need. There is also support for ordered-set aggregates to install cleanup callback functions, so that they can be sure that infrastructure such as tuplesort objects gets cleaned up. In passing, make some fixes in the recently-added support for variadic aggregates, and make some editorial adjustments in the recent FILTER additions for aggregates. Also, simplify use of IsBinaryCoercible() by allowing it to succeed whenever the target type is ANY or ANYELEMENT. It was inconsistent that it dealt with other polymorphic target types but not these. Atri Sharma and Andrew Gierth; reviewed by Pavel Stehule and Vik Fearing, and rather heavily editorialized upon by Tom Lane
12 years ago
NOT binary_coercible(p.proargtypes[2], pfn.proargtypes[3]))
-- we could carry the check further, but 4 args is enough for now
OR (pfn.pronargs > 4)
Allow polymorphic aggregates to have non-polymorphic state data types. Before 9.4, such an aggregate couldn't be declared, because its final function would have to have polymorphic result type but no polymorphic argument, which CREATE FUNCTION would quite properly reject. The ordered-set-aggregate patch found a workaround: allow the final function to be declared as accepting additional dummy arguments that have types matching the aggregate's regular input arguments. However, we failed to notice that this problem applies just as much to regular aggregates, despite the fact that we had a built-in regular aggregate array_agg() that was known to be undeclarable in SQL because its final function had an illegal signature. So what we should have done, and what this patch does, is to decouple the extra-dummy-arguments behavior from ordered-set aggregates and make it generally available for all aggregate declarations. We have to put this into 9.4 rather than waiting till later because it slightly alters the rules for declaring ordered-set aggregates. The patch turned out a bit bigger than I'd hoped because it proved necessary to record the extra-arguments option in a new pg_aggregate column. I'd thought we could just look at the final function's pronargs at runtime, but that didn't work well for variadic final functions. It's probably just as well though, because it simplifies life for pg_dump to record the option explicitly. While at it, fix array_agg() to have a valid final-function signature, and add an opr_sanity test to notice future deviations from polymorphic consistency. I also marked the percentile_cont() aggregates as not needing extra arguments, since they don't.
12 years ago
);
aggfnoid | proname | oid | proname
----------+---------+-----+---------
(0 rows)
-- If transfn is strict then either initval should be non-NULL, or
-- input type should match transtype so that the first non-null input
-- can be assigned as the state value.
SELECT a.aggfnoid::oid, p.proname, ptr.oid, ptr.proname
FROM pg_aggregate AS a, pg_proc AS p, pg_proc AS ptr
WHERE a.aggfnoid = p.oid AND
a.aggtransfn = ptr.oid AND ptr.proisstrict AND
a.agginitval IS NULL AND
NOT binary_coercible(p.proargtypes[0], a.aggtranstype);
aggfnoid | proname | oid | proname
----------+---------+-----+---------
(0 rows)
-- Check for inconsistent specifications of moving-aggregate columns.
SELECT ctid, aggfnoid::oid
FROM pg_aggregate as a
WHERE aggmtranstype != 0 AND
(aggmtransfn = 0 OR aggminvtransfn = 0);
ctid | aggfnoid
------+----------
(0 rows)
SELECT ctid, aggfnoid::oid
FROM pg_aggregate as a
WHERE aggmtranstype = 0 AND
(aggmtransfn != 0 OR aggminvtransfn != 0 OR aggmfinalfn != 0 OR
aggmtransspace != 0 OR aggminitval IS NOT NULL);
ctid | aggfnoid
------+----------
(0 rows)
-- If there is no mfinalfn then the output type must be the mtranstype.
SELECT a.aggfnoid::oid, p.proname
FROM pg_aggregate as a, pg_proc as p
WHERE a.aggfnoid = p.oid AND
a.aggmtransfn != 0 AND
a.aggmfinalfn = 0 AND p.prorettype != a.aggmtranstype;
aggfnoid | proname
----------+---------
(0 rows)
-- Cross-check mtransfn (if present) against its entry in pg_proc.
SELECT a.aggfnoid::oid, p.proname, ptr.oid, ptr.proname
FROM pg_aggregate AS a, pg_proc AS p, pg_proc AS ptr
WHERE a.aggfnoid = p.oid AND
a.aggmtransfn = ptr.oid AND
(ptr.proretset
OR NOT (ptr.pronargs =
CASE WHEN a.aggkind = 'n' THEN p.pronargs + 1
ELSE greatest(p.pronargs - a.aggnumdirectargs, 1) + 1 END)
OR NOT binary_coercible(ptr.prorettype, a.aggmtranstype)
OR NOT binary_coercible(a.aggmtranstype, ptr.proargtypes[0])
OR (p.pronargs > 0 AND
NOT binary_coercible(p.proargtypes[0], ptr.proargtypes[1]))
OR (p.pronargs > 1 AND
NOT binary_coercible(p.proargtypes[1], ptr.proargtypes[2]))
OR (p.pronargs > 2 AND
NOT binary_coercible(p.proargtypes[2], ptr.proargtypes[3]))
-- we could carry the check further, but 3 args is enough for now
OR (p.pronargs > 3)
);
aggfnoid | proname | oid | proname
----------+---------+-----+---------
(0 rows)
-- Cross-check minvtransfn (if present) against its entry in pg_proc.
SELECT a.aggfnoid::oid, p.proname, ptr.oid, ptr.proname
FROM pg_aggregate AS a, pg_proc AS p, pg_proc AS ptr
WHERE a.aggfnoid = p.oid AND
a.aggminvtransfn = ptr.oid AND
(ptr.proretset
OR NOT (ptr.pronargs =
CASE WHEN a.aggkind = 'n' THEN p.pronargs + 1
ELSE greatest(p.pronargs - a.aggnumdirectargs, 1) + 1 END)
OR NOT binary_coercible(ptr.prorettype, a.aggmtranstype)
OR NOT binary_coercible(a.aggmtranstype, ptr.proargtypes[0])
OR (p.pronargs > 0 AND
NOT binary_coercible(p.proargtypes[0], ptr.proargtypes[1]))
OR (p.pronargs > 1 AND
NOT binary_coercible(p.proargtypes[1], ptr.proargtypes[2]))
OR (p.pronargs > 2 AND
NOT binary_coercible(p.proargtypes[2], ptr.proargtypes[3]))
-- we could carry the check further, but 3 args is enough for now
OR (p.pronargs > 3)
);
aggfnoid | proname | oid | proname
----------+---------+-----+---------
(0 rows)
-- Cross-check mfinalfn (if present) against its entry in pg_proc.
SELECT a.aggfnoid::oid, p.proname, pfn.oid, pfn.proname
FROM pg_aggregate AS a, pg_proc AS p, pg_proc AS pfn
WHERE a.aggfnoid = p.oid AND
a.aggmfinalfn = pfn.oid AND
(pfn.proretset OR
NOT binary_coercible(pfn.prorettype, p.prorettype) OR
NOT binary_coercible(a.aggmtranstype, pfn.proargtypes[0]) OR
Allow polymorphic aggregates to have non-polymorphic state data types. Before 9.4, such an aggregate couldn't be declared, because its final function would have to have polymorphic result type but no polymorphic argument, which CREATE FUNCTION would quite properly reject. The ordered-set-aggregate patch found a workaround: allow the final function to be declared as accepting additional dummy arguments that have types matching the aggregate's regular input arguments. However, we failed to notice that this problem applies just as much to regular aggregates, despite the fact that we had a built-in regular aggregate array_agg() that was known to be undeclarable in SQL because its final function had an illegal signature. So what we should have done, and what this patch does, is to decouple the extra-dummy-arguments behavior from ordered-set aggregates and make it generally available for all aggregate declarations. We have to put this into 9.4 rather than waiting till later because it slightly alters the rules for declaring ordered-set aggregates. The patch turned out a bit bigger than I'd hoped because it proved necessary to record the extra-arguments option in a new pg_aggregate column. I'd thought we could just look at the final function's pronargs at runtime, but that didn't work well for variadic final functions. It's probably just as well though, because it simplifies life for pg_dump to record the option explicitly. While at it, fix array_agg() to have a valid final-function signature, and add an opr_sanity test to notice future deviations from polymorphic consistency. I also marked the percentile_cont() aggregates as not needing extra arguments, since they don't.
12 years ago
CASE WHEN a.aggmfinalextra THEN pfn.pronargs != p.pronargs + 1
ELSE pfn.pronargs != a.aggnumdirectargs + 1 END
OR (pfn.pronargs > 1 AND
NOT binary_coercible(p.proargtypes[0], pfn.proargtypes[1]))
Allow polymorphic aggregates to have non-polymorphic state data types. Before 9.4, such an aggregate couldn't be declared, because its final function would have to have polymorphic result type but no polymorphic argument, which CREATE FUNCTION would quite properly reject. The ordered-set-aggregate patch found a workaround: allow the final function to be declared as accepting additional dummy arguments that have types matching the aggregate's regular input arguments. However, we failed to notice that this problem applies just as much to regular aggregates, despite the fact that we had a built-in regular aggregate array_agg() that was known to be undeclarable in SQL because its final function had an illegal signature. So what we should have done, and what this patch does, is to decouple the extra-dummy-arguments behavior from ordered-set aggregates and make it generally available for all aggregate declarations. We have to put this into 9.4 rather than waiting till later because it slightly alters the rules for declaring ordered-set aggregates. The patch turned out a bit bigger than I'd hoped because it proved necessary to record the extra-arguments option in a new pg_aggregate column. I'd thought we could just look at the final function's pronargs at runtime, but that didn't work well for variadic final functions. It's probably just as well though, because it simplifies life for pg_dump to record the option explicitly. While at it, fix array_agg() to have a valid final-function signature, and add an opr_sanity test to notice future deviations from polymorphic consistency. I also marked the percentile_cont() aggregates as not needing extra arguments, since they don't.
12 years ago
OR (pfn.pronargs > 2 AND
NOT binary_coercible(p.proargtypes[1], pfn.proargtypes[2]))
Allow polymorphic aggregates to have non-polymorphic state data types. Before 9.4, such an aggregate couldn't be declared, because its final function would have to have polymorphic result type but no polymorphic argument, which CREATE FUNCTION would quite properly reject. The ordered-set-aggregate patch found a workaround: allow the final function to be declared as accepting additional dummy arguments that have types matching the aggregate's regular input arguments. However, we failed to notice that this problem applies just as much to regular aggregates, despite the fact that we had a built-in regular aggregate array_agg() that was known to be undeclarable in SQL because its final function had an illegal signature. So what we should have done, and what this patch does, is to decouple the extra-dummy-arguments behavior from ordered-set aggregates and make it generally available for all aggregate declarations. We have to put this into 9.4 rather than waiting till later because it slightly alters the rules for declaring ordered-set aggregates. The patch turned out a bit bigger than I'd hoped because it proved necessary to record the extra-arguments option in a new pg_aggregate column. I'd thought we could just look at the final function's pronargs at runtime, but that didn't work well for variadic final functions. It's probably just as well though, because it simplifies life for pg_dump to record the option explicitly. While at it, fix array_agg() to have a valid final-function signature, and add an opr_sanity test to notice future deviations from polymorphic consistency. I also marked the percentile_cont() aggregates as not needing extra arguments, since they don't.
12 years ago
OR (pfn.pronargs > 3 AND
NOT binary_coercible(p.proargtypes[2], pfn.proargtypes[3]))
-- we could carry the check further, but 4 args is enough for now
OR (pfn.pronargs > 4)
Allow polymorphic aggregates to have non-polymorphic state data types. Before 9.4, such an aggregate couldn't be declared, because its final function would have to have polymorphic result type but no polymorphic argument, which CREATE FUNCTION would quite properly reject. The ordered-set-aggregate patch found a workaround: allow the final function to be declared as accepting additional dummy arguments that have types matching the aggregate's regular input arguments. However, we failed to notice that this problem applies just as much to regular aggregates, despite the fact that we had a built-in regular aggregate array_agg() that was known to be undeclarable in SQL because its final function had an illegal signature. So what we should have done, and what this patch does, is to decouple the extra-dummy-arguments behavior from ordered-set aggregates and make it generally available for all aggregate declarations. We have to put this into 9.4 rather than waiting till later because it slightly alters the rules for declaring ordered-set aggregates. The patch turned out a bit bigger than I'd hoped because it proved necessary to record the extra-arguments option in a new pg_aggregate column. I'd thought we could just look at the final function's pronargs at runtime, but that didn't work well for variadic final functions. It's probably just as well though, because it simplifies life for pg_dump to record the option explicitly. While at it, fix array_agg() to have a valid final-function signature, and add an opr_sanity test to notice future deviations from polymorphic consistency. I also marked the percentile_cont() aggregates as not needing extra arguments, since they don't.
12 years ago
);
aggfnoid | proname | oid | proname
----------+---------+-----+---------
(0 rows)
-- If mtransfn is strict then either minitval should be non-NULL, or
-- input type should match mtranstype so that the first non-null input
-- can be assigned as the state value.
SELECT a.aggfnoid::oid, p.proname, ptr.oid, ptr.proname
FROM pg_aggregate AS a, pg_proc AS p, pg_proc AS ptr
WHERE a.aggfnoid = p.oid AND
a.aggmtransfn = ptr.oid AND ptr.proisstrict AND
a.aggminitval IS NULL AND
NOT binary_coercible(p.proargtypes[0], a.aggmtranstype);
aggfnoid | proname | oid | proname
----------+---------+-----+---------
(0 rows)
-- mtransfn and minvtransfn should have same strictness setting.
SELECT a.aggfnoid::oid, p.proname, ptr.oid, ptr.proname, iptr.oid, iptr.proname
FROM pg_aggregate AS a, pg_proc AS p, pg_proc AS ptr, pg_proc AS iptr
WHERE a.aggfnoid = p.oid AND
a.aggmtransfn = ptr.oid AND
a.aggminvtransfn = iptr.oid AND
ptr.proisstrict != iptr.proisstrict;
aggfnoid | proname | oid | proname | oid | proname
----------+---------+-----+---------+-----+---------
(0 rows)
Fix type-safety problem with parallel aggregate serial/deserialization. The original specification for this called for the deserialization function to have signature "deserialize(serialtype) returns transtype", which is a security violation if transtype is INTERNAL (which it always would be in practice) and serialtype is not (which ditto). The patch blithely overrode the opr_sanity check for that, which was sloppy-enough work in itself, but the indisputable reason this cannot be allowed to stand is that CREATE FUNCTION will reject such a signature and thus it'd be impossible for extensions to create parallelizable aggregates. The minimum fix to make the signature type-safe is to add a second, dummy argument of type INTERNAL. But to lock it down a bit more and make misuse of INTERNAL-accepting functions less likely, let's get rid of the ability to specify a "serialtype" for an aggregate and just say that the only useful serialtype is BYTEA --- which, in practice, is the only interesting value anyway, due to the usefulness of the send/recv infrastructure for this purpose. That means we only have to allow "serialize(internal) returns bytea" and "deserialize(bytea, internal) returns internal" as the signatures for these support functions. In passing fix bogus signature of int4_avg_combine, which I found thanks to adding an opr_sanity check on combinefunc signatures. catversion bump due to removing pg_aggregate.aggserialtype and adjusting signatures of assorted built-in functions. David Rowley and Tom Lane Discussion: <27247.1466185504@sss.pgh.pa.us>
9 years ago
-- Check that all combine functions have signature
-- combine(transtype, transtype) returns transtype
SELECT a.aggfnoid, p.proname
FROM pg_aggregate as a, pg_proc as p
WHERE a.aggcombinefn = p.oid AND
(p.pronargs != 2 OR
p.prorettype != p.proargtypes[0] OR
p.prorettype != p.proargtypes[1] OR
NOT binary_coercible(a.aggtranstype, p.proargtypes[0]));
Fix type-safety problem with parallel aggregate serial/deserialization. The original specification for this called for the deserialization function to have signature "deserialize(serialtype) returns transtype", which is a security violation if transtype is INTERNAL (which it always would be in practice) and serialtype is not (which ditto). The patch blithely overrode the opr_sanity check for that, which was sloppy-enough work in itself, but the indisputable reason this cannot be allowed to stand is that CREATE FUNCTION will reject such a signature and thus it'd be impossible for extensions to create parallelizable aggregates. The minimum fix to make the signature type-safe is to add a second, dummy argument of type INTERNAL. But to lock it down a bit more and make misuse of INTERNAL-accepting functions less likely, let's get rid of the ability to specify a "serialtype" for an aggregate and just say that the only useful serialtype is BYTEA --- which, in practice, is the only interesting value anyway, due to the usefulness of the send/recv infrastructure for this purpose. That means we only have to allow "serialize(internal) returns bytea" and "deserialize(bytea, internal) returns internal" as the signatures for these support functions. In passing fix bogus signature of int4_avg_combine, which I found thanks to adding an opr_sanity check on combinefunc signatures. catversion bump due to removing pg_aggregate.aggserialtype and adjusting signatures of assorted built-in functions. David Rowley and Tom Lane Discussion: <27247.1466185504@sss.pgh.pa.us>
9 years ago
aggfnoid | proname
----------+---------
(0 rows)
-- Check that no combine function for an INTERNAL transtype is strict.
SELECT a.aggfnoid, p.proname
FROM pg_aggregate as a, pg_proc as p
WHERE a.aggcombinefn = p.oid AND
a.aggtranstype = 'internal'::regtype AND p.proisstrict;
aggfnoid | proname
----------+---------
(0 rows)
-- serialize/deserialize functions should be specified only for aggregates
-- with transtype internal and a combine function, and we should have both
-- or neither of them.
SELECT aggfnoid, aggtranstype, aggserialfn, aggdeserialfn
FROM pg_aggregate
WHERE (aggserialfn != 0 OR aggdeserialfn != 0)
AND (aggtranstype != 'internal'::regtype OR aggcombinefn = 0 OR
aggserialfn = 0 OR aggdeserialfn = 0);
aggfnoid | aggtranstype | aggserialfn | aggdeserialfn
----------+--------------+-------------+---------------
(0 rows)
-- Check that all serialization functions have signature
-- serialize(internal) returns bytea
-- Also insist that they be strict; it's wasteful to run them on NULLs.
SELECT a.aggfnoid, p.proname
FROM pg_aggregate as a, pg_proc as p
WHERE a.aggserialfn = p.oid AND
(p.prorettype != 'bytea'::regtype OR p.pronargs != 1 OR
p.proargtypes[0] != 'internal'::regtype OR
NOT p.proisstrict);
aggfnoid | proname
----------+---------
(0 rows)
-- Check that all deserialization functions have signature
-- deserialize(bytea, internal) returns internal
-- Also insist that they be strict; it's wasteful to run them on NULLs.
SELECT a.aggfnoid, p.proname
FROM pg_aggregate as a, pg_proc as p
WHERE a.aggdeserialfn = p.oid AND
(p.prorettype != 'internal'::regtype OR p.pronargs != 2 OR
p.proargtypes[0] != 'bytea'::regtype OR
p.proargtypes[1] != 'internal'::regtype OR
NOT p.proisstrict);
aggfnoid | proname
----------+---------
(0 rows)
-- Check that aggregates which have the same transition function also have
-- the same combine, serialization, and deserialization functions.
-- While that isn't strictly necessary, it's fishy if they don't.
SELECT a.aggfnoid, a.aggcombinefn, a.aggserialfn, a.aggdeserialfn,
b.aggfnoid, b.aggcombinefn, b.aggserialfn, b.aggdeserialfn
FROM
pg_aggregate a, pg_aggregate b
WHERE
a.aggfnoid < b.aggfnoid AND a.aggtransfn = b.aggtransfn AND
(a.aggcombinefn != b.aggcombinefn OR a.aggserialfn != b.aggserialfn
OR a.aggdeserialfn != b.aggdeserialfn);
aggfnoid | aggcombinefn | aggserialfn | aggdeserialfn | aggfnoid | aggcombinefn | aggserialfn | aggdeserialfn
----------+--------------+-------------+---------------+----------+--------------+-------------+---------------
(0 rows)
-- Cross-check aggsortop (if present) against pg_operator.
-- We expect to find entries for bool_and, bool_or, every, max, and min.
SELECT DISTINCT proname, oprname
FROM pg_operator AS o, pg_aggregate AS a, pg_proc AS p
WHERE a.aggfnoid = p.oid AND a.aggsortop = o.oid
ORDER BY 1, 2;
proname | oprname
----------+---------
bool_and | <
bool_or | >
every | <
max | >
min | <
(5 rows)
-- Check datatypes match
SELECT a.aggfnoid::oid, o.oid
FROM pg_operator AS o, pg_aggregate AS a, pg_proc AS p
WHERE a.aggfnoid = p.oid AND a.aggsortop = o.oid AND
(oprkind != 'b' OR oprresult != 'boolean'::regtype
OR oprleft != p.proargtypes[0] OR oprright != p.proargtypes[0]);
aggfnoid | oid
----------+-----
(0 rows)
-- Check operator is a suitable btree opfamily member
SELECT a.aggfnoid::oid, o.oid
FROM pg_operator AS o, pg_aggregate AS a, pg_proc AS p
WHERE a.aggfnoid = p.oid AND a.aggsortop = o.oid AND
NOT EXISTS(SELECT 1 FROM pg_amop
WHERE amopmethod = (SELECT oid FROM pg_am WHERE amname = 'btree')
AND amopopr = o.oid
AND amoplefttype = o.oprleft
AND amoprighttype = o.oprright);
aggfnoid | oid
----------+-----
(0 rows)
-- Check correspondence of btree strategies and names
SELECT DISTINCT proname, oprname, amopstrategy
FROM pg_operator AS o, pg_aggregate AS a, pg_proc AS p,
pg_amop as ao
WHERE a.aggfnoid = p.oid AND a.aggsortop = o.oid AND
amopopr = o.oid AND
amopmethod = (SELECT oid FROM pg_am WHERE amname = 'btree')
ORDER BY 1, 2;
proname | oprname | amopstrategy
----------+---------+--------------
bool_and | < | 1
bool_or | > | 5
every | < | 1
max | > | 5
min | < | 1
(5 rows)
-- Check that there are not aggregates with the same name and different
-- numbers of arguments. While not technically wrong, we have a project policy
-- to avoid this because it opens the door for confusion in connection with
-- ORDER BY: novices frequently put the ORDER BY in the wrong place.
-- See the fate of the single-argument form of string_agg() for history.
-- (Note: we don't forbid users from creating such aggregates; the policy is
-- just to think twice before creating built-in aggregates like this.)
-- The only aggregates that should show up here are count(x) and count(*).
SELECT p1.oid::regprocedure, p2.oid::regprocedure
FROM pg_proc AS p1, pg_proc AS p2
WHERE p1.oid < p2.oid AND p1.proname = p2.proname AND
p1.prokind = 'a' AND p2.prokind = 'a' AND
array_dims(p1.proargtypes) != array_dims(p2.proargtypes)
ORDER BY 1;
oid | oid
--------------+---------
count("any") | count()
(1 row)
Support ordered-set (WITHIN GROUP) aggregates. This patch introduces generic support for ordered-set and hypothetical-set aggregate functions, as well as implementations of the instances defined in SQL:2008 (percentile_cont(), percentile_disc(), rank(), dense_rank(), percent_rank(), cume_dist()). We also added mode() though it is not in the spec, as well as versions of percentile_cont() and percentile_disc() that can compute multiple percentile values in one pass over the data. Unlike the original submission, this patch puts full control of the sorting process in the hands of the aggregate's support functions. To allow the support functions to find out how they're supposed to sort, a new API function AggGetAggref() is added to nodeAgg.c. This allows retrieval of the aggregate call's Aggref node, which may have other uses beyond the immediate need. There is also support for ordered-set aggregates to install cleanup callback functions, so that they can be sure that infrastructure such as tuplesort objects gets cleaned up. In passing, make some fixes in the recently-added support for variadic aggregates, and make some editorial adjustments in the recent FILTER additions for aggregates. Also, simplify use of IsBinaryCoercible() by allowing it to succeed whenever the target type is ANY or ANYELEMENT. It was inconsistent that it dealt with other polymorphic target types but not these. Atri Sharma and Andrew Gierth; reviewed by Pavel Stehule and Vik Fearing, and rather heavily editorialized upon by Tom Lane
12 years ago
-- For the same reason, built-in aggregates with default arguments are no good.
SELECT oid, proname
FROM pg_proc AS p
WHERE prokind = 'a' AND proargdefaults IS NOT NULL;
oid | proname
-----+---------
(0 rows)
Support ordered-set (WITHIN GROUP) aggregates. This patch introduces generic support for ordered-set and hypothetical-set aggregate functions, as well as implementations of the instances defined in SQL:2008 (percentile_cont(), percentile_disc(), rank(), dense_rank(), percent_rank(), cume_dist()). We also added mode() though it is not in the spec, as well as versions of percentile_cont() and percentile_disc() that can compute multiple percentile values in one pass over the data. Unlike the original submission, this patch puts full control of the sorting process in the hands of the aggregate's support functions. To allow the support functions to find out how they're supposed to sort, a new API function AggGetAggref() is added to nodeAgg.c. This allows retrieval of the aggregate call's Aggref node, which may have other uses beyond the immediate need. There is also support for ordered-set aggregates to install cleanup callback functions, so that they can be sure that infrastructure such as tuplesort objects gets cleaned up. In passing, make some fixes in the recently-added support for variadic aggregates, and make some editorial adjustments in the recent FILTER additions for aggregates. Also, simplify use of IsBinaryCoercible() by allowing it to succeed whenever the target type is ANY or ANYELEMENT. It was inconsistent that it dealt with other polymorphic target types but not these. Atri Sharma and Andrew Gierth; reviewed by Pavel Stehule and Vik Fearing, and rather heavily editorialized upon by Tom Lane
12 years ago
-- For the same reason, we avoid creating built-in variadic aggregates, except
-- that variadic ordered-set aggregates are OK (since they have special syntax
-- that is not subject to the misplaced ORDER BY issue).
SELECT p.oid, proname
FROM pg_proc AS p JOIN pg_aggregate AS a ON a.aggfnoid = p.oid
WHERE prokind = 'a' AND provariadic != 0 AND a.aggkind = 'n';
oid | proname
-----+---------
(0 rows)
-- **************** pg_opfamily ****************
-- Look for illegal values in pg_opfamily fields
SELECT f.oid
FROM pg_opfamily as f
WHERE f.opfmethod = 0 OR f.opfnamespace = 0;
oid
-----
(0 rows)
-- Look for opfamilies having no opclasses. While most validation of
-- opfamilies is now handled by AM-specific amvalidate functions, that's
-- driven from pg_opclass entries below, so an empty opfamily would not
-- get noticed.
SELECT oid, opfname FROM pg_opfamily f
WHERE NOT EXISTS (SELECT 1 FROM pg_opclass WHERE opcfamily = f.oid);
oid | opfname
-----+---------
(0 rows)
-- **************** pg_opclass ****************
-- Look for illegal values in pg_opclass fields
SELECT c1.oid
FROM pg_opclass AS c1
WHERE c1.opcmethod = 0 OR c1.opcnamespace = 0 OR c1.opcfamily = 0
OR c1.opcintype = 0;
oid
-----
(0 rows)
-- opcmethod must match owning opfamily's opfmethod
SELECT c1.oid, f1.oid
FROM pg_opclass AS c1, pg_opfamily AS f1
WHERE c1.opcfamily = f1.oid AND c1.opcmethod != f1.opfmethod;
oid | oid
-----+-----
(0 rows)
-- There should not be multiple entries in pg_opclass with opcdefault true
-- and the same opcmethod/opcintype combination.
SELECT c1.oid, c2.oid
FROM pg_opclass AS c1, pg_opclass AS c2
WHERE c1.oid != c2.oid AND
c1.opcmethod = c2.opcmethod AND c1.opcintype = c2.opcintype AND
c1.opcdefault AND c2.opcdefault;
oid | oid
-----+-----
(0 rows)
-- Ask access methods to validate opclasses
-- (this replaces a lot of SQL-level checks that used to be done in this file)
SELECT oid, opcname FROM pg_opclass WHERE NOT amvalidate(oid);
oid | opcname
-----+---------
(0 rows)
-- **************** pg_am ****************
-- Look for illegal values in pg_am fields
SELECT a1.oid, a1.amname
FROM pg_am AS a1
WHERE a1.amhandler = 0;
oid | amname
-----+--------
(0 rows)
tableam: introduce table AM infrastructure. This introduces the concept of table access methods, i.e. CREATE ACCESS METHOD ... TYPE TABLE and CREATE TABLE ... USING (storage-engine). No table access functionality is delegated to table AMs as of this commit, that'll be done in following commits. Subsequent commits will incrementally abstract table access functionality to be routed through table access methods. That change is too large to be reviewed & committed at once, so it'll be done incrementally. Docs will be updated at the end, as adding them incrementally would likely make them less coherent, and definitely is a lot more work, without a lot of benefit. Table access methods are specified similar to index access methods, i.e. pg_am.amhandler returns, as INTERNAL, a pointer to a struct with callbacks. In contrast to index AMs that struct needs to live as long as a backend, typically that's achieved by just returning a pointer to a constant struct. Psql's \d+ now displays a table's access method. That can be disabled with HIDE_TABLEAM=true, which is mainly useful so regression tests can be run against different AMs. It's quite possible that this behaviour still needs to be fine tuned. For now it's not allowed to set a table AM for a partitioned table, as we've not resolved how partitions would inherit that. Disallowing allows us to introduce, if we decide that's the way forward, such a behaviour without a compatibility break. Catversion bumped, to add the heap table AM and references to it. Author: Haribabu Kommi, Andres Freund, Alvaro Herrera, Dimitri Golgov and others Discussion: https://postgr.es/m/20180703070645.wchpu5muyto5n647@alap3.anarazel.de https://postgr.es/m/20160812231527.GA690404@alvherre.pgsql https://postgr.es/m/20190107235616.6lur25ph22u5u5av@alap3.anarazel.de https://postgr.es/m/20190304234700.w5tmhducs5wxgzls@alap3.anarazel.de
7 years ago
-- Check for index amhandler functions with the wrong signature
SELECT a1.oid, a1.amname, p1.oid, p1.proname
FROM pg_am AS a1, pg_proc AS p1
WHERE p1.oid = a1.amhandler AND a1.amtype = 'i' AND
(p1.prorettype != 'index_am_handler'::regtype
OR p1.proretset
OR p1.pronargs != 1
OR p1.proargtypes[0] != 'internal'::regtype);
tableam: introduce table AM infrastructure. This introduces the concept of table access methods, i.e. CREATE ACCESS METHOD ... TYPE TABLE and CREATE TABLE ... USING (storage-engine). No table access functionality is delegated to table AMs as of this commit, that'll be done in following commits. Subsequent commits will incrementally abstract table access functionality to be routed through table access methods. That change is too large to be reviewed & committed at once, so it'll be done incrementally. Docs will be updated at the end, as adding them incrementally would likely make them less coherent, and definitely is a lot more work, without a lot of benefit. Table access methods are specified similar to index access methods, i.e. pg_am.amhandler returns, as INTERNAL, a pointer to a struct with callbacks. In contrast to index AMs that struct needs to live as long as a backend, typically that's achieved by just returning a pointer to a constant struct. Psql's \d+ now displays a table's access method. That can be disabled with HIDE_TABLEAM=true, which is mainly useful so regression tests can be run against different AMs. It's quite possible that this behaviour still needs to be fine tuned. For now it's not allowed to set a table AM for a partitioned table, as we've not resolved how partitions would inherit that. Disallowing allows us to introduce, if we decide that's the way forward, such a behaviour without a compatibility break. Catversion bumped, to add the heap table AM and references to it. Author: Haribabu Kommi, Andres Freund, Alvaro Herrera, Dimitri Golgov and others Discussion: https://postgr.es/m/20180703070645.wchpu5muyto5n647@alap3.anarazel.de https://postgr.es/m/20160812231527.GA690404@alvherre.pgsql https://postgr.es/m/20190107235616.6lur25ph22u5u5av@alap3.anarazel.de https://postgr.es/m/20190304234700.w5tmhducs5wxgzls@alap3.anarazel.de
7 years ago
oid | amname | oid | proname
-----+--------+-----+---------
(0 rows)
-- Check for table amhandler functions with the wrong signature
SELECT a1.oid, a1.amname, p1.oid, p1.proname
FROM pg_am AS a1, pg_proc AS p1
WHERE p1.oid = a1.amhandler AND a1.amtype = 't' AND
(p1.prorettype != 'table_am_handler'::regtype
OR p1.proretset
OR p1.pronargs != 1
OR p1.proargtypes[0] != 'internal'::regtype);
oid | amname | oid | proname
-----+--------+-----+---------
(0 rows)
-- **************** pg_amop ****************
-- Look for illegal values in pg_amop fields
SELECT a1.amopfamily, a1.amopstrategy
FROM pg_amop as a1
WHERE a1.amopfamily = 0 OR a1.amoplefttype = 0 OR a1.amoprighttype = 0
OR a1.amopopr = 0 OR a1.amopmethod = 0 OR a1.amopstrategy < 1;
amopfamily | amopstrategy
------------+--------------
(0 rows)
SELECT a1.amopfamily, a1.amopstrategy
FROM pg_amop as a1
WHERE NOT ((a1.amoppurpose = 's' AND a1.amopsortfamily = 0) OR
(a1.amoppurpose = 'o' AND a1.amopsortfamily <> 0));
amopfamily | amopstrategy
------------+--------------
(0 rows)
-- amopmethod must match owning opfamily's opfmethod
SELECT a1.oid, f1.oid
FROM pg_amop AS a1, pg_opfamily AS f1
WHERE a1.amopfamily = f1.oid AND a1.amopmethod != f1.opfmethod;
oid | oid
-----+-----
(0 rows)
-- Make a list of all the distinct operator names being used in particular
-- strategy slots. This is a bit hokey, since the list might need to change
-- in future releases, but it's an effective way of spotting mistakes such as
-- swapping two operators within a family.
SELECT DISTINCT amopmethod, amopstrategy, oprname
FROM pg_amop a1 LEFT JOIN pg_operator o1 ON amopopr = o1.oid
ORDER BY 1, 2, 3;
amopmethod | amopstrategy | oprname
------------+--------------+---------
403 | 1 | *<
403 | 1 | <
403 | 1 | ~<~
403 | 2 | *<=
403 | 2 | <=
403 | 2 | ~<=~
403 | 3 | *=
403 | 3 | =
403 | 4 | *>=
403 | 4 | >=
403 | 4 | ~>=~
403 | 5 | *>
403 | 5 | >
403 | 5 | ~>~
405 | 1 | =
783 | 1 | <<
783 | 1 | @@
783 | 2 | &<
783 | 3 | &&
783 | 4 | &>
783 | 5 | >>
783 | 6 | -|-
783 | 6 | ~=
783 | 7 | @>
783 | 8 | <@
783 | 9 | &<|
783 | 10 | <<|
783 | 11 | |>>
783 | 12 | |&>
783 | 15 | <->
783 | 16 | @>
783 | 18 | =
783 | 19 | <>
783 | 20 | <
783 | 21 | <=
783 | 22 | >
783 | 23 | >=
783 | 24 | <<
783 | 25 | <<=
783 | 26 | >>
783 | 27 | >>=
783 | 28 | <@
783 | 29 | <^
783 | 30 | >^
783 | 48 | <@
783 | 68 | <@
2742 | 1 | &&
2742 | 1 | @@
2742 | 2 | @>
2742 | 2 | @@@
2742 | 3 | <@
2742 | 4 | =
2742 | 7 | @>
2742 | 9 | ?
2742 | 10 | ?|
2742 | 11 | ?&
2742 | 15 | @?
2742 | 16 | @@
BRIN: Block Range Indexes BRIN is a new index access method intended to accelerate scans of very large tables, without the maintenance overhead of btrees or other traditional indexes. They work by maintaining "summary" data about block ranges. Bitmap index scans work by reading each summary tuple and comparing them with the query quals; all pages in the range are returned in a lossy TID bitmap if the quals are consistent with the values in the summary tuple, otherwise not. Normal index scans are not supported because these indexes do not store TIDs. As new tuples are added into the index, the summary information is updated (if the block range in which the tuple is added is already summarized) or not; in the latter case, a subsequent pass of VACUUM or the brin_summarize_new_values() function will create the summary information. For data types with natural 1-D sort orders, the summary info consists of the maximum and the minimum values of each indexed column within each page range. This type of operator class we call "Minmax", and we supply a bunch of them for most data types with B-tree opclasses. Since the BRIN code is generalized, other approaches are possible for things such as arrays, geometric types, ranges, etc; even for things such as enum types we could do something different than minmax with better results. In this commit I only include minmax. Catalog version bumped due to new builtin catalog entries. There's more that could be done here, but this is a good step forwards. Loosely based on ideas from Simon Riggs; code mostly by Álvaro Herrera, with contribution by Heikki Linnakangas. Patch reviewed by: Amit Kapila, Heikki Linnakangas, Robert Haas. Testing help from Jeff Janes, Erik Rijkers, Emanuel Calvo. PS: The research leading to these results has received funding from the European Union's Seventh Framework Programme (FP7/2007-2013) under grant agreement n° 318633.
11 years ago
3580 | 1 | <
3580 | 1 | <<
BRIN bloom indexes Adds a BRIN opclass using a Bloom filter to summarize the range. Indexes using the new opclasses allow only equality queries (similar to hash indexes), but that works fine for data like UUID, MAC addresses etc. for which range queries are not very common. This also means the indexes work for data that is not well correlated to physical location within the table, or perhaps even entirely random (which is a common issue with existing BRIN minmax opclasses). It's possible to specify opclass parameters with the usual Bloom filter parameters, i.e. the desired false-positive rate and the expected number of distinct values per page range. CREATE TABLE t (a int); CREATE INDEX ON t USING brin (a int4_bloom_ops(false_positive_rate = 0.05, n_distinct_per_range = 100)); The opclasses do not operate on the indexed values directly, but compute a 32-bit hash first, and the Bloom filter is built on the hash value. Collisions should not be a huge issue though, as the number of distinct values in a page ranges is usually fairly small. Bump catversion, due to various catalog changes. Author: Tomas Vondra <tomas.vondra@postgresql.org> Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org> Reviewed-by: Alexander Korotkov <aekorotkov@gmail.com> Reviewed-by: Sokolov Yura <y.sokolov@postgrespro.ru> Reviewed-by: Nico Williams <nico@cryptonector.com> Reviewed-by: John Naylor <john.naylor@enterprisedb.com> Discussion: https://postgr.es/m/c1138ead-7668-f0e1-0638-c3be3237e812@2ndquadrant.com Discussion: https://postgr.es/m/5d78b774-7e9c-c94e-12cf-fef51cc89b1a%402ndquadrant.com
5 years ago
3580 | 1 | =
3580 | 2 | &<
BRIN: Block Range Indexes BRIN is a new index access method intended to accelerate scans of very large tables, without the maintenance overhead of btrees or other traditional indexes. They work by maintaining "summary" data about block ranges. Bitmap index scans work by reading each summary tuple and comparing them with the query quals; all pages in the range are returned in a lossy TID bitmap if the quals are consistent with the values in the summary tuple, otherwise not. Normal index scans are not supported because these indexes do not store TIDs. As new tuples are added into the index, the summary information is updated (if the block range in which the tuple is added is already summarized) or not; in the latter case, a subsequent pass of VACUUM or the brin_summarize_new_values() function will create the summary information. For data types with natural 1-D sort orders, the summary info consists of the maximum and the minimum values of each indexed column within each page range. This type of operator class we call "Minmax", and we supply a bunch of them for most data types with B-tree opclasses. Since the BRIN code is generalized, other approaches are possible for things such as arrays, geometric types, ranges, etc; even for things such as enum types we could do something different than minmax with better results. In this commit I only include minmax. Catalog version bumped due to new builtin catalog entries. There's more that could be done here, but this is a good step forwards. Loosely based on ideas from Simon Riggs; code mostly by Álvaro Herrera, with contribution by Heikki Linnakangas. Patch reviewed by: Amit Kapila, Heikki Linnakangas, Robert Haas. Testing help from Jeff Janes, Erik Rijkers, Emanuel Calvo. PS: The research leading to these results has received funding from the European Union's Seventh Framework Programme (FP7/2007-2013) under grant agreement n° 318633.
11 years ago
3580 | 2 | <=
3580 | 3 | &&
BRIN: Block Range Indexes BRIN is a new index access method intended to accelerate scans of very large tables, without the maintenance overhead of btrees or other traditional indexes. They work by maintaining "summary" data about block ranges. Bitmap index scans work by reading each summary tuple and comparing them with the query quals; all pages in the range are returned in a lossy TID bitmap if the quals are consistent with the values in the summary tuple, otherwise not. Normal index scans are not supported because these indexes do not store TIDs. As new tuples are added into the index, the summary information is updated (if the block range in which the tuple is added is already summarized) or not; in the latter case, a subsequent pass of VACUUM or the brin_summarize_new_values() function will create the summary information. For data types with natural 1-D sort orders, the summary info consists of the maximum and the minimum values of each indexed column within each page range. This type of operator class we call "Minmax", and we supply a bunch of them for most data types with B-tree opclasses. Since the BRIN code is generalized, other approaches are possible for things such as arrays, geometric types, ranges, etc; even for things such as enum types we could do something different than minmax with better results. In this commit I only include minmax. Catalog version bumped due to new builtin catalog entries. There's more that could be done here, but this is a good step forwards. Loosely based on ideas from Simon Riggs; code mostly by Álvaro Herrera, with contribution by Heikki Linnakangas. Patch reviewed by: Amit Kapila, Heikki Linnakangas, Robert Haas. Testing help from Jeff Janes, Erik Rijkers, Emanuel Calvo. PS: The research leading to these results has received funding from the European Union's Seventh Framework Programme (FP7/2007-2013) under grant agreement n° 318633.
11 years ago
3580 | 3 | =
3580 | 4 | &>
BRIN: Block Range Indexes BRIN is a new index access method intended to accelerate scans of very large tables, without the maintenance overhead of btrees or other traditional indexes. They work by maintaining "summary" data about block ranges. Bitmap index scans work by reading each summary tuple and comparing them with the query quals; all pages in the range are returned in a lossy TID bitmap if the quals are consistent with the values in the summary tuple, otherwise not. Normal index scans are not supported because these indexes do not store TIDs. As new tuples are added into the index, the summary information is updated (if the block range in which the tuple is added is already summarized) or not; in the latter case, a subsequent pass of VACUUM or the brin_summarize_new_values() function will create the summary information. For data types with natural 1-D sort orders, the summary info consists of the maximum and the minimum values of each indexed column within each page range. This type of operator class we call "Minmax", and we supply a bunch of them for most data types with B-tree opclasses. Since the BRIN code is generalized, other approaches are possible for things such as arrays, geometric types, ranges, etc; even for things such as enum types we could do something different than minmax with better results. In this commit I only include minmax. Catalog version bumped due to new builtin catalog entries. There's more that could be done here, but this is a good step forwards. Loosely based on ideas from Simon Riggs; code mostly by Álvaro Herrera, with contribution by Heikki Linnakangas. Patch reviewed by: Amit Kapila, Heikki Linnakangas, Robert Haas. Testing help from Jeff Janes, Erik Rijkers, Emanuel Calvo. PS: The research leading to these results has received funding from the European Union's Seventh Framework Programme (FP7/2007-2013) under grant agreement n° 318633.
11 years ago
3580 | 4 | >=
3580 | 5 | >
3580 | 5 | >>
3580 | 6 | ~=
3580 | 7 | >>=
3580 | 7 | @>
3580 | 8 | <<=
3580 | 8 | <@
3580 | 9 | &<|
3580 | 10 | <<|
3580 | 11 | |>>
3580 | 12 | |&>
3580 | 16 | @>
3580 | 17 | -|-
3580 | 18 | =
3580 | 20 | <
3580 | 21 | <=
3580 | 22 | >
3580 | 23 | >=
3580 | 24 | >>
3580 | 26 | <<
4000 | 1 | <<
4000 | 1 | ~<~
4000 | 2 | &<
4000 | 2 | ~<=~
4000 | 3 | &&
4000 | 3 | =
4000 | 4 | &>
4000 | 4 | ~>=~
4000 | 5 | >>
4000 | 5 | ~>~
4000 | 6 | -|-
4000 | 6 | ~=
4000 | 7 | @>
4000 | 8 | <@
4000 | 9 | &<|
4000 | 10 | <<|
4000 | 11 | <
4000 | 11 | |>>
4000 | 12 | <=
4000 | 12 | |&>
4000 | 14 | >=
4000 | 15 | <->
4000 | 15 | >
4000 | 16 | @>
4000 | 18 | =
4000 | 19 | <>
4000 | 20 | <
4000 | 21 | <=
4000 | 22 | >
4000 | 23 | >=
4000 | 24 | <<
4000 | 25 | <<=
4000 | 26 | >>
4000 | 27 | >>=
4000 | 28 | ^@
4000 | 29 | <^
4000 | 30 | >^
BRIN bloom indexes Adds a BRIN opclass using a Bloom filter to summarize the range. Indexes using the new opclasses allow only equality queries (similar to hash indexes), but that works fine for data like UUID, MAC addresses etc. for which range queries are not very common. This also means the indexes work for data that is not well correlated to physical location within the table, or perhaps even entirely random (which is a common issue with existing BRIN minmax opclasses). It's possible to specify opclass parameters with the usual Bloom filter parameters, i.e. the desired false-positive rate and the expected number of distinct values per page range. CREATE TABLE t (a int); CREATE INDEX ON t USING brin (a int4_bloom_ops(false_positive_rate = 0.05, n_distinct_per_range = 100)); The opclasses do not operate on the indexed values directly, but compute a 32-bit hash first, and the Bloom filter is built on the hash value. Collisions should not be a huge issue though, as the number of distinct values in a page ranges is usually fairly small. Bump catversion, due to various catalog changes. Author: Tomas Vondra <tomas.vondra@postgresql.org> Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org> Reviewed-by: Alexander Korotkov <aekorotkov@gmail.com> Reviewed-by: Sokolov Yura <y.sokolov@postgrespro.ru> Reviewed-by: Nico Williams <nico@cryptonector.com> Reviewed-by: John Naylor <john.naylor@enterprisedb.com> Discussion: https://postgr.es/m/c1138ead-7668-f0e1-0638-c3be3237e812@2ndquadrant.com Discussion: https://postgr.es/m/5d78b774-7e9c-c94e-12cf-fef51cc89b1a%402ndquadrant.com
5 years ago
(124 rows)
-- Check that all opclass search operators have selectivity estimators.
-- This is not absolutely required, but it seems a reasonable thing
-- to insist on for all standard datatypes.
SELECT a1.amopfamily, a1.amopopr, o1.oid, o1.oprname
FROM pg_amop AS a1, pg_operator AS o1
WHERE a1.amopopr = o1.oid AND a1.amoppurpose = 's' AND
(o1.oprrest = 0 OR o1.oprjoin = 0);
amopfamily | amopopr | oid | oprname
------------+---------+-----+---------
(0 rows)
-- Check that each opclass in an opfamily has associated operators, that is
-- ones whose oprleft matches opcintype (possibly by coercion).
SELECT c1.opcname, c1.opcfamily
FROM pg_opclass AS c1
WHERE NOT EXISTS(SELECT 1 FROM pg_amop AS a1
WHERE a1.amopfamily = c1.opcfamily
AND binary_coercible(c1.opcintype, a1.amoplefttype));
opcname | opcfamily
---------+-----------
(0 rows)
-- Check that each operator listed in pg_amop has an associated opclass,
-- that is one whose opcintype matches oprleft (possibly by coercion).
-- Otherwise the operator is useless because it cannot be matched to an index.
-- (In principle it could be useful to list such operators in multiple-datatype
-- btree opfamilies, but in practice you'd expect there to be an opclass for
-- every datatype the family knows about.)
SELECT a1.amopfamily, a1.amopstrategy, a1.amopopr
FROM pg_amop AS a1
WHERE NOT EXISTS(SELECT 1 FROM pg_opclass AS c1
WHERE c1.opcfamily = a1.amopfamily
AND binary_coercible(c1.opcintype, a1.amoplefttype));
amopfamily | amopstrategy | amopopr
------------+--------------+---------
(0 rows)
-- Operators that are primary members of opclasses must be immutable (else
-- it suggests that the index ordering isn't fixed). Operators that are
-- cross-type members need only be stable, since they are just shorthands
-- for index probe queries.
SELECT a1.amopfamily, a1.amopopr, o1.oprname, p1.prosrc
FROM pg_amop AS a1, pg_operator AS o1, pg_proc AS p1
WHERE a1.amopopr = o1.oid AND o1.oprcode = p1.oid AND
a1.amoplefttype = a1.amoprighttype AND
p1.provolatile != 'i';
amopfamily | amopopr | oprname | prosrc
------------+---------+---------+--------
(0 rows)
SELECT a1.amopfamily, a1.amopopr, o1.oprname, p1.prosrc
FROM pg_amop AS a1, pg_operator AS o1, pg_proc AS p1
WHERE a1.amopopr = o1.oid AND o1.oprcode = p1.oid AND
a1.amoplefttype != a1.amoprighttype AND
p1.provolatile = 'v';
amopfamily | amopopr | oprname | prosrc
------------+---------+---------+--------
(0 rows)
-- **************** pg_amproc ****************
-- Look for illegal values in pg_amproc fields
SELECT a1.amprocfamily, a1.amprocnum
FROM pg_amproc as a1
WHERE a1.amprocfamily = 0 OR a1.amproclefttype = 0 OR a1.amprocrighttype = 0
OR a1.amprocnum < 0 OR a1.amproc = 0;
amprocfamily | amprocnum
--------------+-----------
(0 rows)
-- Support routines that are primary members of opfamilies must be immutable
-- (else it suggests that the index ordering isn't fixed). But cross-type
-- members need only be stable, since they are just shorthands
-- for index probe queries.
SELECT a1.amprocfamily, a1.amproc, p1.prosrc
FROM pg_amproc AS a1, pg_proc AS p1
WHERE a1.amproc = p1.oid AND
a1.amproclefttype = a1.amprocrighttype AND
p1.provolatile != 'i';
amprocfamily | amproc | prosrc
--------------+--------+--------
(0 rows)
SELECT a1.amprocfamily, a1.amproc, p1.prosrc
FROM pg_amproc AS a1, pg_proc AS p1
WHERE a1.amproc = p1.oid AND
a1.amproclefttype != a1.amprocrighttype AND
p1.provolatile = 'v';
amprocfamily | amproc | prosrc
--------------+--------+--------
(0 rows)
-- Almost all of the core distribution's Btree opclasses can use one of the
-- two generic "equalimage" functions as their support function 4. Look for
-- opclasses that don't allow deduplication unconditionally here.
--
-- Newly added Btree opclasses don't have to support deduplication. It will
-- usually be trivial to add support, though. Note that the expected output
-- of this part of the test will need to be updated when a new opclass cannot
-- support deduplication (by using btequalimage).
SELECT amp.amproc::regproc AS proc, opf.opfname AS opfamily_name,
opc.opcname AS opclass_name, opc.opcintype::regtype AS opcintype
FROM pg_am AS am
JOIN pg_opclass AS opc ON opc.opcmethod = am.oid
JOIN pg_opfamily AS opf ON opc.opcfamily = opf.oid
LEFT JOIN pg_amproc AS amp ON amp.amprocfamily = opf.oid AND
amp.amproclefttype = opc.opcintype AND amp.amprocnum = 4
WHERE am.amname = 'btree' AND
amp.amproc IS DISTINCT FROM 'btequalimage'::regproc
ORDER BY 1, 2, 3;
proc | opfamily_name | opclass_name | opcintype
--------------------+------------------+------------------+------------------
btvarstrequalimage | bpchar_ops | bpchar_ops | character
btvarstrequalimage | text_ops | name_ops | name
btvarstrequalimage | text_ops | text_ops | text
btvarstrequalimage | text_ops | varchar_ops | text
| array_ops | array_ops | anyarray
| float_ops | float4_ops | real
| float_ops | float8_ops | double precision
| interval_ops | interval_ops | interval
| jsonb_ops | jsonb_ops | jsonb
5 years ago
| multirange_ops | multirange_ops | anymultirange
| numeric_ops | numeric_ops | numeric
| range_ops | range_ops | anyrange
| record_image_ops | record_image_ops | record
| record_ops | record_ops | record
| tsquery_ops | tsquery_ops | tsquery
| tsvector_ops | tsvector_ops | tsvector
(16 rows)
-- **************** pg_index ****************
-- Look for illegal values in pg_index fields.
SELECT indexrelid, indrelid
FROM pg_index
WHERE indexrelid = 0 OR indrelid = 0 OR
indnatts <= 0 OR indnatts > 32;
indexrelid | indrelid
------------+----------
(0 rows)
-- oidvector and int2vector fields should be of length indnatts.
SELECT indexrelid, indrelid
FROM pg_index
WHERE array_lower(indkey, 1) != 0 OR array_upper(indkey, 1) != indnatts-1 OR
array_lower(indclass, 1) != 0 OR array_upper(indclass, 1) != indnatts-1 OR
array_lower(indcollation, 1) != 0 OR array_upper(indcollation, 1) != indnatts-1 OR
array_lower(indoption, 1) != 0 OR array_upper(indoption, 1) != indnatts-1;
indexrelid | indrelid
------------+----------
(0 rows)
-- Check that opclasses and collations match the underlying columns.
-- (As written, this test ignores expression indexes.)
SELECT indexrelid::regclass, indrelid::regclass, attname, atttypid::regtype, opcname
FROM (SELECT indexrelid, indrelid, unnest(indkey) as ikey,
unnest(indclass) as iclass, unnest(indcollation) as icoll
FROM pg_index) ss,
pg_attribute a,
pg_opclass opc
WHERE a.attrelid = indrelid AND a.attnum = ikey AND opc.oid = iclass AND
(NOT binary_coercible(atttypid, opcintype) OR icoll != attcollation);
indexrelid | indrelid | attname | atttypid | opcname
------------+----------+---------+----------+---------
(0 rows)
-- For system catalogs, be even tighter: nearly all indexes should be
-- exact type matches not binary-coercible matches. At this writing
-- the only exception is an OID index on a regproc column.
SELECT indexrelid::regclass, indrelid::regclass, attname, atttypid::regtype, opcname
FROM (SELECT indexrelid, indrelid, unnest(indkey) as ikey,
unnest(indclass) as iclass, unnest(indcollation) as icoll
FROM pg_index
WHERE indrelid < 16384) ss,
pg_attribute a,
pg_opclass opc
WHERE a.attrelid = indrelid AND a.attnum = ikey AND opc.oid = iclass AND
(opcintype != atttypid OR icoll != attcollation)
ORDER BY 1;
indexrelid | indrelid | attname | atttypid | opcname
--------------------------+--------------+----------+----------+---------
pg_aggregate_fnoid_index | pg_aggregate | aggfnoid | regproc | oid_ops
(1 row)
-- Check for system catalogs with collation-sensitive ordering. This is not
-- a representational error in pg_index, but simply wrong catalog design.
-- It's bad because we expect to be able to clone template0 and assign the
-- copy a different database collation. It would especially not work for
Make collation-aware system catalog columns use "C" collation. Up to now we allowed text columns in system catalogs to use collation "default", but that isn't really safe because it might mean something different in template0 than it means in a database cloned from template0. In particular, this could mean that cloned pg_statistic entries for such columns weren't entirely valid, possibly leading to bogus planner estimates, though (probably) not any outright failures. In the wake of commit 5e0928005, a better solution is available: if we label such columns with "C" collation, then their pg_statistic entries will also use that collation and hence will be valid independently of the database collation. This also provides a cleaner solution for indexes on such columns than the hack added by commit 0b28ea79c: the indexes will naturally inherit "C" collation and don't have to be forced to use text_pattern_ops. Also, with the planned improvement of type "name" to be collation-aware, this policy will apply cleanly to both text and name columns. Because of the pg_statistic angle, we should also apply this policy to the tables in information_schema. This patch does that by adjusting information_schema's textual domain types to specify "C" collation. That has the user-visible effect that order-sensitive comparisons to textual information_schema view columns will now use "C" collation by default. The SQL standard says that the collation of those view columns is implementation-defined, so I think this is legal per spec. At some point this might allow for translation of such comparisons into indexable conditions on the underlying "name" columns, although additional work will be needed before that can happen. Discussion: https://postgr.es/m/19346.1544895309@sss.pgh.pa.us
7 years ago
-- shared catalogs.
SELECT relname, attname, attcollation
FROM pg_class c, pg_attribute a
WHERE c.oid = attrelid AND c.oid < 16384 AND
c.relkind != 'v' AND -- we don't care about columns in views
attcollation != 0 AND
attcollation != (SELECT oid FROM pg_collation WHERE collname = 'C');
relname | attname | attcollation
---------+---------+--------------
(0 rows)
-- Double-check that collation-sensitive indexes have "C" collation, too.
SELECT indexrelid::regclass, indrelid::regclass, iclass, icoll
FROM (SELECT indexrelid, indrelid,
unnest(indclass) as iclass, unnest(indcollation) as icoll
FROM pg_index
WHERE indrelid < 16384) ss
Make collation-aware system catalog columns use "C" collation. Up to now we allowed text columns in system catalogs to use collation "default", but that isn't really safe because it might mean something different in template0 than it means in a database cloned from template0. In particular, this could mean that cloned pg_statistic entries for such columns weren't entirely valid, possibly leading to bogus planner estimates, though (probably) not any outright failures. In the wake of commit 5e0928005, a better solution is available: if we label such columns with "C" collation, then their pg_statistic entries will also use that collation and hence will be valid independently of the database collation. This also provides a cleaner solution for indexes on such columns than the hack added by commit 0b28ea79c: the indexes will naturally inherit "C" collation and don't have to be forced to use text_pattern_ops. Also, with the planned improvement of type "name" to be collation-aware, this policy will apply cleanly to both text and name columns. Because of the pg_statistic angle, we should also apply this policy to the tables in information_schema. This patch does that by adjusting information_schema's textual domain types to specify "C" collation. That has the user-visible effect that order-sensitive comparisons to textual information_schema view columns will now use "C" collation by default. The SQL standard says that the collation of those view columns is implementation-defined, so I think this is legal per spec. At some point this might allow for translation of such comparisons into indexable conditions on the underlying "name" columns, although additional work will be needed before that can happen. Discussion: https://postgr.es/m/19346.1544895309@sss.pgh.pa.us
7 years ago
WHERE icoll != 0 AND
icoll != (SELECT oid FROM pg_collation WHERE collname = 'C');
indexrelid | indrelid | iclass | icoll
------------+----------+--------+-------
(0 rows)