pg_dump: Fix incorrect parsing of object types in pg_dump --filter.

Previously, pg_dump --filter could misinterpret invalid object types
in the filter file as valid ones. For example, the invalid object type
"table-data" (likely a typo for the valid "table_data") could be
mistakenly recognized as "table", causing pg_dump to succeed
when it should have failed.

This happened because pg_dump identified keywords as sequences of
ASCII alphabetic characters, treating non-alphabetic characters
(like hyphens) as keyword boundaries. As a result, "table-data" was
parsed as "table".

To fix this, pg_dump --filter now treats keywords as strings of
non-whitespace characters, ensuring invalid types like "table-data"
are correctly rejected.

Back-patch to v17, where the --filter option was introduced.

Author: Fujii Masao <masao.fujii@gmail.com>
Reviewed-by: Xuneng Zhou <xunengzhou@gmail.com>
Reviewed-by: Srinath Reddy <srinath2133@gmail.com>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://postgr.es/m/CAHGQGwFzPKUwiV5C-NLBqz1oK1+z9K8cgrF+LcxFem-p3_Ftug@mail.gmail.com
Backpatch-through: 17
pull/239/head
Fujii Masao 1 month ago
parent 62a1211d33
commit 85ccd7e30a
  1. 13
      src/bin/pg_dump/filter.c
  2. 14
      src/bin/pg_dump/t/005_pg_dump_filterfile.pl

@ -171,9 +171,8 @@ pg_log_filter_error(FilterStateData *fstate, const char *fmt,...)
/*
* filter_get_keyword - read the next filter keyword from buffer
*
* Search for keywords (limited to ascii alphabetic characters) in
* the passed in line buffer. Returns NULL when the buffer is empty or the first
* char is not alpha. The char '_' is allowed, except as the first character.
* Search for keywords (strings of non-whitespace characters) in the passed
* in line buffer. Returns NULL when the buffer is empty or no keyword exists.
* The length of the found keyword is returned in the size parameter.
*/
static const char *
@ -182,6 +181,9 @@ filter_get_keyword(const char **line, int *size)
const char *ptr = *line;
const char *result = NULL;
/* The passed buffer must not be NULL */
Assert(*line != NULL);
/* Set returned length preemptively in case no keyword is found */
*size = 0;
@ -189,11 +191,12 @@ filter_get_keyword(const char **line, int *size)
while (isspace((unsigned char) *ptr))
ptr++;
if (isalpha((unsigned char) *ptr))
/* Grab one keyword that's the string of non-whitespace characters */
if (*ptr != '\0' && !isspace((unsigned char) *ptr))
{
result = ptr++;
while (isalpha((unsigned char) *ptr) || *ptr == '_')
while (*ptr != '\0' && !isspace((unsigned char) *ptr))
ptr++;
*size = ptr - result;

@ -418,10 +418,16 @@ command_fails_like(
qr/invalid filter command/,
"invalid syntax: incorrect filter command");
# Test invalid object type
# Test invalid object type.
#
# This test also verifies that keywords are correctly recognized as strings of
# non-whitespace characters. If the parser incorrectly treats non-whitespace
# delimiters (like hyphens) as keyword boundaries, "table-data" might be
# misread as the valid object type "table". To catch such issues,
# "table-data" is used here as an intentionally invalid object type.
open $inputfile, '>', "$tempdir/inputfile.txt"
or die "unable to open filterfile for writing";
print $inputfile "include xxx";
print $inputfile "exclude table-data one";
close $inputfile;
command_fails_like(
@ -432,8 +438,8 @@ command_fails_like(
'--filter' => "$tempdir/inputfile.txt",
'postgres'
],
qr/unsupported filter object type: "xxx"/,
"invalid syntax: invalid object type specified, should be table, schema, foreign_data or data"
qr/unsupported filter object type: "table-data"/,
"invalid syntax: invalid object type specified"
);
# Test missing object identifier pattern

Loading…
Cancel
Save