Don't reset the pathlist of partitioned joinrels.

apply_scanjoin_target_to_paths wants to avoid useless work and platform-specific dependencies by throwing away the path list created prior to applying the final scan/join target and constructing a whole new one using the final scan/join target. However, this is only valid when we'll consider all the same strategies after the pathlist reset as before. After resetting the path list, we reconsider Append and MergeAppend paths with the modified target list; therefore, it's only valid for a partitioned relation. However, what the previous coding missed is that it cannot be a partitioned join relation, because that also has paths that are not Append or MergeAppend paths and will not be reconsidered. Thus, before this patch, we'd sometimes choose a partitionwise strategy with a higher total cost than cheapest non-partitionwise strategy, which is not good. We had a surprising number of tests cases that were relying on this bug to work as they did. A big part of the reason for this is that row counts in regression test cases tend to be low, which brings the cost of partitionwise and non-partitionwise strategies very close together, especially for merge joins, where the real and perceived advantages of a partitionwise approach are minimal. In addition, one test case included a row-count-inflating join. In such cases, a partitionwise join can easily be a loser on cost, because the total number of tuples passing through an Append node is much higher than it is with a non-partitionwise strategy. That test case is adjusted by adding additional join clauses to avoid the row count inflation. Although the failure of the planner to choose the lowest-cost path is a bug, we generally do not back-patch fixes of this type, because planning is not an exact science and there is always a possibility that some user will end up with a plan that has a lower estimated cost but actually runs more slowly. Hence, no backpatch here, either. The code change here is exactly what was originally proposed by Ashutosh, but the changes to the comments and test cases have been very heavily rewritten by me, helped along by some very useful advice from Richard Guo. Reported-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Author: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Author: Robert Haas <rhaas@postgresql.org> Reviewed-by: Jakub Wartak <jakub.wartak@enterprisedb.com> Reviewed-by: Arne Roland <arne.roland@malkut.net> Reviewed-by: Richard Guo <guofenglinux@gmail.com> Discussion: http://postgr.es/m/CAExHW5toze58+jL-454J3ty11sqJyU13Sz5rJPQZDmASwZgWiA@mail.gmail.com
2 weeks ago · 014f9a831a
parent 8f1791c618
commit 014f9a831a
5 changed files with 83 additions and 112 deletions
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@ -7906,17 +7906,23 @@ apply_scanjoin_target_to_paths(PlannerInfo *root,
 	check_stack_depth();

 	/*
-	 * If the rel is partitioned, we want to drop its existing paths and
-	 * generate new ones.  This function would still be correct if we kept the
-	 * existing paths: we'd modify them to generate the correct target above
-	 * the partitioning Append, and then they'd compete on cost with paths
-	 * generating the target below the Append.  However, in our current cost
-	 * model the latter way is always the same or cheaper cost, so modifying
-	 * the existing paths would just be useless work.  Moreover, when the cost
-	 * is the same, varying roundoff errors might sometimes allow an existing
-	 * path to be picked, resulting in undesirable cross-platform plan
-	 * variations.  So we drop old paths and thereby force the work to be done
-	 * below the Append, except in the case of a non-parallel-safe target.
+	 * If the rel only has Append and MergeAppend paths, we want to drop its
+	 * existing paths and generate new ones.  This function would still be
+	 * correct if we kept the existing paths: we'd modify them to generate the
+	 * correct target above the partitioning Append, and then they'd compete
+	 * on cost with paths generating the target below the Append.  However, in
+	 * our current cost model the latter way is always the same or cheaper
+	 * cost, so modifying the existing paths would just be useless work.
+	 * Moreover, when the cost is the same, varying roundoff errors might
+	 * sometimes allow an existing path to be picked, resulting in undesirable
+	 * cross-platform plan variations.  So we drop old paths and thereby force
+	 * the work to be done below the Append.
+	 *
+	 * However, there are several cases when this optimization is not safe. If
+	 * the rel isn't partitioned, then none of the paths will be Append or
+	 * MergeAppend paths, so we should definitely not do this. If it is
+	 * parititoned but is a joinrel, it may have Append and MergeAppend paths,
+	 * but it can also have join paths that we can't afford to discard.
 	 *
 	 * Some care is needed, because we have to allow
 	 * generate_useful_gather_paths to see the old partial paths in the next
@ -7924,7 +7930,7 @@ apply_scanjoin_target_to_paths(PlannerInfo *root,
 	 * generate_useful_gather_paths to add path(s) to the main list, and
 	 * finally zap the partial pathlist.
 	 */
-	if (rel_is_partitioned)
+	if (rel_is_partitioned && IS_SIMPLE_REL(rel))
 		rel->pathlist = NIL;

 	/*
@ -7950,7 +7956,7 @@ apply_scanjoin_target_to_paths(PlannerInfo *root,
 	}

 	/* Finish dropping old paths for a partitioned rel, per comment above */
-	if (rel_is_partitioned)
+	if (rel_is_partitioned && IS_SIMPLE_REL(rel))
 		rel->partial_pathlist = NIL;

 	/* Extract SRF-free scan/join target. */
--- a/src/test/regress/expected/partition_join.out
+++ b/src/test/regress/expected/partition_join.out
@ -63,18 +63,21 @@ SELECT t1.a, t1.c, t2.b, t2.c FROM prt1 t1, prt2 t2 WHERE t1.a = t2.b AND t1.b =
 (4 rows)

 -- inner join with partially-redundant join clauses
+-- (avoid a mergejoin, because the planner thinks that an non-partitionwise
+-- merge join is the cheapest plan, and we want to test a partitionwise join)
+BEGIN;
+SET LOCAL enable_mergejoin = false;
 EXPLAIN (COSTS OFF)
 SELECT t1.a, t1.c, t2.b, t2.c FROM prt1 t1, prt2 t2 WHERE t1.a = t2.a AND t1.a = t2.b ORDER BY t1.a, t2.b;
-                          QUERY PLAN                           
---------------------------------------------------------------
+                    QUERY PLAN                    
+--------------------------------------------------
 Sort
   Sort Key: t1.a
   ->  Append
-         ->  Merge Join
-               Merge Cond: (t1_1.a = t2_1.a)
-               ->  Index Scan using iprt1_p1_a on prt1_p1 t1_1
-               ->  Sort
-                     Sort Key: t2_1.b
+         ->  Hash Join
+               Hash Cond: (t1_1.a = t2_1.a)
+               ->  Seq Scan on prt1_p1 t1_1
+               ->  Hash
                     ->  Seq Scan on prt2_p1 t2_1
                           Filter: (a = b)
         ->  Hash Join
@ -89,7 +92,7 @@ SELECT t1.a, t1.c, t2.b, t2.c FROM prt1 t1, prt2 t2 WHERE t1.a = t2.a AND t1.a =
               ->  Hash
                     ->  Seq Scan on prt2_p3 t2_3
                           Filter: (a = b)
-(22 rows)
+(21 rows)

 SELECT t1.a, t1.c, t2.b, t2.c FROM prt1 t1, prt2 t2 WHERE t1.a = t2.a AND t1.a = t2.b ORDER BY t1.a, t2.b;
 a  |  c   | b  |  c   
@ -101,6 +104,7 @@ SELECT t1.a, t1.c, t2.b, t2.c FROM prt1 t1, prt2 t2 WHERE t1.a = t2.a AND t1.a =
 24 | 0024 | 24 | 0024
 (5 rows)

+COMMIT;
 -- left outer join, 3-way
 EXPLAIN (COSTS OFF)
 SELECT COUNT(*) FROM prt1 t1
@ -1244,11 +1248,12 @@ SELECT t1.* FROM prt1 t1 WHERE t1.a IN (SELECT t1.b FROM prt2 t1 WHERE t1.b IN (
 450 | 0 | 0450
 (4 rows)

-- test merge joins
+-- test merge joins, slightly modifying the query to ensure that we still
+-- get a fully partitionwise join
 SET enable_hashjoin TO off;
 SET enable_nestloop TO off;
 EXPLAIN (COSTS OFF)
-SELECT t1.* FROM prt1 t1 WHERE t1.a IN (SELECT t1.b FROM prt2 t1 WHERE t1.b IN (SELECT (t1.a + t1.b)/2 FROM prt1_e t1 WHERE t1.c = 0)) AND t1.b = 0 ORDER BY t1.a;
+SELECT t1.* FROM prt1 t1 WHERE t1.a IN (SELECT t1.b FROM prt2 t1 WHERE t1.b IN (SELECT (t1.a + t1.b)/2 FROM prt1_e t1 WHERE t1.c = 0)) ORDER BY t1.a;
                            QUERY PLAN                            
 ------------------------------------------------------------------
 Merge Append
@ -1258,7 +1263,6 @@ SELECT t1.* FROM prt1 t1 WHERE t1.a IN (SELECT t1.b FROM prt2 t1 WHERE t1.b IN (
         ->  Sort
               Sort Key: t1_3.a
               ->  Seq Scan on prt1_p1 t1_3
-                     Filter: (b = 0)
         ->  Merge Semi Join
               Merge Cond: (t1_6.b = (((t1_9.a + t1_9.b) / 2)))
               ->  Sort
@ -1273,7 +1277,6 @@ SELECT t1.* FROM prt1 t1 WHERE t1.a IN (SELECT t1.b FROM prt2 t1 WHERE t1.b IN (
         ->  Sort
               Sort Key: t1_4.a
               ->  Seq Scan on prt1_p2 t1_4
-                     Filter: (b = 0)
         ->  Merge Semi Join
               Merge Cond: (t1_7.b = (((t1_10.a + t1_10.b) / 2)))
               ->  Sort
@ -1288,7 +1291,6 @@ SELECT t1.* FROM prt1 t1 WHERE t1.a IN (SELECT t1.b FROM prt2 t1 WHERE t1.b IN (
         ->  Sort
               Sort Key: t1_5.a
               ->  Seq Scan on prt1_p3 t1_5
-                     Filter: (b = 0)
         ->  Merge Semi Join
               Merge Cond: (t1_8.b = (((t1_11.a + t1_11.b) / 2)))
               ->  Sort
@ -1298,9 +1300,9 @@ SELECT t1.* FROM prt1 t1 WHERE t1.a IN (SELECT t1.b FROM prt2 t1 WHERE t1.b IN (
                     Sort Key: (((t1_11.a + t1_11.b) / 2))
                     ->  Seq Scan on prt1_e_p3 t1_11
                           Filter: (c = 0)
-(47 rows)
+(44 rows)

-SELECT t1.* FROM prt1 t1 WHERE t1.a IN (SELECT t1.b FROM prt2 t1 WHERE t1.b IN (SELECT (t1.a + t1.b)/2 FROM prt1_e t1 WHERE t1.c = 0)) AND t1.b = 0 ORDER BY t1.a;
+SELECT t1.* FROM prt1 t1 WHERE t1.a IN (SELECT t1.b FROM prt2 t1 WHERE t1.b IN (SELECT (t1.a + t1.b)/2 FROM prt1_e t1 WHERE t1.c = 0)) ORDER BY t1.a;
  a  | b |  c   
 -----+---+------
   0 | 0 | 0000
@ -4922,27 +4924,27 @@ ANALYZE plt3_adv;
 -- merged partition when re-called with plt1_adv_p1 for the second list value
 -- '0001' of that partition
 EXPLAIN (COSTS OFF)
-SELECT t1.a, t1.c, t2.a, t2.c, t3.a, t3.c FROM (plt1_adv t1 LEFT JOIN plt2_adv t2 ON (t1.c = t2.c)) FULL JOIN plt3_adv t3 ON (t1.c = t3.c) WHERE coalesce(t1.a, 0) % 5 != 3 AND coalesce(t1.a, 0) % 5 != 4 ORDER BY t1.c, t1.a, t2.a, t3.a;
+SELECT t1.a, t1.c, t2.a, t2.c, t3.a, t3.c FROM (plt1_adv t1 LEFT JOIN plt2_adv t2 ON (t1.a = t2.a AND t1.c = t2.c)) FULL JOIN plt3_adv t3 ON (t1.a = t3.a AND t1.c = t3.c) WHERE coalesce(t1.a, 0) % 5 != 3 AND coalesce(t1.a, 0) % 5 != 4 ORDER BY t1.c, t1.a, t2.a, t3.a;
                                          QUERY PLAN                                           
 -----------------------------------------------------------------------------------------------
 Sort
   Sort Key: t1.c, t1.a, t2.a, t3.a
   ->  Append
         ->  Hash Full Join
-               Hash Cond: (t1_1.c = t3_1.c)
+               Hash Cond: ((t1_1.a = t3_1.a) AND (t1_1.c = t3_1.c))
               Filter: (((COALESCE(t1_1.a, 0) % 5) <> 3) AND ((COALESCE(t1_1.a, 0) % 5) <> 4))
               ->  Hash Left Join
-                     Hash Cond: (t1_1.c = t2_1.c)
+                     Hash Cond: ((t1_1.a = t2_1.a) AND (t1_1.c = t2_1.c))
                     ->  Seq Scan on plt1_adv_p1 t1_1
                     ->  Hash
                           ->  Seq Scan on plt2_adv_p1 t2_1
               ->  Hash
                     ->  Seq Scan on plt3_adv_p1 t3_1
         ->  Hash Full Join
-               Hash Cond: (t1_2.c = t3_2.c)
+               Hash Cond: ((t1_2.a = t3_2.a) AND (t1_2.c = t3_2.c))
               Filter: (((COALESCE(t1_2.a, 0) % 5) <> 3) AND ((COALESCE(t1_2.a, 0) % 5) <> 4))
               ->  Hash Left Join
-                     Hash Cond: (t1_2.c = t2_2.c)
+                     Hash Cond: ((t1_2.a = t2_2.a) AND (t1_2.c = t2_2.c))
                     ->  Seq Scan on plt1_adv_p2 t1_2
                     ->  Hash
                           ->  Seq Scan on plt2_adv_p2 t2_2
@ -4950,7 +4952,7 @@ SELECT t1.a, t1.c, t2.a, t2.c, t3.a, t3.c FROM (plt1_adv t1 LEFT JOIN plt2_adv t
                     ->  Seq Scan on plt3_adv_p2 t3_2
 (23 rows)

-SELECT t1.a, t1.c, t2.a, t2.c, t3.a, t3.c FROM (plt1_adv t1 LEFT JOIN plt2_adv t2 ON (t1.c = t2.c)) FULL JOIN plt3_adv t3 ON (t1.c = t3.c) WHERE coalesce(t1.a, 0) % 5 != 3 AND coalesce(t1.a, 0) % 5 != 4 ORDER BY t1.c, t1.a, t2.a, t3.a;
+SELECT t1.a, t1.c, t2.a, t2.c, t3.a, t3.c FROM (plt1_adv t1 LEFT JOIN plt2_adv t2 ON (t1.a = t2.a AND t1.c = t2.c)) FULL JOIN plt3_adv t3 ON (t1.a = t3.a AND t1.c = t3.c) WHERE coalesce(t1.a, 0) % 5 != 3 AND coalesce(t1.a, 0) % 5 != 4 ORDER BY t1.c, t1.a, t2.a, t3.a;
 a  |  c   | a  |  c   | a  |  c   
 ----+------+----+------+----+------
  0 | 0000 |    |      |    | 
@ -4959,56 +4961,16 @@ SELECT t1.a, t1.c, t2.a, t2.c, t3.a, t3.c FROM (plt1_adv t1 LEFT JOIN plt2_adv t
 15 | 0000 |    |      |    | 
 20 | 0000 |    |      |    | 
  1 | 0001 |    |      |  1 | 0001
-  1 | 0001 |    |      |  6 | 0001
-  1 | 0001 |    |      | 11 | 0001
-  1 | 0001 |    |      | 16 | 0001
-  1 | 0001 |    |      | 21 | 0001
-  6 | 0001 |    |      |  1 | 0001
  6 | 0001 |    |      |  6 | 0001
-  6 | 0001 |    |      | 11 | 0001
-  6 | 0001 |    |      | 16 | 0001
-  6 | 0001 |    |      | 21 | 0001
- 11 | 0001 |    |      |  1 | 0001
- 11 | 0001 |    |      |  6 | 0001
 11 | 0001 |    |      | 11 | 0001
- 11 | 0001 |    |      | 16 | 0001
- 11 | 0001 |    |      | 21 | 0001
- 16 | 0001 |    |      |  1 | 0001
- 16 | 0001 |    |      |  6 | 0001
- 16 | 0001 |    |      | 11 | 0001
 16 | 0001 |    |      | 16 | 0001
- 16 | 0001 |    |      | 21 | 0001
- 21 | 0001 |    |      |  1 | 0001
- 21 | 0001 |    |      |  6 | 0001
- 21 | 0001 |    |      | 11 | 0001
- 21 | 0001 |    |      | 16 | 0001
 21 | 0001 |    |      | 21 | 0001
  2 | 0002 |  2 | 0002 |    | 
-  2 | 0002 |  7 | 0002 |    | 
-  2 | 0002 | 12 | 0002 |    | 
-  2 | 0002 | 17 | 0002 |    | 
-  2 | 0002 | 22 | 0002 |    | 
-  7 | 0002 |  2 | 0002 |    | 
  7 | 0002 |  7 | 0002 |    | 
-  7 | 0002 | 12 | 0002 |    | 
-  7 | 0002 | 17 | 0002 |    | 
-  7 | 0002 | 22 | 0002 |    | 
- 12 | 0002 |  2 | 0002 |    | 
- 12 | 0002 |  7 | 0002 |    | 
 12 | 0002 | 12 | 0002 |    | 
- 12 | 0002 | 17 | 0002 |    | 
- 12 | 0002 | 22 | 0002 |    | 
- 17 | 0002 |  2 | 0002 |    | 
- 17 | 0002 |  7 | 0002 |    | 
- 17 | 0002 | 12 | 0002 |    | 
 17 | 0002 | 17 | 0002 |    | 
- 17 | 0002 | 22 | 0002 |    | 
- 22 | 0002 |  2 | 0002 |    | 
- 22 | 0002 |  7 | 0002 |    | 
- 22 | 0002 | 12 | 0002 |    | 
- 22 | 0002 | 17 | 0002 |    | 
 22 | 0002 | 22 | 0002 |    | 
-(55 rows)
+(15 rows)

 DROP TABLE plt1_adv;
 DROP TABLE plt2_adv;
@ -5233,8 +5195,11 @@ CREATE TABLE fract_t1 PARTITION OF fract_t FOR VALUES FROM ('1000') TO ('2000');
 INSERT INTO fract_t (id) (SELECT generate_series(0, 1999));
 ANALYZE fract_t;
 -- verify plan; nested index only scans
+-- (avoid merge joins, because the costs of partitionwise and non-partitionwise
+-- merge joins tend to be almost equal, and we want this test to be stable)
 SET max_parallel_workers_per_gather = 0;
 SET enable_partitionwise_join = on;
+SET enable_mergejoin = off;
 EXPLAIN (COSTS OFF)
 SELECT x.id, y.id FROM fract_t x LEFT JOIN fract_t y USING (id) ORDER BY x.id ASC LIMIT 10;
                              QUERY PLAN                               
@ -5242,14 +5207,14 @@ SELECT x.id, y.id FROM fract_t x LEFT JOIN fract_t y USING (id) ORDER BY x.id AS
 Limit
   ->  Merge Append
         Sort Key: x.id
-         ->  Merge Left Join
-               Merge Cond: (x_1.id = y_1.id)
+         ->  Nested Loop Left Join
               ->  Index Only Scan using fract_t0_pkey on fract_t0 x_1
               ->  Index Only Scan using fract_t0_pkey on fract_t0 y_1
-         ->  Merge Left Join
-               Merge Cond: (x_2.id = y_2.id)
+                     Index Cond: (id = x_1.id)
+         ->  Nested Loop Left Join
               ->  Index Only Scan using fract_t1_pkey on fract_t1 x_2
               ->  Index Only Scan using fract_t1_pkey on fract_t1 y_2
+                     Index Cond: (id = x_2.id)
 (11 rows)

 EXPLAIN (COSTS OFF)
@ -5366,6 +5331,7 @@ EXPLAIN (COSTS OFF) SELECT * FROM pht1 p1 JOIN pht1 p2 USING (c) LIMIT 1000;
                     ->  Seq Scan on pht1_p3 p2_3
 (17 rows)

+RESET enable_mergejoin;
 SET max_parallel_workers_per_gather = 1;
 SET debug_parallel_query = on;
 -- Partial paths should also be smart enough to employ limits
--- a/src/test/regress/expected/subselect.out
+++ b/src/test/regress/expected/subselect.out
@ -844,16 +844,19 @@ create index on unique_tbl_p3(a);
 analyze unique_tbl_p;
 set enable_partitionwise_join to on;
 -- Ensure that the unique-ification works for partition-wise join
+-- (Only one of the two joins will be done partitionwise, but that's good
+-- enough for our purposes.)
 explain (verbose, costs off)
 select * from unique_tbl_p t1, unique_tbl_p t2
 where (t1.a, t2.a) in (select a, a from unique_tbl_p t3)
 order by t1.a, t2.a;
                                           QUERY PLAN                                           
 ------------------------------------------------------------------------------------------------
- Merge Append
-   Sort Key: t1.a
-   ->  Nested Loop
-         Output: t1_1.a, t1_1.b, t2_1.a, t2_1.b
+ Merge Join
+   Output: t1.a, t1.b, t2.a, t2.b
+   Merge Cond: (t1.a = t2.a)
+   ->  Merge Append
+         Sort Key: t1.a
         ->  Nested Loop
               Output: t1_1.a, t1_1.b, t3_1.a
               ->  Unique
@ -863,15 +866,6 @@ order by t1.a, t2.a;
               ->  Index Scan using unique_tbl_p1_a_idx on public.unique_tbl_p1 t1_1
                     Output: t1_1.a, t1_1.b
                     Index Cond: (t1_1.a = t3_1.a)
-         ->  Memoize
-               Output: t2_1.a, t2_1.b
-               Cache Key: t1_1.a
-               Cache Mode: logical
-               ->  Index Scan using unique_tbl_p1_a_idx on public.unique_tbl_p1 t2_1
-                     Output: t2_1.a, t2_1.b
-                     Index Cond: (t2_1.a = t1_1.a)
-   ->  Nested Loop
-         Output: t1_2.a, t1_2.b, t2_2.a, t2_2.b
         ->  Nested Loop
               Output: t1_2.a, t1_2.b, t3_2.a
               ->  Unique
@ -881,15 +875,6 @@ order by t1.a, t2.a;
               ->  Index Scan using unique_tbl_p2_a_idx on public.unique_tbl_p2 t1_2
                     Output: t1_2.a, t1_2.b
                     Index Cond: (t1_2.a = t3_2.a)
-         ->  Memoize
-               Output: t2_2.a, t2_2.b
-               Cache Key: t1_2.a
-               Cache Mode: logical
-               ->  Index Scan using unique_tbl_p2_a_idx on public.unique_tbl_p2 t2_2
-                     Output: t2_2.a, t2_2.b
-                     Index Cond: (t2_2.a = t1_2.a)
-   ->  Nested Loop
-         Output: t1_3.a, t1_3.b, t2_3.a, t2_3.b
         ->  Nested Loop
               Output: t1_3.a, t1_3.b, t3_3.a
               ->  Unique
@ -902,14 +887,16 @@ order by t1.a, t2.a;
               ->  Index Scan using unique_tbl_p3_a_idx on public.unique_tbl_p3 t1_3
                     Output: t1_3.a, t1_3.b
                     Index Cond: (t1_3.a = t3_3.a)
-         ->  Memoize
-               Output: t2_3.a, t2_3.b
-               Cache Key: t1_3.a
-               Cache Mode: logical
+   ->  Materialize
+         Output: t2.a, t2.b
+         ->  Append
+               ->  Index Scan using unique_tbl_p1_a_idx on public.unique_tbl_p1 t2_1
+                     Output: t2_1.a, t2_1.b
+               ->  Index Scan using unique_tbl_p2_a_idx on public.unique_tbl_p2 t2_2
+                     Output: t2_2.a, t2_2.b
               ->  Index Scan using unique_tbl_p3_a_idx on public.unique_tbl_p3 t2_3
                     Output: t2_3.a, t2_3.b
-                     Index Cond: (t2_3.a = t1_3.a)
-(59 rows)
+(44 rows)

 reset enable_partitionwise_join;
 drop table unique_tbl_p;
--- a/src/test/regress/sql/partition_join.sql
+++ b/src/test/regress/sql/partition_join.sql
@ -35,9 +35,14 @@ SELECT t1.a, t1.c, t2.b, t2.c FROM prt1 t1, prt2 t2 WHERE t1.a = t2.b AND t1.b =
 SELECT t1.a, t1.c, t2.b, t2.c FROM prt1 t1, prt2 t2 WHERE t1.a = t2.b AND t1.b = 0 ORDER BY t1.a, t2.b;

 -- inner join with partially-redundant join clauses
+-- (avoid a mergejoin, because the planner thinks that an non-partitionwise
+-- merge join is the cheapest plan, and we want to test a partitionwise join)
+BEGIN;
+SET LOCAL enable_mergejoin = false;
 EXPLAIN (COSTS OFF)
 SELECT t1.a, t1.c, t2.b, t2.c FROM prt1 t1, prt2 t2 WHERE t1.a = t2.a AND t1.a = t2.b ORDER BY t1.a, t2.b;
 SELECT t1.a, t1.c, t2.b, t2.c FROM prt1 t1, prt2 t2 WHERE t1.a = t2.a AND t1.a = t2.b ORDER BY t1.a, t2.b;
+COMMIT;

 -- left outer join, 3-way
 EXPLAIN (COSTS OFF)
@ -219,13 +224,14 @@ EXPLAIN (COSTS OFF)
 SELECT t1.* FROM prt1 t1 WHERE t1.a IN (SELECT t1.b FROM prt2 t1 WHERE t1.b IN (SELECT (t1.a + t1.b)/2 FROM prt1_e t1 WHERE t1.c = 0)) AND t1.b = 0 ORDER BY t1.a;
 SELECT t1.* FROM prt1 t1 WHERE t1.a IN (SELECT t1.b FROM prt2 t1 WHERE t1.b IN (SELECT (t1.a + t1.b)/2 FROM prt1_e t1 WHERE t1.c = 0)) AND t1.b = 0 ORDER BY t1.a;

-- test merge joins
+-- test merge joins, slightly modifying the query to ensure that we still
+-- get a fully partitionwise join
 SET enable_hashjoin TO off;
 SET enable_nestloop TO off;

 EXPLAIN (COSTS OFF)
-SELECT t1.* FROM prt1 t1 WHERE t1.a IN (SELECT t1.b FROM prt2 t1 WHERE t1.b IN (SELECT (t1.a + t1.b)/2 FROM prt1_e t1 WHERE t1.c = 0)) AND t1.b = 0 ORDER BY t1.a;
-SELECT t1.* FROM prt1 t1 WHERE t1.a IN (SELECT t1.b FROM prt2 t1 WHERE t1.b IN (SELECT (t1.a + t1.b)/2 FROM prt1_e t1 WHERE t1.c = 0)) AND t1.b = 0 ORDER BY t1.a;
+SELECT t1.* FROM prt1 t1 WHERE t1.a IN (SELECT t1.b FROM prt2 t1 WHERE t1.b IN (SELECT (t1.a + t1.b)/2 FROM prt1_e t1 WHERE t1.c = 0)) ORDER BY t1.a;
+SELECT t1.* FROM prt1 t1 WHERE t1.a IN (SELECT t1.b FROM prt2 t1 WHERE t1.b IN (SELECT (t1.a + t1.b)/2 FROM prt1_e t1 WHERE t1.c = 0)) ORDER BY t1.a;

 EXPLAIN (COSTS OFF)
 SELECT t1.a, t1.c, t2.b, t2.c, t3.a + t3.b, t3.c FROM (prt1 t1 LEFT JOIN prt2 t2 ON t1.a = t2.b) RIGHT JOIN prt1_e t3 ON (t1.a = (t3.a + t3.b)/2) WHERE t3.c = 0 ORDER BY t1.a, t2.b, t3.a + t3.b;
@ -1155,8 +1161,8 @@ ANALYZE plt3_adv;
 -- merged partition when re-called with plt1_adv_p1 for the second list value
 -- '0001' of that partition
 EXPLAIN (COSTS OFF)
-SELECT t1.a, t1.c, t2.a, t2.c, t3.a, t3.c FROM (plt1_adv t1 LEFT JOIN plt2_adv t2 ON (t1.c = t2.c)) FULL JOIN plt3_adv t3 ON (t1.c = t3.c) WHERE coalesce(t1.a, 0) % 5 != 3 AND coalesce(t1.a, 0) % 5 != 4 ORDER BY t1.c, t1.a, t2.a, t3.a;
-SELECT t1.a, t1.c, t2.a, t2.c, t3.a, t3.c FROM (plt1_adv t1 LEFT JOIN plt2_adv t2 ON (t1.c = t2.c)) FULL JOIN plt3_adv t3 ON (t1.c = t3.c) WHERE coalesce(t1.a, 0) % 5 != 3 AND coalesce(t1.a, 0) % 5 != 4 ORDER BY t1.c, t1.a, t2.a, t3.a;
+SELECT t1.a, t1.c, t2.a, t2.c, t3.a, t3.c FROM (plt1_adv t1 LEFT JOIN plt2_adv t2 ON (t1.a = t2.a AND t1.c = t2.c)) FULL JOIN plt3_adv t3 ON (t1.a = t3.a AND t1.c = t3.c) WHERE coalesce(t1.a, 0) % 5 != 3 AND coalesce(t1.a, 0) % 5 != 4 ORDER BY t1.c, t1.a, t2.a, t3.a;
+SELECT t1.a, t1.c, t2.a, t2.c, t3.a, t3.c FROM (plt1_adv t1 LEFT JOIN plt2_adv t2 ON (t1.a = t2.a AND t1.c = t2.c)) FULL JOIN plt3_adv t3 ON (t1.a = t3.a AND t1.c = t3.c) WHERE coalesce(t1.a, 0) % 5 != 3 AND coalesce(t1.a, 0) % 5 != 4 ORDER BY t1.c, t1.a, t2.a, t3.a;

 DROP TABLE plt1_adv;
 DROP TABLE plt2_adv;
@ -1216,8 +1222,11 @@ INSERT INTO fract_t (id) (SELECT generate_series(0, 1999));
 ANALYZE fract_t;

 -- verify plan; nested index only scans
+-- (avoid merge joins, because the costs of partitionwise and non-partitionwise
+-- merge joins tend to be almost equal, and we want this test to be stable)
 SET max_parallel_workers_per_gather = 0;
 SET enable_partitionwise_join = on;
+SET enable_mergejoin = off;

 EXPLAIN (COSTS OFF)
 SELECT x.id, y.id FROM fract_t x LEFT JOIN fract_t y USING (id) ORDER BY x.id ASC LIMIT 10;
@ -1240,6 +1249,7 @@ EXPLAIN (COSTS OFF) SELECT * FROM pht1 p1 JOIN pht1 p2 USING (c) LIMIT 100;
 -- If almost all the data should be fetched - prefer SeqScan
 EXPLAIN (COSTS OFF) SELECT * FROM pht1 p1 JOIN pht1 p2 USING (c) LIMIT 1000;

+RESET enable_mergejoin;
 SET max_parallel_workers_per_gather = 1;
 SET debug_parallel_query = on;
 -- Partial paths should also be smart enough to employ limits
--- a/src/test/regress/sql/subselect.sql
+++ b/src/test/regress/sql/subselect.sql
@ -419,6 +419,8 @@ analyze unique_tbl_p;
 set enable_partitionwise_join to on;

 -- Ensure that the unique-ification works for partition-wise join
+-- (Only one of the two joins will be done partitionwise, but that's good
+-- enough for our purposes.)
 explain (verbose, costs off)
 select * from unique_tbl_p t1, unique_tbl_p t2
 where (t1.a, t2.a) in (select a, a from unique_tbl_p t3)