We've long contended with isolation test results that aren't entirely
stable. Some test scripts insert long delays to try to force stable
results, which is not terribly desirable; but other erratic failure
modes remain, causing unrepeatable buildfarm failures. I've spent a
fair amount of time trying to solve this by improving the server-side
support code, without much success: that way is fundamentally unable
to cope with diffs that stem from chance ordering of arrival of
messages from different server processes.
We can improve matters on the client side, however, by annotating
the test scripts themselves to show the desired reporting order
of events that might occur in different orders. This patch adds
three types of annotations to deal with (a) test steps that might or
might not complete their waits before the isolationtester can see them
waiting; (b) test steps in different sessions that can legitimately
complete in either order; and (c) NOTIFY messages that might arrive
before or after the completion of a step in another session. We might
need more annotation types later, but this seems to be enough to deal
with the instabilities we've seen in the buildfarm. It also lets us
get rid of all the long delays that were previously used, cutting more
than a minute off the runtime of the isolation tests.
Back-patch to all supported branches, because the buildfarm
instabilities affect all the branches, and because it seems desirable
to keep isolationtester's capabilities the same across all branches
to simplify possible future back-patching of tests.
Discussion: https://postgr.es/m/327948.1623725828@sss.pgh.pa.us
step updwctefail: WITH doup AS (UPDATE accounts SET balance = balance + 1100 WHERE accountid = 'checking' RETURNING *, update_checking(999)) UPDATE accounts a SET balance = doup.balance + 100 FROM doup RETURNING *; <waiting ...>
step updwctefail: WITH doup AS (UPDATE accounts SET balance = balance + 1100 WHERE accountid = 'checking' RETURNING *, update_checking(999)) UPDATE accounts a SET balance = doup.balance + 100 FROM doup RETURNING *; <waiting ...>
step c1: COMMIT;
step c1: COMMIT;
step updwctefail: <... completed>
step updwctefail: <... completed>
error in steps c1 updwctefail: ERROR: tuple to be updated was already modified by an operation triggered by the current command
ERROR: tuple to be updated was already modified by an operation triggered by the current command
step c2: COMMIT;
step c2: COMMIT;
step read: SELECT * FROM accounts ORDER BY accountid;
step read: SELECT * FROM accounts ORDER BY accountid;
accountid balance
accountid balance
@ -584,7 +584,7 @@ balance
step delwctefail: WITH doup AS (UPDATE accounts SET balance = balance + 1100 WHERE accountid = 'checking' RETURNING *, update_checking(999)) DELETE FROM accounts a USING doup RETURNING *; <waiting ...>
step delwctefail: WITH doup AS (UPDATE accounts SET balance = balance + 1100 WHERE accountid = 'checking' RETURNING *, update_checking(999)) DELETE FROM accounts a USING doup RETURNING *; <waiting ...>
step c1: COMMIT;
step c1: COMMIT;
step delwctefail: <... completed>
step delwctefail: <... completed>
error in steps c1 delwctefail: ERROR: tuple to be deleted was already modified by an operation triggered by the current command
ERROR: tuple to be deleted was already modified by an operation triggered by the current command
step c2: COMMIT;
step c2: COMMIT;
step read: SELECT * FROM accounts ORDER BY accountid;
step read: SELECT * FROM accounts ORDER BY accountid;
step "s1_upsert" { INSERT INTO upserttest(key, data) VALUES('k1', 'inserted s1') ON CONFLICT (blurt_and_lock_123(key)) DO UPDATE SET data = upserttest.data || ' with conflict update s1'; }
step "s1_upsert" { INSERT INTO upserttest(key, data) VALUES('k1', 'inserted s1') ON CONFLICT (blurt_and_lock_123(key)) DO UPDATE SET data = upserttest.data || ' with conflict update s1'; }
step "s1_insert_toast" { INSERT INTO upserttest VALUES('k2', ctoast_large_val()) ON CONFLICT DO NOTHING; }
step "s1_insert_toast" { INSERT INTO upserttest VALUES('k2', ctoast_large_val()) ON CONFLICT DO NOTHING; }
step "s1_commit" { COMMIT; }
step "s1_commit" { COMMIT; }
step "s1_noop" { }
session "s2"
session "s2"
setup
setup
@ -95,6 +96,7 @@ step "s2_begin" { BEGIN; }
step "s2_upsert" { INSERT INTO upserttest(key, data) VALUES('k1', 'inserted s2') ON CONFLICT (blurt_and_lock_123(key)) DO UPDATE SET data = upserttest.data || ' with conflict update s2'; }
step "s2_upsert" { INSERT INTO upserttest(key, data) VALUES('k1', 'inserted s2') ON CONFLICT (blurt_and_lock_123(key)) DO UPDATE SET data = upserttest.data || ' with conflict update s2'; }
step "s2_insert_toast" { INSERT INTO upserttest VALUES('k2', ctoast_large_val()) ON CONFLICT DO NOTHING; }
step "s2_insert_toast" { INSERT INTO upserttest VALUES('k2', ctoast_large_val()) ON CONFLICT DO NOTHING; }
step "s2_commit" { COMMIT; }
step "s2_commit" { COMMIT; }
step "s2_noop" { }
# Test that speculative locks are correctly acquired and released, s2
# Test that speculative locks are correctly acquired and released, s2
# inserts, s1 updates.
# inserts, s1 updates.
@ -223,7 +225,8 @@ permutation
"controller_show"
"controller_show"
"s2_begin"
"s2_begin"
# Both sessions wait on advisory locks
# Both sessions wait on advisory locks
"s1_upsert" "s2_upsert"
# (but don't show s2_upsert as complete till we've seen all of s1's notices)
"s1_upsert" "s2_upsert" ("s1_upsert" notices 10)
"controller_show"
"controller_show"
# Switch both sessions to wait on the other lock next time (the speculative insertion)
# Switch both sessions to wait on the other lock next time (the speculative insertion)
"controller_unlock_1_1" "controller_unlock_2_1"
"controller_unlock_1_1" "controller_unlock_2_1"
@ -244,14 +247,15 @@ permutation
"controller_unlock_1_2"
"controller_unlock_1_2"
# Should report s1 is waiting on speculative lock
# Should report s1 is waiting on speculative lock
"controller_print_speculative_locks"
"controller_print_speculative_locks"
# Allow s2 to insert into the non-unique index and complete s1 will
# Allow s2 to insert into the non-unique index and complete. s1 will
# no longer wait on speculative lock, but proceed to wait on the
# no longer wait on speculative lock, but proceed to wait on the
# transaction to finish.
# transaction to finish. The no-op step is needed to ensure that
"controller_unlock_2_4"
# we don't advance to the reporting step until s2_upsert has completed.
# Should report that s1 is now waiting for s1 to commit
"controller_unlock_2_4" "s2_noop"
# Should report that s1 is now waiting for s2 to commit
"controller_print_speculative_locks"
"controller_print_speculative_locks"
# Once s2 commits, s1 is finally free to continue to update
# Once s2 commits, s1 is finally free to continue to update