I have a simple scenario, where I want to atomically read and modify the state of a row. But the row may not exist yet.
For this example, I use the user_group_membership table:
user_id (pk) | group_id (pk) | state
-------------------------------------
1 | 3 | joined
- User
1is member of Group3with statejoined(can also beinvitedorleftorbanned). - User
2is not a member of Group3. And never has been a member of that Group since there's no row in the table
The state value works like a state machine. There's a limited set of transitions:
null (no row present) -> invited, banned
invited -> joined, banned
joined -> left, banned
left -> invited, banned
banned -> invited, left
If a row is already present I can use a SELECT ... FOR UPDATE to get the current state, validate the transition, update the state and commit the transaction. All other concurrent transactions will "wait" for the lock to be released. That's fine. In this case all state transitions run sequentially.
But if there is no row in the table, there's nothing to lock. So all concurrent transactions will try to execute an INSERT. The first will succeed and the rest will fail because of the duplicate primary key.
At this point I just could "rerun" the whole code, because now I know that the row exists and it will use the SELECT ... FOR UPDATE for locking/waiting. But I don't want to execute the same code twice. I'm looking for a more elegant solution.
What I came up with so far
This is a replacement for the SELECT ... FOR UPDATE:
INSERT INTO user_group_membership (user_id, group_id, state)
VALUES (2, 3, 'DUMMY_FOR_THE_ROW_LOCK')
ON CONFLICT (user_id, group_id) DO UPDATE
SET user_id = EXCLUDED.user_id
RETURNING *;
-- application code for validating state transition
UPDATE user_group_membership
SET state = 'INVITED'
WHERE user_id = 2 AND group_id = 3;
This should prevent the situation where multiple concurrent transactions try to
INSERTand will hit a duplicate key error.The
DO UPDATEpart basically is a no-op, but it seems to be necessary to getRETURNINGto work properly. This effectively replaces theSELECT.
Questions
- Is this the right way to handle this scenario?
- Is it "safe"?
- Is there a better / easier solution?
Followup questions
- How to properly handle dummy values? The
statecolumn is not nullable and of type enum (invited, joined, left, banned). Introducing a new enum value, which should never be used outside of this locking mechanism feels wrong. But I need some value to create and lock the row. Any ideas?