Fix tag updates atomicity and deadlocks#483
Conversation
a222e25 to
a7acf06
Compare
a7acf06 to
ba2fcdc
Compare
ba2fcdc to
945b6a3
Compare
lib/philomena/tags.ex
Outdated
| ok | ||
|
|
||
| {:error, err} -> | ||
| raise "get_or_create_tags failed: #{inspect(err)}\ntag_names: #{inspect(tag_names)}" |
There was a problem hiding this comment.
I don't like this error handling story, but this function never returned a Result in the first place, but it wasn't annotated with the ! suffix. In fact, it seems that all Repo.*() methods never return a Result. Is this expected? I suppose they use raise internally, but shouldn't they have a ! suffix?
e120fae to
e72f7db
Compare
|
Found this error during seeding after recent change, trying to debug: |
|
Hm, can't reproduce that error mode after |
…eturned as per docs
|
@Meow I discussed this deadlock in PM with Liam. Note that postgres logs every deadlock occurrence in its sever logs like this: Here, there is good enough context on which queries trapped into a deadlock so it becomes much easier to find the root cause. Postgres never actually hangs on a deadlock, it just detects them and cancels all participating transactions (there can be more than 2 transactions in the deadlock chain, and it happened for me during my testing of my dev seeds parallelization efforts), therefore it's technically not a critical error. I recommend you to dump the logs from postgres in prod and grep them for |
lib/philomena/images.ex
Outdated
| tags = Tag |> where([t], t.id in ^tag_ids) | ||
|
|
||
| {count, nil} = repo.update_all(tags, inc: [images_count: 1]) | ||
| count = Tags.update_images_counts(repo, +1, tag_ids) |
There was a problem hiding this comment.
can this not be written simply as 1 instead of +1?
lib/philomena/tags.ex
Outdated
| """ | ||
| @spec update_images_counts(term(), integer(), [integer()]) :: integer() | ||
| def update_images_counts(repo, diff, tag_ids) do | ||
| case tag_ids do |
There was a problem hiding this comment.
could this case statement not be eliminated by using pattern matching in the function definition like def update_image_counts(repo, diff, tag_ids) when length(tag_ids) == 0, do: 0
There was a problem hiding this comment.
i'd also rename the function to update_image_counts, instead of plural images
lib/philomena/tags.ex
Outdated
| |> Enum.sort() | ||
| |> Enum.dedup() | ||
| |> Enum.map( | ||
| &(%Tag{} |
There was a problem hiding this comment.
i'd not use this shorthand form here for clarity's sake, and instead would write Enum.map(fn tag_name -> ...
…ase` with a function-level match
lib/philomena/tags.ex
Outdated
| Tag | ||
| |> where([t], t.id in ^tag_ids) | ||
| |> Repo.update_all(inc: [images_count: 1]) | ||
| update_images_counts(Repo, +1, tag_ids) |
lib/philomena/tags.ex
Outdated
|
|
||
| _ -> | ||
| locked_tags = vectorized_mutation_lock("FOR NO KEY UPDATE", tag_ids) | ||
| def update_image_counts(nil, _diff, []), do: 0 |
There was a problem hiding this comment.
match repo to nil? what if repo isn't nil but tag_ids is []?
There was a problem hiding this comment.
Right, I haven't noticed that nil. Copilot autocomplete betrayed me there 💀. FIxed
* Fix the tag updates atomicity and deadlocks * Replace `Map.drop()` with `Map.take()` as per Liam's feedback in DM * Fix error handling since there will be a tuple of more than 2 items returned as per docs * Replace `+1` with `1` for update_images_counts * Rename `update_images_counts` to `update_image_counts` and replace `case` with a function-level match * Replace the capture operator with an explicit `fn tag_name ->` * Fix remove repo nil match autocompleted by Github Copilot (bruh)
* Fix the tag updates atomicity and deadlocks * Replace `Map.drop()` with `Map.take()` as per Liam's feedback in DM * Fix error handling since there will be a tuple of more than 2 items returned as per docs * Replace `+1` with `1` for update_images_counts * Rename `update_images_counts` to `update_image_counts` and replace `case` with a function-level match * Replace the capture operator with an explicit `fn tag_name ->` * Fix remove repo nil match autocompleted by Github Copilot (bruh)
Before you begin
I've noticed unique constraint violation errors in
get_or_create_tagswhen working on #481. So I replaced the existing impl with two separate queriesINSERT ON CONFLICT DO NOTHINGand a subsequentSELECT.But then I started seeing deadlocks in
update_allqueries for incrementing the image counts on tags. So I fixed it this way.Deadlock demo diagram is below. This can only happen in bulk-update transactions.
The problem is that postgres doesn't guarantee the order of acquiring row-level locks during the update. So one transaction can acquire its part of the locks, and the other transaction may acquire another part of the locks, thus they both wait on each other's remaining part of the records to be released falling into a deadlock.
Tested it with the image seeding script in #483, that does ~50 image uploads concurrently.