Skip to content

Conversation

@lionelchauvin
Copy link

@lionelchauvin lionelchauvin commented Sep 11, 2025

The current implementation of batches doesn't work properly. In the following example, TestCallback is executed too soon:

  batch = Sidekiq::Batch.new
  batch.on(:complete, TestCallback)
  batch.jobs do
     TestWorker.perform_async # <- batch is marked as completed here 
     sleep 5
     TestWorker.perform_async
  end

I think #69 tried to solve this issue.

In this pull request, an additional redis variable "block_executed" has been added in order to know if the batch.jobs{} is actually executed before execute callbacks.
If batch.jobs{} is completed and the last pending job is completed then callbacks are processed.
If all jobs are completed when batch.jobs{} is completed then callbacks are processed.

@lionelchauvin lionelchauvin changed the title batch should not be marked as completed before batch.jobs{} is actually complete batch should not be marked as completed before batch.jobs{} is actually completed Sep 12, 2025
@lionelchauvin
Copy link
Author

@nglx Hi, if you have time, could you review my pull request ?

@lionelchauvin lionelchauvin force-pushed the fix_bug_completed_before_finish_execute_jobs branch from b61d901 to 1220a43 Compare October 13, 2025 07:47
@aleksclark
Copy link
Contributor

@lionelchauvin #69 did indeed attempt to solve the issue, but it requires specific configuration - I believe your PR may be duplicating that effort, specifically your duplication code is exactly what I used to demonstrate success. Did you try the config here? Can you post a minimal repro that uses the config value?

@lionelchauvin
Copy link
Author

lionelchauvin commented Oct 30, 2025

@aleksclark Yes I tried the config and it was not working with sidekiq 8 even with batch_push_interval = 0. I don't see what in the code prevent this condition to be true when you are still enqueuing jobs.

I believe your PR may be duplicating that effort, specifically your duplication code is exactly what I used to demonstrate success.

It is not exactly the same code. In your PR you create one job in batch.jobs{} and this job performs several other jobs every 5 seconds. In the present PR, I create several jobs in the same batch.jobs{} but take time to enqueue them.

I can try to add the config batch_push_interval = 0 in this test but I am almost sure it will fail.

@lionelchauvin
Copy link
Author

As I expected, the test fail (with the code from master) even if I simulate the config:

        batch_push_interval = 0
        batch.instance_variable_set(:@incremental_push, !batch_push_interval.nil?)
        batch.instance_variable_set(:@batch_push_interval, batch_push_interval)

in the condition, pending.to_i and failed.to_i are equal to 1 and children and complete are equal to 0, so it enqueues the complete callback too soon.

@lionelchauvin
Copy link
Author

For more context, I detected this bug because I have my own implementation of Sidekiq::Batch::Status#join. I use it in my specs in order to wait that all jobs are finished before the checks. I could provide a PR for Sidekiq::Batch::Status#join but it can't work without this fix.

@aleksclark
Copy link
Contributor

aleksclark commented Oct 31, 2025 via email

@lionelchauvin lionelchauvin force-pushed the fix_bug_completed_before_finish_execute_jobs branch from 1220a43 to fcebd41 Compare October 31, 2025 15:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants