Skip to content

Conversation

@tomgi
Copy link
Collaborator

@tomgi tomgi commented Nov 25, 2025

Hide whitespace changes for much cleaner diff


Draft to get feedback before looking to raise a PR at https://github.com/que-rb/que


Diff on top of our existing changes: wrap_job_in_rails_executor_and_poll-interval-variance...seek-pass-oss:que:wrap_job_in_rails_executor_and_poll-interval-variance_and_skip_poll_based_on_buffer_fullness


Context

During high-throughput spike periods, Que's job polling mechanism can lead to very high database load with continuous polling, even when the job buffer is already nearly full and workers are actively processing jobs.

What is possible to be happening currently:

  • locker accepts enough jobs via listen to fill the buffer
  • the jobs are super fast, so in the few milliseconds in between wait and next poll we have already worked a few
  • poller polls for a very few jobs to fill the buffer fully
  • locker accept more jobs via listen immediately (and potentially displaces any polled low-importance jobs)
  • workers work a few jobs again
  • the previous poll was satisfied?, so poller polls again immediately (potentially for the same low-importance jobs again)

Changes

Add optional skip-poll-when-buffer-above-threshold CLI parameter defaulting to 1.0.

When the buffer already has at least skip-poll-when-buffer-above-threshold fraction of its capacity filled, the polling is skipped for that iteration.

Backward-compatibility

Currently, the polling is skipped only when the buffer is completely full.

When the new skip-poll-when-buffer-above-threshold parameter is not provided, it defaults to 1.0 so there's no change compared to the current behaviour.

@tomgi tomgi force-pushed the skip_poll_based_on_buffer_fullness branch from 2076b95 to 7a7ca9e Compare November 25, 2025 06:03
@tomgi tomgi force-pushed the skip_poll_based_on_buffer_fullness branch from 7a7ca9e to 1d66a70 Compare November 25, 2025 06:03
assert_equal 1, locker_polled_events.size

# Should have locked first 11 only because there are 8 buffer slots and 3 open workers
assert_equal ids[0...11], locked_ids

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is 0..11 12 or 11? I think it is 12 right?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.. vs ...

> (0..100).to_a[0..11]
=> [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]

> (0..100).to_a[0...11]
=> [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

This spec is hugely copy-pasted inspired by the one above

que/spec/que/locker_spec.rb

Lines 422 to 423 in a58d350

# Should have locked first 11 only.
assert_equal ids[0...11], locked_ids

locker_polled_events = internal_messages(event: 'poller_polled')
assert_equal 1, locker_polled_events.size

3.times { $q2.push nil }
Copy link
Collaborator Author

@tomgi tomgi Nov 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No idea what it does, but 🙈 since it hangs without it and 5 other tests also have it 🤷‍♂️

I'm going to follow the existing pattern and resist shaving this yak 🙂

Probably something to do with this mutex

@cv.wait(mutex)
that push signals

que/lib/que/job_buffer.rb

Lines 248 to 252 in 1d66a70

def _push(item)
Que.assert(waiting_count > 0)
@items << item
@cv.signal
end

Copy link

@lee-treehouse lee-treehouse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎖️ I think this is a great idea

@tomgi
Copy link
Collaborator Author

tomgi commented Nov 26, 2025

Opened a draft in que-rb#441

@tomgi tomgi closed this Nov 26, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants