Add this suggestion to a batch that can be applied as a single commit. This suggestion is invalid because no changes were made to the code. Suggestions cannot be applied while the pull request is closed. Suggestions cannot be applied while viewing a subset of changes. Only one suggestion per line can be applied in a batch. Add this suggestion to a batch that can be applied as a single commit. Applying suggestions on deleted lines is not supported. You must change the existing code in this line in order to create a valid suggestion. Outdated suggestions cannot be applied. This suggestion has been applied or marked resolved. Suggestions cannot be applied from pending reviews. Suggestions cannot be applied on multi-line comments. Suggestions cannot be applied while the pull request is queued to merge. Suggestion cannot be applied right now. Please check back later.
I am currently trying things to fix all the duplicate, very long-running jobs that are crashing our system.
Prevent blocked jobs from being released while job is still executing
When a job runs longer than its concurrency_duration, the semaphore expires and gets deleted by the dispatcher's concurrency maintenance. Previously, BlockedExecution.releasable would consider a concurrency key releasable when its semaphore was missing, causing blocked jobs to be released and run concurrently with the still-executing job.
This violated the concurrency guarantee: with limits_concurrency set to 1, two jobs with the same concurrency key could run simultaneously if the first job exceeded its concurrency_duration.
Fix: Check for claimed executions before marking a concurrency key as releasable. A key is only releasable when:
This ensures concurrency limits are respected even when jobs exceed their configured duration.
Respect concurrency limits when releasing claimed executions during shutdown
When a worker gracefully shuts down, claimed executions are released back
to ready state via ClaimedExecution#release. Previously, this always used
dispatch_bypassing_concurrency_limits, which ignored whether another job
with the same concurrency key was already running.
This could cause duplicate concurrent executions in the following scenario:
Fix: Before releasing a job, check if any other jobs with the same
concurrency key are currently executing. If so, go through normal dispatch
which respects the concurrency policy (block or discard). If not, continue
to bypass limits for performance.
This ensures that graceful shutdown doesn't violate concurrency guarantees,
even when semaphores have expired during long-running job execution.