2

I have a set of dependent Slurm jobs that are successfully submitting. The jobs are setup as:

a b \ / c | d | e 

I need to submit this set of jobs 1000s of times, each time parametrized slightly differently. If I was going to be submitting a large batch of jobs without dependencies, I would use a job array to be kind to other users and the scheduler. What is the best practice for submitting job arrays of dependent jobs?

Possible wrinkle: Each job (a/b - e) is parameterized slightly differently for SBATCH (nodes, tasks-per-node, etc.).

1 Answer 1

2

The --dependency option of sbatch accepts aftercorr to link each job in an array to the corresponding job (the job with the same task array ID) in another array.

The sequence would be (untested)

ArrayAID=$(sbatch --array=1-1000 A.sh) ArrayBID=$(sbatch --array=1-1000 B.sh) ArrayCID=$(sbatch --array=1-1000 --dependency=aftercorr:$ArrayAID,$ArrayBID C.sh) ArrayDID=$(sbatch --array=1-1000 --dependency=aftercorr:$ArrayCID A.sh) ArrayEID=$(sbatch --array=1-1000 --dependency=aftercorr:$ArrayDID A.sh) 

The ith job in array C will will wait for the ith job in each array A and B to complete before starting.

Slurm will most probably schedule jobs in array A before those in the other arrays but that depends on the characteristics of the job and the load of the cluster. You can use the --nice option to alter the ordering and guide it the way you want (either having all of array A finished as soon as possible or having entire workflows finished as soon as possible, in which case you would make job E higher priority than job D, itself higher priority than job C.)

Sign up to request clarification or add additional context in comments.

7 Comments

Is there any way that this could be set up so that jobs would be started alternatively from two arrays, A and B, in the order A[i], B[i], A[i+1], B[i+1], ... ?
Yes, with --dependency=aftercorr... if I recall correctly.
I can get B[i] to run after A[i] by using aftercorr, but A[i+1] will start after A[i] not after B[i], i.e. B is dependent on A, but not the other way round.
Sorry I misread your question, in that case I would consider this is a single job array whose behaviour depends on the fact that SLURM_ARRAY_TASK_ID is odd or even, and that I would restrict to one job at a time with --array=....%1
Typo in --dependendy, it should be --dependency.
|

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.