1

I want to run a file two times with different arguments, each task on 1 node , for example task 1 on node 1 and task 2 on node 2, with my code only the first task is executed. I don't know what is the problem , I'm new on this, this is my code:

 #!/bin/bash

 node_names=(compute-0-4 compute-0-6)
 parameter=(parte__00 parte__01)

 #SBATCH -N 2
 #SBATCH -n 2
 #SBATCH -c 1

 srun -n1 -N1 -w $node_names[0] file.sh $parameter[0] &
 srun -n1 -N1 -w $node_names[1] file.sh $parameter[1] &
 wait

When I run the code just the last job is queued, if a execute scontrol show job I get this command

which is just the second job queued , the first job is not queued

1
  • So... "only the first task is executed", but "the first job is not queued". How are you confirming that the first task is executed? Is it possible that with your original script the first task is run twice? Also, exactly what code caused the condition in the screenshot you added to your question? Your original code should not have been able to queue the second job. Commented Oct 2, 2019 at 17:33

2 Answers 2

1

The #SBATCH lines have to be before any non-comment line. Try with something like this:

 #!/bin/bash
 #SBATCH -N 2
 #SBATCH -n 2
 #SBATCH -c 1

 node_names=(compute-0-4 compute-0-6)
 parameter=(parte__00 parte__01)


 srun -n1 -N1 -w $node_names[0] file.sh $parameter[0] &
 srun -n1 -N1 -w $node_names[1] file.sh $parameter[1] &
 wait

Also, you can just submit 2 jobs if your applications are completely independent, instead of trying to run everything in just 1 job.

Sign up to request clarification or add additional context in comments.

Comments

0

The problem with your existing script is that you need to use squiggly brackets to refer to array items in bash. Where you have $node_names[0] you need ${node_names[0]}.

That said... Do the parameters in the second array map directly to the nodes in the first array? If that's the case, then something like this might work a little better for you:

#!/bin/bash

node_names=(compute-0-3 compute-0-4)
parameter=(parte__00 parte__01)

for i in "${!node_names[@]}"; do
  srun -n1 -N1 -w "${node_names[$i]}" file.sh "${parameter[$i]}" &
done

wait

This runs a loop with $i set to each index in $node_names. For each iteration, the script backgrounds an srun. You can grow your cluster by modifying the arrays.

7 Comments

thanks for your help, but I want task 1 and 2 to run in parallel, when executing the current code only the first task is executed, even with the changes you recommended to me. When I run squeue just the first node is occupied
The ampersand at the end of the srun line backgrounds the task. Perhaps you can update your question with the details of your revised attempt. If there's a problem with your slurm configuration, it's unlikely we'll be able to identify it here.
But some troubleshooting... If you run the two srun commands manually, with the ampersand to background them, do they work properly? Try the script with an echo before the srun to make sure the command lines it is producing are correct. Also, run scontrol show job to see your jobs lists. Are your second jobs not being queued, or are they queued and not running? See scontrol show job <jobid> for details, if it's in the queue.
I tried the echo before srun and is fine, I executed scontrol show job and the second job is running in the second node but the first is not even queued.
just one job is generated the last one, I put an image with the result of scontrol show job in the question. When I run the two srun commands manually, with the ampersand to background them , is the same only one job is queued the last one
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.