5

I've seen the following question : Bash run two commands and get output from both which almost responds to my need.

However, the wait command is blocking, so that means that if command 2 fails before command 1 succeeds, the command will not return when command 2 fails but only when command 1 succeeds.

Is it possible to run multiple commands in parallel, and return 1 whenever one of them fails and return 0 if all of them succeed (And returning as soon as possible) ?

It would be even better if that is possible using standard commands (like xargs or parallel), but also ok if it is written using bash.

6
  • If one of the command fails, do you want to kill the other before returning or keep the other running? Commented Aug 27, 2015 at 12:07
  • Yes, that would be better (killing the other command) Commented Aug 27, 2015 at 12:08
  • Have you looked at this thread ? It simulates the behaviour of a barier in bash Commented Aug 27, 2015 at 12:12
  • @Ploutox That's a good start but it crucially lacks the facility to exit early if one of the subprocesses fails. (And, it's trivial.) Commented Aug 27, 2015 at 12:14
  • @tripleee how would you implement it ? Commented Aug 27, 2015 at 12:29

3 Answers 3

3

This code gives the right exit code, and kills the survivor process:

#/bin/bash

# trap for SIGTERM and set RET_VALUE to false
trap "RET_VAL=false" SIGTERM

MY_PID=$$
# Initialize RET_VALUE to true
RET_VAL=true

# This function will executed be in a separate job (see below)
thread_listener() {
    # Starts the long time job 
    ./longJob.sh &
    PID=$!
    # trap for sigterm and kill the long time process
    trap "kill $PID" SIGTERM
    echo waiting for $PID
    echo Parent $MY_PID
    # Send a SIGTERM to parent job in case of failure
    wait $PID || kill $MY_PID
    exit
}

echo $MY_PID

# Runs thread listener in a separate job
thread_listener &
PID1=$!

# Runs thread listener in a separate job
thread_listener &
PID2=$!

wait
# send sigterm to PID1 and PID2 if present
kill $PID1 2> /dev/null
kill $PID2 2> /dev/null
# returns RET_VALUE
$RET_VAL

See the comments for an explanation of the code. The trick is to starts jobs able to accept or send signal to parent job if needed.

The child job send a signal to the parent in case of a failure of its long time job and the parent send a signal to its childs after the wait (it the parent receive a signal the wait returns immediatly)

Sign up to request clarification or add additional context in comments.

Comments

0

The recent versions of GNU Parallel have focused on exactly that problem. Kill running children, if a single one fails:

parallel --halt now,fail=1 'echo {};{}' ::: true false true true false

Kill running children, if a single one succeeds:

parallel --halt now,success=1 'echo {};{}' ::: true false true true false

Kill running children, if 20% fails:

parallel -j1 --halt now,fail=20% 'echo {#} {};{}' ::: true true true false true true false true false true

Kill the child with signals TERM,TERM,TERM,KILL while waiting 50 ms between each signal:

parallel --termseq TERM,50,TERM,50,TERM,50,KILL -u --halt now,fail=1 'trap "echo TERM" SIGTERM; sleep 1;echo {};{}' ::: true false true true false

3 Comments

How can you use commands longer than one word ? for example parallel --halt now,fail=1 'echo {};{}' ::: true "echo 'hi'" fails with /bin/bash: echo 'hi': command not found
It works if I use : parallel --halt now,fail=1 "echo {}; eval {}" ::: "echo 'ho'" "echo 'hi'"
Yup. Either use eval or only use {}: parallel --halt now,fail=1 {} ::: "echo 'ho'" "echo 'hi'" Typically it will be used as: cat file | parallel --halt now,fail=1
0

Solved a similar situation, share here if ever it helps to someone.

I have got three commands to run.

  • server1 and server2 are long running commands, e.g. webservers.
  • healthcheck command runs some checks to make sure that the servers are okay. It requires the two servers up to perform the tests.

Required behavior:

  • If healthcheck is successful (return code is 0), block until the servers are running. This is normal operation.
  • If healthcheck fails, return immediately. This is wanted if there is a problem with servers.

The following does the job for me:

server1 & server2 & healthcheck && wait

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.