1

Goal

Execute long running command remotely with the following capabilities:

  1. We can kill the task with its PID
  2. If task completes, its exit code should be written to file
  3. Task should be detached from launching ssh session
  4. (Stretch goal) STDERR and STDOUT should use line-buffering as they are written to file

For my use case, multiple instances of this task will be launched concurrently on different servers. However, all of the servers (including the launch server) will be using the same working directory (it's mounted at the same location on all servers).

Issue

As of right now I am able to meet the first 3 requirements. The task is a c++ program named main and we can assume the relative path from $HOME to our working directory is working_directory_path. The current solution uses two helper scripts (run.sh and helper.sh) to achieve its goals.

First, the ssh command:

$ ssh {server} "cd {wdpath}; ./run.sh"

Then we use run.sh to detach from the ssh session and write the PID to a file.

#!/bin/bash

nohup ./helper.sh &>/dev/null &
echo $! > process.pid

Next, we use helper.sh to record the exit status of our program.

#!/bin/bash

./main > process.stdout 2> process.stderr
echo $? > process.status

This seems to work, but there is a few things I don't like.

  1. I have to create 2 temporary scripts per remote invocation, as each invocation of main will be fed some command line arguments that differ (and I will need to adjust the output file names).
  2. STDOUT and STDERR are being block buffered due to nohup.
  3. The PID I'm recording doesn't actually belong to the main process, but its parent shell.

Is it possible to improve my solution so that some of these issues are resolved?

6
  • I think I'd do all that directly in the C++ program instead of kludging up some shell scripts. fork() + setsid() for example. Commented Oct 2 at 18:19
  • 1
    nohup sometimes redirects the standard streams of the commands it runs, but it does not manipulate their buffering, and in this case it won't do anything with them at all because you already redirect to /dev/null, which is not a terminal. What's more, that's one layer removed from execution of your ./main anyway. Your ./main's STDOUT is fully buffered because it's connected to a regular file. On the other hand, your STDERR is not fully buffered (it's probably unbuffered) unless ./main is manipulating that itself or something is not happening according to spec. Commented Oct 2 at 18:39
  • The C standard library function setvbuf() can be used to adjust the buffering of the standard streams from within your program. For invocation from C++, I guess you would include <cstdio>, and call it as std::setvbuf Commented Oct 2 at 18:45
  • (1) could be addressed by making the program read the varying inputs from a file instead of from command-line arguments. You might still have two scripts (plus a parameter file), but they would be the same two scripts for every run. I'm usually much happier to provide a custom input file than a custom script. Commented Oct 2 at 20:02
  • The PID I'm recording doesn't actually belong to the main process, but its parent shell. Why do you record parent pid, instead of main's pid? Commented Oct 4 at 9:57

1 Answer 1

1

Something like this should work:

{ trap '' HUP
  stdbuf -oL -eL ./main >process.stdout 2>process.stderr
  echo $? >process.status
} >/dev/null 2>&1 &
ps -o pgid= -p $! >process.pgid

And you can terminate the task like so:

kill -TERM -$(cat process.pgid)
Sign up to request clarification or add additional context in comments.

2 Comments

I had to make a slight adjustment from echo $! >process.pgid to ps -o pgid= -p $! >process.pgid, but this solution works great.
Yeah, that's better

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.