2

I have a program that takes 3 arguments -t, -a and -s - for example,

./Run -t 1500 -a 150000 -s filename

This program will append data as a row (of 7 columns) to the end of the file "filename".

I want to study how these two parameters affect my output for ranges t in (1500,150000 [steps of 5000]) and a in (500,600000 [steps of 500]). As of now what I am doing is,

parallel -j+0 ./Run -t {2} -a {1} :::: <(seq 500 500 600000) :::: <seq(1500 5000 15000)

As can be seen the parameter t is swept through its range for each value of parameter a. This prints out all the data into the file, all right.

But for ease in use of the data I want it to add 2 blank lines to the file after each parameter a is completely evaluated so I can go ahead with my processing. This means that I should add

echo "" >> filename

each time the parameter a is updated.

How do i do this with gnu parallel?

1
  • "This program will append data as a row (of 7 columns) to the end of the file "filename"." What happens if 2 program append at exactly the same time? Commented Apr 2, 2016 at 13:57

1 Answer 1

1

I find appending to the same file in parallel scary: There are certain situations where it is safe to do, but there are sooo many situations where it is not safe:

# Generate files with a single very long line
parallel -j0 perl -e '\$a=\"1{}\"x10000000\;print\ \$a,\"\\n\"' '>' {} ::: {a..z}
rm -f out.par 
# Grep for the single line in parallel - append to same file
parallel -j0 'grep 1 >> out.par' ::: {a..z}
# This ought to only give a single line for each letter
# But because of race condition some lines are split into two
parallel --tag 'grep {} out.par | wc -l' ::: {a..z}
rm out.par
# Do the same in serial (no race condition)
parallel -j1 'grep 1 >> out.par' ::: {a..z}
# Only a single line per letter
parallel --tag 'grep {} out.par | wc -l' ::: {a..z}
# Do the same in parallel but with serialized output (no race condition)
parallel -j0 grep 1 ::: {a..z} > out.par
# Only a single line per letter
parallel --tag 'grep {} out.par | wc -l' ::: {a..z}

So if I were you I would first change ./Run to output to stdout (standard output), so you can do:

./Run -t 1500 -2 500 > filename
# And in parallel:
parallel ./Run -t {2} -2 {1} :::: <(seq 500 500 600000) :::: <(seq 1500 5000 15000) > filename

To solve your original question we need to agree, that order does matter: It is unacceptable if the jobs are output in completely random order. Therefore we need --keep-order (-k).

parallel -k ./Run -t {2} -2 {1} :::: <(seq 500 500 600000) :::: <(seq 1500 5000 15000) > filename

Now we just need to make something that only runs if the first parameter is 11500:

parallel -k './Run -t {2} -2 {1}; if [ {2} -eq 11500 ]; then echo "";fi' :::: <(seq 500 500 600000) :::: <(seq 1500 5000 15000) > filename

I am not sure what you need it for, but you might want to take a look at --tag as that might be useful for you.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.