3

I have a shell script job.sh.

contents are below:

#!/bin/bash

table=$1

sqoop job --exec ${table}

Now when I do ./job.sh table1

The script executes successfully.

I have the table names in a file tables.txt.

Now I want to loop over the tables.txt file and execute the job.sh script 10 times in parallel.

How can I do that?

Ideally when I execute the script I want it to do like below;

./job.sh table1
./job.sh table2
./job.sh table3
./job.sh table4
./job.sh table5
./job.sh table6
./job.sh table7
./job.sh table8
./job.sh table9
./job.sh table10

What are the options available?

0

3 Answers 3

5

Simply with GNU Parallel

parallel -a tables.txt --dry-run sqoop job --exec {}

Sample Output

sqoop job --exec table7
sqoop job --exec table8
sqoop job --exec table9
sqoop job --exec table6
sqoop job --exec table5
sqoop job --exec table4
sqoop job --exec table3
sqoop job --exec table2
sqoop job --exec table1
sqoop job --exec table10

If that looks correct, just remove the --dry-run and run again for real.

If you would like 4 jobs run at a time, use:

parallel -j 4 ....

If you would like one job per CPU core, that is the default, so you don't need to do anything.

If you would like the jobs to be kept in order, add -k option:

parallel -k ...
Sign up to request clarification or add additional context in comments.

2 Comments

@CharlesDuffy I didn't see it mentioned that OP had busybox, I was expecting a fairly decently specified machine if running sqoop.
You're right -- I must have been thinking about a different question.
3

You can just do

< tables.txt xargs -I% -n1 -P10 echo sqoop job --exec %

the -P10 will run 10 processes in parallel. And you don't even need the helper script.

As @CharlesDuffy commented, you don't need the -I, e.g. even simpler:

< tables.txt xargs -n1 -P10 echo sqoop job --exec

2 Comments

@CharlesDuffy True! The -I isn't needed in this case. it could be helpful in case like printf "%s\n" {1..20} | xargs -I% -n1 -P10 echo sqoop job --exec table%
Sure, though one could use table{1..20} there as well, and avoid the hairiness that comes with -I. Granted, the 255-byte string limit isn't an immediate issue, and the tendency to be abused in ways that lead to injection attacks or the POSIX-specified limit on number of substitutions per command line (or 5) likewise, but it's something that just strikes me as a smell.
0

Option 1

Start all scripts as background processes by appending &, e.g.

./job.sh table1 &
./job.sh table2 &
./job.sh table3 &

However, this will run all jobs at the same time!

Option 2

For more time or memory consuming scripts, you can run a limited number of task at the same time using xargs as for example described here.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.