- software practice
- parallel
- tasks
- clusterlike
- admin
- knowledge backup

I’ve used Gnu Parallel in the past, and it’s an exceptional tool for processing a large number of tasks. I highly recommend it.
Today, I started a task where we generate 5,000 things. Each task runs for about 10-15 minutes, generates a bunch of data, modifies it, and writes it elsewhere.
Last week, I ran this running parallel
commands seperately on on five different machines. which was kind of a pain to manage. So today, when kicking off another batch of 5,000, I came up with this command:
parallel --eta --progress --joblog jobs.log --tagstring {} --results output -j 10 --delay 10 --retries 4 -S ubuntu@10.0.0.213,ubuntu@10.0.0.66,ubuntu@1 0.0.0.58,ubuntu@10.0.0.71,ubuntu@10.0.0.14 "IDENTIFIER={} REALM=2 /home/ubuntu/bin/run" ::: {1..4999}
Which produces this lovely output.
Computers / CPU cores / Max jobs to run
1:ubuntu@10.0.0.71 / 16 / 10
2:ubuntu@10.0.0.14 / 16 / 10
3:ubuntu@10.0.0.213 / 16 / 10
4:ubuntu@10.0.0.58 / 16 / 10
5:ubuntu@10.0.0.66 / 16 / 10
Computer:jobs running/jobs completed/%of started jobs
ETA: 0s Left: 4999 AVG: 0.00s 1:10/0/20%/0.0s 2:10/0/
Now I’ve got 50 jobs running across five machines, and as they complete, more will start until we hit all 4,999 tasks (I already created job 0 earlier, which brings the total to 5,000).
Quick breakdown.
--eta
outputs the expected completion time based on how long completed tasks have taken. (It currently shows 0 because no jobs have finished yet.)--progress
shows how far along we are--jobslog jobs.log
maintains a file of completed jobs. Can be used to restart the entire run if something goes wrong.--tagstring
any stdout that goes to the screen get prefixed withx
wherex
is the input 1-4999.--results
store the job stdout and stderr to files in/output
-j 10
runs up to 10 jobs on each machine--delay 10
wait 10 seconds before starting new jobs.S ubuntu@...
is a comma-separated list of hosts to run jobs on. You can includelocalhost
too. It’s also possible to set a specific number of jobs per host.
Finally you have the command, followed by ::: {1,4999}
In this case we are expanding the input to be all the numbers between the two digits.
Finally, the :::
syntax is followed by {1..4999}
. This expands the input to all integers in that range.
This was fun and I hope you find an excuse to use parallel in the future.