#!/bin/sh
batchsize=5
batchcount=10
tdir=1
buffer=0
for i in *; do
[ $buffer -eq $batchsize ] && tdir=$((tdir + 1)) && buffer=0
[ $tdir -gt $batchcount ] && break
[ -d "$tdir" ] || mkdir -p $tdir
if [ -f "$i" ]; then
buffer=$((buffer + 1))
cp "$i" $tdir/
fi
done
Running this script in a directory, will make a directory called 1, put the first 5 files in it, then make a directory called 2 and put the next 5 files in that and so on until it's created and filled 10 directories.
If you want to test this, replace the cp command with echo cp so it'll show you the copy commands instead of actually running them, and comment out the mkdir command so it won't create the directories.
Undoing it should be real simple too, just run
rm [1-10]/* ./ && rmdir [1-10]
But if you had any other folders named 1-10 that weren't created by the command, all data in them would be lost so careful with this one.
I don't think there's any more convenient way to do this, this kind of stuff is what bash scripts are for.
Here is an alternative version that automatically fills in either batchcount or batchsize if one of them is not set.
#!/bin/sh
batchsize=auto
batchcount=5
ceil(){ # This function rounds a number up
[[ $@ == *"."* ]] && [[ "${@##*\.}" -gt 0 ]] && echo $(("${@%%\.*}" + 1)) || echo "${@%%\.*}"
}
if [[ ! "${batchsize}${batchcount}" =~ ^-?[0-9]+$ ]]; then #One of the variables is not an integer
filecount=$(ls -Ap | grep -v '/' | wc -l)
if [[ ! "$batchsize" =~ ^-?[0-9]+$ ]] && [[ ! "$batchcount" =~ ^-?[0-9]+$ ]]; then #Neither of the variables is set.
echo "Error: Batchsize and Batchcount are both unset, please set at least one of them."
exit
elif [[ ! "$batchsize" =~ ^-?[0-9]+$ ]]; then #Only batchsize is unset
batchsize=$(ceil $(bc -l <<< "$filecount / $batchcount"))
elif [[ ! "$batchcount" =~ ^-?[0-9]+$ ]]; then #Only batchcount is unset
batchcount=$(ceil $(bc -l <<< "$filecount / $batchsize"))
fi
fi
tdir=1
buffer=0
for i in *; do
[ $buffer -eq $batchsize ] && tdir=$((tdir + 1)) && buffer=0
[ $tdir -gt $batchcount ] && break
[ -d "$tdir" ] || mkdir -p $tdir
if [ -f "$i" ]; then
buffer=$((buffer + 1))
cp "$i" $tdir/
fi
done
So in the above configuration's case, it will divide the files into 5 batches, so if you have 23 files, it will divide them into 5 batches of 5, with the last batch only having 3.
If you set batchcount to auto, and set batchsize to 3, it will divide the files evenly into 7 batches of 3.
files=but you don't then use it in your code (other than as a comment). What's its relevance? In fact, I can't see the relevance of/data/samples.txtat all/data/samples.txtcontain a list of all the files you need to copy in batches? If so, you can usesplitto split it into 4 roughly equal "chunks" with e.g.split -n r/4 data/samples.txt -d samples.. Then you can just iterate over the contents ofsamples.01,samples.02,samples.03, andsamples.04cp(which is standard on linux), you can use the-t,--target-directory=DIRECTORYoption, so you don't need to use-I {}withxargs. e.g. something likexargs -d '\n' cp -t batch.04/ < samples.04. Note: this assumes that none of your filenames in samples.txt contain newlines. If they do, you'll need to regenerate that file using NUL as the separator (bothsplitandxargsand many other tools can work with NUL-separated input. BTW, NUL is the ONLY truly safe separator to use because it's the only character that can not be in a filename)