I'm developing a bash script for a project involving NFS (specifically NFS3 and NFS4), aimed at managing critical sections. The script must efficiently handle over a thousand concurrent processes across various computers. Currently, I'm using Bash's noclobber option for file locking, but I'm uncertain about its suitability and effectiveness in this high-concurrency, distributed setting.
#!/bin/bash
lockfile="/mnt/nfs_dir/mylockfile.lock"
# Function to clean up lockfile
cleanup() {
rm -f "$lockfile"
}
trap cleanup EXIT
# Attempt to acquire lock
if ( set -o noclobber; echo "$$" > "$lockfile") 2> /dev/null; then
trap 'rm -f "$lockfile"; exit $?' INT TERM EXIT
# Critical section starts
# ...
# Critical section ends
rm -f "$lockfile"
trap - INT TERM EXIT
else
echo "Failed to acquire lock."
fi
Questions & Concerns:
Scalability and Reliability: Can the
noclobberapproach effectively scale in a high-concurrency environment, especially with NFS and over a thousand workers from different computers?Alternative methods: Would
flockor other file locking mechanisms be more appropriate in this scenario? What about DLM solutions?
mkdiron an NFS filesystem is atomic, so can be used for advisory locking:if mkdir /lock/dir; then : critical_section; rmdir /lock/dir; else echo failed; fi. but do you really want everything to fail instead of queueing?