Skip to content

Commit

Permalink
cuda-help: Use flock
Browse files Browse the repository at this point in the history
Currently the script assumes that it is not called multiple times in
parallel. This is not true, because for job-init it is called by the
forked user process.

Use flock to avoid GPU allocation races.
  • Loading branch information
donald committed Feb 17, 2022
1 parent 713212b commit 4bad691
Showing 1 changed file with 1 addition and 4 deletions.
5 changes: 1 addition & 4 deletions helper/gpu-setup
Original file line number Diff line number Diff line change
Expand Up @@ -179,13 +179,11 @@ job_init() {
pid=$1
uid=$2

# we have no locking here, but mxqd is single-threaded

test -d /dev/shm/mxqd/gpu_devs || die "$0: Not initialized (no dir /dev/shm/mxqd/gpu_devs)"

shopt -s nullglob
for d in /dev/shm/mxqd/gpu_devs/???; do
if [ ! -e $d/pid ]; then
if pid=$pid f=$d/pid flock $d/pid -c 'test -s $f && exit 1; echo $pid>$f'; then
for f in $(cat $d/access-files); do
case $f in
/dev/nvidia-caps/nvidia-cap*)
Expand All @@ -198,7 +196,6 @@ job_init() {
;;
esac
done
echo $pid > $d/pid
cat $d/uuid
exit
fi
Expand Down

0 comments on commit 4bad691

Please sign in to comment.