You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I just spotted this. Its a race. I think the bad outcome is more theoretically, because it is near to impossible to occur in reality. The result would be, that the cuda access files are owned by root and not be the uid of the job, so that accessing the gpus would fail.
MXQ job1 job2
* fork job1
* other initialization
* reserve gpu:
* * find slot without pid
* * change access to UID
* run user program
* exit
* fork job2
* other initialization
* cleanup job 1:
* * rm .../pid
* reserve gpu:
* * find slot without pid
* * change access to UID
* * change access to root
* run user program
mxq/helper/gpu-setup
Line 271 in f3d9fb8
We should keep the lock while we modify the access file, otherwise we race with a new allocation.
The text was updated successfully, but these errors were encountered: