Skip to content

nfsd.service: Order after mxmount.service #217

Merged
merged 1 commit into from Nov 17, 2021
Merged

Conversation

donald
Copy link
Collaborator

@donald donald commented Nov 17, 2021

Currently there is no order between nfsd.service and mxmount.service. If
mxmount is slow (e.g. when file systems have to process their journals
after a server crash), nfsd startup might execute exportfs -ra while
the filesystems are not yet mounted, thereby exporting the unmounted
mountpoints.

This can result in mount failures on nfs clients or in "stale NFS
handle" errors on clients, which had the filesystems mounted before the
crash.

mxmount.service uses mxmount --noexport so there is no reexport
triggered, after the filesystems are mounted. nfsd.services executes
additional exportfs -ra commands 10, 20 and 30 seconds after the nfsd
startup, but 30 seconds might not be enough time to mount all
filesystems after the crash of a fileserver.

These errors can persist. They are partly resolved by a manual
exportfs -ra after a longer time, making the file systems available for
mounting. However, the "stale NFS handle" problem might still be visible
on clients which picked up the now covered inodes of the mountpoints.

Order nfsd.service after mxmount.service so that we don't export the
mountpoints.

Currently there is no order between nfsd.service and mxmount.service. If
mxmount is slow (e.g. when file systems have to process their journals
after a server crash), nfsd startup might execute `exportfs -ra` while
the filesystems are not yet mounted, thereby exporting the unmounted
mountpoints.

This can result in mount failures on nfs clients or in "stale NFS
handle" errors on clients, which had the filesystems mounted before the
crash.

mxmount.service uses `mxmount --noexport` so there is no reexport
triggered, after the filesystems are mounted. nfsd.services executes
additional `exportfs -ra` commands 10, 20 and 30 seconds after the nfsd
startup, but 30 seconds might not be enough time to mount all
filesystems after the crash of a fileserver.

These errors can persist. They are partly resolved by a manual
`exportfs -ra` after a longer time, making the file systems available
for mounting. However, the "stale NFS handle" problem might still be
visible on clients which picked up the now covered inodes of the
mountpoints.

Order nfsd.service after mxmount.service so that we don't export the
mountpoints.
@donald
Copy link
Collaborator Author

donald commented Nov 17, 2021

Tested on dose (with artificial sleep() delay in mxmount)

@pmenzel pmenzel merged commit da1c66b into master Nov 17, 2021
Sign in to join this conversation on GitHub.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants