Skip to content

Commit

Permalink
nfsd: close cached files prior to a REMOVE or RENAME that would repla…
Browse files Browse the repository at this point in the history
…ce target

It's not uncommon for some workloads to do a bunch of I/O to a file and
delete it just afterward. If knfsd has a cached open file however, then
the file may still be open when the dentry is unlinked. If the
underlying filesystem is nfs, then that could trigger it to do a
sillyrename.

On a REMOVE or RENAME scan the nfsd_file cache for open files that
correspond to the inode, and proactively unhash and put their
references. This should prevent any delete-on-last-close activity from
occurring, solely due to knfsd's open file cache.

This must be done synchronously though so we use the variants that call
flush_delayed_fput. There are deadlock possibilities if you call
flush_delayed_fput while holding locks, however. In the case of
nfsd_rename, we don't even do the lookups of the dentries to be renamed
until we've locked for rename.

Once we've figured out what the target dentry is for a rename, check to
see whether there are cached open files associated with it. If there
are, then unwind all of the locking, close them all, and then reattempt
the rename.

None of this is really necessary for "typical" filesystems though. It's
mostly of use for NFS, so declare a new export op flag and use that to
determine whether to close the files beforehand.

Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
Signed-off-by: Lance Shelton <lance.shelton@hammerspace.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
  • Loading branch information
Jeff Layton authored and Chuck Lever committed Dec 9, 2020
1 parent ba5e818 commit 7f84b48
Show file tree
Hide file tree
Showing 4 changed files with 26 additions and 10 deletions.
13 changes: 13 additions & 0 deletions Documentation/filesystems/nfs/exporting.rst
Original file line number Diff line number Diff line change
Expand Up @@ -202,3 +202,16 @@ following flags are defined:
This flag exempts the filesystem from subtree checking and causes
exportfs to get back an error if it tries to enable subtree checking
on it.

EXPORT_OP_CLOSE_BEFORE_UNLINK - always close cached files before unlinking
On some exportable filesystems (such as NFS) unlinking a file that
is still open can cause a fair bit of extra work. For instance,
the NFS client will do a "sillyrename" to ensure that the file
sticks around while it's still open. When reexporting, that open
file is held by nfsd so we usually end up doing a sillyrename, and
then immediately deleting the sillyrenamed file just afterward when
the link count actually goes to zero. Sometimes this delete can race
with other operations (for instance an rmdir of the parent directory).
This flag causes nfsd to close any open files for this inode _before_
calling into the vfs to do an unlink or a rename that would replace
an existing file.
2 changes: 1 addition & 1 deletion fs/nfs/export.c
Original file line number Diff line number Diff line change
Expand Up @@ -171,5 +171,5 @@ const struct export_operations nfs_export_ops = {
.encode_fh = nfs_encode_fh,
.fh_to_dentry = nfs_fh_to_dentry,
.get_parent = nfs_get_parent,
.flags = EXPORT_OP_NOWCC|EXPORT_OP_NOSUBTREECHK,
.flags = EXPORT_OP_NOWCC|EXPORT_OP_NOSUBTREECHK|EXPORT_OP_CLOSE_BEFORE_UNLINK,
};
16 changes: 9 additions & 7 deletions fs/nfsd/vfs.c
Original file line number Diff line number Diff line change
Expand Up @@ -1724,7 +1724,7 @@ nfsd_rename(struct svc_rqst *rqstp, struct svc_fh *ffhp, char *fname, int flen,
struct inode *fdir, *tdir;
__be32 err;
int host_err;
bool has_cached = false;
bool close_cached = false;

err = fh_verify(rqstp, ffhp, S_IFDIR, NFSD_MAY_REMOVE);
if (err)
Expand Down Expand Up @@ -1783,8 +1783,9 @@ nfsd_rename(struct svc_rqst *rqstp, struct svc_fh *ffhp, char *fname, int flen,
if (ffhp->fh_export->ex_path.dentry != tfhp->fh_export->ex_path.dentry)
goto out_dput_new;

if (nfsd_has_cached_files(ndentry)) {
has_cached = true;
if ((ndentry->d_sb->s_export_op->flags & EXPORT_OP_CLOSE_BEFORE_UNLINK) &&
nfsd_has_cached_files(ndentry)) {
close_cached = true;
goto out_dput_old;
} else {
host_err = vfs_rename(fdir, odentry, tdir, ndentry, NULL, 0);
Expand All @@ -1805,7 +1806,7 @@ nfsd_rename(struct svc_rqst *rqstp, struct svc_fh *ffhp, char *fname, int flen,
* as that would do the wrong thing if the two directories
* were the same, so again we do it by hand.
*/
if (!has_cached) {
if (!close_cached) {
fill_post_wcc(ffhp);
fill_post_wcc(tfhp);
}
Expand All @@ -1819,8 +1820,8 @@ nfsd_rename(struct svc_rqst *rqstp, struct svc_fh *ffhp, char *fname, int flen,
* shouldn't be done with locks held however, so we delay it until this
* point and then reattempt the whole shebang.
*/
if (has_cached) {
has_cached = false;
if (close_cached) {
close_cached = false;
nfsd_close_cached_files(ndentry);
dput(ndentry);
goto retry;
Expand Down Expand Up @@ -1872,7 +1873,8 @@ nfsd_unlink(struct svc_rqst *rqstp, struct svc_fh *fhp, int type,
type = d_inode(rdentry)->i_mode & S_IFMT;

if (type != S_IFDIR) {
nfsd_close_cached_files(rdentry);
if (rdentry->d_sb->s_export_op->flags & EXPORT_OP_CLOSE_BEFORE_UNLINK)
nfsd_close_cached_files(rdentry);
host_err = vfs_unlink(dirp, rdentry, NULL);
} else {
host_err = vfs_rmdir(dirp, rdentry);
Expand Down
5 changes: 3 additions & 2 deletions include/linux/exportfs.h
Original file line number Diff line number Diff line change
Expand Up @@ -213,8 +213,9 @@ struct export_operations {
bool write, u32 *device_generation);
int (*commit_blocks)(struct inode *inode, struct iomap *iomaps,
int nr_iomaps, struct iattr *iattr);
#define EXPORT_OP_NOWCC (0x1) /* Don't collect wcc data for NFSv3 replies */
#define EXPORT_OP_NOSUBTREECHK (0x2) /* Subtree checking is not supported! */
#define EXPORT_OP_NOWCC (0x1) /* don't collect v3 wcc data */
#define EXPORT_OP_NOSUBTREECHK (0x2) /* no subtree checking */
#define EXPORT_OP_CLOSE_BEFORE_UNLINK (0x4) /* close files before unlink */
unsigned long flags;
};

Expand Down

0 comments on commit 7f84b48

Please sign in to comment.