Skip to content

Commit

Permalink
Merge branch 'overlayfs-linus' of git://git.kernel.org/pub/scm/linux/…
Browse files Browse the repository at this point in the history
…kernel/git/mszeredi/vfs

Pull overlayfs updates from Miklos Szeredi:
 "In addition to bug fixes and cleanups there are two new features from
  Amir:

   - Consistent inode number support for the case when layers are not
     all on the same filesystem (feature is dubbed "xino").

   - Optimize overlayfs file handle decoding. This one touches the
     exportfs interface to allow detecting the disconnected directory
     case"

* 'overlayfs-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs:
  ovl: update documentation w.r.t "xino" feature
  ovl: add support for "xino" mount and config options
  ovl: consistent d_ino for non-samefs with xino
  ovl: consistent i_ino for non-samefs with xino
  ovl: constant st_ino for non-samefs with xino
  ovl: allocate anon bdev per unique lower fs
  ovl: factor out ovl_map_dev_ino() helper
  ovl: cleanup ovl_update_time()
  ovl: add WARN_ON() for non-dir redirect cases
  ovl: cleanup setting OVL_INDEX
  ovl: set d->is_dir and d->opaque for last path element
  ovl: Do not check for redirect if this is last layer
  ovl: lookup in inode cache first when decoding lower file handle
  ovl: do not try to reconnect a disconnected origin dentry
  ovl: disambiguate ovl_encode_fh()
  ovl: set lower layer st_dev only if setting lower st_ino
  ovl: fix lookup with middle layer opaque dir and absolute path redirects
  ovl: Set d->last properly during lookup
  ovl: set i_ino to the value of st_ino for NFS export
  • Loading branch information
Linus Torvalds committed Apr 13, 2018
2 parents ba2b137 + 1614901 commit 4802310
Show file tree
Hide file tree
Showing 12 changed files with 510 additions and 172 deletions.
39 changes: 33 additions & 6 deletions Documentation/filesystems/overlayfs.txt
Original file line number Diff line number Diff line change
Expand Up @@ -14,9 +14,13 @@ The result will inevitably fail to look exactly like a normal
filesystem for various technical reasons. The expectation is that
many use cases will be able to ignore these differences.

This approach is 'hybrid' because the objects that appear in the
filesystem do not all appear to belong to that filesystem. In many
cases an object accessed in the union will be indistinguishable

Overlay objects
---------------

The overlay filesystem approach is 'hybrid', because the objects that
appear in the filesystem do not always appear to belong to that filesystem.
In many cases, an object accessed in the union will be indistinguishable
from accessing the corresponding object from the original filesystem.
This is most obvious from the 'st_dev' field returned by stat(2).

Expand All @@ -34,6 +38,19 @@ make the overlay mount more compliant with filesystem scanners and
overlay objects will be distinguishable from the corresponding
objects in the original filesystem.

On 64bit systems, even if all overlay layers are not on the same
underlying filesystem, the same compliant behavior could be achieved
with the "xino" feature. The "xino" feature composes a unique object
identifier from the real object st_ino and an underlying fsid index.
If all underlying filesystems support NFS file handles and export file
handles with 32bit inode number encoding (e.g. ext4), overlay filesystem
will use the high inode number bits for fsid. Even when the underlying
filesystem uses 64bit inode numbers, users can still enable the "xino"
feature with the "-o xino=on" overlay mount option. That is useful for the
case of underlying filesystems like xfs and tmpfs, which use 64bit inode
numbers, but are very unlikely to use the high inode number bit.


Upper and Lower
---------------

Expand Down Expand Up @@ -290,10 +307,19 @@ Non-standard behavior
---------------------

The copy_up operation essentially creates a new, identical file and
moves it over to the old name. The new file may be on a different
filesystem, so both st_dev and st_ino of the file may change.
moves it over to the old name. Any open files referring to this inode
will access the old data.

The new file may be on a different filesystem, so both st_dev and st_ino
of the real file may change. The values of st_dev and st_ino returned by
stat(2) on an overlay object are often not the same as the real file
stat(2) values to prevent the values from changing on copy_up.

Any open files referring to this inode will access the old data.
Unless "xino" feature is enabled, when overlay layers are not all on the
same underlying filesystem, the value of st_dev may be different for two
non-directory objects in the same overlay filesystem and the value of
st_ino for directory objects may be non persistent and could change even
while the overlay filesystem is still mounted.

Unless "inode index" feature is enabled, if a file with multiple hard
links is copied up, then this will "break" the link. Changes will not be
Expand All @@ -302,6 +328,7 @@ propagated to other names referring to the same inode.
Unless "redirect_dir" feature is enabled, rename(2) on a lower or merged
directory will fail with EXDEV.


Changes to underlying filesystems
---------------------------------

Expand Down
9 changes: 9 additions & 0 deletions fs/exportfs/expfs.c
Original file line number Diff line number Diff line change
Expand Up @@ -435,6 +435,15 @@ struct dentry *exportfs_decode_fh(struct vfsmount *mnt, struct fid *fid,
if (IS_ERR_OR_NULL(result))
return ERR_PTR(-ESTALE);

/*
* If no acceptance criteria was specified by caller, a disconnected
* dentry is also accepatable. Callers may use this mode to query if
* file handle is stale or to get a reference to an inode without
* risking the high overhead caused by directory reconnect.
*/
if (!acceptable)
return result;

if (d_is_dir(result)) {
/*
* This request is for a directory.
Expand Down
17 changes: 17 additions & 0 deletions fs/overlayfs/Kconfig
Original file line number Diff line number Diff line change
Expand Up @@ -86,3 +86,20 @@ config OVERLAY_FS_NFS_EXPORT
case basis with the "nfs_export=on" mount option.

Say N unless you fully understand the consequences.

config OVERLAY_FS_XINO_AUTO
bool "Overlayfs: auto enable inode number mapping"
default n
depends on OVERLAY_FS
help
If this config option is enabled then overlay filesystems will use
unused high bits in undelying filesystem inode numbers to map all
inodes to a unified address space. The mapped 64bit inode numbers
might not be compatible with applications that expect 32bit inodes.

If compatibility with applications that expect 32bit inodes is not an
issue, then it is safe and recommended to say Y here.

For more information, see Documentation/filesystems/overlayfs.txt

If unsure, say N.
6 changes: 3 additions & 3 deletions fs/overlayfs/copy_up.c
Original file line number Diff line number Diff line change
Expand Up @@ -232,7 +232,7 @@ int ovl_set_attr(struct dentry *upperdentry, struct kstat *stat)
return err;
}

struct ovl_fh *ovl_encode_fh(struct dentry *real, bool is_upper)
struct ovl_fh *ovl_encode_real_fh(struct dentry *real, bool is_upper)
{
struct ovl_fh *fh;
int fh_type, fh_len, dwords;
Expand Down Expand Up @@ -300,7 +300,7 @@ int ovl_set_origin(struct dentry *dentry, struct dentry *lower,
* up and a pure upper inode.
*/
if (ovl_can_decode_fh(lower->d_sb)) {
fh = ovl_encode_fh(lower, false);
fh = ovl_encode_real_fh(lower, false);
if (IS_ERR(fh))
return PTR_ERR(fh);
}
Expand All @@ -321,7 +321,7 @@ static int ovl_set_upper_fh(struct dentry *upper, struct dentry *index)
const struct ovl_fh *fh;
int err;

fh = ovl_encode_fh(upper, true);
fh = ovl_encode_real_fh(upper, true);
if (IS_ERR(fh))
return PTR_ERR(fh);

Expand Down
75 changes: 40 additions & 35 deletions fs/overlayfs/export.c
Original file line number Diff line number Diff line change
Expand Up @@ -228,8 +228,8 @@ static int ovl_d_to_fh(struct dentry *dentry, char *buf, int buflen)
goto fail;

/* Encode an upper or lower file handle */
fh = ovl_encode_fh(enc_lower ? ovl_dentry_lower(dentry) :
ovl_dentry_upper(dentry), !enc_lower);
fh = ovl_encode_real_fh(enc_lower ? ovl_dentry_lower(dentry) :
ovl_dentry_upper(dentry), !enc_lower);
err = PTR_ERR(fh);
if (IS_ERR(fh))
goto fail;
Expand Down Expand Up @@ -267,8 +267,8 @@ static int ovl_dentry_to_fh(struct dentry *dentry, u32 *fid, int *max_len)
return OVL_FILEID;
}

static int ovl_encode_inode_fh(struct inode *inode, u32 *fid, int *max_len,
struct inode *parent)
static int ovl_encode_fh(struct inode *inode, u32 *fid, int *max_len,
struct inode *parent)
{
struct dentry *dentry;
int type;
Expand Down Expand Up @@ -305,15 +305,12 @@ static struct dentry *ovl_obtain_alias(struct super_block *sb,
if (d_is_dir(upper ?: lower))
return ERR_PTR(-EIO);

inode = ovl_get_inode(sb, dget(upper), lower, index, !!lower);
inode = ovl_get_inode(sb, dget(upper), lowerpath, index, !!lower);
if (IS_ERR(inode)) {
dput(upper);
return ERR_CAST(inode);
}

if (index)
ovl_set_flag(OVL_INDEX, inode);

dentry = d_find_any_alias(inode);
if (!dentry) {
dentry = d_alloc_anon(inode->i_sb);
Expand Down Expand Up @@ -685,7 +682,7 @@ static struct dentry *ovl_upper_fh_to_d(struct super_block *sb,
if (!ofs->upper_mnt)
return ERR_PTR(-EACCES);

upper = ovl_decode_fh(fh, ofs->upper_mnt);
upper = ovl_decode_real_fh(fh, ofs->upper_mnt, true);
if (IS_ERR_OR_NULL(upper))
return upper;

Expand All @@ -703,25 +700,39 @@ static struct dentry *ovl_lower_fh_to_d(struct super_block *sb,
struct ovl_path *stack = &origin;
struct dentry *dentry = NULL;
struct dentry *index = NULL;
struct inode *inode = NULL;
bool is_deleted = false;
struct inode *inode;
int err;

/* First lookup indexed upper by fh */
/* First lookup overlay inode in inode cache by origin fh */
err = ovl_check_origin_fh(ofs, fh, false, NULL, &stack);
if (err)
return ERR_PTR(err);

if (!d_is_dir(origin.dentry) ||
!(origin.dentry->d_flags & DCACHE_DISCONNECTED)) {
inode = ovl_lookup_inode(sb, origin.dentry, false);
err = PTR_ERR(inode);
if (IS_ERR(inode))
goto out_err;
if (inode) {
dentry = d_find_any_alias(inode);
iput(inode);
if (dentry)
goto out;
}
}

/* Then lookup indexed upper/whiteout by origin fh */
if (ofs->indexdir) {
index = ovl_get_index_fh(ofs, fh);
err = PTR_ERR(index);
if (IS_ERR(index)) {
if (err != -ESTALE)
return ERR_PTR(err);

/* Found a whiteout index - treat as deleted inode */
is_deleted = true;
index = NULL;
goto out_err;
}
}

/* Then try to get upper dir by index */
/* Then try to get a connected upper dir by index */
if (index && d_is_dir(index)) {
struct dentry *upper = ovl_index_upper(ofs, index);

Expand All @@ -734,32 +745,26 @@ static struct dentry *ovl_lower_fh_to_d(struct super_block *sb,
goto out;
}

/* Then lookup origin by fh */
err = ovl_check_origin_fh(ofs, fh, NULL, &stack);
if (err) {
goto out_err;
} else if (index) {
err = ovl_verify_origin(index, origin.dentry, false);
/* Otherwise, get a connected non-upper dir or disconnected non-dir */
if (d_is_dir(origin.dentry) &&
(origin.dentry->d_flags & DCACHE_DISCONNECTED)) {
dput(origin.dentry);
origin.dentry = NULL;
err = ovl_check_origin_fh(ofs, fh, true, NULL, &stack);
if (err)
goto out_err;
} else if (is_deleted) {
/* Lookup deleted non-dir by origin inode */
if (!d_is_dir(origin.dentry))
inode = ovl_lookup_inode(sb, origin.dentry, false);
err = -ESTALE;
if (!inode || atomic_read(&inode->i_count) == 1)
}
if (index) {
err = ovl_verify_origin(index, origin.dentry, false);
if (err)
goto out_err;

/* Deleted but still open? */
index = dget(ovl_i_dentry_upper(inode));
}

dentry = ovl_get_dentry(sb, NULL, &origin, index);

out:
dput(origin.dentry);
dput(index);
iput(inode);
return dentry;

out_err:
Expand Down Expand Up @@ -829,7 +834,7 @@ static struct dentry *ovl_get_parent(struct dentry *dentry)
}

const struct export_operations ovl_export_operations = {
.encode_fh = ovl_encode_inode_fh,
.encode_fh = ovl_encode_fh,
.fh_to_dentry = ovl_fh_to_dentry,
.fh_to_parent = ovl_fh_to_parent,
.get_name = ovl_get_name,
Expand Down
Loading

0 comments on commit 4802310

Please sign in to comment.