Skip to content

Commit

Permalink
Merge branch 'net-fix-sysfs-permssions-when-device-changes-network'
Browse files Browse the repository at this point in the history
Christian Brauner says:

====================
net: fix sysfs permssions when device changes network

/* v7 */
This is v7 with a build warning fixup that slipped past me the last
time. It removes to unused variables in sysfs_group_change_owner(). I
observed no warning when building just now.

/* v6 */
This is v6 with two small fixups. I missed adapting the commit message
to reflect the renamed helper for changing the owner of sysfs files and
I also forgot to make the new dpm helper static inline.

/* v5 */
This is v5 with a small fixup requested by Rafael.

/* v4 */
This is v4 with more documentation and other fixes that Greg requested.

/* v3 */
This is v3 with explicit uid and gid parameters added to functions that
change sysfs object ownership as Greg requested.

(I've tagged this with net-next since it's triggered by a bug for
 network device files but it also touches driver core aspects so it's
 not clear-cut. I can of course split this series into separate
 patchsets.)
We have been struggling with a bug surrounding the ownership of network
device sysfs files when moving network devices between network
namespaces owned by different user namespaces reported by multiple
users.

Currently, when moving network devices between network namespaces the
ownership of the corresponding sysfs entries is not changed. This leads
to problems when tools try to operate on the corresponding sysfs files.

I also causes a bug when creating a network device in a network
namespaces owned by a user namespace and moving that network device back
to the host network namespaces. Because when a network device is created
in a network namespaces it will be owned by the root user of the user
namespace and all its associated sysfs files will also be owned by the
root user of the corresponding user namespace.
If such a network device has to be moved back to the host network
namespace the permissions will still be set to the root user of the
owning user namespaces of the originating network namespace. This means
unprivileged users can e.g. re-trigger uevents for such incorrectly
owned devices on the host or in other network namespaces. They can also
modify the settings of the device itself through sysfs when they
wouldn't be able to do the same through netlink. Both of these things
are unwanted.

For example, quite a few workloads will create network devices in the
host network namespace. Other tools will then proceed to move such
devices between network namespaces owner by other user namespaces. While
the ownership of the device itself is updated in
net/core/net-sysfs.c:dev_change_net_namespace() the corresponding sysfs
entry for the device is not. Below you'll find that moving a network
device (here a veth device) from a network namespace into another
network namespaces owned by a different user namespace with a different
id mapping. As you can see the permissions are wrong even though it is
owned by the userns root user after it has been moved and can be
interacted with through netlink:

drwxr-xr-x 5 nobody nobody    0 Jan 25 18:08 .
drwxr-xr-x 9 nobody nobody    0 Jan 25 18:08 ..
-r--r--r-- 1 nobody nobody 4096 Jan 25 18:09 addr_assign_type
-r--r--r-- 1 nobody nobody 4096 Jan 25 18:09 addr_len
-r--r--r-- 1 nobody nobody 4096 Jan 25 18:09 address
-r--r--r-- 1 nobody nobody 4096 Jan 25 18:09 broadcast
-rw-r--r-- 1 nobody nobody 4096 Jan 25 18:09 carrier
-r--r--r-- 1 nobody nobody 4096 Jan 25 18:09 carrier_changes
-r--r--r-- 1 nobody nobody 4096 Jan 25 18:09 carrier_down_count
-r--r--r-- 1 nobody nobody 4096 Jan 25 18:09 carrier_up_count
-r--r--r-- 1 nobody nobody 4096 Jan 25 18:09 dev_id
-r--r--r-- 1 nobody nobody 4096 Jan 25 18:09 dev_port
-r--r--r-- 1 nobody nobody 4096 Jan 25 18:09 dormant
-r--r--r-- 1 nobody nobody 4096 Jan 25 18:09 duplex
-rw-r--r-- 1 nobody nobody 4096 Jan 25 18:09 flags
-rw-r--r-- 1 nobody nobody 4096 Jan 25 18:09 gro_flush_timeout
-rw-r--r-- 1 nobody nobody 4096 Jan 25 18:09 ifalias
-r--r--r-- 1 nobody nobody 4096 Jan 25 18:09 ifindex
-r--r--r-- 1 nobody nobody 4096 Jan 25 18:09 iflink
-r--r--r-- 1 nobody nobody 4096 Jan 25 18:09 link_mode
-rw-r--r-- 1 nobody nobody 4096 Jan 25 18:09 mtu
-r--r--r-- 1 nobody nobody 4096 Jan 25 18:09 name_assign_type
-rw-r--r-- 1 nobody nobody 4096 Jan 25 18:09 netdev_group
-r--r--r-- 1 nobody nobody 4096 Jan 25 18:09 operstate
-r--r--r-- 1 nobody nobody 4096 Jan 25 18:09 phys_port_id
-r--r--r-- 1 nobody nobody 4096 Jan 25 18:09 phys_port_name
-r--r--r-- 1 nobody nobody 4096 Jan 25 18:09 phys_switch_id
drwxr-xr-x 2 nobody nobody    0 Jan 25 18:09 power
-rw-r--r-- 1 nobody nobody 4096 Jan 25 18:09 proto_down
drwxr-xr-x 4 nobody nobody    0 Jan 25 18:09 queues
-r--r--r-- 1 nobody nobody 4096 Jan 25 18:09 speed
drwxr-xr-x 2 nobody nobody    0 Jan 25 18:09 statistics
lrwxrwxrwx 1 nobody nobody    0 Jan 25 18:08 subsystem -> ../../../../class/net
-rw-r--r-- 1 nobody nobody 4096 Jan 25 18:09 tx_queue_len
-r--r--r-- 1 nobody nobody 4096 Jan 25 18:09 type
-rw-r--r-- 1 nobody nobody 4096 Jan 25 18:08 uevent

Constrast this with creating a device of the same type in the network
namespace directly. In this case the device's sysfs permissions will be
correctly updated.
(Please also note, that in a lot of workloads this strategy of creating
 the network device directly in the network device to workaround this
 issue can not be used. Either because the network device is dedicated
 after it has been created or because it used by a process that is
 heavily sandboxed and couldn't create network devices itself.):

drwxr-xr-x 5 root   root      0 Jan 25 18:12 .
drwxr-xr-x 9 nobody nobody    0 Jan 25 18:08 ..
-r--r--r-- 1 root   root   4096 Jan 25 18:12 addr_assign_type
-r--r--r-- 1 root   root   4096 Jan 25 18:12 addr_len
-r--r--r-- 1 root   root   4096 Jan 25 18:12 address
-r--r--r-- 1 root   root   4096 Jan 25 18:12 broadcast
-rw-r--r-- 1 root   root   4096 Jan 25 18:12 carrier
-r--r--r-- 1 root   root   4096 Jan 25 18:12 carrier_changes
-r--r--r-- 1 root   root   4096 Jan 25 18:12 carrier_down_count
-r--r--r-- 1 root   root   4096 Jan 25 18:12 carrier_up_count
-r--r--r-- 1 root   root   4096 Jan 25 18:12 dev_id
-r--r--r-- 1 root   root   4096 Jan 25 18:12 dev_port
-r--r--r-- 1 root   root   4096 Jan 25 18:12 dormant
-r--r--r-- 1 root   root   4096 Jan 25 18:12 duplex
-rw-r--r-- 1 root   root   4096 Jan 25 18:12 flags
-rw-r--r-- 1 root   root   4096 Jan 25 18:12 gro_flush_timeout
-rw-r--r-- 1 root   root   4096 Jan 25 18:12 ifalias
-r--r--r-- 1 root   root   4096 Jan 25 18:12 ifindex
-r--r--r-- 1 root   root   4096 Jan 25 18:12 iflink
-r--r--r-- 1 root   root   4096 Jan 25 18:12 link_mode
-rw-r--r-- 1 root   root   4096 Jan 25 18:12 mtu
-r--r--r-- 1 root   root   4096 Jan 25 18:12 name_assign_type
-rw-r--r-- 1 root   root   4096 Jan 25 18:12 netdev_group
-r--r--r-- 1 root   root   4096 Jan 25 18:12 operstate
-r--r--r-- 1 root   root   4096 Jan 25 18:12 phys_port_id
-r--r--r-- 1 root   root   4096 Jan 25 18:12 phys_port_name
-r--r--r-- 1 root   root   4096 Jan 25 18:12 phys_switch_id
drwxr-xr-x 2 root   root      0 Jan 25 18:12 power
-rw-r--r-- 1 root   root   4096 Jan 25 18:12 proto_down
drwxr-xr-x 4 root   root      0 Jan 25 18:12 queues
-r--r--r-- 1 root   root   4096 Jan 25 18:12 speed
drwxr-xr-x 2 root   root      0 Jan 25 18:12 statistics
lrwxrwxrwx 1 nobody nobody    0 Jan 25 18:12 subsystem -> ../../../../class/net
-rw-r--r-- 1 root   root   4096 Jan 25 18:12 tx_queue_len
-r--r--r-- 1 root   root   4096 Jan 25 18:12 type
-rw-r--r-- 1 root   root   4096 Jan 25 18:12 uevent

Now, when creating a network device in a network namespace owned by a
user namespace and moving it to the host the permissions will be set to
the id that the user namespace root user has been mapped to on the host
leading to all sorts of permission issues mentioned above:

458752
drwxr-xr-x 5 458752 458752      0 Jan 25 18:12 .
drwxr-xr-x 9 root   root        0 Jan 25 18:08 ..
-r--r--r-- 1 458752 458752   4096 Jan 25 18:12 addr_assign_type
-r--r--r-- 1 458752 458752   4096 Jan 25 18:12 addr_len
-r--r--r-- 1 458752 458752   4096 Jan 25 18:12 address
-r--r--r-- 1 458752 458752   4096 Jan 25 18:12 broadcast
-rw-r--r-- 1 458752 458752   4096 Jan 25 18:12 carrier
-r--r--r-- 1 458752 458752   4096 Jan 25 18:12 carrier_changes
-r--r--r-- 1 458752 458752   4096 Jan 25 18:12 carrier_down_count
-r--r--r-- 1 458752 458752   4096 Jan 25 18:12 carrier_up_count
-r--r--r-- 1 458752 458752   4096 Jan 25 18:12 dev_id
-r--r--r-- 1 458752 458752   4096 Jan 25 18:12 dev_port
-r--r--r-- 1 458752 458752   4096 Jan 25 18:12 dormant
-r--r--r-- 1 458752 458752   4096 Jan 25 18:12 duplex
-rw-r--r-- 1 458752 458752   4096 Jan 25 18:12 flags
-rw-r--r-- 1 458752 458752   4096 Jan 25 18:12 gro_flush_timeout
-rw-r--r-- 1 458752 458752   4096 Jan 25 18:12 ifalias
-r--r--r-- 1 458752 458752   4096 Jan 25 18:12 ifindex
-r--r--r-- 1 458752 458752   4096 Jan 25 18:12 iflink
-r--r--r-- 1 458752 458752   4096 Jan 25 18:12 link_mode
-rw-r--r-- 1 458752 458752   4096 Jan 25 18:12 mtu
-r--r--r-- 1 458752 458752   4096 Jan 25 18:12 name_assign_type
-rw-r--r-- 1 458752 458752   4096 Jan 25 18:12 netdev_group
-r--r--r-- 1 458752 458752   4096 Jan 25 18:12 operstate
-r--r--r-- 1 458752 458752   4096 Jan 25 18:12 phys_port_id
-r--r--r-- 1 458752 458752   4096 Jan 25 18:12 phys_port_name
-r--r--r-- 1 458752 458752   4096 Jan 25 18:12 phys_switch_id
drwxr-xr-x 2 458752 458752      0 Jan 25 18:12 power
-rw-r--r-- 1 458752 458752   4096 Jan 25 18:12 proto_down
drwxr-xr-x 4 458752 458752      0 Jan 25 18:12 queues
-r--r--r-- 1 458752 458752   4096 Jan 25 18:12 speed
drwxr-xr-x 2 458752 458752      0 Jan 25 18:12 statistics
lrwxrwxrwx 1 root   root        0 Jan 25 18:12 subsystem -> ../../../../class/net
-rw-r--r-- 1 458752 458752   4096 Jan 25 18:12 tx_queue_len
-r--r--r-- 1 458752 458752   4096 Jan 25 18:12 type
-rw-r--r-- 1 458752 458752   4096 Jan 25 18:12 uevent

Fix this by changing the basic sysfs files associated with network
devices when moving them between network namespaces. To this end we add
some infrastructure to sysfs.

The patchset takes care to only do this when the owning user namespaces
changes and the kids differ. So there's only a performance overhead,
when the owning user namespace of the network namespace is different
__and__ the kid mappings for the root user are different for the two
user namespaces:
Assume we have a netdev eth0 which we create in netns1 owned by userns1.
userns1 has an id mapping of 0 100000 100000. Now we move eth0 into
netns2 which is owned by userns2 which also defines an id mapping of 0
100000 100000. In this case sysfs doesn't need updating. The patch will
handle this case and not do any needless work. Now assume eth0 is moved
into netns3 which is owned by userns3 which defines an id mapping of 0
123456 65536. In this case the root user in each namespace corresponds
to different kid and sysfs needs updating.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
  • Loading branch information
David S. Miller committed Feb 27, 2020
2 parents d1c73cb + ef6a4c8 commit ebb4a4b
Show file tree
Hide file tree
Showing 10 changed files with 630 additions and 2 deletions.
120 changes: 120 additions & 0 deletions drivers/base/core.c
Original file line number Diff line number Diff line change
Expand Up @@ -3458,6 +3458,126 @@ int device_move(struct device *dev, struct device *new_parent,
}
EXPORT_SYMBOL_GPL(device_move);

static int device_attrs_change_owner(struct device *dev, kuid_t kuid,
kgid_t kgid)
{
struct kobject *kobj = &dev->kobj;
struct class *class = dev->class;
const struct device_type *type = dev->type;
int error;

if (class) {
/*
* Change the device groups of the device class for @dev to
* @kuid/@kgid.
*/
error = sysfs_groups_change_owner(kobj, class->dev_groups, kuid,
kgid);
if (error)
return error;
}

if (type) {
/*
* Change the device groups of the device type for @dev to
* @kuid/@kgid.
*/
error = sysfs_groups_change_owner(kobj, type->groups, kuid,
kgid);
if (error)
return error;
}

/* Change the device groups of @dev to @kuid/@kgid. */
error = sysfs_groups_change_owner(kobj, dev->groups, kuid, kgid);
if (error)
return error;

if (device_supports_offline(dev) && !dev->offline_disabled) {
/* Change online device attributes of @dev to @kuid/@kgid. */
error = sysfs_file_change_owner(kobj, dev_attr_online.attr.name,
kuid, kgid);
if (error)
return error;
}

return 0;
}

/**
* device_change_owner - change the owner of an existing device.
* @dev: device.
* @kuid: new owner's kuid
* @kgid: new owner's kgid
*
* This changes the owner of @dev and its corresponding sysfs entries to
* @kuid/@kgid. This function closely mirrors how @dev was added via driver
* core.
*
* Returns 0 on success or error code on failure.
*/
int device_change_owner(struct device *dev, kuid_t kuid, kgid_t kgid)
{
int error;
struct kobject *kobj = &dev->kobj;

dev = get_device(dev);
if (!dev)
return -EINVAL;

/*
* Change the kobject and the default attributes and groups of the
* ktype associated with it to @kuid/@kgid.
*/
error = sysfs_change_owner(kobj, kuid, kgid);
if (error)
goto out;

/*
* Change the uevent file for @dev to the new owner. The uevent file
* was created in a separate step when @dev got added and we mirror
* that step here.
*/
error = sysfs_file_change_owner(kobj, dev_attr_uevent.attr.name, kuid,
kgid);
if (error)
goto out;

/*
* Change the device groups, the device groups associated with the
* device class, and the groups associated with the device type of @dev
* to @kuid/@kgid.
*/
error = device_attrs_change_owner(dev, kuid, kgid);
if (error)
goto out;

error = dpm_sysfs_change_owner(dev, kuid, kgid);
if (error)
goto out;

#ifdef CONFIG_BLOCK
if (sysfs_deprecated && dev->class == &block_class)
goto out;
#endif

/*
* Change the owner of the symlink located in the class directory of
* the device class associated with @dev which points to the actual
* directory entry for @dev to @kuid/@kgid. This ensures that the
* symlink shows the same permissions as its target.
*/
error = sysfs_link_change_owner(&dev->class->p->subsys.kobj, &dev->kobj,
dev_name(dev), kuid, kgid);
if (error)
goto out;

out:
put_device(dev);
return error;
}
EXPORT_SYMBOL_GPL(device_change_owner);

/**
* device_shutdown - call ->shutdown() on each device to shutdown.
*/
Expand Down
3 changes: 3 additions & 0 deletions drivers/base/power/power.h
Original file line number Diff line number Diff line change
Expand Up @@ -74,6 +74,7 @@ extern int pm_qos_sysfs_add_flags(struct device *dev);
extern void pm_qos_sysfs_remove_flags(struct device *dev);
extern int pm_qos_sysfs_add_latency_tolerance(struct device *dev);
extern void pm_qos_sysfs_remove_latency_tolerance(struct device *dev);
extern int dpm_sysfs_change_owner(struct device *dev, kuid_t kuid, kgid_t kgid);

#else /* CONFIG_PM */

Expand All @@ -88,6 +89,8 @@ static inline void pm_runtime_remove(struct device *dev) {}

static inline int dpm_sysfs_add(struct device *dev) { return 0; }
static inline void dpm_sysfs_remove(struct device *dev) {}
static inline int dpm_sysfs_change_owner(struct device *dev, kuid_t kuid,
kgid_t kgid) { return 0; }

#endif

Expand Down
55 changes: 54 additions & 1 deletion drivers/base/power/sysfs.c
Original file line number Diff line number Diff line change
Expand Up @@ -480,6 +480,14 @@ static ssize_t wakeup_last_time_ms_show(struct device *dev,
return enabled ? sprintf(buf, "%lld\n", msec) : sprintf(buf, "\n");
}

static inline int dpm_sysfs_wakeup_change_owner(struct device *dev, kuid_t kuid,
kgid_t kgid)
{
if (dev->power.wakeup && dev->power.wakeup->dev)
return device_change_owner(dev->power.wakeup->dev, kuid, kgid);
return 0;
}

static DEVICE_ATTR_RO(wakeup_last_time_ms);

#ifdef CONFIG_PM_AUTOSLEEP
Expand All @@ -501,7 +509,13 @@ static ssize_t wakeup_prevent_sleep_time_ms_show(struct device *dev,

static DEVICE_ATTR_RO(wakeup_prevent_sleep_time_ms);
#endif /* CONFIG_PM_AUTOSLEEP */
#endif /* CONFIG_PM_SLEEP */
#else /* CONFIG_PM_SLEEP */
static inline int dpm_sysfs_wakeup_change_owner(struct device *dev, kuid_t kuid,
kgid_t kgid)
{
return 0;
}
#endif

#ifdef CONFIG_PM_ADVANCED_DEBUG
static ssize_t runtime_usage_show(struct device *dev,
Expand Down Expand Up @@ -684,6 +698,45 @@ int dpm_sysfs_add(struct device *dev)
return rc;
}

int dpm_sysfs_change_owner(struct device *dev, kuid_t kuid, kgid_t kgid)
{
int rc;

if (device_pm_not_required(dev))
return 0;

rc = sysfs_group_change_owner(&dev->kobj, &pm_attr_group, kuid, kgid);
if (rc)
return rc;

if (pm_runtime_callbacks_present(dev)) {
rc = sysfs_group_change_owner(
&dev->kobj, &pm_runtime_attr_group, kuid, kgid);
if (rc)
return rc;
}

if (device_can_wakeup(dev)) {
rc = sysfs_group_change_owner(&dev->kobj, &pm_wakeup_attr_group,
kuid, kgid);
if (rc)
return rc;

rc = dpm_sysfs_wakeup_change_owner(dev, kuid, kgid);
if (rc)
return rc;
}

if (dev->power.set_latency_tolerance) {
rc = sysfs_group_change_owner(
&dev->kobj, &pm_qos_latency_tolerance_attr_group, kuid,
kgid);
if (rc)
return rc;
}
return 0;
}

int wakeup_sysfs_add(struct device *dev)
{
return sysfs_merge_group(&dev->kobj, &pm_wakeup_attr_group);
Expand Down
148 changes: 148 additions & 0 deletions fs/sysfs/file.c
Original file line number Diff line number Diff line change
Expand Up @@ -558,3 +558,151 @@ void sysfs_remove_bin_file(struct kobject *kobj,
kernfs_remove_by_name(kobj->sd, attr->attr.name);
}
EXPORT_SYMBOL_GPL(sysfs_remove_bin_file);

static int internal_change_owner(struct kernfs_node *kn, kuid_t kuid,
kgid_t kgid)
{
struct iattr newattrs = {
.ia_valid = ATTR_UID | ATTR_GID,
.ia_uid = kuid,
.ia_gid = kgid,
};
return kernfs_setattr(kn, &newattrs);
}

/**
* sysfs_link_change_owner - change owner of a sysfs file.
* @kobj: object of the kernfs_node the symlink is located in.
* @targ: object of the kernfs_node the symlink points to.
* @name: name of the link.
* @kuid: new owner's kuid
* @kgid: new owner's kgid
*
* This function looks up the sysfs symlink entry @name under @kobj and changes
* the ownership to @kuid/@kgid. The symlink is looked up in the namespace of
* @targ.
*
* Returns 0 on success or error code on failure.
*/
int sysfs_link_change_owner(struct kobject *kobj, struct kobject *targ,
const char *name, kuid_t kuid, kgid_t kgid)
{
struct kernfs_node *kn = NULL;
int error;

if (!name || !kobj->state_in_sysfs || !targ->state_in_sysfs)
return -EINVAL;

error = -ENOENT;
kn = kernfs_find_and_get_ns(kobj->sd, name, targ->sd->ns);
if (!kn)
goto out;

error = -EINVAL;
if (kernfs_type(kn) != KERNFS_LINK)
goto out;
if (kn->symlink.target_kn->priv != targ)
goto out;

error = internal_change_owner(kn, kuid, kgid);

out:
kernfs_put(kn);
return error;
}

/**
* sysfs_file_change_owner - change owner of a sysfs file.
* @kobj: object.
* @name: name of the file to change.
* @kuid: new owner's kuid
* @kgid: new owner's kgid
*
* This function looks up the sysfs entry @name under @kobj and changes the
* ownership to @kuid/@kgid.
*
* Returns 0 on success or error code on failure.
*/
int sysfs_file_change_owner(struct kobject *kobj, const char *name, kuid_t kuid,
kgid_t kgid)
{
struct kernfs_node *kn;
int error;

if (!name)
return -EINVAL;

if (!kobj->state_in_sysfs)
return -EINVAL;

kn = kernfs_find_and_get(kobj->sd, name);
if (!kn)
return -ENOENT;

error = internal_change_owner(kn, kuid, kgid);

kernfs_put(kn);

return error;
}
EXPORT_SYMBOL_GPL(sysfs_file_change_owner);

/**
* sysfs_change_owner - change owner of the given object.
* @kobj: object.
* @kuid: new owner's kuid
* @kgid: new owner's kgid
*
* Change the owner of the default directory, files, groups, and attributes of
* @kobj to @kuid/@kgid. Note that sysfs_change_owner mirrors how the sysfs
* entries for a kobject are added by driver core. In summary,
* sysfs_change_owner() takes care of the default directory entry for @kobj,
* the default attributes associated with the ktype of @kobj and the default
* attributes associated with the ktype of @kobj.
* Additional properties not added by driver core have to be changed by the
* driver or subsystem which created them. This is similar to how
* driver/subsystem specific entries are removed.
*
* Returns 0 on success or error code on failure.
*/
int sysfs_change_owner(struct kobject *kobj, kuid_t kuid, kgid_t kgid)
{
int error;
const struct kobj_type *ktype;

if (!kobj->state_in_sysfs)
return -EINVAL;

/* Change the owner of the kobject itself. */
error = internal_change_owner(kobj->sd, kuid, kgid);
if (error)
return error;

ktype = get_ktype(kobj);
if (ktype) {
struct attribute **kattr;

/*
* Change owner of the default attributes associated with the
* ktype of @kobj.
*/
for (kattr = ktype->default_attrs; kattr && *kattr; kattr++) {
error = sysfs_file_change_owner(kobj, (*kattr)->name,
kuid, kgid);
if (error)
return error;
}

/*
* Change owner of the default groups associated with the
* ktype of @kobj.
*/
error = sysfs_groups_change_owner(kobj, ktype->default_groups,
kuid, kgid);
if (error)
return error;
}

return 0;
}
EXPORT_SYMBOL_GPL(sysfs_change_owner);
Loading

0 comments on commit ebb4a4b

Please sign in to comment.