Skip to content

Commit

Permalink
Merge branch 'rtnetlink-Add-support-for-rigid-checking-of-data-in-dum…
Browse files Browse the repository at this point in the history
…p-request'

David Ahern says:

====================
rtnetlink: Add support for rigid checking of data in dump request

There are many use cases where a user wants to influence what is
returned in a dump for some rtnetlink command: one is wanting data
for a different namespace than the one the request is received and
another is limiting the amount of data returned in the dump to a
specific set of interest to userspace, reducing the cpu overhead of
both kernel and userspace. Unfortunately, the kernel has historically
not been strict with checking for the proper header or checking the
values passed in the header. This lenient implementation has allowed
iproute2 and other packages to pass any struct or data in the dump
request as long as the family is the first byte. For example, ifinfomsg
struct is used by iproute2 for all generic dump requests - links,
addresses, routes and rules when it is really only valid for link
requests.

There is 1 is example where the kernel deals with the wrong struct: link
dumps after VF support was added. Older iproute2 was sending rtgenmsg as
the header instead of ifinfomsg so a patch was added to try and detect
old userspace vs new:
e5eca6d ("rtnetlink: fix userspace API breakage for iproute2 < v3.9.0")

The latest example is Christian's patch set wanting to return addresses for
a target namespace. It guesses the header struct is an ifaddrmsg and if it
guesses wrong a netlink warning is generated in the kernel log on every
address dump which is unacceptable.

Another example where the kernel is a bit lenient is route dumps: iproute2
can send either a request with either ifinfomsg or a rtmsg as the header
struct, yet the kernel always treats the header as an rtmsg (see
inet_dump_fib and rtm_flags check). The header inconsistency impacts the
ability to add kernel side filters for route dumps - a necessary feature
for scale setups with 100k+ routes.

How to resolve the problem of not breaking old userspace yet be able to
move forward with new features such as kernel side filtering which are
crucial for efficient operation at high scale?

This patch set addresses the problem by adding a new socket flag,
NETLINK_DUMP_STRICT_CHK, that userspace can use with setsockopt to
request strict checking of headers and attributes on dump requests and
hence unlock the ability to use kernel side filters as they are added.

Kernel side, the dump handlers are updated to verify the message contains
at least the expected header struct:
    RTM_GETLINK:       ifinfomsg
    RTM_GETADDR:       ifaddrmsg
    RTM_GETMULTICAST:  ifaddrmsg
    RTM_GETANYCAST:    ifaddrmsg
    RTM_GETADDRLABEL:  ifaddrlblmsg
    RTM_GETROUTE:      rtmsg
    RTM_GETSTATS:      if_stats_msg
    RTM_GETNEIGH:      ndmsg
    RTM_GETNEIGHTBL:   ndtmsg
    RTM_GETNSID:       rtgenmsg
    RTM_GETRULE:       fib_rule_hdr
    RTM_GETNETCONF:    netconfmsg
    RTM_GETMDB:        br_port_msg

And then every field in the header struct should be 0 with the exception
of the family. There are a few exceptions to this rule where the kernel
already influences the data returned by values in the struct. Next the
message should not contain attributes unless the kernel implements
filtering for it. Any unexpected data causes the dump to fail with EINVAL.
If the new flag is honored by the kernel and the dump contents adjusted
by any data passed in the request, the dump handler can set the
NLM_F_DUMP_FILTERED flag in the netlink message header.

For old userspace on new kernel there is no impact as all checks are
wrapped in a check on the new strict flag. For new userspace on old
kernel, the data in the headers and any appended attributes are
silently ignored though the setsockopt failing is the clue to userspace
the feature is not supported. New userspace on new kernel gets the
requested data dump.

iproute2 patches can be found here:
    https://github.com/dsahern/iproute2 dump-enhancements

Major changes since v1
- inner header is supposed to be 4-bytes aligned. So for dumps that
  should not have attributes appended changed the check to use:
        if (nlmsg_attrlen(nlh, sizeof(hdr)))
  Only impacts patches with headers that are not multiples of 4-bytes
  (rtgenmsg, netconfmsg), but applied the change to all patches not
  calling nlmsg_parse for consistency.

- Added nlmsg_parse_strict and nla_parse_strict for tighter control on
  attribute parsing. There should be no unknown attribute types or extra
  bytes.

- Moved validation to a helper in most cases

Changes since rfc-v2
- dropped the NLM_F_DUMP_FILTERED flag from target nsid dumps per
  Jiri's objections
- changed the opt-in uapi from a netlink message flag to a socket
  flag. setsockopt provides an api for userspace to definitively
  know if the kernel supports strict checking on dumps.
- re-ordered patches to peel off the extack on dumps if needed to
  keep this set size within limits
- misc cleanups in patches based on testing
====================

Acked-by: Christian Brauner <christian@brauner.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
  • Loading branch information
David S. Miller committed Oct 8, 2018
2 parents 272a661 + 8c6e137 commit cd7f7df
Show file tree
Hide file tree
Showing 27 changed files with 908 additions and 165 deletions.
2 changes: 2 additions & 0 deletions include/linux/netlink.h
Original file line number Diff line number Diff line change
Expand Up @@ -176,8 +176,10 @@ struct netlink_callback {
void *data;
/* the module that dump function belong to */
struct module *module;
struct netlink_ext_ack *extack;
u16 family;
u16 min_dump_alloc;
bool strict_check;
unsigned int prev_seq, seq;
long args[6];
};
Expand Down
2 changes: 2 additions & 0 deletions include/net/ip_fib.h
Original file line number Diff line number Diff line change
Expand Up @@ -452,4 +452,6 @@ static inline void fib_proc_exit(struct net *net)

u32 ip_mtu_from_fib_result(struct fib_result *res, __be32 daddr);

int ip_valid_fib_dump_req(const struct nlmsghdr *nlh,
struct netlink_ext_ack *extack);
#endif /* _NET_FIB_H */
21 changes: 20 additions & 1 deletion include/net/netlink.h
Original file line number Diff line number Diff line change
Expand Up @@ -373,6 +373,9 @@ int nla_validate(const struct nlattr *head, int len, int maxtype,
int nla_parse(struct nlattr **tb, int maxtype, const struct nlattr *head,
int len, const struct nla_policy *policy,
struct netlink_ext_ack *extack);
int nla_parse_strict(struct nlattr **tb, int maxtype, const struct nlattr *head,
int len, const struct nla_policy *policy,
struct netlink_ext_ack *extack);
int nla_policy_len(const struct nla_policy *, int);
struct nlattr *nla_find(const struct nlattr *head, int len, int attrtype);
size_t nla_strlcpy(char *dst, const struct nlattr *nla, size_t dstsize);
Expand Down Expand Up @@ -516,13 +519,29 @@ static inline int nlmsg_parse(const struct nlmsghdr *nlh, int hdrlen,
const struct nla_policy *policy,
struct netlink_ext_ack *extack)
{
if (nlh->nlmsg_len < nlmsg_msg_size(hdrlen))
if (nlh->nlmsg_len < nlmsg_msg_size(hdrlen)) {
NL_SET_ERR_MSG(extack, "Invalid header length");
return -EINVAL;
}

return nla_parse(tb, maxtype, nlmsg_attrdata(nlh, hdrlen),
nlmsg_attrlen(nlh, hdrlen), policy, extack);
}

static inline int nlmsg_parse_strict(const struct nlmsghdr *nlh, int hdrlen,
struct nlattr *tb[], int maxtype,
const struct nla_policy *policy,
struct netlink_ext_ack *extack)
{
if (nlh->nlmsg_len < nlmsg_msg_size(hdrlen)) {
NL_SET_ERR_MSG(extack, "Invalid header length");
return -EINVAL;
}

return nla_parse_strict(tb, maxtype, nlmsg_attrdata(nlh, hdrlen),
nlmsg_attrlen(nlh, hdrlen), policy, extack);
}

/**
* nlmsg_find_attr - find a specific attribute in a netlink message
* @nlh: netlink message header
Expand Down
1 change: 1 addition & 0 deletions include/uapi/linux/netlink.h
Original file line number Diff line number Diff line change
Expand Up @@ -155,6 +155,7 @@ enum nlmsgerr_attrs {
#define NETLINK_LIST_MEMBERSHIPS 9
#define NETLINK_CAP_ACK 10
#define NETLINK_EXT_ACK 11
#define NETLINK_DUMP_STRICT_CHK 12

struct nl_pktinfo {
__u32 group;
Expand Down
48 changes: 36 additions & 12 deletions lib/nlattr.c
Original file line number Diff line number Diff line change
Expand Up @@ -391,9 +391,10 @@ EXPORT_SYMBOL(nla_policy_len);
*
* Returns 0 on success or a negative error code.
*/
int nla_parse(struct nlattr **tb, int maxtype, const struct nlattr *head,
int len, const struct nla_policy *policy,
struct netlink_ext_ack *extack)
static int __nla_parse(struct nlattr **tb, int maxtype,
const struct nlattr *head, int len,
bool strict, const struct nla_policy *policy,
struct netlink_ext_ack *extack)
{
const struct nlattr *nla;
int rem;
Expand All @@ -403,27 +404,50 @@ int nla_parse(struct nlattr **tb, int maxtype, const struct nlattr *head,
nla_for_each_attr(nla, head, len, rem) {
u16 type = nla_type(nla);

if (type > 0 && type <= maxtype) {
if (policy) {
int err = validate_nla(nla, maxtype, policy,
extack);

if (err < 0)
return err;
if (type == 0 || type > maxtype) {
if (strict) {
NL_SET_ERR_MSG(extack, "Unknown attribute type");
return -EINVAL;
}
continue;
}
if (policy) {
int err = validate_nla(nla, maxtype, policy, extack);

tb[type] = (struct nlattr *)nla;
if (err < 0)
return err;
}

tb[type] = (struct nlattr *)nla;
}

if (unlikely(rem > 0))
if (unlikely(rem > 0)) {
pr_warn_ratelimited("netlink: %d bytes leftover after parsing attributes in process `%s'.\n",
rem, current->comm);
NL_SET_ERR_MSG(extack, "bytes leftover after parsing attributes");
if (strict)
return -EINVAL;
}

return 0;
}

int nla_parse(struct nlattr **tb, int maxtype, const struct nlattr *head,
int len, const struct nla_policy *policy,
struct netlink_ext_ack *extack)
{
return __nla_parse(tb, maxtype, head, len, false, policy, extack);
}
EXPORT_SYMBOL(nla_parse);

int nla_parse_strict(struct nlattr **tb, int maxtype, const struct nlattr *head,
int len, const struct nla_policy *policy,
struct netlink_ext_ack *extack)
{
return __nla_parse(tb, maxtype, head, len, true, policy, extack);
}
EXPORT_SYMBOL(nla_parse_strict);

/**
* nla_find - Find a specific attribute in a stream of attributes
* @head: head of attribute stream
Expand Down
30 changes: 30 additions & 0 deletions net/bridge/br_mdb.c
Original file line number Diff line number Diff line change
Expand Up @@ -162,13 +162,43 @@ static int br_mdb_fill_info(struct sk_buff *skb, struct netlink_callback *cb,
return err;
}

static int br_mdb_valid_dump_req(const struct nlmsghdr *nlh,
struct netlink_ext_ack *extack)
{
struct br_port_msg *bpm;

if (nlh->nlmsg_len < nlmsg_msg_size(sizeof(*bpm))) {
NL_SET_ERR_MSG_MOD(extack, "Invalid header for mdb dump request");
return -EINVAL;
}

bpm = nlmsg_data(nlh);
if (bpm->ifindex) {
NL_SET_ERR_MSG_MOD(extack, "Filtering by device index is not supported for mdb dump request");
return -EINVAL;
}
if (nlmsg_attrlen(nlh, sizeof(*bpm))) {
NL_SET_ERR_MSG(extack, "Invalid data after header in mdb dump request");
return -EINVAL;
}

return 0;
}

static int br_mdb_dump(struct sk_buff *skb, struct netlink_callback *cb)
{
struct net_device *dev;
struct net *net = sock_net(skb->sk);
struct nlmsghdr *nlh = NULL;
int idx = 0, s_idx;

if (cb->strict_check) {
int err = br_mdb_valid_dump_req(cb->nlh, cb->extack);

if (err < 0)
return err;
}

s_idx = cb->args[0];

rcu_read_lock();
Expand Down
2 changes: 1 addition & 1 deletion net/core/devlink.c
Original file line number Diff line number Diff line change
Expand Up @@ -3504,7 +3504,7 @@ static int devlink_nl_cmd_region_read_dumpit(struct sk_buff *skb,
start_offset = *((u64 *)&cb->args[0]);

err = nlmsg_parse(cb->nlh, GENL_HDRLEN + devlink_nl_family.hdrsize,
attrs, DEVLINK_ATTR_MAX, ops->policy, NULL);
attrs, DEVLINK_ATTR_MAX, ops->policy, cb->extack);
if (err)
goto out;

Expand Down
36 changes: 35 additions & 1 deletion net/core/fib_rules.c
Original file line number Diff line number Diff line change
Expand Up @@ -1063,13 +1063,47 @@ static int dump_rules(struct sk_buff *skb, struct netlink_callback *cb,
return err;
}

static int fib_valid_dumprule_req(const struct nlmsghdr *nlh,
struct netlink_ext_ack *extack)
{
struct fib_rule_hdr *frh;

if (nlh->nlmsg_len < nlmsg_msg_size(sizeof(*frh))) {
NL_SET_ERR_MSG(extack, "Invalid header for fib rule dump request");
return -EINVAL;
}

frh = nlmsg_data(nlh);
if (frh->dst_len || frh->src_len || frh->tos || frh->table ||
frh->res1 || frh->res2 || frh->action || frh->flags) {
NL_SET_ERR_MSG(extack,
"Invalid values in header for fib rule dump request");
return -EINVAL;
}

if (nlmsg_attrlen(nlh, sizeof(*frh))) {
NL_SET_ERR_MSG(extack, "Invalid data after header in fib rule dump request");
return -EINVAL;
}

return 0;
}

static int fib_nl_dumprule(struct sk_buff *skb, struct netlink_callback *cb)
{
const struct nlmsghdr *nlh = cb->nlh;
struct net *net = sock_net(skb->sk);
struct fib_rules_ops *ops;
int idx = 0, family;

family = rtnl_msg_family(cb->nlh);
if (cb->strict_check) {
int err = fib_valid_dumprule_req(nlh, cb->extack);

if (err < 0)
return err;
}

family = rtnl_msg_family(nlh);
if (family != AF_UNSPEC) {
/* Protocol specific dump request */
ops = lookup_rules_ops(net, family);
Expand Down
Loading

0 comments on commit cd7f7df

Please sign in to comment.