Skip to content

Commit

Permalink
Merge branch 'cls-hw-offload-rtnl'
Browse files Browse the repository at this point in the history
Vlad Buslov says:

====================
Refactor cls hardware offload API to support rtnl-independent drivers

Currently, all cls API hardware offloads driver callbacks require caller
to hold rtnl lock when calling them. This patch set introduces new API
that allows drivers to register callbacks that are not dependent on rtnl
lock and unlocked classifiers to offload filters without obtaining rtnl
lock first, which is intended to allow offloading tc rules in parallel.

Recently, new rtnl registration flag RTNL_FLAG_DOIT_UNLOCKED was added.
TC rule update handlers (RTM_NEWTFILTER, RTM_DELTFILTER, etc.) are
already registered with this flag and only take rtnl lock when qdisc or
classifier requires it. Classifiers can indicate that their ops
callbacks don't require caller to hold rtnl lock by setting the
TCF_PROTO_OPS_DOIT_UNLOCKED flag. Unlocked implementation of flower
classifier is now upstreamed. However, this implementation still obtains
rtnl lock before calling hardware offloads API.

Implement following cls API changes:

- Introduce new "unlocked_driver_cb" flag to struct flow_block_offload
  to allow registering and unregistering block hardware offload
  callbacks that do not require caller to hold rtnl lock. Drivers that
  doesn't require users of its tc offload callbacks to hold rtnl lock
  sets the flag to true on block bind/unbind. Internally tcf_block is
  extended with additional lockeddevcnt counter that is used to count
  number of devices that require rtnl lock that block is bound to. When
  this counter is zero, tc_setup_cb_*() functions execute callbacks
  without obtaining rtnl lock.

- Extend cls API single hardware rule update tc_setup_cb_call() function
  with tc_setup_cb_add(), tc_setup_cb_replace(), tc_setup_cb_destroy()
  and tc_setup_cb_reoffload() functions. These new APIs are needed to
  move management of block offload counter, filter in hardware counter
  and flag from classifier implementations to cls API, which is now
  responsible for managing them in concurrency-safe manner. Access to
  cb_list from callback execution code is synchronized by obtaining new
  'cb_lock' rw_semaphore in read mode, which allows executing callbacks
  in parallel, but excludes any modifications of data from
  register/unregister code. tcf_block offloads counter type is changed
  to atomic integer to allow updating the counter concurrently.

- Extend classifier ops with new ops->hw_add() and ops->hw_del()
  callbacks which are used to notify unlocked classifiers when filter is
  successfully added or deleted to hardware without releasing cb_lock.
  This is necessary to update classifier state atomically with callback
  list traversal and updating of all relevant counters and allows
  unlocked classifiers to synchronize with concurrent reoffload without
  requiring any changes to driver callback API implementations.

New tc flow_action infrastructure is also modified to allow its user to
execute without rtnl lock protection. Function tc_setup_flow_action() is
modified to conditionally obtain rtnl lock before accessing action
state. Action data that is accessed by reference is either copied or
reference counted to prevent concurrent action overwrite from
deallocating it. New function tc_cleanup_flow_action() is introduced to
cleanup/release all such data obtained by tc_setup_flow_action().

Flower classifier (only unlocked classifier at the moment) is modified
to use new cls hardware offloads API and no longer obtains rtnl lock
before calling it.
====================

Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
  • Loading branch information
David S. Miller committed Aug 26, 2019
2 parents 0846e16 + 918190f commit 72991b5
Show file tree
Hide file tree
Showing 11 changed files with 478 additions and 172 deletions.
2 changes: 2 additions & 0 deletions drivers/net/ethernet/mellanox/mlx5/core/en_main.c
Original file line number Diff line number Diff line change
Expand Up @@ -3470,10 +3470,12 @@ static int mlx5e_setup_tc(struct net_device *dev, enum tc_setup_type type,
void *type_data)
{
struct mlx5e_priv *priv = netdev_priv(dev);
struct flow_block_offload *f = type_data;

switch (type) {
#ifdef CONFIG_MLX5_ESWITCH
case TC_SETUP_BLOCK:
f->unlocked_driver_cb = true;
return flow_block_cb_setup_simple(type_data,
&mlx5e_block_cb_list,
mlx5e_setup_tc_block_cb,
Expand Down
3 changes: 3 additions & 0 deletions drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
Original file line number Diff line number Diff line change
Expand Up @@ -763,6 +763,7 @@ mlx5e_rep_indr_setup_tc_block(struct net_device *netdev,
if (f->binder_type != FLOW_BLOCK_BINDER_TYPE_CLSACT_INGRESS)
return -EOPNOTSUPP;

f->unlocked_driver_cb = true;
f->driver_block_list = &mlx5e_block_cb_list;

switch (f->command) {
Expand Down Expand Up @@ -1245,9 +1246,11 @@ static int mlx5e_rep_setup_tc(struct net_device *dev, enum tc_setup_type type,
void *type_data)
{
struct mlx5e_priv *priv = netdev_priv(dev);
struct flow_block_offload *f = type_data;

switch (type) {
case TC_SETUP_BLOCK:
f->unlocked_driver_cb = true;
return flow_block_cb_setup_simple(type_data,
&mlx5e_rep_block_cb_list,
mlx5e_rep_setup_tc_cb,
Expand Down
1 change: 1 addition & 0 deletions include/net/flow_offload.h
Original file line number Diff line number Diff line change
Expand Up @@ -284,6 +284,7 @@ struct flow_block_offload {
enum flow_block_command command;
enum flow_block_binder_type binder_type;
bool block_shared;
bool unlocked_driver_cb;
struct net *net;
struct flow_block *block;
struct list_head cb_list;
Expand Down
21 changes: 19 additions & 2 deletions include/net/pkt_cls.h
Original file line number Diff line number Diff line change
Expand Up @@ -504,9 +504,26 @@ tcf_match_indev(struct sk_buff *skb, int ifindex)
}

int tc_setup_flow_action(struct flow_action *flow_action,
const struct tcf_exts *exts);
const struct tcf_exts *exts, bool rtnl_held);
void tc_cleanup_flow_action(struct flow_action *flow_action);

int tc_setup_cb_call(struct tcf_block *block, enum tc_setup_type type,
void *type_data, bool err_stop);
void *type_data, bool err_stop, bool rtnl_held);
int tc_setup_cb_add(struct tcf_block *block, struct tcf_proto *tp,
enum tc_setup_type type, void *type_data, bool err_stop,
u32 *flags, unsigned int *in_hw_count, bool rtnl_held);
int tc_setup_cb_replace(struct tcf_block *block, struct tcf_proto *tp,
enum tc_setup_type type, void *type_data, bool err_stop,
u32 *old_flags, unsigned int *old_in_hw_count,
u32 *new_flags, unsigned int *new_in_hw_count,
bool rtnl_held);
int tc_setup_cb_destroy(struct tcf_block *block, struct tcf_proto *tp,
enum tc_setup_type type, void *type_data, bool err_stop,
u32 *flags, unsigned int *in_hw_count, bool rtnl_held);
int tc_setup_cb_reoffload(struct tcf_block *block, struct tcf_proto *tp,
bool add, flow_setup_cb_t *cb,
enum tc_setup_type type, void *type_data,
void *cb_priv, u32 *flags, unsigned int *in_hw_count);
unsigned int tcf_exts_num_actions(struct tcf_exts *exts);

struct tc_cls_u32_knode {
Expand Down
41 changes: 9 additions & 32 deletions include/net/sch_generic.h
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,8 @@
#include <linux/refcount.h>
#include <linux/workqueue.h>
#include <linux/mutex.h>
#include <linux/rwsem.h>
#include <linux/atomic.h>
#include <net/gen_stats.h>
#include <net/rtnetlink.h>
#include <net/flow_offload.h>
Expand Down Expand Up @@ -310,6 +312,10 @@ struct tcf_proto_ops {
int (*reoffload)(struct tcf_proto *tp, bool add,
flow_setup_cb_t *cb, void *cb_priv,
struct netlink_ext_ack *extack);
void (*hw_add)(struct tcf_proto *tp,
void *type_data);
void (*hw_del)(struct tcf_proto *tp,
void *type_data);
void (*bind_class)(void *, u32, unsigned long);
void * (*tmplt_create)(struct net *net,
struct tcf_chain *chain,
Expand Down Expand Up @@ -396,11 +402,13 @@ struct tcf_block {
refcount_t refcnt;
struct net *net;
struct Qdisc *q;
struct rw_semaphore cb_lock; /* protects cb_list and offload counters */
struct flow_block flow_block;
struct list_head owner_list;
bool keep_dst;
unsigned int offloadcnt; /* Number of oddloaded filters */
atomic_t offloadcnt; /* Number of oddloaded filters */
unsigned int nooffloaddevcnt; /* Number of devs unable to do offload */
unsigned int lockeddevcnt; /* Number of devs that require rtnl lock. */
struct {
struct tcf_chain *chain;
struct list_head filter_chain_list;
Expand Down Expand Up @@ -436,37 +444,6 @@ static inline bool lockdep_tcf_proto_is_locked(struct tcf_proto *tp)
#define tcf_proto_dereference(p, tp) \
rcu_dereference_protected(p, lockdep_tcf_proto_is_locked(tp))

static inline void tcf_block_offload_inc(struct tcf_block *block, u32 *flags)
{
if (*flags & TCA_CLS_FLAGS_IN_HW)
return;
*flags |= TCA_CLS_FLAGS_IN_HW;
block->offloadcnt++;
}

static inline void tcf_block_offload_dec(struct tcf_block *block, u32 *flags)
{
if (!(*flags & TCA_CLS_FLAGS_IN_HW))
return;
*flags &= ~TCA_CLS_FLAGS_IN_HW;
block->offloadcnt--;
}

static inline void
tc_cls_offload_cnt_update(struct tcf_block *block, u32 *cnt,
u32 *flags, bool add)
{
if (add) {
if (!*cnt)
tcf_block_offload_inc(block, flags);
(*cnt)++;
} else {
(*cnt)--;
if (!*cnt)
tcf_block_offload_dec(block, flags);
}
}

static inline void qdisc_cb_private_validate(const struct sk_buff *skb, int sz)
{
struct qdisc_skb_cb *qcb;
Expand Down
17 changes: 17 additions & 0 deletions include/net/tc_act/tc_tunnel_key.h
Original file line number Diff line number Diff line change
Expand Up @@ -59,4 +59,21 @@ static inline struct ip_tunnel_info *tcf_tunnel_info(const struct tc_action *a)
return NULL;
#endif
}

static inline struct ip_tunnel_info *
tcf_tunnel_info_copy(const struct tc_action *a)
{
#ifdef CONFIG_NET_CLS_ACT
struct ip_tunnel_info *tun = tcf_tunnel_info(a);

if (tun) {
size_t tun_size = sizeof(*tun) + tun->options_len;
struct ip_tunnel_info *tun_copy = kmemdup(tun, tun_size,
GFP_KERNEL);

return tun_copy;
}
#endif
return NULL;
}
#endif /* __NET_TC_TUNNEL_KEY_H */
Loading

0 comments on commit 72991b5

Please sign in to comment.