Skip to content

Commit

Permalink
Merge tag 'nf-next-25-01-19' of git://git.kernel.org/pub/scm/linux/ke…
Browse files Browse the repository at this point in the history
…rnel/git/netfilter/nf-next

Pablo Neira Ayuso says:

====================
Netfilter updates for net-next

The following batch contains Netfilter updates for net-next:

1) Unbreak set size settings for rbtree set backend, intervals in
   rbtree are represented as two elements, this detailed is leaked
   to userspace leading to bogus ENOSPC from control plane.

2) Remove dead code in br_netfilter's br_nf_pre_routing_finish()
   due to never matching error when looking up for route,
   from Antoine Tenart.

3) Simplify check for device already in use in flowtable,
   from Phil Sutter.

4) Three patches to restore interface name field in struct nft_hook
   and use it, this is to prepare for wildcard interface support.
   From Phil Sutter.

5) Do not remove netdev basechain when last device is gone, this is
   for consistency with the flowtable behaviour. This allows for netdev
   basechains without devices. Another patch to simplify netdev event
   notifier after this update. Also from Phil.

6) Two patches to add missing spinlock when flowtable updates TCP
   state flags, from Florian Westphal.

7) Simplify __nf_ct_refresh_acct() by removing skbuff parameter,
   also from Florian.

8) Flowtable gc now extends ct timeout for offloaded flow. This
   is to address a possible race that leads to handing over flow
   to classic path with long ct timeouts.

9) Tear down flow if cached rt_mtu is stale, before this patch,
   packet is handed over to classic path but flow entry still remained
   in place.

10) Revisit the flowtable teardown strategy, which was originally
    designed to release flowtable hardware entries early. Add a new
    CLOSING flag that still allows hardware to release entries when
    fin/rst is seen, but keeps the flow entry in place when the
    TCP connection is closed. Release flow after timeout or when a new
    syn packet is seen for TCP reopen scenario.

* tag 'nf-next-25-01-19' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next:
  netfilter: flowtable: add CLOSING state
  netfilter: flowtable: teardown flow if cached mtu is stale
  netfilter: conntrack: rework offload nf_conn timeout extension logic
  netfilter: conntrack: remove skb argument from nf_ct_refresh
  netfilter: nft_flow_offload: update tcp state flags under lock
  netfilter: nft_flow_offload: clear tcp MAXACK flag before moving to slowpath
  netfilter: nf_tables: Simplify chain netdev notifier
  netfilter: nf_tables: Tolerate chains with no remaining hooks
  netfilter: nf_tables: Compare netdev hooks based on stored name
  netfilter: nf_tables: Use stored ifname in netdev hook dumps
  netfilter: nf_tables: Store user-defined hook ifname
  netfilter: nf_tables: Flowtable hook's pf value never varies
  netfilter: br_netfilter: remove unused conditional and dead code
  netfilter: nf_tables: fix set size with rbtree backend
====================

Link: https://patch.msgid.link/20250119172051.8261-1-pablo@netfilter.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
  • Loading branch information
Jakub Kicinski committed Jan 20, 2025
2 parents 01f5f35 + fdbaf51 commit 4fd001f
Show file tree
Hide file tree
Showing 16 changed files with 332 additions and 185 deletions.
18 changes: 3 additions & 15 deletions include/net/netfilter/nf_conntrack.h
Original file line number Diff line number Diff line change
Expand Up @@ -204,24 +204,22 @@ bool nf_ct_get_tuplepr(const struct sk_buff *skb, unsigned int nhoff,
struct nf_conntrack_tuple *tuple);

void __nf_ct_refresh_acct(struct nf_conn *ct, enum ip_conntrack_info ctinfo,
const struct sk_buff *skb,
u32 extra_jiffies, bool do_acct);
u32 extra_jiffies, unsigned int bytes);

/* Refresh conntrack for this many jiffies and do accounting */
static inline void nf_ct_refresh_acct(struct nf_conn *ct,
enum ip_conntrack_info ctinfo,
const struct sk_buff *skb,
u32 extra_jiffies)
{
__nf_ct_refresh_acct(ct, ctinfo, skb, extra_jiffies, true);
__nf_ct_refresh_acct(ct, ctinfo, extra_jiffies, skb->len);
}

/* Refresh conntrack for this many jiffies */
static inline void nf_ct_refresh(struct nf_conn *ct,
const struct sk_buff *skb,
u32 extra_jiffies)
{
__nf_ct_refresh_acct(ct, 0, skb, extra_jiffies, false);
__nf_ct_refresh_acct(ct, 0, extra_jiffies, 0);
}

/* kill conntrack and do accounting */
Expand Down Expand Up @@ -314,16 +312,6 @@ static inline bool nf_ct_should_gc(const struct nf_conn *ct)

#define NF_CT_DAY (86400 * HZ)

/* Set an arbitrary timeout large enough not to ever expire, this save
* us a check for the IPS_OFFLOAD_BIT from the packet path via
* nf_ct_is_expired().
*/
static inline void nf_ct_offload_timeout(struct nf_conn *ct)
{
if (nf_ct_expires(ct) < NF_CT_DAY / 2)
WRITE_ONCE(ct->timeout, nfct_time_stamp + NF_CT_DAY);
}

struct kernel_param;

int nf_conntrack_set_hashsize(const char *val, const struct kernel_param *kp);
Expand Down
1 change: 1 addition & 0 deletions include/net/netfilter/nf_flow_table.h
Original file line number Diff line number Diff line change
Expand Up @@ -163,6 +163,7 @@ struct flow_offload_tuple_rhash {
enum nf_flow_flags {
NF_FLOW_SNAT,
NF_FLOW_DNAT,
NF_FLOW_CLOSING,
NF_FLOW_TEARDOWN,
NF_FLOW_HW,
NF_FLOW_HW_DYING,
Expand Down
10 changes: 8 additions & 2 deletions include/net/netfilter/nf_tables.h
Original file line number Diff line number Diff line change
Expand Up @@ -442,6 +442,9 @@ struct nft_set_ext;
* @remove: remove element from set
* @walk: iterate over all set elements
* @get: get set elements
* @ksize: kernel set size
* @usize: userspace set size
* @adjust_maxsize: delta to adjust maximum set size
* @commit: commit set elements
* @abort: abort set elements
* @privsize: function to return size of set private data
Expand Down Expand Up @@ -495,6 +498,9 @@ struct nft_set_ops {
const struct nft_set *set,
const struct nft_set_elem *elem,
unsigned int flags);
u32 (*ksize)(u32 size);
u32 (*usize)(u32 size);
u32 (*adjust_maxsize)(const struct nft_set *set);
void (*commit)(struct nft_set *set);
void (*abort)(const struct nft_set *set);
u64 (*privsize)(const struct nlattr * const nla[],
Expand Down Expand Up @@ -1195,6 +1201,8 @@ struct nft_hook {
struct list_head list;
struct nf_hook_ops ops;
struct rcu_head rcu;
char ifname[IFNAMSIZ];
u8 ifnamelen;
};

/**
Expand Down Expand Up @@ -1230,8 +1238,6 @@ static inline bool nft_is_base_chain(const struct nft_chain *chain)
return chain->flags & NFT_CHAIN_BASE;
}

int __nft_release_basechain(struct nft_ctx *ctx);

unsigned int nft_do_chain(struct nft_pktinfo *pkt, void *priv);

static inline bool nft_use_inc(u32 *use)
Expand Down
30 changes: 1 addition & 29 deletions net/bridge/br_netfilter_hooks.c
Original file line number Diff line number Diff line change
Expand Up @@ -393,38 +393,10 @@ static int br_nf_pre_routing_finish(struct net *net, struct sock *sk, struct sk_
reason = ip_route_input(skb, iph->daddr, iph->saddr,
ip4h_dscp(iph), dev);
if (reason) {
struct in_device *in_dev = __in_dev_get_rcu(dev);

/* If err equals -EHOSTUNREACH the error is due to a
* martian destination or due to the fact that
* forwarding is disabled. For most martian packets,
* ip_route_output_key() will fail. It won't fail for 2 types of
* martian destinations: loopback destinations and destination
* 0.0.0.0. In both cases the packet will be dropped because the
* destination is the loopback device and not the bridge. */
if (reason != SKB_DROP_REASON_IP_INADDRERRORS || !in_dev ||
IN_DEV_FORWARD(in_dev))
goto free_skb;

rt = ip_route_output(net, iph->daddr, 0,
ip4h_dscp(iph), 0,
RT_SCOPE_UNIVERSE);
if (!IS_ERR(rt)) {
/* - Bridged-and-DNAT'ed traffic doesn't
* require ip_forwarding. */
if (rt->dst.dev == dev) {
skb_dst_drop(skb);
skb_dst_set(skb, &rt->dst);
goto bridged_dnat;
}
ip_rt_put(rt);
}
free_skb:
kfree_skb(skb);
kfree_skb_reason(skb, reason);
return 0;
} else {
if (skb_dst(skb)->dev == dev) {
bridged_dnat:
skb->dev = br_indev;
nf_bridge_update_protocol(skb);
nf_bridge_push_encap_header(skb);
Expand Down
2 changes: 1 addition & 1 deletion net/netfilter/nf_conntrack_amanda.c
Original file line number Diff line number Diff line change
Expand Up @@ -106,7 +106,7 @@ static int amanda_help(struct sk_buff *skb,

/* increase the UDP timeout of the master connection as replies from
* Amanda clients to the server can be quite delayed */
nf_ct_refresh(ct, skb, master_timeout * HZ);
nf_ct_refresh(ct, master_timeout * HZ);

/* No data? */
dataoff = protoff + sizeof(struct udphdr);
Expand Down
2 changes: 1 addition & 1 deletion net/netfilter/nf_conntrack_broadcast.c
Original file line number Diff line number Diff line change
Expand Up @@ -75,7 +75,7 @@ int nf_conntrack_broadcast_help(struct sk_buff *skb,
nf_ct_expect_related(exp, 0);
nf_ct_expect_put(exp);

nf_ct_refresh(ct, skb, timeout * HZ);
nf_ct_refresh(ct, timeout * HZ);
out:
return NF_ACCEPT;
}
Expand Down
13 changes: 3 additions & 10 deletions net/netfilter/nf_conntrack_core.c
Original file line number Diff line number Diff line change
Expand Up @@ -1544,12 +1544,6 @@ static void gc_worker(struct work_struct *work)

tmp = nf_ct_tuplehash_to_ctrack(h);

if (test_bit(IPS_OFFLOAD_BIT, &tmp->status)) {
nf_ct_offload_timeout(tmp);
if (!nf_conntrack_max95)
continue;
}

if (expired_count > GC_SCAN_EXPIRED_MAX) {
rcu_read_unlock();

Expand Down Expand Up @@ -2089,9 +2083,8 @@ EXPORT_SYMBOL_GPL(nf_conntrack_in);
/* Refresh conntrack for this many jiffies and do accounting if do_acct is 1 */
void __nf_ct_refresh_acct(struct nf_conn *ct,
enum ip_conntrack_info ctinfo,
const struct sk_buff *skb,
u32 extra_jiffies,
bool do_acct)
unsigned int bytes)
{
/* Only update if this is not a fixed timeout */
if (test_bit(IPS_FIXED_TIMEOUT_BIT, &ct->status))
Expand All @@ -2104,8 +2097,8 @@ void __nf_ct_refresh_acct(struct nf_conn *ct,
if (READ_ONCE(ct->timeout) != extra_jiffies)
WRITE_ONCE(ct->timeout, extra_jiffies);
acct:
if (do_acct)
nf_ct_acct_update(ct, CTINFO2DIR(ctinfo), skb->len);
if (bytes)
nf_ct_acct_update(ct, CTINFO2DIR(ctinfo), bytes);
}
EXPORT_SYMBOL_GPL(__nf_ct_refresh_acct);

Expand Down
4 changes: 2 additions & 2 deletions net/netfilter/nf_conntrack_h323_main.c
Original file line number Diff line number Diff line change
Expand Up @@ -1385,7 +1385,7 @@ static int process_rcf(struct sk_buff *skb, struct nf_conn *ct,
if (info->timeout > 0) {
pr_debug("nf_ct_ras: set RAS connection timeout to "
"%u seconds\n", info->timeout);
nf_ct_refresh(ct, skb, info->timeout * HZ);
nf_ct_refresh(ct, info->timeout * HZ);

/* Set expect timeout */
spin_lock_bh(&nf_conntrack_expect_lock);
Expand Down Expand Up @@ -1433,7 +1433,7 @@ static int process_urq(struct sk_buff *skb, struct nf_conn *ct,
info->sig_port[!dir] = 0;

/* Give it 30 seconds for UCF or URJ */
nf_ct_refresh(ct, skb, 30 * HZ);
nf_ct_refresh(ct, 30 * HZ);

return 0;
}
Expand Down
4 changes: 2 additions & 2 deletions net/netfilter/nf_conntrack_sip.c
Original file line number Diff line number Diff line change
Expand Up @@ -1553,7 +1553,7 @@ static int sip_help_tcp(struct sk_buff *skb, unsigned int protoff,
if (dataoff >= skb->len)
return NF_ACCEPT;

nf_ct_refresh(ct, skb, sip_timeout * HZ);
nf_ct_refresh(ct, sip_timeout * HZ);

if (unlikely(skb_linearize(skb)))
return NF_DROP;
Expand Down Expand Up @@ -1624,7 +1624,7 @@ static int sip_help_udp(struct sk_buff *skb, unsigned int protoff,
if (dataoff >= skb->len)
return NF_ACCEPT;

nf_ct_refresh(ct, skb, sip_timeout * HZ);
nf_ct_refresh(ct, sip_timeout * HZ);

if (unlikely(skb_linearize(skb)))
return NF_DROP;
Expand Down
Loading

0 comments on commit 4fd001f

Please sign in to comment.