Skip to content

Commit

Permalink
Merge branch 'TLS-offload-rx-netdev-and-mlx5'
Browse files Browse the repository at this point in the history
Boris Pismenny says:

====================
TLS offload rx, netdev & mlx5

The following series provides TLS RX inline crypto offload.

v5->v4:
    - Remove the Kconfig to mutually exclude both IPsec and TLS

v4->v3:
    - Remove the iov revert for zero copy send flow

v2->v3:
    - Fix typo
    - Adjust cover letter
    - Fix bug in zero copy flows
    - Use network byte order for the record number in resync
    - Adjust the sequence provided in resync

v1->v2:
    - Fix bisectability problems due to variable name changes
    - Fix potential uninitialized return value

This series completes the generic infrastructure to offload TLS crypto to
a network devices. It enables the kernel TLS socket to skip decryption and
authentication operations for SKBs marked as decrypted on the receive
side of the data path. Leaving those computationally expensive operations
to the NIC.

This infrastructure doesn't require a TCP offload engine. Instead, the
NIC decrypts a packet's payload if the packet contains the expected TCP
sequence number. The TLS record authentication tag remains unmodified
regardless of decryption. If the packet is decrypted successfully and it
contains an authentication tag, then the authentication check has passed.
Otherwise, if the authentication fails, then the packet is provided
unmodified and the KTLS layer is responsible for handling it.
Out-Of-Order TCP packets are provided unmodified. As a result,
in the slow path some of the SKBs are decrypted while others remain as
ciphertext.

The GRO and TCP layers must not coalesce decrypted and non-decrypted SKBs.
At the worst case a received TLS record consists of both plaintext
and ciphertext packets. These partially decrypted records must be
reencrypted, only to be decrypted.

The notable differences between SW KTLS and NIC offloaded TLS
implementations are as follows:
1. Partial decryption - Software must handle the case of a TLS record
that was only partially decrypted by HW. This can happen due to packet
reordering.
2. Resynchronization - tls_read_size calls the device driver to
resynchronize HW whenever it lost track of the TLS record framing in
the TCP stream.

The infrastructure should be extendable to support various NIC offload
implementations.  However it is currently written with the
implementation below in mind:
The NIC identifies packets that should be offloaded according to
the 5-tuple and the TCP sequence number. If these match and the
packet is decrypted and authenticated successfully, then a syndrome
is provided to software. Otherwise, the packet is unmodified.
Decrypted and non-decrypted packets aren't coalesced by the network stack,
and the KTLS layer decrypts and authenticates partially decrypted records.
The NIC provides an indication whenever a resync is required. The resync
operation is triggered by the KTLS layer while parsing TLS record headers.

Finally, we measure the performance obtained by running single stream
iperf with two Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40GHz machines connected
back-to-back with Innova TLS (40Gbps) NICs. We compare TCP (upper bound)
and KTLS-Offload running both in Tx and Rx. The results show that the
performance of offload is comparable to TCP.

                          | Bandwidth (Gbps) | CPU Tx (%) | CPU rx (%)
TCP                       | 28.8             | 5          | 12
KTLS-Offload-Tx-Rx 	  | 28.6	     | 7          | 14

Paper: https://netdevconf.org/2.2/papers/pismenny-tlscrypto-talk.pdf
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
  • Loading branch information
David S. Miller committed Jul 16, 2018
2 parents cc98419 + b3ccf97 commit aea06eb
Show file tree
Hide file tree
Showing 25 changed files with 846 additions and 191 deletions.
37 changes: 37 additions & 0 deletions drivers/net/ethernet/mellanox/mlx5/core/accel/accel.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
#ifndef __MLX5E_ACCEL_H__
#define __MLX5E_ACCEL_H__

#ifdef CONFIG_MLX5_ACCEL

#include <linux/skbuff.h>
#include <linux/netdevice.h>
#include "en.h"

static inline bool is_metadata_hdr_valid(struct sk_buff *skb)
{
__be16 *ethtype;

if (unlikely(skb->len < ETH_HLEN + MLX5E_METADATA_ETHER_LEN))
return false;
ethtype = (__be16 *)(skb->data + ETH_ALEN * 2);
if (*ethtype != cpu_to_be16(MLX5E_METADATA_ETHER_TYPE))
return false;
return true;
}

static inline void remove_metadata_hdr(struct sk_buff *skb)
{
struct ethhdr *old_eth;
struct ethhdr *new_eth;

/* Remove the metadata from the buffer */
old_eth = (struct ethhdr *)skb->data;
new_eth = (struct ethhdr *)(skb->data + MLX5E_METADATA_ETHER_LEN);
memmove(new_eth, old_eth, 2 * ETH_ALEN);
/* Ethertype is already in its new place */
skb_pull_inline(skb, MLX5E_METADATA_ETHER_LEN);
}

#endif /* CONFIG_MLX5_ACCEL */

#endif /* __MLX5E_EN_ACCEL_H__ */
23 changes: 16 additions & 7 deletions drivers/net/ethernet/mellanox/mlx5/core/accel/tls.c
Original file line number Diff line number Diff line change
Expand Up @@ -37,17 +37,26 @@
#include "mlx5_core.h"
#include "fpga/tls.h"

int mlx5_accel_tls_add_tx_flow(struct mlx5_core_dev *mdev, void *flow,
struct tls_crypto_info *crypto_info,
u32 start_offload_tcp_sn, u32 *p_swid)
int mlx5_accel_tls_add_flow(struct mlx5_core_dev *mdev, void *flow,
struct tls_crypto_info *crypto_info,
u32 start_offload_tcp_sn, u32 *p_swid,
bool direction_sx)
{
return mlx5_fpga_tls_add_tx_flow(mdev, flow, crypto_info,
start_offload_tcp_sn, p_swid);
return mlx5_fpga_tls_add_flow(mdev, flow, crypto_info,
start_offload_tcp_sn, p_swid,
direction_sx);
}

void mlx5_accel_tls_del_tx_flow(struct mlx5_core_dev *mdev, u32 swid)
void mlx5_accel_tls_del_flow(struct mlx5_core_dev *mdev, u32 swid,
bool direction_sx)
{
mlx5_fpga_tls_del_tx_flow(mdev, swid, GFP_KERNEL);
mlx5_fpga_tls_del_flow(mdev, swid, GFP_KERNEL, direction_sx);
}

int mlx5_accel_tls_resync_rx(struct mlx5_core_dev *mdev, u32 handle, u32 seq,
u64 rcd_sn)
{
return mlx5_fpga_tls_resync_rx(mdev, handle, seq, rcd_sn);
}

bool mlx5_accel_is_tls_device(struct mlx5_core_dev *mdev)
Expand Down
26 changes: 17 additions & 9 deletions drivers/net/ethernet/mellanox/mlx5/core/accel/tls.h
Original file line number Diff line number Diff line change
Expand Up @@ -60,22 +60,30 @@ struct mlx5_ifc_tls_flow_bits {
u8 reserved_at_2[0x1e];
};

int mlx5_accel_tls_add_tx_flow(struct mlx5_core_dev *mdev, void *flow,
struct tls_crypto_info *crypto_info,
u32 start_offload_tcp_sn, u32 *p_swid);
void mlx5_accel_tls_del_tx_flow(struct mlx5_core_dev *mdev, u32 swid);
int mlx5_accel_tls_add_flow(struct mlx5_core_dev *mdev, void *flow,
struct tls_crypto_info *crypto_info,
u32 start_offload_tcp_sn, u32 *p_swid,
bool direction_sx);
void mlx5_accel_tls_del_flow(struct mlx5_core_dev *mdev, u32 swid,
bool direction_sx);
int mlx5_accel_tls_resync_rx(struct mlx5_core_dev *mdev, u32 handle, u32 seq,
u64 rcd_sn);
bool mlx5_accel_is_tls_device(struct mlx5_core_dev *mdev);
u32 mlx5_accel_tls_device_caps(struct mlx5_core_dev *mdev);
int mlx5_accel_tls_init(struct mlx5_core_dev *mdev);
void mlx5_accel_tls_cleanup(struct mlx5_core_dev *mdev);

#else

static inline int
mlx5_accel_tls_add_tx_flow(struct mlx5_core_dev *mdev, void *flow,
struct tls_crypto_info *crypto_info,
u32 start_offload_tcp_sn, u32 *p_swid) { return 0; }
static inline void mlx5_accel_tls_del_tx_flow(struct mlx5_core_dev *mdev, u32 swid) { }
static int
mlx5_accel_tls_add_flow(struct mlx5_core_dev *mdev, void *flow,
struct tls_crypto_info *crypto_info,
u32 start_offload_tcp_sn, u32 *p_swid,
bool direction_sx) { return -ENOTSUPP; }
static inline void mlx5_accel_tls_del_flow(struct mlx5_core_dev *mdev, u32 swid,
bool direction_sx) { }
static inline int mlx5_accel_tls_resync_rx(struct mlx5_core_dev *mdev, u32 handle,
u32 seq, u64 rcd_sn) { return 0; }
static inline bool mlx5_accel_is_tls_device(struct mlx5_core_dev *mdev) { return false; }
static inline u32 mlx5_accel_tls_device_caps(struct mlx5_core_dev *mdev) { return 0; }
static inline int mlx5_accel_tls_init(struct mlx5_core_dev *mdev) { return 0; }
Expand Down
20 changes: 5 additions & 15 deletions drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec_rxtx.c
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,7 @@

#include "en_accel/ipsec_rxtx.h"
#include "en_accel/ipsec.h"
#include "accel/accel.h"
#include "en.h"

enum {
Expand Down Expand Up @@ -346,19 +347,12 @@ mlx5e_ipsec_build_sp(struct net_device *netdev, struct sk_buff *skb,
}

struct sk_buff *mlx5e_ipsec_handle_rx_skb(struct net_device *netdev,
struct sk_buff *skb)
struct sk_buff *skb, u32 *cqe_bcnt)
{
struct mlx5e_ipsec_metadata *mdata;
struct ethhdr *old_eth;
struct ethhdr *new_eth;
struct xfrm_state *xs;
__be16 *ethtype;

/* Detect inline metadata */
if (skb->len < ETH_HLEN + MLX5E_METADATA_ETHER_LEN)
return skb;
ethtype = (__be16 *)(skb->data + ETH_ALEN * 2);
if (*ethtype != cpu_to_be16(MLX5E_METADATA_ETHER_TYPE))
if (!is_metadata_hdr_valid(skb))
return skb;

/* Use the metadata */
Expand All @@ -369,12 +363,8 @@ struct sk_buff *mlx5e_ipsec_handle_rx_skb(struct net_device *netdev,
return NULL;
}

/* Remove the metadata from the buffer */
old_eth = (struct ethhdr *)skb->data;
new_eth = (struct ethhdr *)(skb->data + MLX5E_METADATA_ETHER_LEN);
memmove(new_eth, old_eth, 2 * ETH_ALEN);
/* Ethertype is already in its new place */
skb_pull_inline(skb, MLX5E_METADATA_ETHER_LEN);
remove_metadata_hdr(skb);
*cqe_bcnt -= MLX5E_METADATA_ETHER_LEN;

return skb;
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@
#include "en.h"

struct sk_buff *mlx5e_ipsec_handle_rx_skb(struct net_device *netdev,
struct sk_buff *skb);
struct sk_buff *skb, u32 *cqe_bcnt);
void mlx5e_ipsec_handle_rx_cqe(struct mlx5e_rq *rq, struct mlx5_cqe64 *cqe);

void mlx5e_ipsec_inverse_table_init(void);
Expand Down
69 changes: 51 additions & 18 deletions drivers/net/ethernet/mellanox/mlx5/core/en_accel/tls.c
Original file line number Diff line number Diff line change
Expand Up @@ -110,9 +110,7 @@ static int mlx5e_tls_add(struct net_device *netdev, struct sock *sk,
u32 caps = mlx5_accel_tls_device_caps(mdev);
int ret = -ENOMEM;
void *flow;

if (direction != TLS_OFFLOAD_CTX_DIR_TX)
return -EINVAL;
u32 swid;

flow = kzalloc(MLX5_ST_SZ_BYTES(tls_flow), GFP_KERNEL);
if (!flow)
Expand All @@ -122,18 +120,23 @@ static int mlx5e_tls_add(struct net_device *netdev, struct sock *sk,
if (ret)
goto free_flow;

ret = mlx5_accel_tls_add_flow(mdev, flow, crypto_info,
start_offload_tcp_sn, &swid,
direction == TLS_OFFLOAD_CTX_DIR_TX);
if (ret < 0)
goto free_flow;

if (direction == TLS_OFFLOAD_CTX_DIR_TX) {
struct mlx5e_tls_offload_context *tx_ctx =
struct mlx5e_tls_offload_context_tx *tx_ctx =
mlx5e_get_tls_tx_context(tls_ctx);
u32 swid;

ret = mlx5_accel_tls_add_tx_flow(mdev, flow, crypto_info,
start_offload_tcp_sn, &swid);
if (ret < 0)
goto free_flow;

tx_ctx->swid = htonl(swid);
tx_ctx->expected_seq = start_offload_tcp_sn;
} else {
struct mlx5e_tls_offload_context_rx *rx_ctx =
mlx5e_get_tls_rx_context(tls_ctx);

rx_ctx->handle = htonl(swid);
}

return 0;
Expand All @@ -147,30 +150,60 @@ static void mlx5e_tls_del(struct net_device *netdev,
enum tls_offload_ctx_dir direction)
{
struct mlx5e_priv *priv = netdev_priv(netdev);
unsigned int handle;

if (direction == TLS_OFFLOAD_CTX_DIR_TX) {
u32 swid = ntohl(mlx5e_get_tls_tx_context(tls_ctx)->swid);
handle = ntohl((direction == TLS_OFFLOAD_CTX_DIR_TX) ?
mlx5e_get_tls_tx_context(tls_ctx)->swid :
mlx5e_get_tls_rx_context(tls_ctx)->handle);

mlx5_accel_tls_del_tx_flow(priv->mdev, swid);
} else {
netdev_err(netdev, "unsupported direction %d\n", direction);
}
mlx5_accel_tls_del_flow(priv->mdev, handle,
direction == TLS_OFFLOAD_CTX_DIR_TX);
}

static void mlx5e_tls_resync_rx(struct net_device *netdev, struct sock *sk,
u32 seq, u64 rcd_sn)
{
struct tls_context *tls_ctx = tls_get_ctx(sk);
struct mlx5e_priv *priv = netdev_priv(netdev);
struct mlx5e_tls_offload_context_rx *rx_ctx;

rx_ctx = mlx5e_get_tls_rx_context(tls_ctx);

netdev_info(netdev, "resyncing seq %d rcd %lld\n", seq,
be64_to_cpu(rcd_sn));
mlx5_accel_tls_resync_rx(priv->mdev, rx_ctx->handle, seq, rcd_sn);
atomic64_inc(&priv->tls->sw_stats.rx_tls_resync_reply);
}

static const struct tlsdev_ops mlx5e_tls_ops = {
.tls_dev_add = mlx5e_tls_add,
.tls_dev_del = mlx5e_tls_del,
.tls_dev_resync_rx = mlx5e_tls_resync_rx,
};

void mlx5e_tls_build_netdev(struct mlx5e_priv *priv)
{
u32 caps = mlx5_accel_tls_device_caps(priv->mdev);
struct net_device *netdev = priv->netdev;

if (!mlx5_accel_is_tls_device(priv->mdev))
return;

netdev->features |= NETIF_F_HW_TLS_TX;
netdev->hw_features |= NETIF_F_HW_TLS_TX;
if (caps & MLX5_ACCEL_TLS_TX) {
netdev->features |= NETIF_F_HW_TLS_TX;
netdev->hw_features |= NETIF_F_HW_TLS_TX;
}

if (caps & MLX5_ACCEL_TLS_RX) {
netdev->features |= NETIF_F_HW_TLS_RX;
netdev->hw_features |= NETIF_F_HW_TLS_RX;
}

if (!(caps & MLX5_ACCEL_TLS_LRO)) {
netdev->features &= ~NETIF_F_LRO;
netdev->hw_features &= ~NETIF_F_LRO;
}

netdev->tlsdev_ops = &mlx5e_tls_ops;
}

Expand Down
33 changes: 26 additions & 7 deletions drivers/net/ethernet/mellanox/mlx5/core/en_accel/tls.h
Original file line number Diff line number Diff line change
Expand Up @@ -43,25 +43,44 @@ struct mlx5e_tls_sw_stats {
atomic64_t tx_tls_drop_resync_alloc;
atomic64_t tx_tls_drop_no_sync_data;
atomic64_t tx_tls_drop_bypass_required;
atomic64_t rx_tls_drop_resync_request;
atomic64_t rx_tls_resync_request;
atomic64_t rx_tls_resync_reply;
atomic64_t rx_tls_auth_fail;
};

struct mlx5e_tls {
struct mlx5e_tls_sw_stats sw_stats;
};

struct mlx5e_tls_offload_context {
struct tls_offload_context base;
struct mlx5e_tls_offload_context_tx {
struct tls_offload_context_tx base;
u32 expected_seq;
__be32 swid;
};

static inline struct mlx5e_tls_offload_context *
static inline struct mlx5e_tls_offload_context_tx *
mlx5e_get_tls_tx_context(struct tls_context *tls_ctx)
{
BUILD_BUG_ON(sizeof(struct mlx5e_tls_offload_context) >
TLS_OFFLOAD_CONTEXT_SIZE);
return container_of(tls_offload_ctx(tls_ctx),
struct mlx5e_tls_offload_context,
BUILD_BUG_ON(sizeof(struct mlx5e_tls_offload_context_tx) >
TLS_OFFLOAD_CONTEXT_SIZE_TX);
return container_of(tls_offload_ctx_tx(tls_ctx),
struct mlx5e_tls_offload_context_tx,
base);
}

struct mlx5e_tls_offload_context_rx {
struct tls_offload_context_rx base;
__be32 handle;
};

static inline struct mlx5e_tls_offload_context_rx *
mlx5e_get_tls_rx_context(struct tls_context *tls_ctx)
{
BUILD_BUG_ON(sizeof(struct mlx5e_tls_offload_context_rx) >
TLS_OFFLOAD_CONTEXT_SIZE_RX);
return container_of(tls_offload_ctx_rx(tls_ctx),
struct mlx5e_tls_offload_context_rx,
base);
}

Expand Down
Loading

0 comments on commit aea06eb

Please sign in to comment.