Skip to content

Commit

Permalink
Merge branch 'tc-taprio-offload-for-SJA1105-DSA'
Browse files Browse the repository at this point in the history
Vladimir Oltean says:

====================
tc-taprio offload for SJA1105 DSA

This is the third attempt to submit the tc-taprio offload model for
inclusion in the networking tree. The sja1105 switch driver will provide
the first implementation of the offload. Only the bare minimum is added:

- The offload model and a DSA pass-through
- The hardware implementation
- The interaction with the netdev queues in the tagger code
- Documentation

What has been removed from previous attempts is support for
PTP-as-clocksource in sja1105, as well as configuring the traffic class
for management traffic.  These will be added as soon as the offload
model is settled.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
  • Loading branch information
David S. Miller committed Sep 16, 2019
2 parents 67e80b9 + 7c95afa commit db539ca
Show file tree
Hide file tree
Showing 17 changed files with 1,222 additions and 52 deletions.
90 changes: 90 additions & 0 deletions Documentation/networking/dsa/sja1105.rst
Original file line number Diff line number Diff line change
Expand Up @@ -146,6 +146,96 @@ enslaves eth0 and eth1 (the DSA master of the switch ports). This is because in
this mode, the switch ports beneath br0 are not capable of regular traffic, and
are only used as a conduit for switchdev operations.

Offloads
========

Time-aware scheduling
---------------------

The switch supports a variation of the enhancements for scheduled traffic
specified in IEEE 802.1Q-2018 (formerly 802.1Qbv). This means it can be used to
ensure deterministic latency for priority traffic that is sent in-band with its
gate-open event in the network schedule.

This capability can be managed through the tc-taprio offload ('flags 2'). The
difference compared to the software implementation of taprio is that the latter
would only be able to shape traffic originated from the CPU, but not
autonomously forwarded flows.

The device has 8 traffic classes, and maps incoming frames to one of them based
on the VLAN PCP bits (if no VLAN is present, the port-based default is used).
As described in the previous sections, depending on the value of
``vlan_filtering``, the EtherType recognized by the switch as being VLAN can
either be the typical 0x8100 or a custom value used internally by the driver
for tagging. Therefore, the switch ignores the VLAN PCP if used in standalone
or bridge mode with ``vlan_filtering=0``, as it will not recognize the 0x8100
EtherType. In these modes, injecting into a particular TX queue can only be
done by the DSA net devices, which populate the PCP field of the tagging header
on egress. Using ``vlan_filtering=1``, the behavior is the other way around:
offloaded flows can be steered to TX queues based on the VLAN PCP, but the DSA
net devices are no longer able to do that. To inject frames into a hardware TX
queue with VLAN awareness active, it is necessary to create a VLAN
sub-interface on the DSA master port, and send normal (0x8100) VLAN-tagged
towards the switch, with the VLAN PCP bits set appropriately.

Management traffic (having DMAC 01-80-C2-xx-xx-xx or 01-19-1B-xx-xx-xx) is the
notable exception: the switch always treats it with a fixed priority and
disregards any VLAN PCP bits even if present. The traffic class for management
traffic has a value of 7 (highest priority) at the moment, which is not
configurable in the driver.

Below is an example of configuring a 500 us cyclic schedule on egress port
``swp5``. The traffic class gate for management traffic (7) is open for 100 us,
and the gates for all other traffic classes are open for 400 us::

#!/bin/bash

set -e -u -o pipefail

NSEC_PER_SEC="1000000000"

gatemask() {
local tc_list="$1"
local mask=0

for tc in ${tc_list}; do
mask=$((${mask} | (1 << ${tc})))
done

printf "%02x" ${mask}
}

if ! systemctl is-active --quiet ptp4l; then
echo "Please start the ptp4l service"
exit
fi

now=$(phc_ctl /dev/ptp1 get | gawk '/clock time is/ { print $5; }')
# Phase-align the base time to the start of the next second.
sec=$(echo "${now}" | gawk -F. '{ print $1; }')
base_time="$(((${sec} + 1) * ${NSEC_PER_SEC}))"

tc qdisc add dev swp5 parent root handle 100 taprio \
num_tc 8 \
map 0 1 2 3 5 6 7 \
queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 \
base-time ${base_time} \
sched-entry S $(gatemask 7) 100000 \
sched-entry S $(gatemask "0 1 2 3 4 5 6") 400000 \
flags 2

It is possible to apply the tc-taprio offload on multiple egress ports. There
are hardware restrictions related to the fact that no gate event may trigger
simultaneously on two ports. The driver checks the consistency of the schedules
against this restriction and errors out when appropriate. Schedule analysis is
needed to avoid this, which is outside the scope of the document.

At the moment, the time-aware scheduler can only be triggered based on a
standalone clock and not based on PTP time. This means the base-time argument
from tc-taprio is ignored and the schedule starts right away. It also means it
is more difficult to phase-align the scheduler with the other devices in the
network.

Device Tree bindings and board design
=====================================

Expand Down
8 changes: 8 additions & 0 deletions drivers/net/dsa/sja1105/Kconfig
Original file line number Diff line number Diff line change
Expand Up @@ -23,3 +23,11 @@ config NET_DSA_SJA1105_PTP
help
This enables support for timestamping and PTP clock manipulations in
the SJA1105 DSA driver.

config NET_DSA_SJA1105_TAS
bool "Support for the Time-Aware Scheduler on NXP SJA1105"
depends on NET_DSA_SJA1105
help
This enables support for the TTEthernet-based egress scheduling
engine in the SJA1105 DSA driver, which is controlled using a
hardware offload of the tc-tqprio qdisc.
4 changes: 4 additions & 0 deletions drivers/net/dsa/sja1105/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -12,3 +12,7 @@ sja1105-objs := \
ifdef CONFIG_NET_DSA_SJA1105_PTP
sja1105-objs += sja1105_ptp.o
endif

ifdef CONFIG_NET_DSA_SJA1105_TAS
sja1105-objs += sja1105_tas.o
endif
6 changes: 6 additions & 0 deletions drivers/net/dsa/sja1105/sja1105.h
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,8 @@
*/
#define SJA1105_AGEING_TIME_MS(ms) ((ms) / 10)

#include "sja1105_tas.h"

/* Keeps the different addresses between E/T and P/Q/R/S */
struct sja1105_regs {
u64 device_id;
Expand Down Expand Up @@ -104,6 +106,7 @@ struct sja1105_private {
*/
struct mutex mgmt_lock;
struct sja1105_tagger_data tagger_data;
struct sja1105_tas_data tas_data;
};

#include "sja1105_dynamic_config.h"
Expand All @@ -120,6 +123,9 @@ typedef enum {
SPI_WRITE = 1,
} sja1105_spi_rw_mode_t;

/* From sja1105_main.c */
int sja1105_static_config_reload(struct sja1105_private *priv);

/* From sja1105_spi.c */
int sja1105_spi_send_packed_buf(const struct sja1105_private *priv,
sja1105_spi_rw_mode_t rw, u64 reg_addr,
Expand Down
8 changes: 8 additions & 0 deletions drivers/net/dsa/sja1105/sja1105_dynamic_config.c
Original file line number Diff line number Diff line change
Expand Up @@ -488,6 +488,8 @@ sja1105et_general_params_entry_packing(void *buf, void *entry_ptr,

/* SJA1105E/T: First generation */
struct sja1105_dynamic_table_ops sja1105et_dyn_ops[BLK_IDX_MAX_DYN] = {
[BLK_IDX_SCHEDULE] = {0},
[BLK_IDX_SCHEDULE_ENTRY_POINTS] = {0},
[BLK_IDX_L2_LOOKUP] = {
.entry_packing = sja1105et_dyn_l2_lookup_entry_packing,
.cmd_packing = sja1105et_l2_lookup_cmd_packing,
Expand Down Expand Up @@ -529,6 +531,8 @@ struct sja1105_dynamic_table_ops sja1105et_dyn_ops[BLK_IDX_MAX_DYN] = {
.packed_size = SJA1105ET_SIZE_MAC_CONFIG_DYN_CMD,
.addr = 0x36,
},
[BLK_IDX_SCHEDULE_PARAMS] = {0},
[BLK_IDX_SCHEDULE_ENTRY_POINTS_PARAMS] = {0},
[BLK_IDX_L2_LOOKUP_PARAMS] = {
.entry_packing = sja1105et_l2_lookup_params_entry_packing,
.cmd_packing = sja1105et_l2_lookup_params_cmd_packing,
Expand All @@ -552,6 +556,8 @@ struct sja1105_dynamic_table_ops sja1105et_dyn_ops[BLK_IDX_MAX_DYN] = {

/* SJA1105P/Q/R/S: Second generation */
struct sja1105_dynamic_table_ops sja1105pqrs_dyn_ops[BLK_IDX_MAX_DYN] = {
[BLK_IDX_SCHEDULE] = {0},
[BLK_IDX_SCHEDULE_ENTRY_POINTS] = {0},
[BLK_IDX_L2_LOOKUP] = {
.entry_packing = sja1105pqrs_dyn_l2_lookup_entry_packing,
.cmd_packing = sja1105pqrs_l2_lookup_cmd_packing,
Expand Down Expand Up @@ -593,6 +599,8 @@ struct sja1105_dynamic_table_ops sja1105pqrs_dyn_ops[BLK_IDX_MAX_DYN] = {
.packed_size = SJA1105PQRS_SIZE_MAC_CONFIG_DYN_CMD,
.addr = 0x4B,
},
[BLK_IDX_SCHEDULE_PARAMS] = {0},
[BLK_IDX_SCHEDULE_ENTRY_POINTS_PARAMS] = {0},
[BLK_IDX_L2_LOOKUP_PARAMS] = {
.entry_packing = sja1105et_l2_lookup_params_entry_packing,
.cmd_packing = sja1105et_l2_lookup_params_cmd_packing,
Expand Down
26 changes: 24 additions & 2 deletions drivers/net/dsa/sja1105/sja1105_main.c
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@
#include <linux/if_ether.h>
#include <linux/dsa/8021q.h>
#include "sja1105.h"
#include "sja1105_tas.h"

static void sja1105_hw_reset(struct gpio_desc *gpio, unsigned int pulse_len,
unsigned int startup_delay)
Expand Down Expand Up @@ -384,7 +385,9 @@ static int sja1105_init_general_params(struct sja1105_private *priv)
/* Disallow dynamic changing of the mirror port */
.mirr_ptacu = 0,
.switchid = priv->ds->index,
/* Priority queue for link-local frames trapped to CPU */
/* Priority queue for link-local management frames
* (both ingress to and egress from CPU - PTP, STP etc)
*/
.hostprio = 7,
.mac_fltres1 = SJA1105_LINKLOCAL_FILTER_A,
.mac_flt1 = SJA1105_LINKLOCAL_FILTER_A_MASK,
Expand Down Expand Up @@ -1380,7 +1383,7 @@ static void sja1105_bridge_leave(struct dsa_switch *ds, int port,
* modify at runtime (currently only MAC) and restore them after uploading,
* such that this operation is relatively seamless.
*/
static int sja1105_static_config_reload(struct sja1105_private *priv)
int sja1105_static_config_reload(struct sja1105_private *priv)
{
struct sja1105_mac_config_entry *mac;
int speed_mbps[SJA1105_NUM_PORTS];
Expand Down Expand Up @@ -1711,6 +1714,9 @@ static int sja1105_setup(struct dsa_switch *ds)
*/
ds->vlan_filtering_is_global = true;

/* Advertise the 8 egress queues */
ds->num_tx_queues = SJA1105_NUM_TC;

/* The DSA/switchdev model brings up switch ports in standalone mode by
* default, and that means vlan_filtering is 0 since they're not under
* a bridge, so it's safe to set up switch tagging at this time.
Expand All @@ -1722,6 +1728,7 @@ static void sja1105_teardown(struct dsa_switch *ds)
{
struct sja1105_private *priv = ds->priv;

sja1105_tas_teardown(ds);
cancel_work_sync(&priv->tagger_data.rxtstamp_work);
skb_queue_purge(&priv->tagger_data.skb_rxtstamp_queue);
sja1105_ptp_clock_unregister(priv);
Expand Down Expand Up @@ -2051,6 +2058,18 @@ static bool sja1105_port_txtstamp(struct dsa_switch *ds, int port,
return true;
}

static int sja1105_port_setup_tc(struct dsa_switch *ds, int port,
enum tc_setup_type type,
void *type_data)
{
switch (type) {
case TC_SETUP_QDISC_TAPRIO:
return sja1105_setup_tc_taprio(ds, port, type_data);
default:
return -EOPNOTSUPP;
}
}

static const struct dsa_switch_ops sja1105_switch_ops = {
.get_tag_protocol = sja1105_get_tag_protocol,
.setup = sja1105_setup,
Expand Down Expand Up @@ -2083,6 +2102,7 @@ static const struct dsa_switch_ops sja1105_switch_ops = {
.port_hwtstamp_set = sja1105_hwtstamp_set,
.port_rxtstamp = sja1105_port_rxtstamp,
.port_txtstamp = sja1105_port_txtstamp,
.port_setup_tc = sja1105_port_setup_tc,
};

static int sja1105_check_device_id(struct sja1105_private *priv)
Expand Down Expand Up @@ -2192,6 +2212,8 @@ static int sja1105_probe(struct spi_device *spi)
}
mutex_init(&priv->mgmt_lock);

sja1105_tas_setup(ds);

return dsa_register_switch(priv->ds);
}

Expand Down
Loading

0 comments on commit db539ca

Please sign in to comment.