Skip to content

Commit

Permalink
Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Browse files Browse the repository at this point in the history
Pull networking fixes from David Miller:

 1) Prevent index integer overflow in ptr_ring, from Jason Wang.

 2) Program mvpp2 multicast filter properly, from Mikulas Patocka.

 3) The bridge brport attribute file is write only and doesn't have a
    ->show() method, don't blindly invoke it. From Xin Long.

 4) Inverted mask used in genphy_setup_forced(), from Ingo van Lil.

 5) Fix multiple definition issue with if_ether.h UAPI header, from
    Hauke Mehrtens.

 6) Fix GFP_KERNEL usage in atomic in RDS protocol code, from Sowmini
    Varadhan.

 7) Revert XDP redirect support from thunderx driver, it is not
    implemented properly. From Jesper Dangaard Brouer.

 8) Fix missing RTNL protection across some tipc operations, from Ying
    Xue.

 9) Return the correct IV bytes in the TLS getsockopt code, from Boris
    Pismenny.

10) Take tclassid into consideration properly when doing FIB rule
    matching. From Stefano Brivio.

11) cxgb4 device needs more PCI VPD quirks, from Casey Leedom.

12) TUN driver doesn't align frags properly, and we can end up doing
    unaligned atomics on misaligned metadata. From Eric Dumazet.

13) Fix various crashes found using DEBUG_PREEMPT in rmnet driver, from
    Subash Abhinov Kasiviswanathan.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (56 commits)
  tg3: APE heartbeat changes
  mlxsw: spectrum_router: Do not unconditionally clear route offload indication
  net: qualcomm: rmnet: Fix possible null dereference in command processing
  net: qualcomm: rmnet: Fix warning seen with 64 bit stats
  net: qualcomm: rmnet: Fix crash on real dev unregistration
  sctp: remove the left unnecessary check for chunk in sctp_renege_events
  rxrpc: Work around usercopy check
  tun: fix tun_napi_alloc_frags() frag allocator
  udplite: fix partial checksum initialization
  skbuff: Fix comment mis-spelling.
  dn_getsockoptdecnet: move nf_{get/set}sockopt outside sock lock
  PCI/cxgb4: Extend T3 PCI quirk to T4+ devices
  cxgb4: fix trailing zero in CIM LA dump
  cxgb4: free up resources of pf 0-3
  fib_semantics: Don't match route with mismatching tclassid
  NFC: llcp: Limit size of SDP URI
  tls: getsockopt return record sequence number
  tls: reset the crypto info if copy_from_user fails
  tls: retrun the correct IV in getsockopt
  docs: segmentation-offloads.txt: add SCTP info
  ...
  • Loading branch information
Linus Torvalds committed Feb 19, 2018
2 parents 91ab883 + 506b0a3 commit 79c0ef3
Show file tree
Hide file tree
Showing 52 changed files with 504 additions and 395 deletions.
38 changes: 34 additions & 4 deletions Documentation/networking/segmentation-offloads.txt
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@ The following technologies are described:
* Generic Segmentation Offload - GSO
* Generic Receive Offload - GRO
* Partial Generic Segmentation Offload - GSO_PARTIAL
* SCTP accelleration with GSO - GSO_BY_FRAGS

TCP Segmentation Offload
========================
Expand Down Expand Up @@ -49,6 +50,10 @@ datagram into multiple IPv4 fragments. Many of the requirements for UDP
fragmentation offload are the same as TSO. However the IPv4 ID for
fragments should not increment as a single IPv4 datagram is fragmented.

UFO is deprecated: modern kernels will no longer generate UFO skbs, but can
still receive them from tuntap and similar devices. Offload of UDP-based
tunnel protocols is still supported.

IPIP, SIT, GRE, UDP Tunnel, and Remote Checksum Offloads
========================================================

Expand Down Expand Up @@ -83,10 +88,10 @@ SKB_GSO_UDP_TUNNEL_CSUM. These two additional tunnel types reflect the
fact that the outer header also requests to have a non-zero checksum
included in the outer header.

Finally there is SKB_GSO_REMCSUM which indicates that a given tunnel header
has requested a remote checksum offload. In this case the inner headers
will be left with a partial checksum and only the outer header checksum
will be computed.
Finally there is SKB_GSO_TUNNEL_REMCSUM which indicates that a given tunnel
header has requested a remote checksum offload. In this case the inner
headers will be left with a partial checksum and only the outer header
checksum will be computed.

Generic Segmentation Offload
============================
Expand Down Expand Up @@ -128,3 +133,28 @@ values for if the header was simply duplicated. The one exception to this
is the outer IPv4 ID field. It is up to the device drivers to guarantee
that the IPv4 ID field is incremented in the case that a given header does
not have the DF bit set.

SCTP accelleration with GSO
===========================

SCTP - despite the lack of hardware support - can still take advantage of
GSO to pass one large packet through the network stack, rather than
multiple small packets.

This requires a different approach to other offloads, as SCTP packets
cannot be just segmented to (P)MTU. Rather, the chunks must be contained in
IP segments, padding respected. So unlike regular GSO, SCTP can't just
generate a big skb, set gso_size to the fragmentation point and deliver it
to IP layer.

Instead, the SCTP protocol layer builds an skb with the segments correctly
padded and stored as chained skbs, and skb_segment() splits based on those.
To signal this, gso_size is set to the special value GSO_BY_FRAGS.

Therefore, any code in the core networking stack must be aware of the
possibility that gso_size will be GSO_BY_FRAGS and handle that case
appropriately. (For size checks, the skb_gso_validate_*_len family of
helpers do this automatically.)

This also affects drivers with the NETIF_F_FRAGLIST & NETIF_F_GSO_SCTP bits
set. Note also that NETIF_F_GSO_SCTP is included in NETIF_F_GSO_SOFTWARE.
35 changes: 24 additions & 11 deletions drivers/net/ethernet/broadcom/tg3.c
Original file line number Diff line number Diff line change
Expand Up @@ -820,7 +820,7 @@ static int tg3_ape_event_lock(struct tg3 *tp, u32 timeout_us)

tg3_ape_unlock(tp, TG3_APE_LOCK_MEM);

udelay(10);
usleep_range(10, 20);
timeout_us -= (timeout_us > 10) ? 10 : timeout_us;
}

Expand Down Expand Up @@ -922,8 +922,8 @@ static int tg3_ape_send_event(struct tg3 *tp, u32 event)
if (!(apedata & APE_FW_STATUS_READY))
return -EAGAIN;

/* Wait for up to 1 millisecond for APE to service previous event. */
err = tg3_ape_event_lock(tp, 1000);
/* Wait for up to 20 millisecond for APE to service previous event. */
err = tg3_ape_event_lock(tp, 20000);
if (err)
return err;

Expand All @@ -946,6 +946,7 @@ static void tg3_ape_driver_state_change(struct tg3 *tp, int kind)

switch (kind) {
case RESET_KIND_INIT:
tg3_ape_write32(tp, TG3_APE_HOST_HEARTBEAT_COUNT, tp->ape_hb++);
tg3_ape_write32(tp, TG3_APE_HOST_SEG_SIG,
APE_HOST_SEG_SIG_MAGIC);
tg3_ape_write32(tp, TG3_APE_HOST_SEG_LEN,
Expand All @@ -962,13 +963,6 @@ static void tg3_ape_driver_state_change(struct tg3 *tp, int kind)
event = APE_EVENT_STATUS_STATE_START;
break;
case RESET_KIND_SHUTDOWN:
/* With the interface we are currently using,
* APE does not track driver state. Wiping
* out the HOST SEGMENT SIGNATURE forces
* the APE to assume OS absent status.
*/
tg3_ape_write32(tp, TG3_APE_HOST_SEG_SIG, 0x0);

if (device_may_wakeup(&tp->pdev->dev) &&
tg3_flag(tp, WOL_ENABLE)) {
tg3_ape_write32(tp, TG3_APE_HOST_WOL_SPEED,
Expand All @@ -990,6 +984,18 @@ static void tg3_ape_driver_state_change(struct tg3 *tp, int kind)
tg3_ape_send_event(tp, event);
}

static void tg3_send_ape_heartbeat(struct tg3 *tp,
unsigned long interval)
{
/* Check if hb interval has exceeded */
if (!tg3_flag(tp, ENABLE_APE) ||
time_before(jiffies, tp->ape_hb_jiffies + interval))
return;

tg3_ape_write32(tp, TG3_APE_HOST_HEARTBEAT_COUNT, tp->ape_hb++);
tp->ape_hb_jiffies = jiffies;
}

static void tg3_disable_ints(struct tg3 *tp)
{
int i;
Expand Down Expand Up @@ -7262,6 +7268,7 @@ static int tg3_poll_msix(struct napi_struct *napi, int budget)
}
}

tg3_send_ape_heartbeat(tp, TG3_APE_HB_INTERVAL << 1);
return work_done;

tx_recovery:
Expand Down Expand Up @@ -7344,6 +7351,7 @@ static int tg3_poll(struct napi_struct *napi, int budget)
}
}

tg3_send_ape_heartbeat(tp, TG3_APE_HB_INTERVAL << 1);
return work_done;

tx_recovery:
Expand Down Expand Up @@ -10732,7 +10740,7 @@ static int tg3_reset_hw(struct tg3 *tp, bool reset_phy)
if (tg3_flag(tp, ENABLE_APE))
/* Write our heartbeat update interval to APE. */
tg3_ape_write32(tp, TG3_APE_HOST_HEARTBEAT_INT_MS,
APE_HOST_HEARTBEAT_INT_DISABLE);
APE_HOST_HEARTBEAT_INT_5SEC);

tg3_write_sig_post_reset(tp, RESET_KIND_INIT);

Expand Down Expand Up @@ -11077,6 +11085,9 @@ static void tg3_timer(struct timer_list *t)
tp->asf_counter = tp->asf_multiplier;
}

/* Update the APE heartbeat every 5 seconds.*/
tg3_send_ape_heartbeat(tp, TG3_APE_HB_INTERVAL);

spin_unlock(&tp->lock);

restart_timer:
Expand Down Expand Up @@ -16653,6 +16664,8 @@ static int tg3_get_invariants(struct tg3 *tp, const struct pci_device_id *ent)
pci_state_reg);

tg3_ape_lock_init(tp);
tp->ape_hb_interval =
msecs_to_jiffies(APE_HOST_HEARTBEAT_INT_5SEC);
}

/* Set up tp->grc_local_ctrl before calling
Expand Down
5 changes: 5 additions & 0 deletions drivers/net/ethernet/broadcom/tg3.h
Original file line number Diff line number Diff line change
Expand Up @@ -2508,6 +2508,7 @@
#define TG3_APE_LOCK_PHY3 5
#define TG3_APE_LOCK_GPIO 7

#define TG3_APE_HB_INTERVAL (tp->ape_hb_interval)
#define TG3_EEPROM_SB_F1R2_MBA_OFF 0x10


Expand Down Expand Up @@ -3423,6 +3424,10 @@ struct tg3 {
struct device *hwmon_dev;
bool link_up;
bool pcierr_recovery;

u32 ape_hb;
unsigned long ape_hb_interval;
unsigned long ape_hb_jiffies;
};

/* Accessor macros for chip and asic attributes
Expand Down
2 changes: 2 additions & 0 deletions drivers/net/ethernet/cavium/common/cavium_ptp.c
Original file line number Diff line number Diff line change
Expand Up @@ -75,6 +75,8 @@ EXPORT_SYMBOL(cavium_ptp_get);

void cavium_ptp_put(struct cavium_ptp *ptp)
{
if (!ptp)
return;
pci_dev_put(ptp->pdev);
}
EXPORT_SYMBOL(cavium_ptp_put);
Expand Down
110 changes: 26 additions & 84 deletions drivers/net/ethernet/cavium/thunder/nicvf_main.c
Original file line number Diff line number Diff line change
Expand Up @@ -67,11 +67,6 @@ module_param(cpi_alg, int, S_IRUGO);
MODULE_PARM_DESC(cpi_alg,
"PFC algorithm (0=none, 1=VLAN, 2=VLAN16, 3=IP Diffserv)");

struct nicvf_xdp_tx {
u64 dma_addr;
u8 qidx;
};

static inline u8 nicvf_netdev_qidx(struct nicvf *nic, u8 qidx)
{
if (nic->sqs_mode)
Expand Down Expand Up @@ -507,29 +502,14 @@ static int nicvf_init_resources(struct nicvf *nic)
return 0;
}

static void nicvf_unmap_page(struct nicvf *nic, struct page *page, u64 dma_addr)
{
/* Check if it's a recycled page, if not unmap the DMA mapping.
* Recycled page holds an extra reference.
*/
if (page_ref_count(page) == 1) {
dma_addr &= PAGE_MASK;
dma_unmap_page_attrs(&nic->pdev->dev, dma_addr,
RCV_FRAG_LEN + XDP_HEADROOM,
DMA_FROM_DEVICE,
DMA_ATTR_SKIP_CPU_SYNC);
}
}

static inline bool nicvf_xdp_rx(struct nicvf *nic, struct bpf_prog *prog,
struct cqe_rx_t *cqe_rx, struct snd_queue *sq,
struct rcv_queue *rq, struct sk_buff **skb)
{
struct xdp_buff xdp;
struct page *page;
struct nicvf_xdp_tx *xdp_tx = NULL;
u32 action;
u16 len, err, offset = 0;
u16 len, offset = 0;
u64 dma_addr, cpu_addr;
void *orig_data;

Expand All @@ -543,7 +523,7 @@ static inline bool nicvf_xdp_rx(struct nicvf *nic, struct bpf_prog *prog,
cpu_addr = (u64)phys_to_virt(cpu_addr);
page = virt_to_page((void *)cpu_addr);

xdp.data_hard_start = page_address(page) + RCV_BUF_HEADROOM;
xdp.data_hard_start = page_address(page);
xdp.data = (void *)cpu_addr;
xdp_set_data_meta_invalid(&xdp);
xdp.data_end = xdp.data + len;
Expand All @@ -563,7 +543,18 @@ static inline bool nicvf_xdp_rx(struct nicvf *nic, struct bpf_prog *prog,

switch (action) {
case XDP_PASS:
nicvf_unmap_page(nic, page, dma_addr);
/* Check if it's a recycled page, if not
* unmap the DMA mapping.
*
* Recycled page holds an extra reference.
*/
if (page_ref_count(page) == 1) {
dma_addr &= PAGE_MASK;
dma_unmap_page_attrs(&nic->pdev->dev, dma_addr,
RCV_FRAG_LEN + XDP_PACKET_HEADROOM,
DMA_FROM_DEVICE,
DMA_ATTR_SKIP_CPU_SYNC);
}

/* Build SKB and pass on packet to network stack */
*skb = build_skb(xdp.data,
Expand All @@ -576,28 +567,25 @@ static inline bool nicvf_xdp_rx(struct nicvf *nic, struct bpf_prog *prog,
case XDP_TX:
nicvf_xdp_sq_append_pkt(nic, sq, (u64)xdp.data, dma_addr, len);
return true;
case XDP_REDIRECT:
/* Save DMA address for use while transmitting */
xdp_tx = (struct nicvf_xdp_tx *)page_address(page);
xdp_tx->dma_addr = dma_addr;
xdp_tx->qidx = nicvf_netdev_qidx(nic, cqe_rx->rq_idx);

err = xdp_do_redirect(nic->pnicvf->netdev, &xdp, prog);
if (!err)
return true;

/* Free the page on error */
nicvf_unmap_page(nic, page, dma_addr);
put_page(page);
break;
default:
bpf_warn_invalid_xdp_action(action);
/* fall through */
case XDP_ABORTED:
trace_xdp_exception(nic->netdev, prog, action);
/* fall through */
case XDP_DROP:
nicvf_unmap_page(nic, page, dma_addr);
/* Check if it's a recycled page, if not
* unmap the DMA mapping.
*
* Recycled page holds an extra reference.
*/
if (page_ref_count(page) == 1) {
dma_addr &= PAGE_MASK;
dma_unmap_page_attrs(&nic->pdev->dev, dma_addr,
RCV_FRAG_LEN + XDP_PACKET_HEADROOM,
DMA_FROM_DEVICE,
DMA_ATTR_SKIP_CPU_SYNC);
}
put_page(page);
return true;
}
Expand Down Expand Up @@ -1864,50 +1852,6 @@ static int nicvf_xdp(struct net_device *netdev, struct netdev_bpf *xdp)
}
}

static int nicvf_xdp_xmit(struct net_device *netdev, struct xdp_buff *xdp)
{
struct nicvf *nic = netdev_priv(netdev);
struct nicvf *snic = nic;
struct nicvf_xdp_tx *xdp_tx;
struct snd_queue *sq;
struct page *page;
int err, qidx;

if (!netif_running(netdev) || !nic->xdp_prog)
return -EINVAL;

page = virt_to_page(xdp->data);
xdp_tx = (struct nicvf_xdp_tx *)page_address(page);
qidx = xdp_tx->qidx;

if (xdp_tx->qidx >= nic->xdp_tx_queues)
return -EINVAL;

/* Get secondary Qset's info */
if (xdp_tx->qidx >= MAX_SND_QUEUES_PER_QS) {
qidx = xdp_tx->qidx / MAX_SND_QUEUES_PER_QS;
snic = (struct nicvf *)nic->snicvf[qidx - 1];
if (!snic)
return -EINVAL;
qidx = xdp_tx->qidx % MAX_SND_QUEUES_PER_QS;
}

sq = &snic->qs->sq[qidx];
err = nicvf_xdp_sq_append_pkt(snic, sq, (u64)xdp->data,
xdp_tx->dma_addr,
xdp->data_end - xdp->data);
if (err)
return -ENOMEM;

nicvf_xdp_sq_doorbell(snic, sq, qidx);
return 0;
}

static void nicvf_xdp_flush(struct net_device *dev)
{
return;
}

static int nicvf_config_hwtstamp(struct net_device *netdev, struct ifreq *ifr)
{
struct hwtstamp_config config;
Expand Down Expand Up @@ -1986,8 +1930,6 @@ static const struct net_device_ops nicvf_netdev_ops = {
.ndo_fix_features = nicvf_fix_features,
.ndo_set_features = nicvf_set_features,
.ndo_bpf = nicvf_xdp,
.ndo_xdp_xmit = nicvf_xdp_xmit,
.ndo_xdp_flush = nicvf_xdp_flush,
.ndo_do_ioctl = nicvf_ioctl,
};

Expand Down
Loading

0 comments on commit 79c0ef3

Please sign in to comment.