Skip to content

Commit

Permalink
enic: prevent waking up stopped tx queues over watchdog reset
Browse files Browse the repository at this point in the history
Recent months, our customer reported several kernel crashes all
preceding with following message:
NETDEV WATCHDOG: eth2 (enic): transmit queue 0 timed out
Error message of one of those crashes:
BUG: unable to handle kernel paging request at ffffffffa007e090

After analyzing severl vmcores, I found that most of crashes are
caused by memory corruption. And all the corrupted memory areas
are overwritten by data of network packets. Moreover, I also found
that the tx queues were enabled over watchdog reset.

After going through the source code, I found that in enic_stop(),
the tx queues stopped by netif_tx_disable() could be woken up over
a small time window between netif_tx_disable() and the
napi_disable() by the following code path:
napi_poll->
  enic_poll_msix_wq->
     vnic_cq_service->
        enic_wq_service->
           netif_wake_subqueue(enic->netdev, q_number)->
              test_and_clear_bit(__QUEUE_STATE_DRV_XOFF, &txq->state)
In turn, upper netowrk stack could queue skb to ENIC NIC though
enic_hard_start_xmit(). And this might introduce some race condition.

Our customer comfirmed that this kind of kernel crash doesn't occur over
90 days since they applied this patch.

Signed-off-by: Firo Yang <firo.yang@suse.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
  • Loading branch information
Firo Yang authored and David S. Miller committed Feb 12, 2020
1 parent b44beb8 commit 0f90522
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion drivers/net/ethernet/cisco/enic/enic_main.c
Original file line number Diff line number Diff line change
Expand Up @@ -2013,10 +2013,10 @@ static int enic_stop(struct net_device *netdev)
napi_disable(&enic->napi[i]);

netif_carrier_off(netdev);
netif_tx_disable(netdev);
if (vnic_dev_get_intr_mode(enic->vdev) == VNIC_DEV_INTR_MODE_MSIX)
for (i = 0; i < enic->wq_count; i++)
napi_disable(&enic->napi[enic_cq_wq(enic, i)]);
netif_tx_disable(netdev);

if (!enic_is_dynamic(enic) && !enic_is_sriov_vf(enic))
enic_dev_del_station_addr(enic);
Expand Down

0 comments on commit 0f90522

Please sign in to comment.