Skip to content

Commit

Permalink
IB/mad: fix oops in cancel_mads
Browse files Browse the repository at this point in the history
We have seen the following OOPs in cancel_mads, when restarting opensm
multiple times:

    Call Trace:
      [<c010549b>] show_stack+0x9b/0xb0
      [<c01055ec>] show_registers+0x11c/0x190
      [<c01057cd>] die+0xed/0x160
      [<c031b966>] do_page_fault+0x3f6/0x5d0
      [<c010511f>] error_code+0x4f/0x60
      [<f8ac4e38>] cancel_mads+0x128/0x150 [ib_mad]
      [<f8ac2811>] unregister_mad_agent+0x11/0x130 [ib_mad]
      [<f8ac2a12>] ib_unregister_mad_agent+0x12/0x20 [ib_mad]
      [<f8b10f23>] ib_umad_close+0xf3/0x130 [ib_umad]
      [<c0162937>] __fput+0x187/0x1c0
      [<c01627a9>] fput+0x19/0x20
      [<c0160f7a>] filp_close+0x3a/0x60
      [<c0121ca8>] put_files_struct+0x68/0xa0
      [<c0103cf7>] do_signal+0x47/0x100
      [<c0103ded>] do_notify_resume+0x3d/0x40
      [<c0103f9e>] work_notifysig+0x13/0x25

We traced this back to local_completions unlocking mad_agent_priv->lock
while still keeping a pointer into local_list. A later call to
list_del(&local->completion_list) would then corrupt the list.

To fix this, remove the entry from local_list after looking it up but
before releasing mad_agent_priv->lock, to prevent cancel_mads from
finding and freeing it.

Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il>
Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
  • Loading branch information
Michael S. Tsirkin authored and Roland Dreier committed Apr 2, 2006
1 parent f27f0a0 commit 37289ef
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion drivers/infiniband/core/mad.c
Original file line number Diff line number Diff line change
Expand Up @@ -2311,6 +2311,7 @@ static void local_completions(void *data)
local = list_entry(mad_agent_priv->local_list.next,
struct ib_mad_local_private,
completion_list);
list_del(&local->completion_list);
spin_unlock_irqrestore(&mad_agent_priv->lock, flags);
if (local->mad_priv) {
recv_mad_agent = local->recv_mad_agent;
Expand Down Expand Up @@ -2362,7 +2363,6 @@ static void local_completions(void *data)
&mad_send_wc);

spin_lock_irqsave(&mad_agent_priv->lock, flags);
list_del(&local->completion_list);
atomic_dec(&mad_agent_priv->refcount);
if (!recv)
kmem_cache_free(ib_mad_cache, local->mad_priv);
Expand Down

0 comments on commit 37289ef

Please sign in to comment.