Skip to content

Commit

Permalink
IB/mad: Fix lock-lock-timer deadlock in RMPP code
Browse files Browse the repository at this point in the history
Holding agent->lock across cancel_delayed_work() (which does
del_timer_sync()) in ib_cancel_rmpp_recvs() leads to lockdep reports of
possible lock-timer deadlocks if a consumer ever does something that
connects agent->lock to a lock taken in IRQ context (cf
http://marc.info/?l=linux-rdma&m=125243699026045).

Fix this by changing the list items to a new state "CANCELING" while
holding the lock, and then canceling the delayed work without holding
the lock.  If the delayed work runs after the lock is dropped, it will
see the state is CANCELING and return immediately, so the list will
stay stable while we traverse it with the lock not held.

Reviewed-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
  • Loading branch information
Roland Dreier authored and Roland Dreier committed Sep 23, 2009
1 parent 86d7101 commit 0e442af
Showing 1 changed file with 13 additions and 4 deletions.
17 changes: 13 additions & 4 deletions drivers/infiniband/core/mad_rmpp.c
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,8 @@
enum rmpp_state {
RMPP_STATE_ACTIVE,
RMPP_STATE_TIMEOUT,
RMPP_STATE_COMPLETE
RMPP_STATE_COMPLETE,
RMPP_STATE_CANCELING
};

struct mad_rmpp_recv {
Expand Down Expand Up @@ -86,19 +87,23 @@ void ib_cancel_rmpp_recvs(struct ib_mad_agent_private *agent)
unsigned long flags;

spin_lock_irqsave(&agent->lock, flags);
list_for_each_entry(rmpp_recv, &agent->rmpp_list, list) {
if (rmpp_recv->state != RMPP_STATE_COMPLETE)
ib_free_recv_mad(rmpp_recv->rmpp_wc);
rmpp_recv->state = RMPP_STATE_CANCELING;
}
spin_unlock_irqrestore(&agent->lock, flags);

list_for_each_entry(rmpp_recv, &agent->rmpp_list, list) {
cancel_delayed_work(&rmpp_recv->timeout_work);
cancel_delayed_work(&rmpp_recv->cleanup_work);
}
spin_unlock_irqrestore(&agent->lock, flags);

flush_workqueue(agent->qp_info->port_priv->wq);

list_for_each_entry_safe(rmpp_recv, temp_rmpp_recv,
&agent->rmpp_list, list) {
list_del(&rmpp_recv->list);
if (rmpp_recv->state != RMPP_STATE_COMPLETE)
ib_free_recv_mad(rmpp_recv->rmpp_wc);
destroy_rmpp_recv(rmpp_recv);
}
}
Expand Down Expand Up @@ -260,6 +265,10 @@ static void recv_cleanup_handler(struct work_struct *work)
unsigned long flags;

spin_lock_irqsave(&rmpp_recv->agent->lock, flags);
if (rmpp_recv->state == RMPP_STATE_CANCELING) {
spin_unlock_irqrestore(&rmpp_recv->agent->lock, flags);
return;
}
list_del(&rmpp_recv->list);
spin_unlock_irqrestore(&rmpp_recv->agent->lock, flags);
destroy_rmpp_recv(rmpp_recv);
Expand Down

0 comments on commit 0e442af

Please sign in to comment.