Skip to content

Commit

Permalink
rtmutex: Detect changes in the pi lock chain
Browse files Browse the repository at this point in the history
When we walk the lock chain, we drop all locks after each step. So the
lock chain can change under us before we reacquire the locks. That's
harmless in principle as we just follow the wrong lock path. But it
can lead to a false positive in the dead lock detection logic:

T0 holds L0
T0 blocks on L1 held by T1
T1 blocks on L2 held by T2
T2 blocks on L3 held by T3
T4 blocks on L4 held by T4

Now we walk the chain

lock T1 -> lock L2 -> adjust L2 -> unlock T1 -> 
     lock T2 ->  adjust T2 ->  drop locks

T2 times out and blocks on L0

Now we continue:

lock T2 -> lock L0 -> deadlock detected, but it's not a deadlock at all.

Brad tried to work around that in the deadlock detection logic itself,
but the more I looked at it the less I liked it, because it's crystal
ball magic after the fact.

We actually can detect a chain change very simple:

lock T1 -> lock L2 -> adjust L2 -> unlock T1 -> lock T2 -> adjust T2 ->

     next_lock = T2->pi_blocked_on->lock;

drop locks

T2 times out and blocks on L0

Now we continue:

lock T2 -> 

     if (next_lock != T2->pi_blocked_on->lock)
     	   return;

So if we detect that T2 is now blocked on a different lock we stop the
chain walk. That's also correct in the following scenario:

lock T1 -> lock L2 -> adjust L2 -> unlock T1 -> lock T2 -> adjust T2 ->

     next_lock = T2->pi_blocked_on->lock;

drop locks

T3 times out and drops L3
T2 acquires L3 and blocks on L4 now

Now we continue:

lock T2 -> 

     if (next_lock != T2->pi_blocked_on->lock)
     	   return;

We don't have to follow up the chain at that point, because T2
propagated our priority up to T4 already.

[ Folded a cleanup patch from peterz ]

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reported-by: Brad Mouring <bmouring@ni.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20140605152801.930031935@linutronix.de
Cc: stable@vger.kernel.org
  • Loading branch information
Thomas Gleixner committed Jun 7, 2014
1 parent 3d5c934 commit 8208498
Showing 1 changed file with 71 additions and 24 deletions.
95 changes: 71 additions & 24 deletions kernel/locking/rtmutex.c
Original file line number Diff line number Diff line change
Expand Up @@ -260,27 +260,36 @@ static void rt_mutex_adjust_prio(struct task_struct *task)
*/
int max_lock_depth = 1024;

static inline struct rt_mutex *task_blocked_on_lock(struct task_struct *p)
{
return p->pi_blocked_on ? p->pi_blocked_on->lock : NULL;
}

/*
* Adjust the priority chain. Also used for deadlock detection.
* Decreases task's usage by one - may thus free the task.
*
* @task: the task owning the mutex (owner) for which a chain walk is probably
* needed
* @task: the task owning the mutex (owner) for which a chain walk is
* probably needed
* @deadlock_detect: do we have to carry out deadlock detection?
* @orig_lock: the mutex (can be NULL if we are walking the chain to recheck
* things for a task that has just got its priority adjusted, and
* is waiting on a mutex)
* @orig_lock: the mutex (can be NULL if we are walking the chain to recheck
* things for a task that has just got its priority adjusted, and
* is waiting on a mutex)
* @next_lock: the mutex on which the owner of @orig_lock was blocked before
* we dropped its pi_lock. Is never dereferenced, only used for
* comparison to detect lock chain changes.
* @orig_waiter: rt_mutex_waiter struct for the task that has just donated
* its priority to the mutex owner (can be NULL in the case
* depicted above or if the top waiter is gone away and we are
* actually deboosting the owner)
* @top_task: the current top waiter
* its priority to the mutex owner (can be NULL in the case
* depicted above or if the top waiter is gone away and we are
* actually deboosting the owner)
* @top_task: the current top waiter
*
* Returns 0 or -EDEADLK.
*/
static int rt_mutex_adjust_prio_chain(struct task_struct *task,
int deadlock_detect,
struct rt_mutex *orig_lock,
struct rt_mutex *next_lock,
struct rt_mutex_waiter *orig_waiter,
struct task_struct *top_task)
{
Expand Down Expand Up @@ -338,6 +347,18 @@ static int rt_mutex_adjust_prio_chain(struct task_struct *task,
if (orig_waiter && !rt_mutex_owner(orig_lock))
goto out_unlock_pi;

/*
* We dropped all locks after taking a refcount on @task, so
* the task might have moved on in the lock chain or even left
* the chain completely and blocks now on an unrelated lock or
* on @orig_lock.
*
* We stored the lock on which @task was blocked in @next_lock,
* so we can detect the chain change.
*/
if (next_lock != waiter->lock)
goto out_unlock_pi;

/*
* Drop out, when the task has no waiters. Note,
* top_waiter can be NULL, when we are in the deboosting
Expand Down Expand Up @@ -422,11 +443,26 @@ static int rt_mutex_adjust_prio_chain(struct task_struct *task,
__rt_mutex_adjust_prio(task);
}

/*
* Check whether the task which owns the current lock is pi
* blocked itself. If yes we store a pointer to the lock for
* the lock chain change detection above. After we dropped
* task->pi_lock next_lock cannot be dereferenced anymore.
*/
next_lock = task_blocked_on_lock(task);

raw_spin_unlock_irqrestore(&task->pi_lock, flags);

top_waiter = rt_mutex_top_waiter(lock);
raw_spin_unlock(&lock->wait_lock);

/*
* We reached the end of the lock chain. Stop right here. No
* point to go back just to figure that out.
*/
if (!next_lock)
goto out_put_task;

if (!detect_deadlock && waiter != top_waiter)
goto out_put_task;

Expand Down Expand Up @@ -536,8 +572,9 @@ static int task_blocks_on_rt_mutex(struct rt_mutex *lock,
{
struct task_struct *owner = rt_mutex_owner(lock);
struct rt_mutex_waiter *top_waiter = waiter;
unsigned long flags;
struct rt_mutex *next_lock;
int chain_walk = 0, res;
unsigned long flags;

/*
* Early deadlock detection. We really don't want the task to
Expand Down Expand Up @@ -569,20 +606,28 @@ static int task_blocks_on_rt_mutex(struct rt_mutex *lock,
if (!owner)
return 0;

raw_spin_lock_irqsave(&owner->pi_lock, flags);
if (waiter == rt_mutex_top_waiter(lock)) {
raw_spin_lock_irqsave(&owner->pi_lock, flags);
rt_mutex_dequeue_pi(owner, top_waiter);
rt_mutex_enqueue_pi(owner, waiter);

__rt_mutex_adjust_prio(owner);
if (owner->pi_blocked_on)
chain_walk = 1;
raw_spin_unlock_irqrestore(&owner->pi_lock, flags);
}
else if (debug_rt_mutex_detect_deadlock(waiter, detect_deadlock))
} else if (debug_rt_mutex_detect_deadlock(waiter, detect_deadlock)) {
chain_walk = 1;
}

if (!chain_walk)
/* Store the lock on which owner is blocked or NULL */
next_lock = task_blocked_on_lock(owner);

raw_spin_unlock_irqrestore(&owner->pi_lock, flags);
/*
* Even if full deadlock detection is on, if the owner is not
* blocked itself, we can avoid finding this out in the chain
* walk.
*/
if (!chain_walk || !next_lock)
return 0;

/*
Expand All @@ -594,8 +639,8 @@ static int task_blocks_on_rt_mutex(struct rt_mutex *lock,

raw_spin_unlock(&lock->wait_lock);

res = rt_mutex_adjust_prio_chain(owner, detect_deadlock, lock, waiter,
task);
res = rt_mutex_adjust_prio_chain(owner, detect_deadlock, lock,
next_lock, waiter, task);

raw_spin_lock(&lock->wait_lock);

Expand Down Expand Up @@ -644,8 +689,8 @@ static void remove_waiter(struct rt_mutex *lock,
{
int first = (waiter == rt_mutex_top_waiter(lock));
struct task_struct *owner = rt_mutex_owner(lock);
struct rt_mutex *next_lock = NULL;
unsigned long flags;
int chain_walk = 0;

raw_spin_lock_irqsave(&current->pi_lock, flags);
rt_mutex_dequeue(lock, waiter);
Expand All @@ -669,21 +714,21 @@ static void remove_waiter(struct rt_mutex *lock,
}
__rt_mutex_adjust_prio(owner);

if (owner->pi_blocked_on)
chain_walk = 1;
/* Store the lock on which owner is blocked or NULL */
next_lock = task_blocked_on_lock(owner);

raw_spin_unlock_irqrestore(&owner->pi_lock, flags);
}

if (!chain_walk)
if (!next_lock)
return;

/* gets dropped in rt_mutex_adjust_prio_chain()! */
get_task_struct(owner);

raw_spin_unlock(&lock->wait_lock);

rt_mutex_adjust_prio_chain(owner, 0, lock, NULL, current);
rt_mutex_adjust_prio_chain(owner, 0, lock, next_lock, NULL, current);

raw_spin_lock(&lock->wait_lock);
}
Expand All @@ -696,6 +741,7 @@ static void remove_waiter(struct rt_mutex *lock,
void rt_mutex_adjust_pi(struct task_struct *task)
{
struct rt_mutex_waiter *waiter;
struct rt_mutex *next_lock;
unsigned long flags;

raw_spin_lock_irqsave(&task->pi_lock, flags);
Expand All @@ -706,12 +752,13 @@ void rt_mutex_adjust_pi(struct task_struct *task)
raw_spin_unlock_irqrestore(&task->pi_lock, flags);
return;
}

next_lock = waiter->lock;
raw_spin_unlock_irqrestore(&task->pi_lock, flags);

/* gets dropped in rt_mutex_adjust_prio_chain()! */
get_task_struct(task);
rt_mutex_adjust_prio_chain(task, 0, NULL, NULL, task);

rt_mutex_adjust_prio_chain(task, 0, NULL, next_lock, NULL, task);
}

/**
Expand Down

0 comments on commit 8208498

Please sign in to comment.