Skip to content

Commit

Permalink
x86, vdso: Use asm volatile in __getcpu
Browse files Browse the repository at this point in the history
In Linux 3.18 and below, GCC hoists the lsl instructions in the
pvclock code all the way to the beginning of __vdso_clock_gettime,
slowing the non-paravirt case significantly.  For unknown reasons,
presumably related to the removal of a branch, the performance issue
is gone as of

e76b027 x86,vdso: Use LSL unconditionally for vgetcpu

but I don't trust GCC enough to expect the problem to stay fixed.

There should be no correctness issue, because the __getcpu calls in
__vdso_vlock_gettime were never necessary in the first place.

Note to stable maintainers: In 3.18 and below, depending on
configuration, gcc 4.9.2 generates code like this:

     9c3:       44 0f 03 e8             lsl    %ax,%r13d
     9c7:       45 89 eb                mov    %r13d,%r11d
     9ca:       0f 03 d8                lsl    %ax,%ebx

This patch won't apply as is to any released kernel, but I'll send a
trivial backported version if needed.

Fixes: 51c19b4 x86: vdso: pvclock gettime support
Cc: stable@vger.kernel.org # 3.8+
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Andy Lutomirski <luto@amacapital.net>
  • Loading branch information
Andy Lutomirski committed Dec 23, 2014
1 parent 394f56f commit 1ddf0b1
Showing 1 changed file with 4 additions and 2 deletions.
6 changes: 4 additions & 2 deletions arch/x86/include/asm/vgtod.h
Original file line number Diff line number Diff line change
Expand Up @@ -80,9 +80,11 @@ static inline unsigned int __getcpu(void)

/*
* Load per CPU data from GDT. LSL is faster than RDTSCP and
* works on all CPUs.
* works on all CPUs. This is volatile so that it orders
* correctly wrt barrier() and to keep gcc from cleverly
* hoisting it out of the calling function.
*/
asm("lsl %1,%0" : "=r" (p) : "r" (__PER_CPU_SEG));
asm volatile ("lsl %1,%0" : "=r" (p) : "r" (__PER_CPU_SEG));

return p;
}
Expand Down

0 comments on commit 1ddf0b1

Please sign in to comment.