Skip to content

Commit

Permalink
bcache: Delete some slower inline asm
Browse files Browse the repository at this point in the history
Never saw a profile of bset_search_tree() where it wasn't bottlenecked
on memory until I got my new Haswell machine, but when I tried it there
it was suddenly burning 20% of the cpu in the inner loop on shrd...

Turns out, the version of shrd that takes 64 bit operands has a 9 cycle
latency. hah.

Signed-off-by: Kent Overstreet <kmo@daterainc.com>
  • Loading branch information
Kent Overstreet committed Nov 11, 2013
1 parent 28935ab commit 098fb25
Showing 1 changed file with 0 additions and 8 deletions.
8 changes: 0 additions & 8 deletions drivers/md/bcache/bset.c
Original file line number Diff line number Diff line change
Expand Up @@ -481,16 +481,8 @@ static struct bkey *table_to_bkey(struct bset_tree *t, unsigned cacheline)

static inline uint64_t shrd128(uint64_t high, uint64_t low, uint8_t shift)
{
#ifdef CONFIG_X86_64
asm("shrd %[shift],%[high],%[low]"
: [low] "+Rm" (low)
: [high] "R" (high),
[shift] "ci" (shift)
: "cc");
#else
low >>= shift;
low |= (high << 1) << (63U - shift);
#endif
return low;
}

Expand Down

0 comments on commit 098fb25

Please sign in to comment.