Skip to content

Commit

Permalink
Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Browse files Browse the repository at this point in the history
Pull another networking update from David Miller:
 "Small follow-up to the main merge pull from the other day:

  1) Alexander Duyck's DMA memory barrier patch set.

  2) cxgb4 driver fixes from Karen Xie.

  3) Add missing export of fixed_phy_register() to modules, from Mark
     Salter.

  4) DSA bug fixes from Florian Fainelli"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (24 commits)
  net/macb: add TX multiqueue support for gem
  linux/interrupt.h: remove the definition of unused tasklet_hi_enable
  jme: replace calls to redundant function
  net: ethernet: davicom: Allow to select DM9000 for nios2
  net: ethernet: smsc: Allow to select SMC91X for nios2
  cxgb4: Add support for QSA modules
  libcxgbi: fix freeing skb prematurely
  cxgb4i: use set_wr_txq() to set tx queues
  cxgb4i: handle non-pdu-aligned rx data
  cxgb4i: additional types of negative advice
  cxgb4/cxgb4i: set the max. pdu length in firmware
  cxgb4i: fix credit check for tx_data_wr
  cxgb4i: fix tx immediate data credit check
  net: phy: export fixed_phy_register()
  fib_trie: Fix trie balancing issue if new node pushes down existing node
  vlan: Add ability to always enable TSO/UFO
  r8169:update rtl8168g pcie ephy parameter
  net: dsa: bcm_sf2: force link for all fixed PHY devices
  fm10k/igb/ixgbe: Use dma_rmb on Rx descriptor reads
  r8169: Use dma_rmb() and dma_wmb() for DescOwn checks
  ...
  • Loading branch information
Linus Torvalds committed Dec 13, 2014
2 parents 5543798 + eea3e8f commit f96fe22
Show file tree
Hide file tree
Showing 35 changed files with 771 additions and 425 deletions.
42 changes: 42 additions & 0 deletions Documentation/memory-barriers.txt
Original file line number Diff line number Diff line change
Expand Up @@ -1633,6 +1633,48 @@ There are some more advanced barrier functions:
operations" subsection for information on where to use these.


(*) dma_wmb();
(*) dma_rmb();

These are for use with consistent memory to guarantee the ordering
of writes or reads of shared memory accessible to both the CPU and a
DMA capable device.

For example, consider a device driver that shares memory with a device
and uses a descriptor status value to indicate if the descriptor belongs
to the device or the CPU, and a doorbell to notify it when new
descriptors are available:

if (desc->status != DEVICE_OWN) {
/* do not read data until we own descriptor */
dma_rmb();

/* read/modify data */
read_data = desc->data;
desc->data = write_data;

/* flush modifications before status update */
dma_wmb();

/* assign ownership */
desc->status = DEVICE_OWN;

/* force memory to sync before notifying device via MMIO */
wmb();

/* notify device of new descriptors */
writel(DESC_NOTIFY, doorbell);
}

The dma_rmb() allows us guarantee the device has released ownership
before we read the data from the descriptor, and he dma_wmb() allows
us to guarantee the data is written to the descriptor before the device
can see it now has ownership. The wmb() is needed to guarantee that the
cache coherent memory writes have completed before attempting a write to
the cache incoherent MMIO region.

See Documentation/DMA-API.txt for more information on consistent memory.

MMIO WRITE BARRIER
------------------

Expand Down
51 changes: 51 additions & 0 deletions arch/alpha/include/asm/barrier.h
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,57 @@
#define rmb() __asm__ __volatile__("mb": : :"memory")
#define wmb() __asm__ __volatile__("wmb": : :"memory")

/**
* read_barrier_depends - Flush all pending reads that subsequents reads
* depend on.
*
* No data-dependent reads from memory-like regions are ever reordered
* over this barrier. All reads preceding this primitive are guaranteed
* to access memory (but not necessarily other CPUs' caches) before any
* reads following this primitive that depend on the data return by
* any of the preceding reads. This primitive is much lighter weight than
* rmb() on most CPUs, and is never heavier weight than is
* rmb().
*
* These ordering constraints are respected by both the local CPU
* and the compiler.
*
* Ordering is not guaranteed by anything other than these primitives,
* not even by data dependencies. See the documentation for
* memory_barrier() for examples and URLs to more information.
*
* For example, the following code would force ordering (the initial
* value of "a" is zero, "b" is one, and "p" is "&a"):
*
* <programlisting>
* CPU 0 CPU 1
*
* b = 2;
* memory_barrier();
* p = &b; q = p;
* read_barrier_depends();
* d = *q;
* </programlisting>
*
* because the read of "*q" depends on the read of "p" and these
* two reads are separated by a read_barrier_depends(). However,
* the following code, with the same initial values for "a" and "b":
*
* <programlisting>
* CPU 0 CPU 1
*
* a = 2;
* memory_barrier();
* b = 3; y = b;
* read_barrier_depends();
* x = a;
* </programlisting>
*
* does not enforce ordering, since there is no data dependency between
* the read of "a" and the read of "b". Therefore, on some CPUs, such
* as Alpha, "y" could be set to 3 and "x" to 0. Use rmb()
* in cases like this where there are no data dependencies.
*/
#define read_barrier_depends() __asm__ __volatile__("mb": : :"memory")

#ifdef CONFIG_SMP
Expand Down
4 changes: 4 additions & 0 deletions arch/arm/include/asm/barrier.h
Original file line number Diff line number Diff line change
Expand Up @@ -43,10 +43,14 @@
#define mb() do { dsb(); outer_sync(); } while (0)
#define rmb() dsb()
#define wmb() do { dsb(st); outer_sync(); } while (0)
#define dma_rmb() dmb(osh)
#define dma_wmb() dmb(oshst)
#else
#define mb() barrier()
#define rmb() barrier()
#define wmb() barrier()
#define dma_rmb() barrier()
#define dma_wmb() barrier()
#endif

#ifndef CONFIG_SMP
Expand Down
3 changes: 3 additions & 0 deletions arch/arm64/include/asm/barrier.h
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,9 @@
#define rmb() dsb(ld)
#define wmb() dsb(st)

#define dma_rmb() dmb(oshld)
#define dma_wmb() dmb(oshst)

#ifndef CONFIG_SMP
#define smp_mb() barrier()
#define smp_rmb() barrier()
Expand Down
51 changes: 51 additions & 0 deletions arch/blackfin/include/asm/barrier.h
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,57 @@
# define mb() do { barrier(); smp_check_barrier(); smp_mark_barrier(); } while (0)
# define rmb() do { barrier(); smp_check_barrier(); } while (0)
# define wmb() do { barrier(); smp_mark_barrier(); } while (0)
/*
* read_barrier_depends - Flush all pending reads that subsequents reads
* depend on.
*
* No data-dependent reads from memory-like regions are ever reordered
* over this barrier. All reads preceding this primitive are guaranteed
* to access memory (but not necessarily other CPUs' caches) before any
* reads following this primitive that depend on the data return by
* any of the preceding reads. This primitive is much lighter weight than
* rmb() on most CPUs, and is never heavier weight than is
* rmb().
*
* These ordering constraints are respected by both the local CPU
* and the compiler.
*
* Ordering is not guaranteed by anything other than these primitives,
* not even by data dependencies. See the documentation for
* memory_barrier() for examples and URLs to more information.
*
* For example, the following code would force ordering (the initial
* value of "a" is zero, "b" is one, and "p" is "&a"):
*
* <programlisting>
* CPU 0 CPU 1
*
* b = 2;
* memory_barrier();
* p = &b; q = p;
* read_barrier_depends();
* d = *q;
* </programlisting>
*
* because the read of "*q" depends on the read of "p" and these
* two reads are separated by a read_barrier_depends(). However,
* the following code, with the same initial values for "a" and "b":
*
* <programlisting>
* CPU 0 CPU 1
*
* a = 2;
* memory_barrier();
* b = 3; y = b;
* read_barrier_depends();
* x = a;
* </programlisting>
*
* does not enforce ordering, since there is no data dependency between
* the read of "a" and the read of "b". Therefore, on some CPUs, such
* as Alpha, "y" could be set to 3 and "x" to 0. Use rmb()
* in cases like this where there are no data dependencies.
*/
# define read_barrier_depends() do { barrier(); smp_check_barrier(); } while (0)
#endif

Expand Down
25 changes: 12 additions & 13 deletions arch/ia64/include/asm/barrier.h
Original file line number Diff line number Diff line change
Expand Up @@ -35,26 +35,25 @@
* it's (presumably) much slower than mf and (b) mf.a is supported for
* sequential memory pages only.
*/
#define mb() ia64_mf()
#define rmb() mb()
#define wmb() mb()
#define read_barrier_depends() do { } while(0)
#define mb() ia64_mf()
#define rmb() mb()
#define wmb() mb()

#define dma_rmb() mb()
#define dma_wmb() mb()

#ifdef CONFIG_SMP
# define smp_mb() mb()
# define smp_rmb() rmb()
# define smp_wmb() wmb()
# define smp_read_barrier_depends() read_barrier_depends()

#else

# define smp_mb() barrier()
# define smp_rmb() barrier()
# define smp_wmb() barrier()
# define smp_read_barrier_depends() do { } while(0)

#endif

#define smp_rmb() smp_mb()
#define smp_wmb() smp_mb()

#define read_barrier_depends() do { } while (0)
#define smp_read_barrier_depends() do { } while (0)

#define smp_mb__before_atomic() barrier()
#define smp_mb__after_atomic() barrier()

Expand Down
19 changes: 10 additions & 9 deletions arch/metag/include/asm/barrier.h
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,6 @@
#include <asm/metag_mem.h>

#define nop() asm volatile ("NOP")
#define mb() wmb()
#define rmb() barrier()

#ifdef CONFIG_METAG_META21

Expand Down Expand Up @@ -41,13 +39,13 @@ static inline void wr_fence(void)

#endif /* !CONFIG_METAG_META21 */

static inline void wmb(void)
{
/* flush writes through the write combiner */
wr_fence();
}
/* flush writes through the write combiner */
#define mb() wr_fence()
#define rmb() barrier()
#define wmb() mb()

#define read_barrier_depends() do { } while (0)
#define dma_rmb() rmb()
#define dma_wmb() wmb()

#ifndef CONFIG_SMP
#define fence() do { } while (0)
Expand Down Expand Up @@ -82,7 +80,10 @@ static inline void fence(void)
#define smp_wmb() barrier()
#endif
#endif
#define smp_read_barrier_depends() do { } while (0)

#define read_barrier_depends() do { } while (0)
#define smp_read_barrier_depends() do { } while (0)

#define set_mb(var, value) do { var = value; smp_mb(); } while (0)

#define smp_store_release(p, v) \
Expand Down
61 changes: 5 additions & 56 deletions arch/mips/include/asm/barrier.h
Original file line number Diff line number Diff line change
Expand Up @@ -10,58 +10,6 @@

#include <asm/addrspace.h>

/*
* read_barrier_depends - Flush all pending reads that subsequents reads
* depend on.
*
* No data-dependent reads from memory-like regions are ever reordered
* over this barrier. All reads preceding this primitive are guaranteed
* to access memory (but not necessarily other CPUs' caches) before any
* reads following this primitive that depend on the data return by
* any of the preceding reads. This primitive is much lighter weight than
* rmb() on most CPUs, and is never heavier weight than is
* rmb().
*
* These ordering constraints are respected by both the local CPU
* and the compiler.
*
* Ordering is not guaranteed by anything other than these primitives,
* not even by data dependencies. See the documentation for
* memory_barrier() for examples and URLs to more information.
*
* For example, the following code would force ordering (the initial
* value of "a" is zero, "b" is one, and "p" is "&a"):
*
* <programlisting>
* CPU 0 CPU 1
*
* b = 2;
* memory_barrier();
* p = &b; q = p;
* read_barrier_depends();
* d = *q;
* </programlisting>
*
* because the read of "*q" depends on the read of "p" and these
* two reads are separated by a read_barrier_depends(). However,
* the following code, with the same initial values for "a" and "b":
*
* <programlisting>
* CPU 0 CPU 1
*
* a = 2;
* memory_barrier();
* b = 3; y = b;
* read_barrier_depends();
* x = a;
* </programlisting>
*
* does not enforce ordering, since there is no data dependency between
* the read of "a" and the read of "b". Therefore, on some CPUs, such
* as Alpha, "y" could be set to 3 and "x" to 0. Use rmb()
* in cases like this where there are no data dependencies.
*/

#define read_barrier_depends() do { } while(0)
#define smp_read_barrier_depends() do { } while(0)

Expand Down Expand Up @@ -127,20 +75,21 @@

#include <asm/wbflush.h>

#define wmb() fast_wmb()
#define rmb() fast_rmb()
#define mb() wbflush()
#define iob() wbflush()

#else /* !CONFIG_CPU_HAS_WB */

#define wmb() fast_wmb()
#define rmb() fast_rmb()
#define mb() fast_mb()
#define iob() fast_iob()

#endif /* !CONFIG_CPU_HAS_WB */

#define wmb() fast_wmb()
#define rmb() fast_rmb()
#define dma_wmb() fast_wmb()
#define dma_rmb() fast_rmb()

#if defined(CONFIG_WEAK_ORDERING) && defined(CONFIG_SMP)
# ifdef CONFIG_CPU_CAVIUM_OCTEON
# define smp_mb() __sync()
Expand Down
Loading

0 comments on commit f96fe22

Please sign in to comment.