Skip to content

Commit

Permalink
Merge tag 'rxrpc-rewrite-20160706' of git://git.kernel.org/pub/scm/li…
Browse files Browse the repository at this point in the history
…nux/kernel/git/dhowells/linux-fs

David Howells says:

====================
rxrpc: Improve conn/call lookup and fix call number generation [ver #3]

I've fixed a couple of patch descriptions and excised the patch that
duplicated the connections list for reconsideration at a later date.

For reference, the excised patch is sitting on the rxrpc-experimental
branch of my git tree, based on top of the rxrpc-rewrite branch.  Diffing
it against yesterday's tag shows no differences.

Would you prefer the patch set to be emailed afresh instead of a git-pull
request?

David
---
Here's the next part of the AF_RXRPC rewrite.  The two main purposes of
this set are to fix the call number handling and to make use of RCU when
looking up the connection or call to pass a received packet to.

Important changes in this set include:

 (1) Avoidance of placing stack data into SG lists in rxkad so that kernel
     stacks can become vmalloc'd (Herbert Xu).

 (2) Calls cease pinning the connection they used as soon as possible,
     which allows the connection to be discarded sooner and allows the call
     channel on that connection to be reused earlier.

 (3) Make each call channel on a connection have a separate and independent
     call number space rather than having a shared number space for the
     connection.  Call numbers should increment monotonically per channel
     on the client, and the server should ignore a call with a lower call
     number for that channel than the latest it has seen.  The RESPONSE
     packet sets the minimum values of each call ID counter on a
     connection.

 (4) Look up calls by indexing the channel array on a connection rather
     than by keeping calls in an rbtree on that connection.  Also look up
     calls using the channel array rather than using a hashtable.

     The call hashtable can then be removed.

 (5) Call terminal statuses are cached in the channel array for the last
     call.  It is assumed that if we the server have seen call N, then the
     client no longer cares about call N-1 on the same channel.

     This will allow retransmission of the terminal status in future
     without the need to keep the rxrpc_call struct around.

 (6) Peer lookups are moved out of common connection handling code and into
     service connection handling code as client connections (a) must point
     to a peer before they can be used and (b) are looked up by a
     machine-unique connection ID directly, so we only need to look up the
     peer first if we're going to deal with a service call.

 (7) The reference count on a connection is held elevated by 1 whilst it is
     alive (ie. idle unused connections have a refcount of 1).  The reaper
     will attempt to change the refcount from 1->0 and skip if this cannot
     be done, whilst look ups only increment the refcount if it's non-zero.

     This makes the implementation of RCU lookups easier as we don't have
     to get a ref on the connection or a lock on the connection list to
     prevent a connection being reaped whilst we're contemplating queueing
     a packet that initiates a new service call upon it.

     If we need to get a connection, but there's a dead connection in the
     tree, we use rb_replace_node() to replace the dead one with a new one.

 (8) Use a seqlock to validate the walk over the service connection rbtree
     attached to a peer when it's being walked in RCU mode.

 (9) Make the incoming call/connection packet handling code use RCU mode
     and locks and make it only take a reference if the call/connection
     gets queued on a workqueue.

The intention is that the next set will introduce the connection lifetime
management and capacity limits to prevent clients from overloading the
server.

There are some fixes too:

 (1) Verifying that a packet coming in to a client connection came from the
     expected source.

 (2) Fix handling of connection failure in client call creation where we
     don't reinitialise the list linkage block and a second attempt to
     unlink the failed connection oopses and also we don't set the state
     correctly, which causes an assertion failure.

 (3) New service calls were being added to the socket's accept queue under
     the wrong lock.

Changes:

 (V2) In rxrpc_find_service_conn_rcu() initialised the sequence number to 0.

      Fixed the RCU handling in conn_service.c by introducing and using
      rb_replace_node_rcu() as an RCU-safe alternative in
      rxrpc_publish_service_conn().

      Modified and used rcu_dereference_raw() to avoid RCU sparse warnings
      in rxrpc_find_service_conn_rcu().

      Added in some missing RCU dereference wrappers.  It seems to be
      necessary to turn on CONFIG_PROVE_RCU_REPEATEDLY as well as
      CONFIG_SPARSE_RCU_POINTER to get the static __rcu annotation checking
      to happen.

      Fixed some other sparse warnings, including a missing ntohs() in
      jumbo packet processing.

 (V3) Fixed some commit descriptions.

      Excised the patch that duplicated the connection list to separate out
      the procfs list for reconsideration at a later date.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
  • Loading branch information
David S. Miller committed Jul 9, 2016
2 parents 99a50bb + d440a1c commit cc3baec
Show file tree
Hide file tree
Showing 21 changed files with 1,031 additions and 1,062 deletions.
2 changes: 2 additions & 0 deletions include/linux/rbtree.h
Original file line number Diff line number Diff line change
Expand Up @@ -76,6 +76,8 @@ extern struct rb_node *rb_next_postorder(const struct rb_node *);
/* Fast replacement of a single node without remove/rebalance/add/rebalance */
extern void rb_replace_node(struct rb_node *victim, struct rb_node *new,
struct rb_root *root);
extern void rb_replace_node_rcu(struct rb_node *victim, struct rb_node *new,
struct rb_root *root);

static inline void rb_link_node(struct rb_node *node, struct rb_node *parent,
struct rb_node **rb_link)
Expand Down
13 changes: 13 additions & 0 deletions include/linux/rbtree_augmented.h
Original file line number Diff line number Diff line change
Expand Up @@ -130,6 +130,19 @@ __rb_change_child(struct rb_node *old, struct rb_node *new,
WRITE_ONCE(root->rb_node, new);
}

static inline void
__rb_change_child_rcu(struct rb_node *old, struct rb_node *new,
struct rb_node *parent, struct rb_root *root)
{
if (parent) {
if (parent->rb_left == old)
rcu_assign_pointer(parent->rb_left, new);
else
rcu_assign_pointer(parent->rb_right, new);
} else
rcu_assign_pointer(root->rb_node, new);
}

extern void __rb_erase_color(struct rb_node *parent, struct rb_root *root,
void (*augment_rotate)(struct rb_node *old, struct rb_node *new));

Expand Down
8 changes: 6 additions & 2 deletions include/linux/rcupdate.h
Original file line number Diff line number Diff line change
Expand Up @@ -611,6 +611,12 @@ static inline void rcu_preempt_sleep_check(void)
rcu_dereference_sparse(p, space); \
((typeof(*p) __force __kernel *)(p)); \
})
#define rcu_dereference_raw(p) \
({ \
/* Dependency order vs. p above. */ \
typeof(p) ________p1 = lockless_dereference(p); \
((typeof(*p) __force __kernel *)(________p1)); \
})

/**
* RCU_INITIALIZER() - statically initialize an RCU-protected global variable
Expand Down Expand Up @@ -729,8 +735,6 @@ static inline void rcu_preempt_sleep_check(void)
__rcu_dereference_check((p), (c) || rcu_read_lock_sched_held(), \
__rcu)

#define rcu_dereference_raw(p) rcu_dereference_check(p, 1) /*@@@ needed? @@@*/

/*
* The tracing infrastructure traces RCU (we want that), but unfortunately
* some of the RCU checks causes tracing to lock up the system.
Expand Down
26 changes: 24 additions & 2 deletions lib/rbtree.c
Original file line number Diff line number Diff line change
Expand Up @@ -539,17 +539,39 @@ void rb_replace_node(struct rb_node *victim, struct rb_node *new,
{
struct rb_node *parent = rb_parent(victim);

/* Copy the pointers/colour from the victim to the replacement */
*new = *victim;

/* Set the surrounding nodes to point to the replacement */
__rb_change_child(victim, new, parent, root);
if (victim->rb_left)
rb_set_parent(victim->rb_left, new);
if (victim->rb_right)
rb_set_parent(victim->rb_right, new);
__rb_change_child(victim, new, parent, root);
}
EXPORT_SYMBOL(rb_replace_node);

void rb_replace_node_rcu(struct rb_node *victim, struct rb_node *new,
struct rb_root *root)
{
struct rb_node *parent = rb_parent(victim);

/* Copy the pointers/colour from the victim to the replacement */
*new = *victim;

/* Set the surrounding nodes to point to the replacement */
if (victim->rb_left)
rb_set_parent(victim->rb_left, new);
if (victim->rb_right)
rb_set_parent(victim->rb_right, new);

/* Set the parent's pointer to the new node last after an RCU barrier
* so that the pointers onwards are seen to be set correctly when doing
* an RCU walk over the tree.
*/
__rb_change_child_rcu(victim, new, parent, root);
}
EXPORT_SYMBOL(rb_replace_node);
EXPORT_SYMBOL(rb_replace_node_rcu);

static struct rb_node *rb_left_deepest_node(const struct rb_node *node)
{
Expand Down
1 change: 1 addition & 0 deletions net/rxrpc/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ af-rxrpc-y := \
conn_client.o \
conn_event.o \
conn_object.o \
conn_service.o \
input.o \
insecure.o \
key.o \
Expand Down
20 changes: 0 additions & 20 deletions net/rxrpc/af_rxrpc.c
Original file line number Diff line number Diff line change
Expand Up @@ -788,27 +788,7 @@ static void __exit af_rxrpc_exit(void)
proto_unregister(&rxrpc_proto);
rxrpc_destroy_all_calls();
rxrpc_destroy_all_connections();

ASSERTCMP(atomic_read(&rxrpc_n_skbs), ==, 0);

/* We need to flush the scheduled work twice because the local endpoint
* records involve a work item in their destruction as they can only be
* destroyed from process context. However, a connection may have a
* work item outstanding - and this will pin the local endpoint record
* until the connection goes away.
*
* Peers don't pin locals and calls pin sockets - which prevents the
* module from being unloaded - so we should only need two flushes.
*/
_debug("flush scheduled work");
flush_workqueue(rxrpc_workqueue);
_debug("flush scheduled work 2");
flush_workqueue(rxrpc_workqueue);
_debug("synchronise RCU");
rcu_barrier();
_debug("destroy locals");
ASSERT(idr_is_empty(&rxrpc_client_conn_ids));
idr_destroy(&rxrpc_client_conn_ids);
rxrpc_destroy_all_locals();

remove_proc_entry("rxrpc_conns", init_net.proc_net);
Expand Down
Loading

0 comments on commit cc3baec

Please sign in to comment.