Skip to content

Commit

Permalink
Merge tag 'vfs-6.12-rc2.fixes' of git://git.kernel.org/pub/scm/linux/…
Browse files Browse the repository at this point in the history
…kernel/git/vfs/vfs

Pull vfs fixes from Christian Brauner:
 "afs:

   - Fix setting of the server responding flag

   - Remove unused struct afs_address_list and afs_put_address_list()
     function

   - Fix infinite loop because of unresponsive servers

   - Ensure that afs_retry_request() function is correctly added to the
     afs_req_ops netfs operations table

  netfs:

   - Fix netfs_folio tracepoint handling to handle NULL mappings

   - Add a missing folio_queue API documentation

   - Ensure that netfs_write_folio() correctly advances the iterator via
     iov_iter_advance()

   - Fix a dentry leak during concurrent cull and cookie lookup
     operations in cachefiles

  pidfs:

   - Correctly handle accessing another task's pid namespace"

* tag 'vfs-6.12-rc2.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
  netfs: Fix the netfs_folio tracepoint to handle NULL mapping
  netfs: Add folio_queue API documentation
  netfs: Advance iterator correctly rather than jumping it
  afs: Fix the setting of the server responding flag
  afs: Remove unused struct and function prototype
  afs: Fix possible infinite loop with unresponsive servers
  pidfs: check for valid pid namespace
  afs: Fix missing wire-up of afs_retry_request()
  cachefiles: fix dentry leak in cachefiles_open_file()
  • Loading branch information
Linus Torvalds committed Sep 30, 2024
2 parents 2007d28 + f801850 commit a5f24c7
Show file tree
Hide file tree
Showing 11 changed files with 410 additions and 24 deletions.
212 changes: 212 additions & 0 deletions Documentation/core-api/folio_queue.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,212 @@
.. SPDX-License-Identifier: GPL-2.0+
===========
Folio Queue
===========

:Author: David Howells <dhowells@redhat.com>

.. Contents:
* Overview
* Initialisation
* Adding and removing folios
* Querying information about a folio
* Querying information about a folio_queue
* Folio queue iteration
* Folio marks
* Lockless simultaneous production/consumption issues
Overview
========

The folio_queue struct forms a single segment in a segmented list of folios
that can be used to form an I/O buffer. As such, the list can be iterated over
using the ITER_FOLIOQ iov_iter type.

The publicly accessible members of the structure are::

struct folio_queue {
struct folio_queue *next;
struct folio_queue *prev;
...
};

A pair of pointers are provided, ``next`` and ``prev``, that point to the
segments on either side of the segment being accessed. Whilst this is a
doubly-linked list, it is intentionally not a circular list; the outward
sibling pointers in terminal segments should be NULL.

Each segment in the list also stores:

* an ordered sequence of folio pointers,
* the size of each folio and
* three 1-bit marks per folio,

but hese should not be accessed directly as the underlying data structure may
change, but rather the access functions outlined below should be used.

The facility can be made accessible by::

#include <linux/folio_queue.h>

and to use the iterator::

#include <linux/uio.h>


Initialisation
==============

A segment should be initialised by calling::

void folioq_init(struct folio_queue *folioq);

with a pointer to the segment to be initialised. Note that this will not
necessarily initialise all the folio pointers, so care must be taken to check
the number of folios added.


Adding and removing folios
==========================

Folios can be set in the next unused slot in a segment struct by calling one
of::

unsigned int folioq_append(struct folio_queue *folioq,
struct folio *folio);

unsigned int folioq_append_mark(struct folio_queue *folioq,
struct folio *folio);

Both functions update the stored folio count, store the folio and note its
size. The second function also sets the first mark for the folio added. Both
functions return the number of the slot used. [!] Note that no attempt is made
to check that the capacity wasn't overrun and the list will not be extended
automatically.

A folio can be excised by calling::

void folioq_clear(struct folio_queue *folioq, unsigned int slot);

This clears the slot in the array and also clears all the marks for that folio,
but doesn't change the folio count - so future accesses of that slot must check
if the slot is occupied.


Querying information about a folio
==================================

Information about the folio in a particular slot may be queried by the
following function::

struct folio *folioq_folio(const struct folio_queue *folioq,
unsigned int slot);

If a folio has not yet been set in that slot, this may yield an undefined
pointer. The size of the folio in a slot may be queried with either of::

unsigned int folioq_folio_order(const struct folio_queue *folioq,
unsigned int slot);

size_t folioq_folio_size(const struct folio_queue *folioq,
unsigned int slot);

The first function returns the size as an order and the second as a number of
bytes.


Querying information about a folio_queue
========================================

Information may be retrieved about a particular segment with the following
functions::

unsigned int folioq_nr_slots(const struct folio_queue *folioq);

unsigned int folioq_count(struct folio_queue *folioq);

bool folioq_full(struct folio_queue *folioq);

The first function returns the maximum capacity of a segment. It must not be
assumed that this won't vary between segments. The second returns the number
of folios added to a segments and the third is a shorthand to indicate if the
segment has been filled to capacity.

Not that the count and fullness are not affected by clearing folios from the
segment. These are more about indicating how many slots in the array have been
initialised, and it assumed that slots won't get reused, but rather the segment
will get discarded as the queue is consumed.


Folio marks
===========

Folios within a queue can also have marks assigned to them. These marks can be
used to note information such as if a folio needs folio_put() calling upon it.
There are three marks available to be set for each folio.

The marks can be set by::

void folioq_mark(struct folio_queue *folioq, unsigned int slot);
void folioq_mark2(struct folio_queue *folioq, unsigned int slot);
void folioq_mark3(struct folio_queue *folioq, unsigned int slot);

Cleared by::

void folioq_unmark(struct folio_queue *folioq, unsigned int slot);
void folioq_unmark2(struct folio_queue *folioq, unsigned int slot);
void folioq_unmark3(struct folio_queue *folioq, unsigned int slot);

And the marks can be queried by::

bool folioq_is_marked(const struct folio_queue *folioq, unsigned int slot);
bool folioq_is_marked2(const struct folio_queue *folioq, unsigned int slot);
bool folioq_is_marked3(const struct folio_queue *folioq, unsigned int slot);

The marks can be used for any purpose and are not interpreted by this API.


Folio queue iteration
=====================

A list of segments may be iterated over using the I/O iterator facility using
an ``iov_iter`` iterator of ``ITER_FOLIOQ`` type. The iterator may be
initialised with::

void iov_iter_folio_queue(struct iov_iter *i, unsigned int direction,
const struct folio_queue *folioq,
unsigned int first_slot, unsigned int offset,
size_t count);

This may be told to start at a particular segment, slot and offset within a
queue. The iov iterator functions will follow the next pointers when advancing
and prev pointers when reverting when needed.


Lockless simultaneous production/consumption issues
===================================================

If properly managed, the list can be extended by the producer at the head end
and shortened by the consumer at the tail end simultaneously without the need
to take locks. The ITER_FOLIOQ iterator inserts appropriate barriers to aid
with this.

Care must be taken when simultaneously producing and consuming a list. If the
last segment is reached and the folios it refers to are entirely consumed by
the IOV iterators, an iov_iter struct will be left pointing to the last segment
with a slot number equal to the capacity of that segment. The iterator will
try to continue on from this if there's another segment available when it is
used again, but care must be taken lest the segment got removed and freed by
the consumer before the iterator was advanced.

It is recommended that the queue always contain at least one segment, even if
that segment has never been filled or is entirely spent. This prevents the
head and tail pointers from collapsing.


API Function Reference
======================

.. kernel-doc:: include/linux/folio_queue.h
9 changes: 0 additions & 9 deletions fs/afs/afs_vl.h
Original file line number Diff line number Diff line change
Expand Up @@ -134,13 +134,4 @@ struct afs_uvldbentry__xdr {
__be32 spares9;
};

struct afs_address_list {
refcount_t usage;
unsigned int version;
unsigned int nr_addrs;
struct sockaddr_rxrpc addrs[];
};

extern void afs_put_address_list(struct afs_address_list *alist);

#endif /* AFS_VL_H */
1 change: 1 addition & 0 deletions fs/afs/file.c
Original file line number Diff line number Diff line change
Expand Up @@ -420,6 +420,7 @@ const struct netfs_request_ops afs_req_ops = {
.begin_writeback = afs_begin_writeback,
.prepare_write = afs_prepare_write,
.issue_write = afs_issue_write,
.retry_request = afs_retry_request,
};

static void afs_add_open_mmap(struct afs_vnode *vnode)
Expand Down
2 changes: 1 addition & 1 deletion fs/afs/fs_operation.c
Original file line number Diff line number Diff line change
Expand Up @@ -201,7 +201,7 @@ void afs_wait_for_operation(struct afs_operation *op)
}
}

if (op->call_responded)
if (op->call_responded && op->server)
set_bit(AFS_SERVER_FL_RESPONDING, &op->server->flags);

if (!afs_op_error(op)) {
Expand Down
4 changes: 2 additions & 2 deletions fs/afs/fs_probe.c
Original file line number Diff line number Diff line change
Expand Up @@ -506,10 +506,10 @@ int afs_wait_for_one_fs_probe(struct afs_server *server, struct afs_endpoint_sta
finish_wait(&server->probe_wq, &wait);

dont_wait:
if (estate->responsive_set & ~exclude)
return 1;
if (test_bit(AFS_ESTATE_SUPERSEDED, &estate->flags))
return 0;
if (estate->responsive_set & ~exclude)
return 1;
if (is_intr && signal_pending(current))
return -ERESTARTSYS;
if (timo == 0)
Expand Down
11 changes: 8 additions & 3 deletions fs/afs/rotate.c
Original file line number Diff line number Diff line change
Expand Up @@ -632,8 +632,10 @@ bool afs_select_fileserver(struct afs_operation *op)
wait_for_more_probe_results:
error = afs_wait_for_one_fs_probe(op->server, op->estate, op->addr_tried,
!(op->flags & AFS_OPERATION_UNINTR));
if (!error)
if (error == 1)
goto iterate_address;
if (!error)
goto restart_from_beginning;

/* We've now had a failure to respond on all of a server's addresses -
* immediately probe them again and consider retrying the server.
Expand All @@ -644,10 +646,13 @@ bool afs_select_fileserver(struct afs_operation *op)
error = afs_wait_for_one_fs_probe(op->server, op->estate, op->addr_tried,
!(op->flags & AFS_OPERATION_UNINTR));
switch (error) {
case 0:
case 1:
op->flags &= ~AFS_OPERATION_RETRY_SERVER;
trace_afs_rotate(op, afs_rotate_trace_retry_server, 0);
trace_afs_rotate(op, afs_rotate_trace_retry_server, 1);
goto retry_server;
case 0:
trace_afs_rotate(op, afs_rotate_trace_retry_server, 0);
goto restart_from_beginning;
case -ERESTARTSYS:
afs_op_set_error(op, error);
goto failed;
Expand Down
7 changes: 3 additions & 4 deletions fs/cachefiles/namei.c
Original file line number Diff line number Diff line change
Expand Up @@ -595,14 +595,12 @@ static bool cachefiles_open_file(struct cachefiles_object *object,
* write and readdir but not lookup or open).
*/
touch_atime(&file->f_path);
dput(dentry);
return true;

check_failed:
fscache_cookie_lookup_negative(object->cookie);
cachefiles_unmark_inode_in_use(object, file);
fput(file);
dput(dentry);
if (ret == -ESTALE)
return cachefiles_create_file(object);
return false;
Expand All @@ -611,7 +609,6 @@ static bool cachefiles_open_file(struct cachefiles_object *object,
fput(file);
error:
cachefiles_do_unmark_inode_in_use(object, d_inode(dentry));
dput(dentry);
return false;
}

Expand Down Expand Up @@ -654,7 +651,9 @@ bool cachefiles_look_up_object(struct cachefiles_object *object)
goto new_file;
}

if (!cachefiles_open_file(object, dentry))
ret = cachefiles_open_file(object, dentry);
dput(dentry);
if (!ret)
return false;

_leave(" = t [%lu]", file_inode(object->file)->i_ino);
Expand Down
12 changes: 9 additions & 3 deletions fs/netfs/write_issue.c
Original file line number Diff line number Diff line change
Expand Up @@ -317,6 +317,7 @@ static int netfs_write_folio(struct netfs_io_request *wreq,
struct netfs_io_stream *stream;
struct netfs_group *fgroup; /* TODO: Use this with ceph */
struct netfs_folio *finfo;
size_t iter_off = 0;
size_t fsize = folio_size(folio), flen = fsize, foff = 0;
loff_t fpos = folio_pos(folio), i_size;
bool to_eof = false, streamw = false;
Expand Down Expand Up @@ -472,7 +473,12 @@ static int netfs_write_folio(struct netfs_io_request *wreq,
if (choose_s < 0)
break;
stream = &wreq->io_streams[choose_s];
wreq->io_iter.iov_offset = stream->submit_off;

/* Advance the iterator(s). */
if (stream->submit_off > iter_off) {
iov_iter_advance(&wreq->io_iter, stream->submit_off - iter_off);
iter_off = stream->submit_off;
}

atomic64_set(&wreq->issued_to, fpos + stream->submit_off);
stream->submit_extendable_to = fsize - stream->submit_off;
Expand All @@ -487,8 +493,8 @@ static int netfs_write_folio(struct netfs_io_request *wreq,
debug = true;
}

wreq->io_iter.iov_offset = 0;
iov_iter_advance(&wreq->io_iter, fsize);
if (fsize > iter_off)
iov_iter_advance(&wreq->io_iter, fsize - iter_off);
atomic64_set(&wreq->issued_to, fpos + fsize);

if (!debug)
Expand Down
5 changes: 4 additions & 1 deletion fs/pidfs.c
Original file line number Diff line number Diff line change
Expand Up @@ -120,6 +120,7 @@ static long pidfd_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
struct nsproxy *nsp __free(put_nsproxy) = NULL;
struct pid *pid = pidfd_pid(file);
struct ns_common *ns_common = NULL;
struct pid_namespace *pid_ns;

if (arg)
return -EINVAL;
Expand Down Expand Up @@ -202,7 +203,9 @@ static long pidfd_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
case PIDFD_GET_PID_NAMESPACE:
if (IS_ENABLED(CONFIG_PID_NS)) {
rcu_read_lock();
ns_common = to_ns_common( get_pid_ns(task_active_pid_ns(task)));
pid_ns = task_active_pid_ns(task);
if (pid_ns)
ns_common = to_ns_common(get_pid_ns(pid_ns));
rcu_read_unlock();
}
break;
Expand Down
Loading

0 comments on commit a5f24c7

Please sign in to comment.