Skip to content

Commit

Permalink
ublk: document zero copy feature
Browse files Browse the repository at this point in the history
Add words to explain how zero copy feature works, and why it has to be
trusted for handling IO read command.

Signed-off-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20250327095123.179113-8-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
  • Loading branch information
Ming Lei authored and Jens Axboe committed Mar 28, 2025
1 parent ebf695f commit 1797020
Showing 1 changed file with 26 additions and 9 deletions.
35 changes: 26 additions & 9 deletions Documentation/block/ublk.rst
Original file line number Diff line number Diff line change
Expand Up @@ -309,18 +309,35 @@ with specified IO tag in the command data:
``UBLK_IO_COMMIT_AND_FETCH_REQ`` to the server, ublkdrv needs to copy
the server buffer (pages) read to the IO request pages.

Future development
==================

Zero copy
---------

Zero copy is a generic requirement for nbd, fuse or similar drivers. A
problem [#xiaoguang]_ Xiaoguang mentioned is that pages mapped to userspace
can't be remapped any more in kernel with existing mm interfaces. This can
occurs when destining direct IO to ``/dev/ublkb*``. Also, he reported that
big requests (IO size >= 256 KB) may benefit a lot from zero copy.

ublk zero copy relies on io_uring's fixed kernel buffer, which provides
two APIs: `io_buffer_register_bvec()` and `io_buffer_unregister_bvec`.

ublk adds IO command of `UBLK_IO_REGISTER_IO_BUF` to call
`io_buffer_register_bvec()` for ublk server to register client request
buffer into io_uring buffer table, then ublk server can submit io_uring
IOs with the registered buffer index. IO command of `UBLK_IO_UNREGISTER_IO_BUF`
calls `io_buffer_unregister_bvec()` to unregister the buffer, which is
guaranteed to be live between calling `io_buffer_register_bvec()` and
`io_buffer_unregister_bvec()`. Any io_uring operation which supports this
kind of kernel buffer will grab one reference of the buffer until the
operation is completed.

ublk server implementing zero copy or user copy has to be CAP_SYS_ADMIN and
be trusted, because it is ublk server's responsibility to make sure IO buffer
filled with data for handling read command, and ublk server has to return
correct result to ublk driver when handling READ command, and the result
has to match with how many bytes filled to the IO buffer. Otherwise,
uninitialized kernel IO buffer will be exposed to client application.

ublk server needs to align the parameter of `struct ublk_param_dma_align`
with backend for zero copy to work correctly.

For reaching best IO performance, ublk server should align its segment
parameter of `struct ublk_param_segment` with backend for avoiding
unnecessary IO split, which usually hurts io_uring performance.

References
==========
Expand Down

0 comments on commit 1797020

Please sign in to comment.