Skip to content

Commit

Permalink
vsock/virtio: avoid queuing packets when intermediate queue is empty
Browse files Browse the repository at this point in the history
When the driver needs to send new packets to the device, it always
queues the new sk_buffs into an intermediate queue (send_pkt_queue)
and schedules a worker (send_pkt_work) to then queue them into the
virtqueue exposed to the device.

This increases the chance of batching, but also introduces a lot of
latency into the communication. So we can optimize this path by
adding a fast path to be taken when there is no element in the
intermediate queue, there is space available in the virtqueue,
and no other process that is sending packets (tx_lock held).

The following benchmarks were run to check improvements in latency and
throughput. The test bed is a host with Intel i7-10700KF CPU @ 3.80GHz
and L1 guest running on QEMU/KVM with vhost process and all vCPUs
pinned individually to pCPUs.

- Latency
   Tool: Fio version 3.37-56
   Mode: pingpong (h-g-h)
   Test runs: 50
   Runtime-per-test: 50s
   Type: SOCK_STREAM

In the following fio benchmark (pingpong mode) the host sends
a payload to the guest and waits for the same payload back.

fio process pinned both inside the host and the guest system.

Before: Linux 6.9.8

Payload 64B:

	1st perc.	overall		99th perc.
Before	12.91		16.78		42.24		us
After	9.77		13.57		39.17		us

Payload 512B:

	1st perc.	overall		99th perc.
Before	13.35		17.35		41.52		us
After	10.25		14.11		39.58		us

Payload 4K:

	1st perc.	overall		99th perc.
Before	14.71		19.87		41.52		us
After	10.51		14.96		40.81		us

- Throughput
   Tool: iperf-vsock

The size represents the buffer length (-l) to read/write
P represents the number of parallel streams

P=1
	4K	64K	128K
Before	6.87	29.3	29.5 Gb/s
After	10.5	39.4	39.9 Gb/s

P=2
	4K	64K	128K
Before	10.5	32.8	33.2 Gb/s
After	17.8	47.7	48.5 Gb/s

P=4
	4K	64K	128K
Before	12.7	33.6	34.2 Gb/s
After	16.9	48.1	50.5 Gb/s

The performance improvement is related to this optimization,
I used a ebpf kretprobe on virtio_transport_send_skb to check
that each packet was sent directly to the virtqueue

Co-developed-by: Marco Pinna <marco.pinn95@gmail.com>
Signed-off-by: Marco Pinna <marco.pinn95@gmail.com>
Signed-off-by: Luigi Leonardi <luigi.leonardi@outlook.com>
Message-Id: <20240730-pinna-v4-2-5c9179164db5@outlook.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
  • Loading branch information
Luigi Leonardi authored and Michael S. Tsirkin committed Sep 25, 2024
1 parent 26618da commit efcd71a
Showing 1 changed file with 35 additions and 4 deletions.
39 changes: 35 additions & 4 deletions net/vmw_vsock/virtio_transport.c
Original file line number Diff line number Diff line change
Expand Up @@ -208,6 +208,28 @@ virtio_transport_send_pkt_work(struct work_struct *work)
queue_work(virtio_vsock_workqueue, &vsock->rx_work);
}

/* Caller need to hold RCU for vsock.
* Returns 0 if the packet is successfully put on the vq.
*/
static int virtio_transport_send_skb_fast_path(struct virtio_vsock *vsock, struct sk_buff *skb)
{
struct virtqueue *vq = vsock->vqs[VSOCK_VQ_TX];
int ret;

/* Inside RCU, can't sleep! */
ret = mutex_trylock(&vsock->tx_lock);
if (unlikely(ret == 0))
return -EBUSY;

ret = virtio_transport_send_skb(skb, vq, vsock);
if (ret == 0)
virtqueue_kick(vq);

mutex_unlock(&vsock->tx_lock);

return ret;
}

static int
virtio_transport_send_pkt(struct sk_buff *skb)
{
Expand All @@ -231,11 +253,20 @@ virtio_transport_send_pkt(struct sk_buff *skb)
goto out_rcu;
}

if (virtio_vsock_skb_reply(skb))
atomic_inc(&vsock->queued_replies);
/* If send_pkt_queue is empty, we can safely bypass this queue
* because packet order is maintained and (try) to put the packet
* on the virtqueue using virtio_transport_send_skb_fast_path.
* If this fails we simply put the packet on the intermediate
* queue and schedule the worker.
*/
if (!skb_queue_empty_lockless(&vsock->send_pkt_queue) ||
virtio_transport_send_skb_fast_path(vsock, skb)) {
if (virtio_vsock_skb_reply(skb))
atomic_inc(&vsock->queued_replies);

virtio_vsock_skb_queue_tail(&vsock->send_pkt_queue, skb);
queue_work(virtio_vsock_workqueue, &vsock->send_pkt_work);
virtio_vsock_skb_queue_tail(&vsock->send_pkt_queue, skb);
queue_work(virtio_vsock_workqueue, &vsock->send_pkt_work);
}

out_rcu:
rcu_read_unlock();
Expand Down

0 comments on commit efcd71a

Please sign in to comment.