Skip to content

Commit

Permalink
nvme-rdma: Fix max_hw_sectors calculation
Browse files Browse the repository at this point in the history
By default, the NVMe/RDMA driver should support max io_size of 1MiB (or
upto the maximum supported size by the HCA). Currently, one will see that
/sys/class/block/<bdev>/queue/max_hw_sectors_kb is 1020 instead of 1024.

A non power of 2 value can cause performance degradation due to
unnecessary splitting of IO requests and unoptimized allocation units.

The number of pages per MR has been fixed here, so there is no longer any
need to reduce max_sectors by 1.

Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Max Gurtovoy <maxg@mellanox.com>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
  • Loading branch information
Max Gurtovoy authored and Sagi Grimberg committed Sep 25, 2019
1 parent bc4f6e0 commit ff13c1b
Showing 1 changed file with 11 additions and 5 deletions.
16 changes: 11 additions & 5 deletions drivers/nvme/host/rdma.c
Original file line number Diff line number Diff line change
Expand Up @@ -427,7 +427,7 @@ static void nvme_rdma_destroy_queue_ib(struct nvme_rdma_queue *queue)
static int nvme_rdma_get_max_fr_pages(struct ib_device *ibdev)
{
return min_t(u32, NVME_RDMA_MAX_SEGMENTS,
ibdev->attrs.max_fast_reg_page_list_len);
ibdev->attrs.max_fast_reg_page_list_len - 1);
}

static int nvme_rdma_create_queue_ib(struct nvme_rdma_queue *queue)
Expand All @@ -437,7 +437,7 @@ static int nvme_rdma_create_queue_ib(struct nvme_rdma_queue *queue)
const int cq_factor = send_wr_factor + 1; /* + RECV */
int comp_vector, idx = nvme_rdma_queue_idx(queue);
enum ib_poll_context poll_ctx;
int ret;
int ret, pages_per_mr;

queue->device = nvme_rdma_find_get_device(queue->cm_id);
if (!queue->device) {
Expand Down Expand Up @@ -479,10 +479,16 @@ static int nvme_rdma_create_queue_ib(struct nvme_rdma_queue *queue)
goto out_destroy_qp;
}

/*
* Currently we don't use SG_GAPS MR's so if the first entry is
* misaligned we'll end up using two entries for a single data page,
* so one additional entry is required.
*/
pages_per_mr = nvme_rdma_get_max_fr_pages(ibdev) + 1;
ret = ib_mr_pool_init(queue->qp, &queue->qp->rdma_mrs,
queue->queue_size,
IB_MR_TYPE_MEM_REG,
nvme_rdma_get_max_fr_pages(ibdev), 0);
pages_per_mr, 0);
if (ret) {
dev_err(queue->ctrl->ctrl.device,
"failed to initialize MR pool sized %d for QID %d\n",
Expand Down Expand Up @@ -820,8 +826,8 @@ static int nvme_rdma_configure_admin_queue(struct nvme_rdma_ctrl *ctrl,
if (error)
goto out_stop_queue;

ctrl->ctrl.max_hw_sectors =
(ctrl->max_fr_pages - 1) << (ilog2(SZ_4K) - 9);
ctrl->ctrl.max_segments = ctrl->max_fr_pages;
ctrl->ctrl.max_hw_sectors = ctrl->max_fr_pages << (ilog2(SZ_4K) - 9);

blk_mq_unquiesce_queue(ctrl->ctrl.admin_q);

Expand Down

0 comments on commit ff13c1b

Please sign in to comment.