Skip to content

Commit

Permalink
scsi: lpfc: Support dynamic unbounded SGL lists on G7 hardware.
Browse files Browse the repository at this point in the history
Typical SLI-4 hardware supports up to 2 4KB pages to be registered per XRI
to contain the exchanges Scatter/Gather List. This caps the number of SGL
elements that can be in the SGL. There are not extensions to extend the
list out of the 2 pages.

The G7 hardware adds a SGE type that allows the SGL to be vectored to a
different scatter/gather list segment. And that segment can contain a SGE
to go to another segment and so on.  The initial segment must still be
pre-registered for the XRI, but it can be a much smaller amount (256Bytes)
as it can now be dynamically grown.  This much smaller allocation can
handle the SG list for most normal I/O, and the dynamic aspect allows it to
support many MB's if needed.

The implementation creates a pool which contains "segments" and which is
initially sized to hold the initial small segment per xri. If an I/O
requires additional segments, they are allocated from the pool.  If the
pool has no more segments, the pool is grown based on what is now
needed. After the I/O completes, the additional segments are returned to
the pool for use by other I/Os. Once allocated, the additional segments are
not released under the assumption of "if needed once, it will be needed
again". Pools are kept on a per-hardware queue basis, which is typically
1:1 per cpu, but may be shared by multiple cpus.

The switch to the smaller initial allocation significantly reduces the
memory footprint of the driver (which only grows if large ios are
issued). Based on the several K of XRIs for the adapter, the 8KB->256B
reduction can conserve 32MBs or more.

It has been observed with per-cpu resource pools that allocating a resource
on CPU A, may be put back on CPU B. While the get routines are distributed
evenly, only a limited subset of CPUs may be handling the put routines.
This can put a strain on the lpfc_put_cmd_rsp_buf_per_cpu routine because
all the resources are being put on a limited subset of CPUs.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
  • Loading branch information
James Smart authored and Martin K. Petersen committed Aug 20, 2019
1 parent e62245d commit d79c9e9
Show file tree
Hide file tree
Showing 9 changed files with 904 additions and 248 deletions.
5 changes: 5 additions & 0 deletions drivers/scsi/lpfc/lpfc.h
Original file line number Diff line number Diff line change
Expand Up @@ -51,6 +51,8 @@ struct lpfc_sli2_slim;
cmnd for menlo needs nearly twice as for firmware
downloads using bsg */

#define LPFC_DEFAULT_XPSGL_SIZE 256
#define LPFC_MAX_SG_TABLESIZE 0xffff
#define LPFC_MIN_SG_SLI4_BUF_SZ 0x800 /* based on LPFC_DEFAULT_SG_SEG_CNT */
#define LPFC_MAX_BG_SLI4_SEG_CNT_DIF 128 /* sg element count for BlockGuard */
#define LPFC_MAX_SG_SEG_CNT_DIF 512 /* sg element count per scsi cmnd */
Expand Down Expand Up @@ -799,6 +801,7 @@ struct lpfc_hba {
/* HBA Config Parameters */
uint32_t cfg_ack0;
uint32_t cfg_xri_rebalancing;
uint32_t cfg_xpsgl;
uint32_t cfg_enable_npiv;
uint32_t cfg_enable_rrq;
uint32_t cfg_topology;
Expand Down Expand Up @@ -904,6 +907,7 @@ struct lpfc_hba {
wait_queue_head_t work_waitq;
struct task_struct *worker_thread;
unsigned long data_flags;
uint32_t border_sge_num;

uint32_t hbq_in_use; /* HBQs in use flag */
uint32_t hbq_count; /* Count of configured HBQs */
Expand Down Expand Up @@ -986,6 +990,7 @@ struct lpfc_hba {
struct dma_pool *lpfc_nvmet_drb_pool; /* data receive buffer pool */
struct dma_pool *lpfc_hbq_pool; /* SLI3 hbq buffer pool */
struct dma_pool *txrdy_payload_pool;
struct dma_pool *lpfc_cmd_rsp_buf_pool;
struct lpfc_dma_pool lpfc_mbuf_safety_pool;

mempool_t *mbox_mem_pool;
Expand Down
20 changes: 20 additions & 0 deletions drivers/scsi/lpfc/lpfc_hw4.h
Original file line number Diff line number Diff line change
Expand Up @@ -2050,6 +2050,23 @@ struct sli4_sge { /* SLI-4 */
uint32_t sge_len;
};

struct sli4_hybrid_sgl {
struct list_head list_node;
struct sli4_sge *dma_sgl;
dma_addr_t dma_phys_sgl;
};

struct fcp_cmd_rsp_buf {
struct list_head list_node;

/* for storing cmd/rsp dma alloc'ed virt_addr */
struct fcp_cmnd *fcp_cmnd;
struct fcp_rsp *fcp_rsp;

/* for storing this cmd/rsp's dma mapped phys addr from per CPU pool */
dma_addr_t fcp_cmd_rsp_dma_handle;
};

struct sli4_sge_diseed { /* SLI-4 */
uint32_t ref_tag;
uint32_t ref_tag_tran;
Expand Down Expand Up @@ -3449,6 +3466,9 @@ struct lpfc_sli4_parameters {
#define cfg_xib_SHIFT 4
#define cfg_xib_MASK 0x00000001
#define cfg_xib_WORD word19
#define cfg_xpsgl_SHIFT 6
#define cfg_xpsgl_MASK 0x00000001
#define cfg_xpsgl_WORD word19
#define cfg_eqdr_SHIFT 8
#define cfg_eqdr_MASK 0x00000001
#define cfg_eqdr_WORD word19
Expand Down
Loading

0 comments on commit d79c9e9

Please sign in to comment.