Skip to content

Commit

Permalink
---
Browse files Browse the repository at this point in the history
yaml
---
r: 104120
b: refs/heads/master
c: e86322f
h: refs/heads/master
v: v3
  • Loading branch information
J. Bruce Fields committed Jul 3, 2008
1 parent 4f2326a commit 61303a9
Show file tree
Hide file tree
Showing 26 changed files with 619 additions and 540 deletions.
2 changes: 1 addition & 1 deletion [refs]
Original file line number Diff line number Diff line change
@@ -1,2 +1,2 @@
---
refs/heads/master: 8948896c9e098c6fd31a6a698a598a7cbd7fa40e
refs/heads/master: e86322f611eef95aafaf726fd3965e5b211f1985
103 changes: 59 additions & 44 deletions trunk/Documentation/filesystems/nfs-rdma.txt
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
################################################################################

Author: NetApp and Open Grid Computing
Date: April 15, 2008
Date: May 29, 2008

Table of Contents
~~~~~~~~~~~~~~~~~
Expand Down Expand Up @@ -60,39 +60,52 @@ Installation
The procedures described in this document have been tested with
distributions from Red Hat's Fedora Project (http://fedora.redhat.com/).

- Install nfs-utils-1.1.1 or greater on the client
- Install nfs-utils-1.1.2 or greater on the client

An NFS/RDMA mount point can only be obtained by using the mount.nfs
command in nfs-utils-1.1.1 or greater. To see which version of mount.nfs
you are using, type:
An NFS/RDMA mount point can be obtained by using the mount.nfs command in
nfs-utils-1.1.2 or greater (nfs-utils-1.1.1 was the first nfs-utils
version with support for NFS/RDMA mounts, but for various reasons we
recommend using nfs-utils-1.1.2 or greater). To see which version of
mount.nfs you are using, type:

> /sbin/mount.nfs -V
$ /sbin/mount.nfs -V

If the version is less than 1.1.1 or the command does not exist,
then you will need to install the latest version of nfs-utils.
If the version is less than 1.1.2 or the command does not exist,
you should install the latest version of nfs-utils.

Download the latest package from:

http://www.kernel.org/pub/linux/utils/nfs

Uncompress the package and follow the installation instructions.

If you will not be using GSS and NFSv4, the installation process
can be simplified by disabling these features when running configure:
If you will not need the idmapper and gssd executables (you do not need
these to create an NFS/RDMA enabled mount command), the installation
process can be simplified by disabling these features when running
configure:

> ./configure --disable-gss --disable-nfsv4
$ ./configure --disable-gss --disable-nfsv4

For more information on this see the package's README and INSTALL files.
To build nfs-utils you will need the tcp_wrappers package installed. For
more information on this see the package's README and INSTALL files.

After building the nfs-utils package, there will be a mount.nfs binary in
the utils/mount directory. This binary can be used to initiate NFS v2, v3,
or v4 mounts. To initiate a v4 mount, the binary must be called mount.nfs4.
The standard technique is to create a symlink called mount.nfs4 to mount.nfs.
or v4 mounts. To initiate a v4 mount, the binary must be called
mount.nfs4. The standard technique is to create a symlink called
mount.nfs4 to mount.nfs.

NOTE: mount.nfs and therefore nfs-utils-1.1.1 or greater is only needed
This mount.nfs binary should be installed at /sbin/mount.nfs as follows:

$ sudo cp utils/mount/mount.nfs /sbin/mount.nfs

In this location, mount.nfs will be invoked automatically for NFS mounts
by the system mount commmand.

NOTE: mount.nfs and therefore nfs-utils-1.1.2 or greater is only needed
on the NFS client machine. You do not need this specific version of
nfs-utils on the server. Furthermore, only the mount.nfs command from
nfs-utils-1.1.1 is needed on the client.
nfs-utils-1.1.2 is needed on the client.

- Install a Linux kernel with NFS/RDMA

Expand Down Expand Up @@ -156,8 +169,8 @@ Check RDMA and NFS Setup
this time. For example, if you are using a Mellanox Tavor/Sinai/Arbel
card:

> modprobe ib_mthca
> modprobe ib_ipoib
$ modprobe ib_mthca
$ modprobe ib_ipoib

If you are using InfiniBand, make sure there is a Subnet Manager (SM)
running on the network. If your IB switch has an embedded SM, you can
Expand All @@ -166,18 +179,18 @@ Check RDMA and NFS Setup

If an SM is running on your network, you should see the following:

> cat /sys/class/infiniband/driverX/ports/1/state
$ cat /sys/class/infiniband/driverX/ports/1/state
4: ACTIVE

where driverX is mthca0, ipath5, ehca3, etc.

To further test the InfiniBand software stack, use IPoIB (this
assumes you have two IB hosts named host1 and host2):

host1> ifconfig ib0 a.b.c.x
host2> ifconfig ib0 a.b.c.y
host1> ping a.b.c.y
host2> ping a.b.c.x
host1$ ifconfig ib0 a.b.c.x
host2$ ifconfig ib0 a.b.c.y
host1$ ping a.b.c.y
host2$ ping a.b.c.x

For other device types, follow the appropriate procedures.

Expand All @@ -202,55 +215,57 @@ NFS/RDMA Setup
/vol0 192.168.0.47(fsid=0,rw,async,insecure,no_root_squash)
/vol0 192.168.0.0/255.255.255.0(fsid=0,rw,async,insecure,no_root_squash)

The IP address(es) is(are) the client's IPoIB address for an InfiniBand HCA or the
cleint's iWARP address(es) for an RNIC.
The IP address(es) is(are) the client's IPoIB address for an InfiniBand
HCA or the cleint's iWARP address(es) for an RNIC.

NOTE: The "insecure" option must be used because the NFS/RDMA client does not
use a reserved port.
NOTE: The "insecure" option must be used because the NFS/RDMA client does
not use a reserved port.

Each time a machine boots:

- Load and configure the RDMA drivers

For InfiniBand using a Mellanox adapter:

> modprobe ib_mthca
> modprobe ib_ipoib
> ifconfig ib0 a.b.c.d
$ modprobe ib_mthca
$ modprobe ib_ipoib
$ ifconfig ib0 a.b.c.d

NOTE: use unique addresses for the client and server

- Start the NFS server

If the NFS/RDMA server was built as a module (CONFIG_SUNRPC_XPRT_RDMA=m in kernel config),
load the RDMA transport module:
If the NFS/RDMA server was built as a module (CONFIG_SUNRPC_XPRT_RDMA=m in
kernel config), load the RDMA transport module:

> modprobe svcrdma
$ modprobe svcrdma

Regardless of how the server was built (module or built-in), start the server:
Regardless of how the server was built (module or built-in), start the
server:

> /etc/init.d/nfs start
$ /etc/init.d/nfs start

or

> service nfs start
$ service nfs start

Instruct the server to listen on the RDMA transport:

> echo rdma 2050 > /proc/fs/nfsd/portlist
$ echo rdma 2050 > /proc/fs/nfsd/portlist

- On the client system

If the NFS/RDMA client was built as a module (CONFIG_SUNRPC_XPRT_RDMA=m in kernel config),
load the RDMA client module:
If the NFS/RDMA client was built as a module (CONFIG_SUNRPC_XPRT_RDMA=m in
kernel config), load the RDMA client module:

> modprobe xprtrdma.ko
$ modprobe xprtrdma.ko

Regardless of how the client was built (module or built-in), issue the mount.nfs command:
Regardless of how the client was built (module or built-in), use this
command to mount the NFS/RDMA server:

> /path/to/your/mount.nfs <IPoIB-server-name-or-address>:/<export> /mnt -i -o rdma,port=2050
$ mount -o rdma,port=2050 <IPoIB-server-name-or-address>:/<export> /mnt

To verify that the mount is using RDMA, run "cat /proc/mounts" and check the
"proto" field for the given mount.
To verify that the mount is using RDMA, run "cat /proc/mounts" and check
the "proto" field for the given mount.

Congratulations! You're using NFS/RDMA!
33 changes: 13 additions & 20 deletions trunk/fs/lockd/svc.c
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,7 @@ EXPORT_SYMBOL(nlmsvc_ops);
static DEFINE_MUTEX(nlmsvc_mutex);
static unsigned int nlmsvc_users;
static struct task_struct *nlmsvc_task;
static struct svc_serv *nlmsvc_serv;
static struct svc_rqst *nlmsvc_rqst;
int nlmsvc_grace_period;
unsigned long nlmsvc_timeout;

Expand Down Expand Up @@ -194,20 +194,11 @@ lockd(void *vrqstp)

svc_process(rqstp);
}

flush_signals(current);
if (nlmsvc_ops)
nlmsvc_invalidate_all();
nlm_shutdown_hosts();

unlock_kernel();

nlmsvc_task = NULL;
nlmsvc_serv = NULL;

/* Exit the RPC thread */
svc_exit_thread(rqstp);

return 0;
}

Expand Down Expand Up @@ -254,16 +245,15 @@ int
lockd_up(int proto) /* Maybe add a 'family' option when IPv6 is supported ?? */
{
struct svc_serv *serv;
struct svc_rqst *rqstp;
int error = 0;

mutex_lock(&nlmsvc_mutex);
/*
* Check whether we're already up and running.
*/
if (nlmsvc_serv) {
if (nlmsvc_rqst) {
if (proto)
error = make_socks(nlmsvc_serv, proto);
error = make_socks(nlmsvc_rqst->rq_server, proto);
goto out;
}

Expand All @@ -288,26 +278,26 @@ lockd_up(int proto) /* Maybe add a 'family' option when IPv6 is supported ?? */
/*
* Create the kernel thread and wait for it to start.
*/
rqstp = svc_prepare_thread(serv, &serv->sv_pools[0]);
if (IS_ERR(rqstp)) {
error = PTR_ERR(rqstp);
nlmsvc_rqst = svc_prepare_thread(serv, &serv->sv_pools[0]);
if (IS_ERR(nlmsvc_rqst)) {
error = PTR_ERR(nlmsvc_rqst);
nlmsvc_rqst = NULL;
printk(KERN_WARNING
"lockd_up: svc_rqst allocation failed, error=%d\n",
error);
goto destroy_and_out;
}

svc_sock_update_bufs(serv);
nlmsvc_serv = rqstp->rq_server;

nlmsvc_task = kthread_run(lockd, rqstp, serv->sv_name);
nlmsvc_task = kthread_run(lockd, nlmsvc_rqst, serv->sv_name);
if (IS_ERR(nlmsvc_task)) {
error = PTR_ERR(nlmsvc_task);
svc_exit_thread(nlmsvc_rqst);
nlmsvc_task = NULL;
nlmsvc_serv = NULL;
nlmsvc_rqst = NULL;
printk(KERN_WARNING
"lockd_up: kthread_run failed, error=%d\n", error);
svc_exit_thread(rqstp);
goto destroy_and_out;
}

Expand Down Expand Up @@ -346,6 +336,9 @@ lockd_down(void)
BUG();
}
kthread_stop(nlmsvc_task);
svc_exit_thread(nlmsvc_rqst);
nlmsvc_task = NULL;
nlmsvc_rqst = NULL;
out:
mutex_unlock(&nlmsvc_mutex);
}
Expand Down
2 changes: 1 addition & 1 deletion trunk/fs/nfsd/lockd.c
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ nlm_fopen(struct svc_rqst *rqstp, struct nfs_fh *f, struct file **filp)
fh.fh_export = NULL;

exp_readlock();
nfserr = nfsd_open(rqstp, &fh, S_IFREG, MAY_LOCK, filp);
nfserr = nfsd_open(rqstp, &fh, S_IFREG, NFSD_MAY_LOCK, filp);
fh_put(&fh);
rqstp->rq_client = NULL;
exp_readunlock();
Expand Down
7 changes: 4 additions & 3 deletions trunk/fs/nfsd/nfs2acl.c
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,8 @@ static __be32 nfsacld_proc_getacl(struct svc_rqst * rqstp,
dprintk("nfsd: GETACL(2acl) %s\n", SVCFH_fmt(&argp->fh));

fh = fh_copy(&resp->fh, &argp->fh);
if ((nfserr = fh_verify(rqstp, &resp->fh, 0, MAY_NOP)))
nfserr = fh_verify(rqstp, &resp->fh, 0, NFSD_MAY_NOP);
if (nfserr)
RETURN_STATUS(nfserr);

if (argp->mask & ~(NFS_ACL|NFS_ACLCNT|NFS_DFACL|NFS_DFACLCNT))
Expand Down Expand Up @@ -107,7 +108,7 @@ static __be32 nfsacld_proc_setacl(struct svc_rqst * rqstp,
dprintk("nfsd: SETACL(2acl) %s\n", SVCFH_fmt(&argp->fh));

fh = fh_copy(&resp->fh, &argp->fh);
nfserr = fh_verify(rqstp, &resp->fh, 0, MAY_SATTR);
nfserr = fh_verify(rqstp, &resp->fh, 0, NFSD_MAY_SATTR);

if (!nfserr) {
nfserr = nfserrno( nfsd_set_posix_acl(
Expand All @@ -134,7 +135,7 @@ static __be32 nfsacld_proc_getattr(struct svc_rqst * rqstp,
dprintk("nfsd: GETATTR %s\n", SVCFH_fmt(&argp->fh));

fh_copy(&resp->fh, &argp->fh);
return fh_verify(rqstp, &resp->fh, 0, MAY_NOP);
return fh_verify(rqstp, &resp->fh, 0, NFSD_MAY_NOP);
}

/*
Expand Down
5 changes: 3 additions & 2 deletions trunk/fs/nfsd/nfs3acl.c
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,8 @@ static __be32 nfsd3_proc_getacl(struct svc_rqst * rqstp,
__be32 nfserr = 0;

fh = fh_copy(&resp->fh, &argp->fh);
if ((nfserr = fh_verify(rqstp, &resp->fh, 0, MAY_NOP)))
nfserr = fh_verify(rqstp, &resp->fh, 0, NFSD_MAY_NOP);
if (nfserr)
RETURN_STATUS(nfserr);

if (argp->mask & ~(NFS_ACL|NFS_ACLCNT|NFS_DFACL|NFS_DFACLCNT))
Expand Down Expand Up @@ -101,7 +102,7 @@ static __be32 nfsd3_proc_setacl(struct svc_rqst * rqstp,
__be32 nfserr = 0;

fh = fh_copy(&resp->fh, &argp->fh);
nfserr = fh_verify(rqstp, &resp->fh, 0, MAY_SATTR);
nfserr = fh_verify(rqstp, &resp->fh, 0, NFSD_MAY_SATTR);

if (!nfserr) {
nfserr = nfserrno( nfsd_set_posix_acl(
Expand Down
8 changes: 4 additions & 4 deletions trunk/fs/nfsd/nfs3proc.c
Original file line number Diff line number Diff line change
Expand Up @@ -63,7 +63,7 @@ nfsd3_proc_getattr(struct svc_rqst *rqstp, struct nfsd_fhandle *argp,
SVCFH_fmt(&argp->fh));

fh_copy(&resp->fh, &argp->fh);
nfserr = fh_verify(rqstp, &resp->fh, 0, MAY_NOP);
nfserr = fh_verify(rqstp, &resp->fh, 0, NFSD_MAY_NOP);
if (nfserr)
RETURN_STATUS(nfserr);

Expand Down Expand Up @@ -242,7 +242,7 @@ nfsd3_proc_create(struct svc_rqst *rqstp, struct nfsd3_createargs *argp,
attr = &argp->attrs;

/* Get the directory inode */
nfserr = fh_verify(rqstp, dirfhp, S_IFDIR, MAY_CREATE);
nfserr = fh_verify(rqstp, dirfhp, S_IFDIR, NFSD_MAY_CREATE);
if (nfserr)
RETURN_STATUS(nfserr);

Expand Down Expand Up @@ -558,7 +558,7 @@ nfsd3_proc_fsinfo(struct svc_rqst * rqstp, struct nfsd_fhandle *argp,
resp->f_maxfilesize = ~(u32) 0;
resp->f_properties = NFS3_FSF_DEFAULT;

nfserr = fh_verify(rqstp, &argp->fh, 0, MAY_NOP);
nfserr = fh_verify(rqstp, &argp->fh, 0, NFSD_MAY_NOP);

/* Check special features of the file system. May request
* different read/write sizes for file systems known to have
Expand Down Expand Up @@ -597,7 +597,7 @@ nfsd3_proc_pathconf(struct svc_rqst * rqstp, struct nfsd_fhandle *argp,
resp->p_case_insensitive = 0;
resp->p_case_preserving = 1;

nfserr = fh_verify(rqstp, &argp->fh, 0, MAY_NOP);
nfserr = fh_verify(rqstp, &argp->fh, 0, NFSD_MAY_NOP);

if (nfserr == 0) {
struct super_block *sb = argp->fh.fh_dentry->d_inode->i_sb;
Expand Down
Loading

0 comments on commit 61303a9

Please sign in to comment.