Skip to content

Commit

Permalink
Merge branch 'for-linus' of git://linux-nfs.org/~bfields/linux
Browse files Browse the repository at this point in the history
* 'for-linus' of git://linux-nfs.org/~bfields/linux: (52 commits)
  knfsd: clear both setuid and setgid whenever a chown is done
  knfsd: get rid of imode variable in nfsd_setattr
  SUNRPC: Use unsigned loop and array index in svc_init_buffer()
  SUNRPC: Use unsigned index when looping over arrays
  SUNRPC: Update RPC server's TCP record marker decoder
  SUNRPC: RPC server still uses 2.4 method for disabling TCP Nagle
  NLM: don't let lockd exit on unexpected svc_recv errors (try #2)
  NFS: don't let nfs_callback_svc exit on unexpected svc_recv errors (try #2)
  Use a zero sized array for raw field in struct fid
  nfsd: use static memory for callback program and stats
  SUNRPC: remove svc_create_thread()
  nfsd: fix comment
  lockd: Fix stale nlmsvc_unlink_block comment
  NFSD: Strip __KERNEL__ testing from unexported header files.
  sunrpc: make token header values less confusing
  gss_krb5: consistently use unsigned for seqnum
  NFSD: Remove NFSv4 dependency on NFSv3
  SUNRPC: Remove PROC_FS dependency
  NFSD: Use "depends on" for PROC_FS dependency
  nfsd: move most of fh_verify to separate function
  ...
  • Loading branch information
Linus Torvalds committed Apr 24, 2008
2 parents c328d54 + ca45625 commit 10c993a
Show file tree
Hide file tree
Showing 43 changed files with 879 additions and 517 deletions.
252 changes: 252 additions & 0 deletions Documentation/filesystems/nfs-rdma.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,252 @@
################################################################################
# #
# NFS/RDMA README #
# #
################################################################################

Author: NetApp and Open Grid Computing
Date: February 25, 2008

Table of Contents
~~~~~~~~~~~~~~~~~
- Overview
- Getting Help
- Installation
- Check RDMA and NFS Setup
- NFS/RDMA Setup

Overview
~~~~~~~~

This document describes how to install and setup the Linux NFS/RDMA client
and server software.

The NFS/RDMA client was first included in Linux 2.6.24. The NFS/RDMA server
was first included in the following release, Linux 2.6.25.

In our testing, we have obtained excellent performance results (full 10Gbit
wire bandwidth at minimal client CPU) under many workloads. The code passes
the full Connectathon test suite and operates over both Infiniband and iWARP
RDMA adapters.

Getting Help
~~~~~~~~~~~~

If you get stuck, you can ask questions on the

nfs-rdma-devel@lists.sourceforge.net

mailing list.

Installation
~~~~~~~~~~~~

These instructions are a step by step guide to building a machine for
use with NFS/RDMA.

- Install an RDMA device

Any device supported by the drivers in drivers/infiniband/hw is acceptable.

Testing has been performed using several Mellanox-based IB cards, the
Ammasso AMS1100 iWARP adapter, and the Chelsio cxgb3 iWARP adapter.

- Install a Linux distribution and tools

The first kernel release to contain both the NFS/RDMA client and server was
Linux 2.6.25 Therefore, a distribution compatible with this and subsequent
Linux kernel release should be installed.

The procedures described in this document have been tested with
distributions from Red Hat's Fedora Project (http://fedora.redhat.com/).

- Install nfs-utils-1.1.1 or greater on the client

An NFS/RDMA mount point can only be obtained by using the mount.nfs
command in nfs-utils-1.1.1 or greater. To see which version of mount.nfs
you are using, type:

> /sbin/mount.nfs -V

If the version is less than 1.1.1 or the command does not exist,
then you will need to install the latest version of nfs-utils.

Download the latest package from:

http://www.kernel.org/pub/linux/utils/nfs

Uncompress the package and follow the installation instructions.

If you will not be using GSS and NFSv4, the installation process
can be simplified by disabling these features when running configure:

> ./configure --disable-gss --disable-nfsv4

For more information on this see the package's README and INSTALL files.

After building the nfs-utils package, there will be a mount.nfs binary in
the utils/mount directory. This binary can be used to initiate NFS v2, v3,
or v4 mounts. To initiate a v4 mount, the binary must be called mount.nfs4.
The standard technique is to create a symlink called mount.nfs4 to mount.nfs.

NOTE: mount.nfs and therefore nfs-utils-1.1.1 or greater is only needed
on the NFS client machine. You do not need this specific version of
nfs-utils on the server. Furthermore, only the mount.nfs command from
nfs-utils-1.1.1 is needed on the client.

- Install a Linux kernel with NFS/RDMA

The NFS/RDMA client and server are both included in the mainline Linux
kernel version 2.6.25 and later. This and other versions of the 2.6 Linux
kernel can be found at:

ftp://ftp.kernel.org/pub/linux/kernel/v2.6/

Download the sources and place them in an appropriate location.

- Configure the RDMA stack

Make sure your kernel configuration has RDMA support enabled. Under
Device Drivers -> InfiniBand support, update the kernel configuration
to enable InfiniBand support [NOTE: the option name is misleading. Enabling
InfiniBand support is required for all RDMA devices (IB, iWARP, etc.)].

Enable the appropriate IB HCA support (mlx4, mthca, ehca, ipath, etc.) or
iWARP adapter support (amso, cxgb3, etc.).

If you are using InfiniBand, be sure to enable IP-over-InfiniBand support.

- Configure the NFS client and server

Your kernel configuration must also have NFS file system support and/or
NFS server support enabled. These and other NFS related configuration
options can be found under File Systems -> Network File Systems.

- Build, install, reboot

The NFS/RDMA code will be enabled automatically if NFS and RDMA
are turned on. The NFS/RDMA client and server are configured via the hidden
SUNRPC_XPRT_RDMA config option that depends on SUNRPC and INFINIBAND. The
value of SUNRPC_XPRT_RDMA will be:

- N if either SUNRPC or INFINIBAND are N, in this case the NFS/RDMA client
and server will not be built
- M if both SUNRPC and INFINIBAND are on (M or Y) and at least one is M,
in this case the NFS/RDMA client and server will be built as modules
- Y if both SUNRPC and INFINIBAND are Y, in this case the NFS/RDMA client
and server will be built into the kernel

Therefore, if you have followed the steps above and turned no NFS and RDMA,
the NFS/RDMA client and server will be built.

Build a new kernel, install it, boot it.

Check RDMA and NFS Setup
~~~~~~~~~~~~~~~~~~~~~~~~

Before configuring the NFS/RDMA software, it is a good idea to test
your new kernel to ensure that the kernel is working correctly.
In particular, it is a good idea to verify that the RDMA stack
is functioning as expected and standard NFS over TCP/IP and/or UDP/IP
is working properly.

- Check RDMA Setup

If you built the RDMA components as modules, load them at
this time. For example, if you are using a Mellanox Tavor/Sinai/Arbel
card:

> modprobe ib_mthca
> modprobe ib_ipoib

If you are using InfiniBand, make sure there is a Subnet Manager (SM)
running on the network. If your IB switch has an embedded SM, you can
use it. Otherwise, you will need to run an SM, such as OpenSM, on one
of your end nodes.

If an SM is running on your network, you should see the following:

> cat /sys/class/infiniband/driverX/ports/1/state
4: ACTIVE

where driverX is mthca0, ipath5, ehca3, etc.

To further test the InfiniBand software stack, use IPoIB (this
assumes you have two IB hosts named host1 and host2):

host1> ifconfig ib0 a.b.c.x
host2> ifconfig ib0 a.b.c.y
host1> ping a.b.c.y
host2> ping a.b.c.x

For other device types, follow the appropriate procedures.

- Check NFS Setup

For the NFS components enabled above (client and/or server),
test their functionality over standard Ethernet using TCP/IP or UDP/IP.

NFS/RDMA Setup
~~~~~~~~~~~~~~

We recommend that you use two machines, one to act as the client and
one to act as the server.

One time configuration:

- On the server system, configure the /etc/exports file and
start the NFS/RDMA server.

Exports entries with the following format have been tested:

/vol0 10.97.103.47(rw,async) 192.168.0.47(rw,async,insecure,no_root_squash)

Here the first IP address is the client's Ethernet address and the second
IP address is the clients IPoIB address.

Each time a machine boots:

- Load and configure the RDMA drivers

For InfiniBand using a Mellanox adapter:

> modprobe ib_mthca
> modprobe ib_ipoib
> ifconfig ib0 a.b.c.d

NOTE: use unique addresses for the client and server

- Start the NFS server

If the NFS/RDMA server was built as a module (CONFIG_SUNRPC_XPRT_RDMA=m in kernel config),
load the RDMA transport module:

> modprobe svcrdma

Regardless of how the server was built (module or built-in), start the server:

> /etc/init.d/nfs start

or

> service nfs start

Instruct the server to listen on the RDMA transport:

> echo rdma 2050 > /proc/fs/nfsd/portlist

- On the client system

If the NFS/RDMA client was built as a module (CONFIG_SUNRPC_XPRT_RDMA=m in kernel config),
load the RDMA client module:

> modprobe xprtrdma.ko

Regardless of how the client was built (module or built-in), issue the mount.nfs command:

> /path/to/your/mount.nfs <IPoIB-server-name-or-address>:/<export> /mnt -i -o rdma,port=2050

To verify that the mount is using RDMA, run "cat /proc/mounts" and check the
"proto" field for the given mount.

Congratulations! You're using NFS/RDMA!
109 changes: 57 additions & 52 deletions fs/Kconfig
Original file line number Diff line number Diff line change
Expand Up @@ -411,7 +411,7 @@ config JFS_STATISTICS
to be made available to the user in the /proc/fs/jfs/ directory.

config FS_POSIX_ACL
# Posix ACL utility routines (for now, only ext2/ext3/jfs/reiserfs)
# Posix ACL utility routines (for now, only ext2/ext3/jfs/reiserfs/nfs4)
#
# NOTE: you can implement Posix ACLs without these helpers (XFS does).
# Never use this symbol for ifdefs.
Expand Down Expand Up @@ -1694,75 +1694,80 @@ config NFSD
select LOCKD
select SUNRPC
select EXPORTFS
select NFSD_V2_ACL if NFSD_V3_ACL
select NFS_ACL_SUPPORT if NFSD_V2_ACL
select NFSD_TCP if NFSD_V4
select CRYPTO_MD5 if NFSD_V4
select CRYPTO if NFSD_V4
select FS_POSIX_ACL if NFSD_V4
select PROC_FS if NFSD_V4
select PROC_FS if SUNRPC_GSS
help
If you want your Linux box to act as an NFS *server*, so that other
computers on your local network which support NFS can access certain
directories on your box transparently, you have two options: you can
use the self-contained user space program nfsd, in which case you
should say N here, or you can say Y and use the kernel based NFS
server. The advantage of the kernel based solution is that it is
faster.

In either case, you will need support software; the respective
locations are given in the file <file:Documentation/Changes> in the
NFS section.

If you say Y here, you will get support for version 2 of the NFS
protocol (NFSv2). If you also want NFSv3, say Y to the next question
as well.

Please read the NFS-HOWTO, available from
<http://www.tldp.org/docs.html#howto>.

To compile the NFS server support as a module, choose M here: the
module will be called nfsd. If unsure, say N.
help
Choose Y here if you want to allow other computers to access
files residing on this system using Sun's Network File System
protocol. To compile the NFS server support as a module,
choose M here: the module will be called nfsd.

You may choose to use a user-space NFS server instead, in which
case you can choose N here.

To export local file systems using NFS, you also need to install
user space programs which can be found in the Linux nfs-utils
package, available from http://linux-nfs.org/. More detail about
the Linux NFS server implementation is available via the
exports(5) man page.

Below you can choose which versions of the NFS protocol are
available to clients mounting the NFS server on this system.
Support for NFS version 2 (RFC 1094) is always available when
CONFIG_NFSD is selected.

If unsure, say N.

config NFSD_V2_ACL
bool
depends on NFSD

config NFSD_V3
bool "Provide NFSv3 server support"
bool "NFS server support for NFS version 3"
depends on NFSD
help
If you would like to include the NFSv3 server as well as the NFSv2
server, say Y here. If unsure, say Y.
This option enables support in your system's NFS server for
version 3 of the NFS protocol (RFC 1813).

If unsure, say Y.

config NFSD_V3_ACL
bool "Provide server support for the NFSv3 ACL protocol extension"
bool "NFS server support for the NFSv3 ACL protocol extension"
depends on NFSD_V3
select NFSD_V2_ACL
help
Implement the NFSv3 ACL protocol extension for manipulating POSIX
Access Control Lists on exported file systems. NFS clients should
be compiled with the NFSv3 ACL protocol extension; see the
CONFIG_NFS_V3_ACL option. If unsure, say N.
Solaris NFS servers support an auxiliary NFSv3 ACL protocol that
never became an official part of the NFS version 3 protocol.
This protocol extension allows applications on NFS clients to
manipulate POSIX Access Control Lists on files residing on NFS
servers. NFS servers enforce POSIX ACLs on local files whether
this protocol is available or not.

This option enables support in your system's NFS server for the
NFSv3 ACL protocol extension allowing NFS clients to manipulate
POSIX ACLs on files exported by your system's NFS server. NFS
clients which support the Solaris NFSv3 ACL protocol can then
access and modify ACLs on your NFS server.

To store ACLs on your NFS server, you also need to enable ACL-
related CONFIG options for your local file systems of choice.

If unsure, say N.

config NFSD_V4
bool "Provide NFSv4 server support (EXPERIMENTAL)"
depends on NFSD && NFSD_V3 && EXPERIMENTAL
bool "NFS server support for NFS version 4 (EXPERIMENTAL)"
depends on NFSD && PROC_FS && EXPERIMENTAL
select NFSD_V3
select FS_POSIX_ACL
select RPCSEC_GSS_KRB5
help
If you would like to include the NFSv4 server as well as the NFSv2
and NFSv3 servers, say Y here. This feature is experimental, and
should only be used if you are interested in helping to test NFSv4.
If unsure, say N.
This option enables support in your system's NFS server for
version 4 of the NFS protocol (RFC 3530).

config NFSD_TCP
bool "Provide NFS server over TCP support"
depends on NFSD
default y
help
If you want your NFS server to support TCP connections, say Y here.
TCP connections usually perform better than the default UDP when
the network is lossy or congested. If unsure, say Y.
To export files using NFSv4, you need to install additional user
space programs which can be found in the Linux nfs-utils package,
available from http://linux-nfs.org/.

If unsure, say N.

config ROOT_NFS
bool "Root file system on NFS"
Expand Down
Loading

0 comments on commit 10c993a

Please sign in to comment.