Skip to content

Reduce nfs max_session_lots from 64 to 4 #93

Merged
merged 1 commit into from
May 24, 2019
Merged

Conversation

donald
Copy link
Collaborator

@donald donald commented May 24, 2019

The default number of session slots (nfs/max_session_slots) is 64. This
is the maximum number of parallel requests a nfs client can have
outstanding to a nfs server (per mount). Because we have 64 nfsd threads
on our servers, a single client is able to dominate all nsfd threads of
a server.

There seems to be a bad interaction between nfs and readahead. The
readahead size is set to max_session_lots-1 and is doubled when
posix_fadvice(..,POSIX_FADV_SEQUENTIAL) is used, which "cp" and "cat"
do. Although the exact mechanism is not yet fully understood, the
readahead io-requests seem to be handled by multiple nfsd threads in
parallel, which are blocking each other while competing for the same
block device. Because all nfsd threads are used up (by a single client
doing a single copy) other nfs clients stall and we've even seen them
being pushed over their lease timeouts and losing their state.

Reduce this value via modprobe.d to avoid overload situations.

This parameter can also be read and written on a running system via
/sys/module/nfs/parameters/max_session_slots if nfs is loaded. Changes
apply to future mounts only.

The default number of session slots (nfs/max_session_lots) is 64. This
is the maximum number of parallel requests a nfs client can have
outstanding to a nfs server (per mount). Because we have 64 nfsd threads
on our servers, a single client is able to dominate all nsfd threads of
a server.

There seems to be a bad interaction between nfs and readahead. The
readahead size is set to max_session_lots-1 and is doubled when
posix_fadvice(..,POSIX_FADV_SEQUENTIAL) is used, which "cp" and "cat"
do.  Although the exact mechanism is not yet fully understood, the
readahead io-requests seem to be handled by multiple nfsd thread in
parallel, which are blocking each other while compeeting for the same
block device. Because all nfsd threads are used up (by a single client
doing a single copy) other nfs clients stall and we've even seen them
being pushed over their lease timeouts and losing their state.

Reduce this value via modprobe.d to avoid overload situations.

This parameter can also be read and written on a running system via
/sys/module/nfs/parameters/max_session_slots if nfs is loaded.  Changes
apply to future mounts only.
@pmenzel pmenzel merged commit 3c1cd23 into master May 24, 2019
@donald
Copy link
Collaborator Author

donald commented May 24, 2019

Looks like my measurements were wrong. nfsd usage is the same, even if /max_session_slots is reduced. :-(

donald added a commit that referenced this pull request Jun 28, 2019
PR #93 did not work as expected.

Further analysis showed, that the parallelism of page io
(which includes buffered reads) is not limit by nfs.max_session_slots
but by sunrpc.tcp_max_slot_table_entries.

The default value is 65536 ( RPC_MAX_SLOT_TABLE_LIMIT ).

This parameter can be modified for the running system via
/proc/sys/sunrpc/tcp_max_slot_table_entries and is applied to
new server connections.

Experiment (on simsalabimbambasaladusaladim) :

    # preload server cache
    ssh mrtorgue cat /project/stresser/files/random_10G.dat \> /dev/null

    umount /project/stresser
    echo 3 > /proc/sys/vm/drop_caches
    time cat /project/stresser/files/random_10G.dat > /dev/null

    # result: 0m11.429s  ca. 14 nfsd visible active in `top` on server

    echo 4 > /proc/sys/sunrpc/tcp_max_slot_table_entries
    umount /project/stresser
    echo 3 > /proc/sys/vm/drop_caches
    time cat /project/stresser/files/random_10G.dat > /dev/null

    # 0m12.546s , 4 nfsd visible active in `top` on server

    echo 65536 > /proc/sys/sunrpc/tcp_max_slot_table_entries
    umount /project/stresser
    echo 3 > /proc/sys/vm/drop_caches
    time cat /project/stresser/files/random_10G.dat > /dev/null

    # result: 0m11.937s , ca. 16 nfsd visible active in `top` on server

Reduce sunrpc.tcp_max_slot_table_entries to 4 instead of nfs.max_session_slots
Sign in to join this conversation on GitHub.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants