Skip to content

Reduce sunrpc.tcp_max_slot_table_entries #94

Merged
merged 1 commit into from
Jul 1, 2019
Merged

Conversation

donald
Copy link
Collaborator

@donald donald commented Jun 28, 2019

PR #93 did not work as expected.

Further analysis showed, that the parallelism of page io
(which includes buffered reads) is not limitted by nfs.max_session_slots
but by sunrpc.tcp_max_slot_table_entries.

The default value is 65536 (RPC_MAX_SLOT_TABLE_LIMIT).

This parameter can be modified for the running system via
/proc/sys/sunrpc/tcp_max_slot_table_entries and is applied to
new server connections.

Experiment (on simsalabimbambasaladusaladim):

# preload server cache
ssh mrtorgue cat /project/stresser/files/random_10G.dat \> /dev/null

umount /project/stresser
echo 3 > /proc/sys/vm/drop_caches
time cat /project/stresser/files/random_10G.dat > /dev/null

# result: 0m11.429s  ca. 14 nfsd visible active in `top` on server

echo 4 > /proc/sys/sunrpc/tcp_max_slot_table_entries
umount /project/stresser
echo 3 > /proc/sys/vm/drop_caches
time cat /project/stresser/files/random_10G.dat > /dev/null

# 0m12.546s , 4 nfsd visible active in `top` on server

echo 65536 > /proc/sys/sunrpc/tcp_max_slot_table_entries
umount /project/stresser
echo 3 > /proc/sys/vm/drop_caches
time cat /project/stresser/files/random_10G.dat > /dev/null

# result: 0m11.937s , ca. 16 nfsd visible active in `top` on server

Reduce sunrpc.tcp_max_slot_table_entries to 4 instead of nfs.max_session_slots.

PR #93 did not work as expected.

Further analysis showed, that the parallelism of page io
(which includes buffered reads) is not limit by nfs.max_session_slots
but by sunrpc.tcp_max_slot_table_entries.

The default value is 65536 ( RPC_MAX_SLOT_TABLE_LIMIT ).

This parameter can be modified for the running system via
/proc/sys/sunrpc/tcp_max_slot_table_entries and is applied to
new server connections.

Experiment (on simsalabimbambasaladusaladim) :

    # preload server cache
    ssh mrtorgue cat /project/stresser/files/random_10G.dat \> /dev/null

    umount /project/stresser
    echo 3 > /proc/sys/vm/drop_caches
    time cat /project/stresser/files/random_10G.dat > /dev/null

    # result: 0m11.429s  ca. 14 nfsd visible active in `top` on server

    echo 4 > /proc/sys/sunrpc/tcp_max_slot_table_entries
    umount /project/stresser
    echo 3 > /proc/sys/vm/drop_caches
    time cat /project/stresser/files/random_10G.dat > /dev/null

    # 0m12.546s , 4 nfsd visible active in `top` on server

    echo 65536 > /proc/sys/sunrpc/tcp_max_slot_table_entries
    umount /project/stresser
    echo 3 > /proc/sys/vm/drop_caches
    time cat /project/stresser/files/random_10G.dat > /dev/null

    # result: 0m11.937s , ca. 16 nfsd visible active in `top` on server

Reduce sunrpc.tcp_max_slot_table_entries to 4 instead of nfs.max_session_slots
@pmenzel
Copy link
Contributor

pmenzel commented Jun 28, 2019

So why 4?

What resources does sunrpc.tcp_max_slot_table_entries depend on?

@donald
Copy link
Collaborator Author

donald commented Jun 28, 2019

  • As small as possible without causing bad effects. 4 doesn't seem to change user experience very much. See above timing. Also, I've rebooted my workstation with it and don't "feel" a disadvantage. However, we yet need to test a active cluster system with many processes and doing grep xx a>b or somesuch nonsense, too. This is the "need more testing" label.

  • I don't understand the other question.

@donald
Copy link
Collaborator Author

donald commented Jun 29, 2019

So lets start with 8 then?

@wwwutz
Copy link
Contributor

wwwutz commented Jun 29, 2019

I think 4 is ok, we'll see what happens. Dont forget there might be > 20 clients accessing one server.
The question should not be "why 4" but "hey, my research research revealed 4 is a bad value, please change it. to 8".
Four is a good value.

@donald
Copy link
Collaborator Author

donald commented Jul 1, 2019

/proc/sys/sunrpc/tcp_max_slot_table_entries not availabe on every kernel. ( CONFIG_SUNRPC_DEBUG probably required, which we don't have on old kernels).

@donald
Copy link
Collaborator Author

donald commented Jul 1, 2019

Sign in to join this conversation on GitHub.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants