-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[NET]: Supporting UDP-Lite (RFC 3828) in Linux
This is a revision of the previously submitted patch, which alters the way files are organized and compiled in the following manner: * UDP and UDP-Lite now use separate object files * source file dependencies resolved via header files net/ipv{4,6}/udp_impl.h * order of inclusion files in udp.c/udplite.c adapted accordingly [NET/IPv4]: Support for the UDP-Lite protocol (RFC 3828) This patch adds support for UDP-Lite to the IPv4 stack, provided as an extension to the existing UDPv4 code: * generic routines are all located in net/ipv4/udp.c * UDP-Lite specific routines are in net/ipv4/udplite.c * MIB/statistics support in /proc/net/snmp and /proc/net/udplite * shared API with extensions for partial checksum coverage [NET/IPv6]: Extension for UDP-Lite over IPv6 It extends the existing UDPv6 code base with support for UDP-Lite in the same manner as per UDPv4. In particular, * UDPv6 generic and shared code is in net/ipv6/udp.c * UDP-Litev6 specific extensions are in net/ipv6/udplite.c * MIB/statistics support in /proc/net/snmp6 and /proc/net/udplite6 * support for IPV6_ADDRFORM * aligned the coding style of protocol initialisation with af_inet6.c * made the error handling in udpv6_queue_rcv_skb consistent; to return `-1' on error on all error cases * consolidation of shared code [NET]: UDP-Lite Documentation and basic XFRM/Netfilter support The UDP-Lite patch further provides * API documentation for UDP-Lite * basic xfrm support * basic netfilter support for IPv4 and IPv6 (LOG target) Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
- Loading branch information
Gerrit Renker
authored and
David S. Miller
committed
Dec 3, 2006
1 parent
6051e2f
commit ba4e58e
Showing
28 changed files
with
1,442 additions
and
403 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,281 @@ | ||
=========================================================================== | ||
The UDP-Lite protocol (RFC 3828) | ||
=========================================================================== | ||
|
||
|
||
UDP-Lite is a Standards-Track IETF transport protocol whose characteristic | ||
is a variable-length checksum. This has advantages for transport of multimedia | ||
(video, VoIP) over wireless networks, as partly damaged packets can still be | ||
fed into the codec instead of being discarded due to a failed checksum test. | ||
|
||
This file briefly describes the existing kernel support and the socket API. | ||
For in-depth information, you can consult: | ||
|
||
o The UDP-Lite Homepage: http://www.erg.abdn.ac.uk/users/gerrit/udp-lite/ | ||
Fom here you can also download some example application source code. | ||
|
||
o The UDP-Lite HOWTO on | ||
http://www.erg.abdn.ac.uk/users/gerrit/udp-lite/files/UDP-Lite-HOWTO.txt | ||
|
||
o The Wireshark UDP-Lite WiKi (with capture files): | ||
http://wiki.wireshark.org/Lightweight_User_Datagram_Protocol | ||
|
||
o The Protocol Spec, RFC 3828, http://www.ietf.org/rfc/rfc3828.txt | ||
|
||
|
||
I) APPLICATIONS | ||
|
||
Several applications have been ported successfully to UDP-Lite. Ethereal | ||
(now called wireshark) has UDP-Litev4/v6 support by default. The tarball on | ||
|
||
http://www.erg.abdn.ac.uk/users/gerrit/udp-lite/files/udplite_linux.tar.gz | ||
|
||
has source code for several v4/v6 client-server and network testing examples. | ||
|
||
Porting applications to UDP-Lite is straightforward: only socket level and | ||
IPPROTO need to be changed; senders additionally set the checksum coverage | ||
length (default = header length = 8). Details are in the next section. | ||
|
||
|
||
II) PROGRAMMING API | ||
|
||
UDP-Lite provides a connectionless, unreliable datagram service and hence | ||
uses the same socket type as UDP. In fact, porting from UDP to UDP-Lite is | ||
very easy: simply add `IPPROTO_UDPLITE' as the last argument of the socket(2) | ||
call so that the statement looks like: | ||
|
||
s = socket(PF_INET, SOCK_DGRAM, IPPROTO_UDPLITE); | ||
|
||
or, respectively, | ||
|
||
s = socket(PF_INET6, SOCK_DGRAM, IPPROTO_UDPLITE); | ||
|
||
With just the above change you are able to run UDP-Lite services or connect | ||
to UDP-Lite servers. The kernel will assume that you are not interested in | ||
using partial checksum coverage and so emulate UDP mode (full coverage). | ||
|
||
To make use of the partial checksum coverage facilities requires setting a | ||
single socket option, which takes an integer specifying the coverage length: | ||
|
||
* Sender checksum coverage: UDPLITE_SEND_CSCOV | ||
|
||
For example, | ||
|
||
int val = 20; | ||
setsockopt(s, SOL_UDPLITE, UDPLITE_SEND_CSCOV, &val, sizeof(int)); | ||
|
||
sets the checksum coverage length to 20 bytes (12b data + 8b header). | ||
Of each packet only the first 20 bytes (plus the pseudo-header) will be | ||
checksummed. This is useful for RTP applications which have a 12-byte | ||
base header. | ||
|
||
|
||
* Receiver checksum coverage: UDPLITE_RECV_CSCOV | ||
|
||
This option is the receiver-side analogue. It is truly optional, i.e. not | ||
required to enable traffic with partial checksum coverage. Its function is | ||
that of a traffic filter: when enabled, it instructs the kernel to drop | ||
all packets which have a coverage _less_ than this value. For example, if | ||
RTP and UDP headers are to be protected, a receiver can enforce that only | ||
packets with a minimum coverage of 20 are admitted: | ||
|
||
int min = 20; | ||
setsockopt(s, SOL_UDPLITE, UDPLITE_RECV_CSCOV, &min, sizeof(int)); | ||
|
||
The calls to getsockopt(2) are analogous. Being an extension and not a stand- | ||
alone protocol, all socket options known from UDP can be used in exactly the | ||
same manner as before, e.g. UDP_CORK or UDP_ENCAP. | ||
|
||
A detailed discussion of UDP-Lite checksum coverage options is in section IV. | ||
|
||
|
||
III) HEADER FILES | ||
|
||
The socket API requires support through header files in /usr/include: | ||
|
||
* /usr/include/netinet/in.h | ||
to define IPPROTO_UDPLITE | ||
|
||
* /usr/include/netinet/udplite.h | ||
for UDP-Lite header fields and protocol constants | ||
|
||
For testing purposes, the following can serve as a `mini' header file: | ||
|
||
#define IPPROTO_UDPLITE 136 | ||
#define SOL_UDPLITE 136 | ||
#define UDPLITE_SEND_CSCOV 10 | ||
#define UDPLITE_RECV_CSCOV 11 | ||
|
||
Ready-made header files for various distros are in the UDP-Lite tarball. | ||
|
||
|
||
IV) KERNEL BEHAVIOUR WITH REGARD TO THE VARIOUS SOCKET OPTIONS | ||
|
||
To enable debugging messages, the log level need to be set to 8, as most | ||
messages use the KERN_DEBUG level (7). | ||
|
||
1) Sender Socket Options | ||
|
||
If the sender specifies a value of 0 as coverage length, the module | ||
assumes full coverage, transmits a packet with coverage length of 0 | ||
and according checksum. If the sender specifies a coverage < 8 and | ||
different from 0, the kernel assumes 8 as default value. Finally, | ||
if the specified coverage length exceeds the packet length, the packet | ||
length is used instead as coverage length. | ||
|
||
2) Receiver Socket Options | ||
|
||
The receiver specifies the minimum value of the coverage length it | ||
is willing to accept. A value of 0 here indicates that the receiver | ||
always wants the whole of the packet covered. In this case, all | ||
partially covered packets are dropped and an error is logged. | ||
|
||
It is not possible to specify illegal values (<0 and <8); in these | ||
cases the default of 8 is assumed. | ||
|
||
All packets arriving with a coverage value less than the specified | ||
threshold are discarded, these events are also logged. | ||
|
||
3) Disabling the Checksum Computation | ||
|
||
On both sender and receiver, checksumming will always be performed | ||
and can not be disabled using SO_NO_CHECK. Thus | ||
|
||
setsockopt(sockfd, SOL_SOCKET, SO_NO_CHECK, ... ); | ||
|
||
will always will be ignored, while the value of | ||
|
||
getsockopt(sockfd, SOL_SOCKET, SO_NO_CHECK, &value, ...); | ||
|
||
is meaningless (as in TCP). Packets with a zero checksum field are | ||
illegal (cf. RFC 3828, sec. 3.1) will be silently discarded. | ||
|
||
4) Fragmentation | ||
|
||
The checksum computation respects both buffersize and MTU. The size | ||
of UDP-Lite packets is determined by the size of the send buffer. The | ||
minimum size of the send buffer is 2048 (defined as SOCK_MIN_SNDBUF | ||
in include/net/sock.h), the default value is configurable as | ||
net.core.wmem_default or via setting the SO_SNDBUF socket(7) | ||
option. The maximum upper bound for the send buffer is determined | ||
by net.core.wmem_max. | ||
|
||
Given a payload size larger than the send buffer size, UDP-Lite will | ||
split the payload into several individual packets, filling up the | ||
send buffer size in each case. | ||
|
||
The precise value also depends on the interface MTU. The interface MTU, | ||
in turn, may trigger IP fragmentation. In this case, the generated | ||
UDP-Lite packet is split into several IP packets, of which only the | ||
first one contains the L4 header. | ||
|
||
The send buffer size has implications on the checksum coverage length. | ||
Consider the following example: | ||
|
||
Payload: 1536 bytes Send Buffer: 1024 bytes | ||
MTU: 1500 bytes Coverage Length: 856 bytes | ||
|
||
UDP-Lite will ship the 1536 bytes in two separate packets: | ||
|
||
Packet 1: 1024 payload + 8 byte header + 20 byte IP header = 1052 bytes | ||
Packet 2: 512 payload + 8 byte header + 20 byte IP header = 540 bytes | ||
|
||
The coverage packet covers the UDP-Lite header and 848 bytes of the | ||
payload in the first packet, the second packet is fully covered. Note | ||
that for the second packet, the coverage length exceeds the packet | ||
length. The kernel always re-adjusts the coverage length to the packet | ||
length in such cases. | ||
|
||
As an example of what happens when one UDP-Lite packet is split into | ||
several tiny fragments, consider the following example. | ||
|
||
Payload: 1024 bytes Send buffer size: 1024 bytes | ||
MTU: 300 bytes Coverage length: 575 bytes | ||
|
||
+-+-----------+--------------+--------------+--------------+ | ||
|8| 272 | 280 | 280 | 280 | | ||
+-+-----------+--------------+--------------+--------------+ | ||
280 560 840 1032 | ||
^ | ||
*****checksum coverage************* | ||
|
||
The UDP-Lite module generates one 1032 byte packet (1024 + 8 byte | ||
header). According to the interface MTU, these are split into 4 IP | ||
packets (280 byte IP payload + 20 byte IP header). The kernel module | ||
sums the contents of the entire first two packets, plus 15 bytes of | ||
the last packet before releasing the fragments to the IP module. | ||
|
||
To see the analogous case for IPv6 fragmentation, consider a link | ||
MTU of 1280 bytes and a write buffer of 3356 bytes. If the checksum | ||
coverage is less than 1232 bytes (MTU minus IPv6/fragment header | ||
lengths), only the first fragment needs to be considered. When using | ||
larger checksum coverage lengths, each eligible fragment needs to be | ||
checksummed. Suppose we have a checksum coverage of 3062. The buffer | ||
of 3356 bytes will be split into the following fragments: | ||
|
||
Fragment 1: 1280 bytes carrying 1232 bytes of UDP-Lite data | ||
Fragment 2: 1280 bytes carrying 1232 bytes of UDP-Lite data | ||
Fragment 3: 948 bytes carrying 900 bytes of UDP-Lite data | ||
|
||
The first two fragments have to be checksummed in full, of the last | ||
fragment only 598 (= 3062 - 2*1232) bytes are checksummed. | ||
|
||
While it is important that such cases are dealt with correctly, they | ||
are (annoyingly) rare: UDP-Lite is designed for optimising multimedia | ||
performance over wireless (or generally noisy) links and thus smaller | ||
coverage lenghts are likely to be expected. | ||
|
||
|
||
V) UDP-LITE RUNTIME STATISTICS AND THEIR MEANING | ||
|
||
Exceptional and error conditions are logged to syslog at the KERN_DEBUG | ||
level. Live statistics about UDP-Lite are available in /proc/net/snmp | ||
and can (with newer versions of netstat) be viewed using | ||
|
||
netstat -svu | ||
|
||
This displays UDP-Lite statistics variables, whose meaning is as follows. | ||
|
||
InDatagrams: Total number of received datagrams. | ||
|
||
NoPorts: Number of packets received to an unknown port. | ||
These cases are counted separately (not as InErrors). | ||
|
||
InErrors: Number of erroneous UDP-Lite packets. Errors include: | ||
* internal socket queue receive errors | ||
* packet too short (less than 8 bytes or stated | ||
coverage length exceeds received length) | ||
* xfrm4_policy_check() returned with error | ||
* application has specified larger min. coverage | ||
length than that of incoming packet | ||
* checksum coverage violated | ||
* bad checksum | ||
|
||
OutDatagrams: Total number of sent datagrams. | ||
|
||
These statistics derive from the UDP MIB (RFC 2013). | ||
|
||
|
||
VI) IPTABLES | ||
|
||
There is packet match support for UDP-Lite as well as support for the LOG target. | ||
If you copy and paste the following line into /etc/protcols, | ||
|
||
udplite 136 UDP-Lite # UDP-Lite [RFC 3828] | ||
|
||
then | ||
iptables -A INPUT -p udplite -j LOG | ||
|
||
will produce logging output to syslog. Dropping and rejecting packets also works. | ||
|
||
|
||
VII) MAINTAINER ADDRESS | ||
|
||
The UDP-Lite patch was developed at | ||
University of Aberdeen | ||
Electronics Research Group | ||
Department of Engineering | ||
Fraser Noble Building | ||
Aberdeen AB24 3UE; UK | ||
The current maintainer is Gerrit Renker, <gerrit@erg.abdn.ac.uk>. Initial | ||
code was developed by William Stanislaus, <william@erg.abdn.ac.uk>. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.