Network Working Group V. Kashyap
INTERNET DRAFT Sequent Computer Systems
Expiration date: 9 August 1998 9 Feb 1998
Modification in Datagram Too Big message
<draft-kashyap-too-big-00.txt>
Status of this Memo
This document is an Internet Draft. Internet Drafts are working
documents of the Internet Engineering Task Force (IETF), its areas,
and its working groups. Note that other groups may also distribute
working documents as Internet Drafts.
Internet Drafts are draft documents valid for a maximum of six
months, and may be updated, replaced, or obsoleted by other
documents at any time. It is inappropriate to use Internet Drafts
as reference material or to cite them other than as ``work in
progress''.
To learn the current status of any Internet Draft, please check the
``1id-abstracts.txt'' listing contained in the Internet Drafts
shadow directories on ftp.is.co.za (Africa), nic.nordu.net
(Europe), munnari.oz.au (Pacific Rim), ds.internic.net (US East
Coast), or ftp.isi.edu (US West Coast).
This memo provides information for the Internet community. This
memo does not specify an Internet standard of any kind.
Distribution of this memo is unlimited.
Abstract
This memo describes a small modification in the 'Datagram Too Big'
message for both the IPv4(ICMPv4) and IPv6(ICMPv6) standards. The
document addresses possible reduction in resources on large servers
when implementing the Path MTU discovery process.
Table of Contents
1. Introduction
2. Changes to Datagram Too Big message
2.1 IPv6 Datagram Too Big message
2.2 IPv4 Datagram Too Big message
3. Router specification
4. Impact on host implementation of Path MTU Discovery
5. Security Considerations
6. References
7. Author's Address
1. Introduction
When one IP host has a large amount of data to send to another
host, the data is transmitted as a series of IP datagrams. It is
usually preferable that these datagrams be of the largest size that
does not require fragmentation anywhere along the path from the
source to the destination. This datagram size is referred to as the
Path MTU (PMTU), and it is equal to the minimum of the MTUs of each
hop in the path.
This shortcoming is overcome by the use of the path MTU discovery
process as outlined in [1] and [2]. Datagram Too Big is defined in
[1] and [2].
With the current specification of Datagram too Big the source host
gets to know that there is a bottleneck in the path somewhere. It
cannot aggregate this information to share with other connections
(unless they are to the same destination). Thus the source host has
to cache the path information on a per host basis. Any
representation for the path may be used but in all the current
implementations (that I am aware of) of Path MTU discovery the path
information is kept as a routing table entry. The receipt of a
Datagram Too Big message causes a routing table entry to be created
for the destination host.
Any reduction in this table size reduces the resources utilized to
keep this information and in searching through the large routing
table.
The suggestion in this document is to attain the following
advantages :
. Aggregation of paths having the same PMTU
. Reduction in resources utilized to store the Path MTU
. Speed up in the MTU updates to all connections on a host
rather than each discovering it independently.
. On deletion of a network route on the host, each of the
derived routes/paths has to be deleted too. With the
reduction in the number of such caches it would take less
time and simpler algorithms can be used.
. utilities need to list a smaller set of routes. eg.
netstat.
. routing protocols need to exchange smaller tables and/or not
weed through a large set of derived routes.
2. Changes to Datagram Too Big message
2.1 IPv6 'Datagram too Big'
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Ver | Pri | Flow Label |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| |
| Source Address |
| |
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| |
| Destination Address |
| |
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Type: 2 | mask code | Checksum |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Maximum Transmission Unit Size |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
The mask translates as:
prefix mask = -1 << mask code
With the addresses being 128 bits long the code required in 8 bits
long which fits in the unused ICMP code octet.
The route/path for the Path MTU process is identified using the
final destination address ANDed to the number of prefix bits. The
path/route may also include the flow id [3] but that does not
effect this discussion. It is yet another component of the path
identification.
A mask code of 0 implies all 1s ie. a host route. This is exactly
same behaviour as the current definition. A value of 128 implies
that the router used its default route. An indication of default
route does not provide any information though. It can be considered
to be equivalent to the host route case.
If the ICMP message is received as a response to a Multicast
address the prefix mask information may not be useful.
2.2 IPv4 'Datagram too Big'
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Type = 3 | Code = 4 | Checksum |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| mask code | Next-Hop MTU |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Internet Header + 64 bits of Original Datagram Data |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
The mask code is defined as follows :
Code Mask Comment
---- ----- -------
0 255.255.255.255 host route (default)
32 0.0.0.0 default route
31 128.0.0.0
.
.
.
8 255.255.255.0
.
.
.
1 255.255.255.254
The above table is determined :
mask = -1 << mask code
The route/path MTU cache entry is identified by the final
destination AND the mask.
3. Router specification
The router in addition to returning the next hop MTU (as is done
now) returns the mask code. The mask code as per the table 1,
informs the end host of the net mask used by the router in
determining the path to the final destination.
IPv4
Currently the routers return 0 in the 16 bits used by the mask
code. Hence a router implementing the existing ICMP Datagram too
Big message will be interpreted exactly as now ie. create a host
specific route (or Path MTU cache).
IPv6
Currently the 8 bits for the icmp code are unused. Hence a router
implementing the existing ICMP Datagram too Big message will be
interpreted exactly as now ie. create a host specific route (or
Path MTU cache).
4. Impact on host implementation of Path MTU Discovery
When a host receives Datagram Too Big message for a connection it
has no way of knowing whether the subnet the peer belongs to is
behind the bottleneck. As a result the host is forced to create a
path specifically for the peer. This information cannot be shared
with another connection attempted to another host which may be on
the same sub network and behind the same bottleneck.
With the modification suggested in section 2 such sharing of
information becomes possible.
4.1 Case 1
Consider two different connections endpoint Ca to Cx and Cb to Cy.
+----+ +------+ M1 +------+ +------+ +----+
| Ca |----->| Ra |----->| Rb |----->| Rc |---->| Cx |
| Cb | | | | |----->| |--+ +____+
+____+ +------+ +------+ +------+ |
|
|
V
+------+
| Cy |
+------+
fig 1
The first MTU reduction occurs on the path from Ra to Rb. Instead
of creating a separate per path route for both Ca and Cb the host
may keep both the connections using the same route (or any other
cache to store the pmtu). Currently the host will have to create
two entries, one for each connection.
4.2 Case 2
Routing change occurs such that the PMTU for Cb is M2
+----+ +------+ M1 +------+ +------+ +----+
| Ca |----->| Ra |----->| Rb |----->| Rc |---->| Cx |
| Cb | | | | | | |--+ +____+
+____+ +------+ +------+ +------+ |
|
|M2
V
+------+
| Cy |
+------+
fig 2
A Datagram too Big message will be received for the connection Ca
to Cb. The host can utilize the information in the 'mask code' to
create a more specific route.
The above two cases can be extended to 1000s to connections on the
two paths considered.
4.3 Case 3
+----+ +------+ +------+ +------+ M1 +----+
| Ca |----->| Ra |----->| Rb |----->| Rc |---->| Cx |
| Cb | | | | | | |--+ +____+
+____+ +------+ +------+ +------+ |
|
|
Ha V
+------+
| Cy |
+------+
Hy
fig 3
The route to Cx from Rc is determined by the route x.y.255.255 but
the path to Cy is determined by x.y.z.255 (considering an IPv4
inter network. The PMTU to Cx is determined by M1 at Rc but the
path to Cy is the same as the first hop MTU all the way from the
host Ha.
If the connection Ca to Cx is made first the Ha will have an entry
corresponding to Ha to the network x.y.255.255. This will cause the
connection to Cy to use the same information as determined for the
connection to Cx. Similar situation can occur if the topology
change occurred while the connections Ca-Cx, Cb-Cy were active as
in fig 1 to fig 3.
Since the information is shared between the connections it is
possible that at the 10 minute interval (as suggested in [1] and
[2]) host Ha may fail to determine the increased Path MTU to Hy.
This problem can be avoided if the end hosts implement the policy:
. on a new connection use a non-PMTU discovered route/path.
. at every probe time (if the host has data to send) use the
original route. This will cause a rediscovery of the paths.
4.4 Case 4
If the Datagram Too Big message returns the code indicating that
the router used the default route, it may be taken equivalent to
the indication of the use of a host route.
If the information returned indicates a more general route than the
route that was used then the information must be discarded and it
be considered to be a host route mask. The local routing
information is received using a routing protocol or set up by an
administrator and must not be overridden.
5. Security considerations
If the Datagram Too Big message returns a more general route than
was used by the host, the indication is taken equivalent to the
host route mask. This blocks the host from being fed faulty network
information. The host may however be sent Datagram Too Big messages
indicating the default route. The end host will end up creating
host routes instead of subnet routes. This is no different from
what happens now. A code that indicates a more precise route does
not have any effect on theflow of data or the path MTU information
related to the path.
6. References
[1] J.Mogul, S.Deering. Path MTU Discovery, RFC 1191, November 1990.
[2] J. McCann, S. Deering, J. Mogul. Path MTU Discovery for IP version 6.
RFC 1981, August 1996
[3] S. Deering, R. Hinden, "Internet Protocol, Version 6 (IPv6) Specification"
RFC 1883, December 1995.
[4] Conta, A., and S. Deering, "Internet Control Message Protocol
(ICMPv6) for the Internet Protocol Version 6 (IPv6) Specification",
RFC 1885, December 1995.
7. Author's Address
Vivek Kashyap
Sequent Computer Systems, Inc.
15450, SW Koll Parkway
Beaverton, OR 97006
ph. 503 - 578 3422
email: viv@sequent.com