Network Working Group R. Bush
Internet-Draft Internet Initiative Japan
Intended status: Standards Track O. Maennel
Expires: September 10, 2009 Deutsche Telekom Laboratories
J. Zorz
go6.si
S. Bellovin
Columbia University
L. Cittadini
Universita' Roma Tre
March 9, 2009
The A+P Approach to the IPv4 Address Shortage
draft-ymbk-aplusp-03
Status of this Memo
This Internet-Draft is submitted to IETF in full conformance with the
provisions of BCP 78 and BCP 79. This document may not be modified,
and derivative works of it may not be created, and it may not be
published except as an Internet-Draft.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that
other groups may also distribute working documents as Internet-
Drafts.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt.
The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html.
This Internet-Draft will expire on September 10, 2009.
Copyright Notice
Copyright (c) 2009 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents in effect on the date of
Bush, et al. Expires September 10, 2009 [Page 1]
Internet-Draft A+P Addressing Extension March 2009
publication of this document (http://trustee.ietf.org/license-info).
Please review these documents carefully, as they describe your rights
and restrictions with respect to this document.
Abstract
We are facing the exhaustion of the IANA IPv4 free IP address pool.
Unfortunately, IPv6 is not yet deployed widely enough to fully
replace IPv4, and it is unrealistic to expect that this is going to
change before we run out of IPv4 addresses. Letting hosts seamlessly
communicate in an IPv4-world without assigning a unique globally
routable IPv4 address to each of them is a challenging problem.
This draft discusses the possibility of address sharing by treating
some of the port number bits as part of an extended IPv4 address
(Address plus Port, or A+P). Instead of assigning a single IPv4
address to a device, we propose to extended the address by "stealing"
bits from the port number in the TCP/UDP header, leaving the
applications a reduced range of ports. This means assigning the same
IP to different clients (e.g., CPE's, mobile phones), each with its
port-range. In the face of IPv4 address exhaustion, the need for
addresses is stronger than the need to be able to address thousands
of applications on a single host. If address translation is needed,
the end-user should be in control of the translation process - not
some smart boxes in the core.
Requirements Language
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in RFC 2119 [RFC2119].
Bush, et al. Expires September 10, 2009 [Page 2]
Internet-Draft A+P Addressing Extension March 2009
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.1. Why Large-Scale-NATs are Harmful . . . . . . . . . . . . . 4
2. Design Constraints and Assumptions . . . . . . . . . . . . . . 6
2.1. Design constraints . . . . . . . . . . . . . . . . . . . . 6
2.2. Terminology . . . . . . . . . . . . . . . . . . . . . . . 7
3. Overview of the A+P Solution . . . . . . . . . . . . . . . . . 8
3.1. Signaling . . . . . . . . . . . . . . . . . . . . . . . . 10
3.2. Address realm . . . . . . . . . . . . . . . . . . . . . . 11
3.3. Reasons for allowing multiple A+P gateways . . . . . . . . 13
4. Deployment Scenarios . . . . . . . . . . . . . . . . . . . . . 15
4.1. A+P for Broadband Providers . . . . . . . . . . . . . . . 15
4.2. A+P for Mobile Providers . . . . . . . . . . . . . . . . . 15
4.3. A+P from provider networks perspective . . . . . . . . . . 16
4.4. Dynamic allocation of port ranges . . . . . . . . . . . . 18
4.5. Example of A+P-forwarded packets . . . . . . . . . . . . . 20
4.6. Forwarding of standard packets . . . . . . . . . . . . . . 24
4.7. Handling ICMP . . . . . . . . . . . . . . . . . . . . . . 24
4.8. Limitations of the A+P approach . . . . . . . . . . . . . 25
5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 25
6. Security Considerations . . . . . . . . . . . . . . . . . . . 26
7. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 26
8. References . . . . . . . . . . . . . . . . . . . . . . . . . . 26
8.1. Normative References . . . . . . . . . . . . . . . . . . . 26
8.2. Informative References . . . . . . . . . . . . . . . . . . 26
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 27
Bush, et al. Expires September 10, 2009 [Page 3]
Internet-Draft A+P Addressing Extension March 2009
1. Introduction
This document addresses the imminent IPv4 address space exhaustion.
Very soon there will be not enough IPv4 space allocatable to
customers of broadband or mobile providers, while IPv6 is not widely
enough deployed to migrate to an IPv6-only world. Many large
Internet Service Providers (ISPs) face the problem that their
networks' customer edges are so large that it will soon not be
possible anymore to provide each customer with a single IPv4 address.
Therefore ISPs have to devise something more ingenious. Although
undesirable, address sharing is inevitable.
To allow end-to-end connectivity between IPv4 speaking applications
we propose to "steal" some bits from the UDP/TCP header and use them
for addressing devices. Assuming we could limit the applications'
port addressing to 8 (or 4) bits, we can increase the effective size
of an IPv4 address by 8 (or 12) additional bits. In this scenario,
128 (or 4096) customers could be multiplexed on the same IPv4
address, while allowing them a fixed range of 512 (or 16) ports.
Customers that require larger port-ranges could dynamically request
additional blocks, depending on their contract. We call this
"extended addressing" or "A+P" (Address Plus Port) addressing. The
main advantage of A+P is that it preserves the Internet "end-to-end"
paradigm by not translating (at least some ports of) an IP address.
With NAT this end-to-end connectivity is broken. As long as the
customer chooses to do this on his/her premises this is a choice that
he/she takes, however this is not an option anymore in face of the
looming IPv4 address exhaustion, where so called Large-Scale-NATs
(LSNs) might be deployed within the providers network - outside the
control of the customer.
1.1. Why Large-Scale-NATs are Harmful
Various forms of NATs will be installed at various levels and places
in the IPv4-Internet to achieve the necessary address compression.
This document argues for mechanisms that end-customers will not be
locked behind a walled-garden shrine without any control over the
translation and that it is therefore essential to create mechanisms
to "bypass" a NAT, and keep the control at the end-user:
"Carrier grade" is a euphemism for centralized. More semantics move
to the core of the network. This is bad in and of itself. Net-heads
call it "telco-think" because it is the telco model of smarts in the
core as opposed to the Internet model of a simple, just-forward-
packets core, with smart edges. It also places the provider in the
position, where the user is trapped behind unchangeable application
and policies. This is the opposite of the "end-to-end" model of the
Internet.
Bush, et al. Expires September 10, 2009 [Page 4]
Internet-Draft A+P Addressing Extension March 2009
With the smarts at the edges, one can easily field new protocols
between consenting end-points by "just" tweaking the NATs at the
corresponding Customer Premises Equipment (CPE), even adding
application layer gateways (ALGs) if they are needed. However, LSNs
do not build an Internet walled garden at the edges, they build it by
restricting the core.
With LSNs in the core, customers wanting new application protocols
which require cooperation from the NAT, have to beg help from the
broadband providers' engineers and lawyers, and all other users of
the large-scale-NATs. It is feared that all new application
protocols have to go through the carrier-loving lawyers to be allowed
to be handled by the NATs in their core. Today's NATs are typically
mitigated by ALGs over which the customer has some degree of control,
e.g. port forwarding or UPnP. However, this is not expected to work
anymore with LSNs. LSN proposals admit that it is not expected that
applications that require specific port assignment or port mapping
from the NAT box will keep working
[I-D.durand-softwire-dual-stack-lite]. This is the ultimate horror
the NAT-haters fear, and, in this case, they are not all that wrong.
We believe this is not an option and that the end-user must have the
ability to control its own ALGs. So, if someone wants to deploy a
new application, they can talk to the broadband providers' lawyers or
run new disruptive technology over HTTP; we can pick our poison. And
if the NAT is not where the customer can directly control it, i.e.,
it is anywhere back in the provider's network, then the provider
controls what the user can control, i.e. it is not really under user
control. We do not wish to deal with the case where the provider has
to decide whether to allow Skype v42 when they themselves provide a
competing VoIP product.
And remember that as IPv6 deploys, if we want to have one Internet,
i.e. IPv4 nodes talking freely with IPv6 nodes, then translation
must be done somewhere. The challenge is whether someone can figure
out a scheme where it is done for these large networks? We believe
it should be at the customer edge, not in the core.
Another issue with LSN is scalability. ISPs face a tension between
the placement of LSNs within their network to aggregate as much as
possible, when too much aggregation creates a massive state problem.
To reduce the state, the placement ends up somewhere closer to the
edge, where the benefits are somewhat limited. It is not clear how a
LSN should maintain per-session state in a scalable manner. State
for improperly terminated sessions could remain stale for some time.
The LSN hence trades scalability for the amount of state that needs
to be kept, which makes optimally placing a LSN a hard engineering
problem.
Bush, et al. Expires September 10, 2009 [Page 5]
Internet-Draft A+P Addressing Extension March 2009
In addition, NATs frequently need to initiate translation for
secondary port numbers. This may be a decision based on packet
inspection (i.e., looking for PORT commands in FTP [RFC0959]
sessions), or it may rely on explicit signaling from the end host via
protocols such as UPnP. Either way, LSNs pose a security threat
and/or an administrative nightmare.
The issue is proper authentication of such requests. Most UPnP
devices do not implement appropriate security features. Even if they
did, there would be no way to administer the security mechanism.
Every end-user device would have to have a secret corresponding to
some authentication field in the LSN. End users will not set these
up properly; providers do not want to maintain such a database.
Decisions made based on packet inspection are just as problematic. A
request from one customer could easily request opening a port for an
other customer's addresses, similar to the Java-based attack
described by Martin et al in [Martin-Java].
Furthermore, with LSNs, tracing hackers, spammers and other criminals
will be impossible, unless all the connection based mapping
information is recorded and stored. This would not only cause
concern for law enforcement services, but also for privacy advocates.
2. Design Constraints and Assumptions
The problem of address space shortage is first felt by providers with
a very large end-user customer base, such as broadband providers and
mobile-service providers. Though the cases and requirements are
slightly different, they share many commonalities. In the following
we will develop a set of overall design constraints.
2.1. Design constraints
We regard several constraints as important for our design:
1) End-to-end is under customer control. Customers shall have
the possibility to send/receive packets unmodified and deploy
new application protocols at will. IPv4 address exhaustion
is no clearance to break the Internet's end-to-end paradigm.
2) End-to-end transparency through multiple intermediate
devices. Multiple gateways should be able to operate in
sequence along one data path without interfering with each
other.
Bush, et al. Expires September 10, 2009 [Page 6]
Internet-Draft A+P Addressing Extension March 2009
3) Incremental deployability and backward compatibility. The
approaches shall be transparent to unaware users. Devices or
existing applications shall be able to work without
modification. Emergence of new applications shall not be
limited.
4) Automatic configuration/administration. There should be no
need for customers to call the ISP and tell them that they
are operating their own A+P-gateway devices. Customers/
mobile phone users are NOT supposed to lookup assigned ports
manually on websites and then configure them on devices or
applications.
5) "Double-NAT" shall be avoided. Based on Constraint 2
multiple gateway devices might be present in a path, and once
one has done some translation, those packets should not be
re-translated.
6) Legal traceability. ISPs must be able to provide the
identity of a customer from the knowledge of the IPv4 public
address and the port. This should have the lowest impact
possible on the storage and the ISP. We assume that NATs on
customer premises do not pose much of a problem, while
provider NATs need to keep additional logs.
7) IPv6 deployment should be encouraged.
While we acknowledge that A+P works in an IPv4-only environment
(e.g., [I-D.boucadair-port-range]) we strongly believe that IPv6 is
the long-term solution to the problem, and that A+P should be
considered only as an intermediate hack towards an IPv6-only world.
We therefore assume in constraint 7 that the ISP has migrated to a
dual-stack core and A+P can use IPv6 as a transport inside the
network. This ensures that A+P will not be an hindrance to the
introduction of IPv6.
Constraints 2 and 5 are important: while many techniques have been
deployed to allow applications to work through a NAT, traversing
cascaded NATs is crucial if NATs are being deployed in the core of a
provider network.
2.2. Terminology
The A+P idea can be split into three distinct functionalities:
encaps/decaps, NAT, and signaling functionalities.
Encaps/decaps functionality: is used to forward port-restricted A+P-
packets over intermediate legacy devices. The encapsulation
Bush, et al. Expires September 10, 2009 [Page 7]
Internet-Draft A+P Addressing Extension March 2009
functionality takes an IPv4 packet, looks up the IP and TCP/UDP
headers, and puts the packet into the appropriate tunnel. The state
needed to perform this action is comparable to a forwarding table.
The decapsulation device SHOULD check if the source address and port
of packets coming out of the tunnel are legitimate (e.g., see
[BCP38]). Based on the result of such a check, the packet MAY be
forwarded untranslated, it MAY be discarded or MAY be NATed.
Network Address Translation (NAT) functionality: is used to connect
legacy end-hosts. Unless upgraded, end-hosts or end-systems are not
aware of A+P restrictions and therefore assume a full IP address.
The NAT functionality is performing any address or port translation,
including application-level-gateways (ALGs). The state that has to
be kept to implement this functionality is the mapping for which
external addresses and ports have been mapped to which internal
addresses and ports.
Signaling functionality: is used in order to allow A+P-aware devices
get to know which ports are assigned to be passed through
untranslated and what will happen to packets outside the assigned
port-range (e.g., could be NATed or discarded). In addition, the
signaling functionality is used to dynamically increase/decrease the
requested port-range.
A+P address realm: a public routable IPv4 address that is port
restricted (A+P). Forwarding of packets is done based on the IPv4
address and the TCP/UDP port numbers. When this draft talks about
"A+P packets" it is assumed that those packets pass untranslated.
Private address realm: IPv4 addresses that are not globally routed.
Ideally they should be taken from [RFC1918] range. However, this
draft does not make such an assumption. We regard as private address
space any IPv4 address, which needs to be translated in order to gain
global connectivity, irrespective of whether it falls in [RFC1918]
space or not.
3. Overview of the A+P Solution
The core architectural elements of the A+P solution are three
separated and independent functionalities: the NAT functionality, the
encaps/decaps functionality, and the signaling functionality. The
NAT functionality is similar to a NAT as we know it today: it
performs a translation between two different address realms. When
the external realm is public IPv4 address space, we assume that the
translation is many-to-one, in order to multiplex many customers on a
single public IPv4 address. The only difference with a traditional
NAT (Figure 1) is that the translator might only be able to use a
Bush, et al. Expires September 10, 2009 [Page 8]
Internet-Draft A+P Addressing Extension March 2009
restricted range of ports when mapping multiple internal addresses
onto an external one, e.g., the external address realm might be port-
restricted.
"internal-side" "external-side"
+-----+
internal | N | external
address <---| A |---> address
realm | T | realm
+-----+
Traditional NAT
Figure 1
The encaps/decaps functionality, on the other hand, consists of the
capability of establishing a tunnel with another endpoint providing
the same functionality. This implies some form of signaling to
establish a tunnel This can be viewed as integrated with DHCP or a
separate service. Section 3.1 discusses the constraints of this
signaling function. The established tunnel can be encapsulation in
IPv6, a layer-2 tunnel, or some other form of softwire. Note that
the presence of a tunnel allows for intermediate legacy devices
between the two endpoints.
Two or more devices which provide the encaps/decaps functionality and
are linked by tunnels form an A+P subsystem. The function of each
gateway is to encapsulate and decapsulate respectively. Figure 2
depicts the simplest possible A+P subsystem, that is, two devices
providing the encaps/decaps functionality.
+------------------------------------+
port-restricted | +----------+ tunnel +----------+ | external
address realm --|-| gateway |==========| gateway |-|-- address
| +----------+ +----------+ | realm
+------------------------------------+
A+P subsystem
A simple A+P subsystem
Figure 2
Within an A+P subsystem, the external address realm is extended by
"stealing" bits from the port number. Each device is assigned one
Bush, et al. Expires September 10, 2009 [Page 9]
Internet-Draft A+P Addressing Extension March 2009
address from the external realm and a range of port numbers. Hence,
devices which are part of an A+P subsystem can communicate with the
external address without the need for address translation (i.e.,
preserving end-to-end packet integrity): an A+P packet originated
from within the A+P subsystem can be simply forwarded over tunnels up
to the endpoint, where it gets decapsulated and routed in the
external realm. On the other hand, packets that are originated from
outside the A+P subsystem need to be translated, since they belong to
different realms. For this reason, one of the two edges of the A+P
subsystem MUST provide the NAT functionality (or both). It is up to
the provider to trade-off the placement of the NAT functionality.
Hence, the design of A+P is deliberately agnostic to where packets in
transit will be translated, provided that the translation happens
exactly once (Constraint 5).
3.1. Signaling
The following information needs to be available on all the gateways
in the A+P subsystem. We propose to deploy a signaling protocol such
as [I-D.boucadair-dhc-port-range],
[I-D.bajko-v6ops-port-restricted-ipaddr-assign]. The information
that needs to be shared are the following:
o a set of public IPv4 addresses,
o for each IPv4 address a set of allocated port-ranges (port-set),
o the tunneling technology to be used (e.g., "IPv6-encapsulation")
o addresses of the tunnel endpoints (e.g., IPv6 address of tunnel
endpoints)
o whether or not NAT functionality is provided by the gateway
o a device identification number and some authentification
mechanisms
o a version number and some reserved bits for future use.
Note that the functions of encapsulation and decapsulation have been
separated from the NATing functionality. However, to accommodate
legacy hosts, NATing must provided at some point in the path;
therefore the availability or absence of NATing must be communicated
in the signaling, as A+P is agnostic about NAT placement.
Bush, et al. Expires September 10, 2009 [Page 10]
Internet-Draft A+P Addressing Extension March 2009
3.2. Address realm
Each gateway within the A+P subsystem manages a certain portion of
A+P address space, that is, a portion of IPv4 space which is extended
borrowing bits from the port number. This address space may be a
single, port-restricted IPv4 address. The gateway MAY use its
managed A+P address space for several purposes:
o Allocate a sub-portion of the A+P address space to other
authenticated A+P gateways in the A+P subsystem (referred to as
delegation). We call the allocated sub-portion delegated address
space.
o Exchange (untranslated) packets with the external address realm.
For this to work, such packets MUST use source address and port
belonging to the non-delegated address space.
Note that if the gateway is also capable of performing the NAT
functionality, it MAY translate packets arriving on an internal
interface which are outside of its managed A+P address space into
non-delegated address space.
An A+P gateway ("A"), accepts incoming connections from other A+P
gateways ("B"). Upon connection establishment (provided appropriate
authentication), B would "ask" A for delegation of an A+P address.
In turn, A will inform B about its public IPv4 address, and will
delegate a portion of its port-range to B. In addition, A will also
negotiate the encaps/decaps functionality with B (e.g., let B know
the address of the decaps device/other-end-point of the tunnel).
This could be implemented for example via a DHCP-similar solution.
In general the following rule applys: A sub-portion of the managed
A+P address space is delegated as long as devices below ask for it,
otherwise private IPv4 is provided to support legacy hosts.
Bush, et al. Expires September 10, 2009 [Page 11]
Internet-Draft A+P Addressing Extension March 2009
private +-----+ +-----+ public
address ---| B |==========| A |--- Internet
realm +-----+ +-----+
Address space realm of A:
public IPv4 address = 12.0.0.1
port range = 0-65535
Address space realm of B:
public IPv4 address = 12.0.0.1
port range = 2560-3071
Figure 3
Figure 3 illustrates a sample configuration. Note that A could
actually consits of three different devices: one that handles
signaling requests from B; one device that performs encapsulation and
decapsulation; and, if provided, one device that performs NATing
functionality (e.g., LSN). Packet forwarding is assumed in the
following way: In the "out-bound" case, a packet arrives from the
private address realm to B. As stated above, B has two options: it
can either apply or not apply the NAT function. The decision depends
upon the specific configuration and/or the capabilities of A and B.
Note that NAT functionality is required to support legacy hosts,
however, this can be done at any of the two devices A or B. The term
NAT refers to translating the packet into the managed A+P address (B
has address 12.0.0.1 and ports 2560-3071 in the example above). We
then have two options:
1) B NATs the packet. The translated packet is then tunneled to A.
A recognizes that the packet has already been translated, because
the source address and port match allocated information. A
decapsulates the packet and releases it in the public Internet.
2) B does not NAT the packet. The untranslated packet is then
tunneled to A. A recognizes that the packet has not been
translated, so A forwards the packet to a co-located NATing
device, which translates the packet and routes it in the public
Internet. This device, e.g., an LSN, has to store the mapping
between the source port used to NAT and the tunnel where the
packet came from, in order to correctly route the reply. Note
that A cannot use a port number from the range that has been
delegated to B. As a consequence A has to assign a part of its
non-delegated address space to the NATing functionality.
"Inbound" packets are handled in the following way: a packet from the
public realm arrives at A. A analyzes the destination port number to
Bush, et al. Expires September 10, 2009 [Page 12]
Internet-Draft A+P Addressing Extension March 2009
understand whether the packet needs to be NATed or not.
1) If the destination port number belongs to the range that A
delegated to B, then A tunnels the packet to B. B can now NAT the
packet using its stored mapping and forward the translated packet
in the private domain.
2) If the destination port number is from the address space of the
LSN, then A passes the packet on to the co-located LSN which uses
the stored mapping to NAT the packet into the private address
realm of B. The appropriate tunnel is stored as well in the
mapping of the initial NAT. The LSN then encapsulates the packet
to B, which decapsulates it and normally routes it within its
private realm.
3) Finally, if the destination port number does neither fall in a
delegated range, nor into the address range of the LSN, A
discards the packet. If the packet is passed to the LSN, but no
mapping can be found, the LSN discards the packet.
3.3. Reasons for allowing multiple A+P gateways
Since each device in the A+P subsystem provides the encaps/decaps
functionality, new devices can establish tunnels and become in turn
part of the A+P subsystem. As noted above, being part of the A+P
subsystem implies the capability of talking to the external address
realm without any translation. In particular, as described in the
previous section, a device X in the A+P subsystem can be reached from
the external domain by simply using the public IPv4 address and a
port which has been delegated to X. Figure 4 shows an example where
three devices are connected in a chain. In other words, A+P
signaling can be used to extend end-to-end connectivity to the
devices which are in the A+P subsystem. This allows A+P-aware
applications (or OSes) running on end hosts to enter the A+P
subsystem and exploit untranslated connectivity.
There are two modes for end-hosts to gain end-to-end connectivity.
The first one is having end-hosts perform the NAT function (along
with the encaps/decaps function which is required to join the A+P
subsystem). This option works in a similar way to the NAT-in-the-
host trick employed by virtualization software such as VMware, where
the guest operating system is connected via a NAT to the host
operating system. The second mode is applications who autonomously
ask for an A+P address and use it to join the A+P subsystem. This
capability is necessary for some applications that require end-to-end
connectivity (e.g., applications that need to be contacted from
outside).
Bush, et al. Expires September 10, 2009 [Page 13]
Internet-Draft A+P Addressing Extension March 2009
+---------+ +---------+ +---------+
internal | gateway | | gateway | | gateway | external
realm --| 1 |======| 2 |======| 3 |-- realm
+---------+ +---------+ +---------+
An A+P subsystem with multiple devices
Figure 4
Whatever the reasons might be, the Internet was build on a paradigm
that end-to-end connectivity is possible. A+P makes this still
possible in a time where address shortage forces ISPs to use NATs at
various levels. In such sense, A+P can be regarded as a way to
bypass NATs.
+---+ (customer2)
|A+P|-* +---+
+---+ \ NAT|A+P|-*
\ +---+ |
\ | forward if in-range
+---+ \+---+ +---+ /
|A+P|------|A+P|----|A+P|----
+---+ /+---+ +---+ \
/ NAT if necessary
/ (cust1) (prov. (e.g., provider NAT)
+---+ / router)
|A+P|-*
+---+
A complex A+P subsystem
Figure 5
Figure 5 depicts a complex scenario, where the A+P subsystem is
composed by multiple devices organized in a hierarchy. Each A+P
gateway decapsulates the packet and then re-encapsulates it again to
the next tunnel. A packet can either be NATed when it enters the A+P
subsystem, or at intermediate devices, or when it exits the A+P
subsystem. This could be for example a gateway installed within the
provider's network, together with a LSN (a large-scale-NAT provided
by the provider). Then each customer operates its own CPE. However,
behind the CPE applications might also be A+P-aware and run their own
A+P-gateways, which enables them to have end-to-end connectivity.
One limitation applies, if "delayed translation" is used (e.g.,
translation at the LSN instead of the CPE). If devices using
Bush, et al. Expires September 10, 2009 [Page 14]
Internet-Draft A+P Addressing Extension March 2009
"delayed translation" want to talk to each other they SHOULD use A+P
addresses or out-of-band addressing.
4. Deployment Scenarios
4.1. A+P for Broadband Providers
Large broadband providers have not enough IPv4 address space to
provide every customer with a single IP. The natural solution is
sharing a single IP address among many customers. Multiplexing
customers is usually accomplished by allocating different port
numbers to different customers somewhere within the network of the
provider.
In this document we use the following terms and assumptions:
1. Customer Premises Equipment (CPE), i.e. cable/DSL modem.
2. Provider Edge Router (PE), AKA customer aggregation router
3. Port Range Router (PRR), edge behind which A+P addresses are
used.
4. Provider Border Router (BR), providers edge to other providers
5. Network Core Routers (Core), provider routers which are not at
the edge.
It is expected that the CPE can be upgraded or replaced to support
A+P encaps/decaps functionality. Ideally the CPE also provides
NATing functionality. Further, it is expected that at least another
component in the ISP network provides the same functionality, and
hence is able to establish an A+P subsystem with the CPE. This
device is referred to as A+P border router or port-range router
(PRR), and could be located close to the PE router. The core of the
network MUST support the tunneling protocol (which SHOULD be IPv6, as
per Constraint 7). In addition, we do not want to restrict any
initiative of customers, who might want to run an A+P-capable network
behind their CPE. To satisfy both Constraints 1 and 3 unmodified
legacy hosts should keep working seamlessly, while upgraded/new end-
systems should be given the opportunity to exploit enhanced features.
4.2. A+P for Mobile Providers
In the case of mobile service provider the situation is slightly
different. The A+P border is assumed to be the gateway (e.g., GGSN/
PDN GW of 3GPP, or ASN GW of WiMAX). The need to extend the address
Bush, et al. Expires September 10, 2009 [Page 15]
Internet-Draft A+P Addressing Extension March 2009
is not within the provider network, but on the edge between the
mobile phone devices and the base-station. While desirable, IPv6
connectivity may or may not be providable.
For mobile providers we use the following terms and assumptions:
1. Provider Network (PN)
2. Gateway (GW)
3. Mobile Phone device (phone)
4. Devices behind phone, e.g., laptop computer connecting via phone
to Internet.
We expect that the gateway has many IPv4 addresses and is always in
the data-path of the packets. Transportation between gateway and
phone devices is assumed to be an end-to-end layer-2 tunnel. We
assume that phone as well as gateway can be upgraded to support A+P.
However, some applications running on the phone or devices behind the
phone (such as laptop computers connecting via the phone), are not
necessarily expected to be upgraded. Again, while we do not expect
that devices behind the phone will be A+P aware/upgraded we also do
not want to hinder their evolution. In this sense the mobile phone
would be comparable to the CPE in the broadband provider case; the
gateway to the PRR/LSN box in the network of the broadband provider.
4.3. A+P from provider networks perspective
ISPs suffering from IPv4 address space exhaustion are interested in
achieving a high address space compression ratio. In this respect,
an A+P subsystem allows much more flexibility than traditional NATs:
the NAT can be placed at the customer, and/or in the provider
network. In addition hosts or applications can request ports and
thus have untranslated end-to-end connectivity.
Bush, et al. Expires September 10, 2009 [Page 16]
Internet-Draft A+P Addressing Extension March 2009
+------------------------------+
private | +------+ A+P-in +--------+ | dual-stacked
(RFC1918) --|-| CPE |==-IPv6-==| PRR |-|-- network
space | +------+ tunnel +--------+ | (public addresses)
| ^ +--------+ |
| | IPv6-only | LSN | |
| | network +--------+ |
+----+----------------- ^ -----+
| |
on customer within provider
premises and control network
A simple A+P subsystem example
Figure 6
Consider the deployment scenario in Figure 6, where an A+P subsystem
is formed between the CPE and a port-range router (PRR) within the
ISP core network. The PRR is placed somewhere within the ISPs
network, preferably close to the customer edge and forms the border
from where on packets are forwarded based on address and port. The
provider MAY deploy a LSN co-located with the PRR: in this case
packets that have not been translated by the CPE will be handed to
the LSN and NATted. In such a configuration, the ISP allows the
customer to freely decide whether the translation is done at the CPE
or at the LSN. In order to establish the A+P subsystem, the CPE will
be configured automatically (e.g. via a signaling protocol, that
conforms with the requirements stated above).
Note that the CPE in the example above is only provisioned with an
IPv6 address on the external interface.
Bush, et al. Expires September 10, 2009 [Page 17]
Internet-Draft A+P Addressing Extension March 2009
+------------ IPv6-only transport ------------+
| +---------------+ | | |
| |A+P-application| | +--------+ | +-----+ | dual-stacked
| | on end-host |=|==| CPE w/ |==|==| PRR |-|-- network
| +---------------+ | +--------+ | +-----+ | (public addresses)
+---------------+ | +--------+ | +-----+ |
private IPv4 <-*--+->| NAT | | | LSN | |
address space \ | +--------+ | +-----+ |
for legacy +|--------------|----------+
hosts | |
| |
end-host with | CPE device | provider
upgraded | on customer | network
application | premises |
An extended A+P subsystem with end-host running A+P-aware
applications
Figure 7
Figure 7 shows an example of how an upgraded application running on a
legacy end-host can connect. The legacy host is provisioned with a
private IPv4 address allocated from the CPE. Any packet sent from
the legacy host will be NATed either at the CPE (if configured to do
so), or at the LSN (if available).
An A+P-aware application running on the end-host MAY use the
signaling described in Section 3.1 to connect to the A+P-subsystem.
Hence, the application will be delegated some space in the A+P
address realm, and will be able to contact the external realm (i.e.,
the public Internet) without the need for translation.
Note that part of A+P signaling is that the NATs are optional.
However, if neither the CPE nor the PRR provides NATing
functionality, then it will not be possible to connect legacy end-
hosts.
4.4. Dynamic allocation of port ranges
Allocating the same sized fixed range of ports to all CPE may lead to
exhaustion of ports that are needed for NAT in a CPE to operate,
because that customer has several hosts behind CPE and uses NAT to
communicate with the Internet, any given restricted range of
allocated ports might become exhausted. This is a perfect recipe for
upsetting the more demanding customer. A mechanism for dynamic
allocation of port ranges allows the ISP to achieve two goals; a more
efficient compression ratio of number of customers on one IPv4
Bush, et al. Expires September 10, 2009 [Page 18]
Internet-Draft A+P Addressing Extension March 2009
address and, on the other hand, not limiting the more demanding
customers on their communication to/from Internet.
The following mechanism applies to NAT functionality in CPE only: If
a customer has an arrangement with the ISP for well-known-ports, and
the PRR allocates to this CPE WKP range, this range is used for end-
to-end communications to a server behind CPE with public IP address
or if customer configures so for inbound NAT (1:1 or port
forwarding). This function has a fixed range of ports and is not
considered in the dynamic allocation mechanism. On the other hand,
if customer configures the NAT function to access Internet from
private address pool behind the CPE, this mechanism is automatically
applied. NAT keeps track of translation tables, so only a small
"daemon" needs to be developed and implemented by the CPE
manufacturer to keep track of allocated ranges of ports and how many
are used. In the case of 90% usage, the dynamic allocation daemon
signals to the PRR the need for additional ports. A downside of this
mechanism is that port allocation to a CPE might get quite large
without and additional mechanism that would return unused port ranges
back to the PRR's pool. This may be fixed by forcing the NAT to
sequentially allocate ports for translation and reallocate to new
requesta and released ports. So the use of ports is controlled and
unfragmented ranges can be returned to pool. An other, not so
pretty, way is to reset the additional allocations to 0 every 24
hours, and leave only the first allocation. Additional allocations
would be requested by mechanism in very short time, leaving the
customer unlikely to notice the event.
The mechanism would prefer allocations of port ranges from the same
IP for an initial allocation. If it is not possible to allocate an
additional port range from the same IP, than mechanism can allocate a
port range from another IP within the same subnet. With every
additional port range allocation, the PRR updates its routing table
and sends packets coming to allocated ports on that IP to the
appropriate tunnel that ends on the CPE which requested and allocated
that additional port range. The mechanism for allocating additional
port ranges may be part of normal signaling that is used to
authenticate CPE to ISP.
The ISP controls the dynamic allocation of port ranges by the PRR by
setting the initial allocation size and maximum number of allocations
per CPE, or the maximum allocations per subscription, depending on
subscription level. There is a general observation that the more
demanding customer uses around 1024 ports when heavily communicating.
So, for example, a first suggestion would be 512 ports initially and
then dynamic allocations of ranges of 512 ports up to 6 more
allocations maximum. The maximum number of allocations should
prevent from one customer acting in distructive manner, in case they
Bush, et al. Expires September 10, 2009 [Page 19]
Internet-Draft A+P Addressing Extension March 2009
become infected. The maximum number of allocations can also be fine
grained with parameter of how many allocations a user can request per
time frame. If this is used, evasive applications are limited in bad
behavior, for example one additional allocation per minute would
considerably slow the port requesting storm.
Note that there is no minimum request size. This is because A+P-
aware applications running on end-hosts MAY request a single port (or
a few ports) for the CPE to be contacted on (e.g., VoIP clients
register a public IP and a single delegated port from the CPE, and
accept incoming calls on that port). The implementation on the CPE
or PRR will dictate how to handle such requests for smaller blocks:
For example half of available blocks might be used for "block-
allocations", 1/6 for single port requests, and the rest for NATing.
4.5. Example of A+P-forwarded packets
This section provides a detailed example of A+P setup, configuration,
and how packets flow from an end-host behind an A+P upgraded provider
to any host in the IPv4 Internet and how the return packets flow
back. The following example discusses the situation of an A+P-
unaware end-host, the NATing is done at the CPE. Figure 8
illustrates how the CPE receives an IPv4 packet from the end-user
device. We first describe the case where the CPE has been configured
to provide the NAT functionality (e.g., by the customer via
interaction via a website, or via automatic signaling). In the
following, we call a packet which is translated at the CPE an A+P-
forwarded packet, in analogy with the port-forwarding function
employed in today's CPEs. Upon receiving a packet from the internal
interface, the CPE NATs it and forwards it to the PRR. The NAT on
the CPE is assumed to store the 5-tuple (source_IPv4, source_port,
destination_IPv4, destination_port, tunnel-interface).
When the PRR receives the A+P-forwarded packet, it de-capsulates the
inner IPv4 packet and it checks the source address and port. If the
source address and port match the CPE's A+P address, then the PRR
simply routes the encapsulated packet. This is always the case for
A+P-forwarded packets. Otherwise, the PRR assumes that the packet is
not A+P-forwarded, and then passes it to the LSN function, which in-
turn NATs the packet and then releases it into the Internet.
Figure 8 shows the packet flow for an outgoing A+P-forwarded packet.
Bush, et al. Expires September 10, 2009 [Page 20]
Internet-Draft A+P Addressing Extension March 2009
+-----------+
| Host |
+-----+-----+
| | 10.0.0.2
IPv4 datagram 1 | |
| |
v | 10.0.0.1
+---------|---------+
|CPE | |
+--------|||--------+
| ||| a::2
| ||| 12.0.0.3 (100-200)
IPv6 datagram 2| |||
| |||<-IPv4-in-IPv6
| |||
-----|-|||-------
/ | ||| \
| ISP access network |
\ | ||| /
-----|-|||-------
| |||
v ||| a::1
+--------|||--------+
|PRR ||| |
+---------|---------+
| | 12.0.0.1
IPv4 datagram 3 | |
-----|--|--------
/ | | \
| ISP network / |
\ Internet /
-----|--|--------
| |
v | 128.0.0.1
+-----+-----+
| IPv4 Host |
+-----------+
Figure 8: Forwarding of Outgoing A+P-forwarded Packets
Bush, et al. Expires September 10, 2009 [Page 21]
Internet-Draft A+P Addressing Extension March 2009
+-----------------+--------------+-----------------------------+
| Datagram | Header field | Contents |
+-----------------+--------------+-----------------------------+
| IPv4 datagram 1 | IPv4 Dst | 128.0.0.1 |
| | IPv4 Src | 10.0.0.2 |
| | TCP Dst | 80 |
| | TCP Src | 8000 |
| --------------- | ------------ | --------------------------- |
| IPv6 Datagram 2 | IPv6 Dst | a::1 |
| | IPv6 Src | a::2 |
| | IPv4 Dst | 128.0.0.1 |
| | IPv4 Src | 12.0.0.3 |
| | TCP Dst | 80 |
| | TCP Src | 100 |
| --------------- | ------------ | --------------------------- |
| IPv4 datagram 3 | IPv4 Dst | 128.0.0.1 |
| | IPv4 Src | 12.0.0.3 |
| | TCP Dst | 80 |
| | TCP Src | 100 |
+-----------------+--------------+-----------------------------+
Datagram header contents
An incoming packet undergoes the reverse process. When the PRR
receives an IPv4 packet on an external interface, it first checks
whether the destination port number falls in a delegated range or
not. If the address space was delegated, then PRR tunnels the
packets unmodified. If the address space was not-delegated the
packet will be handed to the LSN to check if a mapping is available.
Figure 9 shows how an incoming packet is forwarded, under the
assumption that the port number matches the port range which was
delegated to the CPE.
Bush, et al. Expires September 10, 2009 [Page 22]
Internet-Draft A+P Addressing Extension March 2009
+-----------+
| Host |
+-----+-----+
^ | 10.0.0.2
IPv4 datagram 3 | |
| |
| | 10.0.0.1
+---------|---------+
|CPE | |
+--------|||--------+
^ ||| a::2
| ||| 12.0.0.3 (100-200)
IPv6 datagram 2| |||
| |||<-IPv4-in-IPv6
| |||
-----|-|||-------
/ | ||| \
| ISP access network |
\ | ||| /
-----|-|||-------
| |||
| ||| a::1
+--------|||--------+
|PRR ||| |
+---------|---------+
^ | 12.0.0.1
IPv4 datagram 1 | |
-----|--|--------
/ | | \
| ISP network / |
\ Internet /
-----|--|--------
| |
| | 128.0.0.1
+-----+-----+
| IPv4 Host |
+-----------+
Figure 9: Forwarding of Incoming A+P-forwarded Packets
Bush, et al. Expires September 10, 2009 [Page 23]
Internet-Draft A+P Addressing Extension March 2009
+-----------------+--------------+-----------------------------+
| Datagram | Header field | Contents |
+-----------------+--------------+-----------------------------+
| IPv4 datagram 1 | IPv4 Dst | 12.0.0.3 |
| | IPv4 Src | 128.0.0.1 |
| | TCP Dst | 100 |
| | TCP Src | 80 |
| --------------- | ------------ | --------------------------- |
| IPv6 Datagram 2 | IPv6 Dst | a::2 |
| | IPv6 Src | a::1 |
| | IPv4 Dst | 12.0.0.3 |
| | IP Src | 128.0.0.1 |
| | TCP Dst | 100 |
| | TCP Src | 80 |
| --------------- | ------------ | --------------------------- |
| IPv4 datagram 3 | IPv4 Dst | 10.0.0.2 |
| | IPv4 Src | 128.0.0.1 |
| | TCP Dst | 8000 |
| | TCP Src | 80 |
+-----------------+--------------+-----------------------------+
Datagram header contents
Note that datagram 1 travels untranslated up to the CPE, thus the
customer has the same control over the translation as it has today
where he/she has an home gateway with customizable port-forwarding.
4.6. Forwarding of standard packets
Packets for which the CPE does not have a corresponding port
forwarding rule are tunneled to the PRR which provides the LSN
function. We underline that the LSN MUST NOT use the delegated space
for NATting. See [I-D.durand-softwire-dual-stack-lite] for network
diagrams which illustrate the packet flow in this case.
4.7. Handling ICMP
ICMP is problematic for all NATs, because it lacks port numbers. A+P
routing exacerbates the problem.
Most ICMP messages fall into one of two categories: error reports, or
ECHO/ECHO reply (commonly known as "ping"). For error reports, the
offending packet header is embedded within the ICMP packet; NAT
devices can then rewrite that portion and route the packet to the
actual destination host. This functionality will remain the same
with A+P; however, the PRR will need to examine the embedded header
to extract the port number, while the A+P gateway will do the
necessary rewriting.
Bush, et al. Expires September 10, 2009 [Page 24]
Internet-Draft A+P Addressing Extension March 2009
ECHO and ECHO reply are more problematic. For ECHO, the A+P gateway
device must rewrite the "Identifier" and perhaps "Sequence Number"
fields in the ICMP request, treating them as if they were port
numbers. This way, the BR can build the correct A+P address for the
returning ECHO replies, so they can be correctly routed back to the
appropriate host in the same way as TCP/UDP packets. (Pings
originated from an external domain/legacy Internet towards an A+P
device are not supported.)
4.8. Limitations of the A+P approach
One limitation that A+P shares with any other IP address-sharing
mechanism is the availability of well-known ports. In fact, services
run by customers that share the same IP address will be distinguished
by the port number. As a consequence, it will be impossible for two
customers who share the same IP address to run services on the same
port (e.g., port 80). Unfortunately, working around this limitation
usually implies application-specific hacks (e.g., HTTP and HTTPS
virtual hosting), discussion of which is out of the scope of this
document. Of course, a provider might charge more for giving a
customer the well-known port range, 0..1024, thus allowing the
customer to provide externally available services. Many applications
require the availability of well known ports. However, those
applications are not expected to work in A+P environment unless they
can adapt to work with different ports. However, such application do
not work behind today's NATs either.
Another problem which is common to all kind of NATs is the
coexistence with IPsec. In fact, a NAT which also translates port
numbers prevents AH and ESP from functioning properly, both in tunnel
and in transport mode. In this respect, we stress that, since an A+P
subsystem exhibits the same external behavior as a NAT, well-known
workarounds (such as [RFC3715]) can be employed.
Port randomization is also a bit compromised in A+P solution. As CPE
can randomize ports only within port range that is allocated to it,
randomness is more limited than in the the scenario with full range
of ports, allowed for randomization. We can assume, that CPE either
gets port range from ephemeral range (49152-65535) or from
"registered ports" range (1024-49151). Both ranges can be used for
randomization, see [I-D.ietf-tsvwg-port-randomization] for more
details.
5. IANA Considerations
This document makes no request of IANA.
Bush, et al. Expires September 10, 2009 [Page 25]
Internet-Draft A+P Addressing Extension March 2009
Note to RFC Editor: this section may be removed on publication as an
RFC.
6. Security Considerations
7. Acknowledgments
The authors wish to thank especially (in alphabetical order) Gabor
Bajko, Remi Despres, Alain Durand, Pierre Levis, and Teemu Savolainen
for their close collaboration on the development of the A+P approach.
David Ward for review, constructive criticism, and interminable
questions. Cullen Jennings for discussion and review of
fragmentation, and Dave Thaler for useful criticism on "stackable"
A+P gateways. We would also like to thank the following persons for
their feedback on earlier versions of this work: Bernhard Ager, Rob
Austein, Gert Doering, Dino Farinacci, Russ Housley, and Ruediger
Volk.
8. References
8.1. Normative References
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, March 1997.
8.2. Informative References
[BCP38] Ferguson, P. and D. Senie, "Network Ingress Filtering:
Defeating Denial of Service Attacks which employ IP Source
Address Spoofing", BCP 38, May 2000.
[I-D.bajko-v6ops-port-restricted-ipaddr-assign]
Bajko, G. and T. Savolainen, "Port Restricted IP Address
Assignment",
draft-bajko-v6ops-port-restricted-ipaddr-assign-02 (work
in progress), November 2008.
[I-D.boucadair-dhc-port-range]
Boucadair, M., Grimault, J., Levis, P., and A.
Villefranque, "DHCP Options for Conveying Port Mask and
Port Range Router IP Address",
draft-boucadair-dhc-port-range-01 (work in progress),
October 2008.
[I-D.boucadair-port-range]
Bush, et al. Expires September 10, 2009 [Page 26]
Internet-Draft A+P Addressing Extension March 2009
Boucadair, M., Levis, P., Bajko, G., and T. Savolainen,
"IPv4 Connectivity Access in the Context of IPv4 Address
Exhaustion", draft-boucadair-port-range-01 (work in
progress), January 2009.
[I-D.durand-softwire-dual-stack-lite]
Durand, A., Droms, R., Haberman, B., and J. Woodyatt,
"Dual-stack lite broadband deployments post IPv4
exhaustion", draft-durand-softwire-dual-stack-lite-01
(work in progress), November 2008.
[I-D.ietf-tsvwg-port-randomization]
Larsen, M. and F. Gont, "Port Randomization",
draft-ietf-tsvwg-port-randomization-02 (work in progress),
August 2008.
[Martin-Java]
Martin, D., Rajagopalan, S., and A. Rubin, "Blocking Java
Applets at the Firewall", Proceedings of the Internet
Society Symposium on Network and Distributed System
Security, pp. 16-26, 1997.
[RFC0959] Postel, J. and J. Reynolds, "File Transfer Protocol",
STD 9, RFC 959, October 1985.
[RFC1918] Rekhter, Y., Moskowitz, R., Karrenberg, D., Groot, G., and
E. Lear, "Address Allocation for Private Internets",
BCP 5, RFC 1918, February 1996.
[RFC3715] Aboba, B. and W. Dixon, "IPsec-Network Address Translation
(NAT) Compatibility Requirements", RFC 3715, March 2004.
Authors' Addresses
Randy Bush
Internet Initiative Japan
5147 Crystal Springs
Bainbridge Island, Washington 98110
US
Phone: +1 206 780 0431 x1
Email: randy@psg.com
Bush, et al. Expires September 10, 2009 [Page 27]
Internet-Draft A+P Addressing Extension March 2009
Olaf Maennel
Deutsche Telekom Laboratories
Ernst-Reuter-Platz 7
Berlin 10587
Germany
Phone: +3727120686
Email: o@maennel.net
Jan Zorz
go6.si
Frankovo naselje 165
Skofja Loka 4220
Slovenia
Phone: +38659042000
Email: jan@go6.si
Steven M. Bellovin
Columbia University
1214 Amsterdam Avenue
MC 0401
New York, NY 10027
US
Phone: +1 212 939 7149
Email: bellovin@acm.org
Luca Cittadini
Universita' Roma Tre
via della Vasca Navale, 79
Rome, 00146
Italy
Phone: +39 06 5733 3215
Email: luca.cittadini@gmail.com
Bush, et al. Expires September 10, 2009 [Page 28]