Independent Submission                                       M. Stenberg
Internet-Draft                                                  O. Troan
Expires: March 24, 2007                              cisco Systems, Inc.
                                                      September 20, 2006


      IPv6 Prefix Delegation routing state maintenance approaches
                 draft-stenberg-pd-route-maintenance-00

Status of this Memo

   By submitting this Internet-Draft, each author represents that any
   applicable patent or other IPR claims of which he or she is aware
   have been or will be disclosed, and any of which he or she becomes
   aware will be disclosed, in accordance with Section 6 of BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups.  Note that
   other groups may also distribute working documents as Internet-
   Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/ietf/1id-abstracts.txt.

   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html.

   This Internet-Draft will expire on March 24, 2007.

Copyright Notice

   Copyright (C) The Internet Society (2006).

Abstract

   The maintenance of Prefix Delegation (PD) routing state is an issue
   that people have discussed in the IETF DHC WG, and there have been
   drafts on the topic.  However, as the pros and cons of the different
   routing state maintenance solutions have not been examined
   thoroughly, this text attempts to shed some light on both the actual
   problem and the various alternative solutions.





Stenberg & Troan         Expires March 24, 2007                 [Page 1]


Internet-Draft    IPv6 PD route maintenance approaches    September 2006


1.  Introduction

   A prefix delegation deployment consists of Requesting Routers (RR),
   Delegating Routers (DR) and possibly a backend provisioning system
   (see Figure 1).  The delegated prefix has to be routed in the
   network.  This document explores various alternatives for how the
   route for the delegated prefix can be injected in the network, and
   how the routing state can be maintained.

   /~~~~~~~~~\
   | Network |
   \~~~~~~~~~/
    |
    |-----------------------------------------
    |                                         \
    | +-----------------------------+-------------------+
    | | Backend provisioning system | DR 4 (integrated) |
    | +-----------------------------+-------------------+
    |    |                       |             |
    | +------+                +------+     +------+
    |-| DR 1 |                | DR 3 |     | RR 3 |
    \ +------+                +------+     +------+
     --- | --------------------/ |
      +------+                +------+
      | DR 2 |                | RR 2 |
      +------+                +------+
         |
      +------+
      | RR 1 |
      +------+

   Figure 1: Possible prefix delegation deployment cases.

   Prefix delegation is a stateful protocol.  The RR needs to maintain
   state so that it can sub-delegate prefixes to downstream links.  The
   RR maintains soft-state which can be recovered by redoing the prefix
   request (for example, using Dynamic Host Configuration Protocol for
   IPv6 (DHCPv6) [1] with the Prefix Delegation options defined in [2]).

   If the DR should do route injection on behalf of the RR, it needs to
   maintain state.  The backend provisioning system must maintain a list
   of prefixes delegated, as a prefix delegation is a long-lived entity
   (lifetime of a customer relationship, as in months or years).  The
   backend system and DR might run on the same router.  This document
   focuses on the case where the backend system and DR are separate and
   the DR has little or no persistent storage.  Therefore the DR 4 case
   (in Figure 1) is not covered here, as it is trivial - the backend has
   nonvolatile storage for the prefixes, and it can re-inject the routes



Stenberg & Troan         Expires March 24, 2007                 [Page 2]


Internet-Draft    IPv6 PD route maintenance approaches    September 2006


   when the integrated DR 4 restarts.

   The DR's routing state that needs to be maintained can be divided
   into two distinct categories: local routing state (that is, a local
   RIB entry containing the next-hop and the interface the assigned
   prefix is connected to), and global (AS-wide) routing state which
   requires advertising the route via a routing protocol.

   Advertising a route per delegation from the DR can be avoided if
   there is an aggregate prefix covering the delegation.  This requires
   stringent address allocation procedures and prohibits an RR from
   moving to a different DR.







































Stenberg & Troan         Expires March 24, 2007                 [Page 3]


Internet-Draft    IPv6 PD route maintenance approaches    September 2006


2.  Different approaches for maintaining routing state

   As any router (or the backend system for that matter) can go offline
   and come back up later, it is necessary for the system to recover
   from these intermittent failures.

   The problem is how to delegate responsibility for route maintenance
   to one (or more) of the three components of the system, and letting
   it take care of maintaining the required routing state in place for
   the RR's prefix.

2.1.  Backend provisioning system responsible for routing state

   Considering the backend provisioning system is the only component in
   the system that actually requires significant amount of nonvolatile
   storage, from data system point of view it would be ideal to have the
   backend provisioning system responsible for maintaining the routing
   state as well.

   It would mean that the backend provisioning system should, when DR
   restarts, (securely) re-inject the local or global routing state into
   the DRs.

   In practise, this is infeasible:

   o  There is no standard way of detecting when the DR is restarted.

   o  Redundancy of DR, or the links between DR and backend system,
      makes it difficult for the backend system to judge the state of
      the DR accurately without significant extra configuration data
      about the deployed configuration.

   o  None of the current routing protocols are suitable for altering
      remote router's local routing information, and therefore some
      protocol development would be in order for this approach to be
      usable.  There are also security implications with this solution.

   o  Lack of scalability; the benefit of having 'backend' provisioning
      system disappears as it will need to take care of maintaining
      routes of every one of its DRs.

   o  The backend may lack the information to identify the DR to the
      routing system.  With multiple DRs, if the delegation protocol
      does not contain everything needed to re-inject the route later on
      to the specific DR, it won't work.  For example, DHCPv6 does not
      uniquely identify the relays.  And if interim DRs do not have
      backend provisioning system-addressible addresses, there is a
      problem.  All DRs may not have global unicast addresses, and this



Stenberg & Troan         Expires March 24, 2007                 [Page 4]


Internet-Draft    IPv6 PD route maintenance approaches    September 2006


      is problematic especially in configurations spanning multiple
      administrative domains.

   Having considered the backend provisioning system as the responsible
   component, it is clearly NOT the way to go.  That leaves the DR and
   RR components.

2.2.  Delegating router responsible for routing state

   As the DR is part of the local routing infrastructure, placing the
   responsibility for routing state in the DR seems sensible.  With that
   design decision, the next problem is _when_ the routing information
   is updated after the DR restarts:

2.2.1.  Approach 1: On-demand lease query

   In the on-demand lease query case as defined in [3], the routing
   state maintenance problem is assumed to be local, and therefore the
   DR will receive packets both from the network at large as well as the
   RR even after a loss of local state caused by a restart.

   When traffic arrives to DR either from the RR, or from the network to
   the DR for a prefix without local routing information, the DR will
   perform lease query, acquire the allocated prefix, and update the
   routing information appropriately.

   This approach, while simple to specify, has some major issues:

   o  It depends on the aggregated prefix to cause the inbound traffic
      to wind up in the DR.  This assumption may not be valid, depending
      on the address assignment policies of the organization.
      Geographical or network topological hierarchical address
      assignment at large seems to be a failure, and it is unclear if
      all deployments can really implement this.

   o  It requires the incoming traffic both from the RR and the network,
      for which no route exists, to trigger the lease query.  This has
      two negative side effects: it requires support from the fast path
      hardware in the DR, and potentially causes large amount of
      spurious requests to the backend provisioning system (up to the
      desired rate that is considered harmful to the system).

   o  It requires simulated ordering of the unordered transaction
      stream, to ensure that the routing state is maintained correctly.
      The DR cannot be argued to be particularly stateless anymore.






Stenberg & Troan         Expires March 24, 2007                 [Page 5]


Internet-Draft    IPv6 PD route maintenance approaches    September 2006


2.2.2.  Approach 2: Anticipatory lease query

   Anticipatory, or bulk lease query, solves the routing state problem
   by requesting ALL prefix information from the backend provisioning
   system at the DR restart time.  There are two different ways: The
   first approach is asynchronous, that is, the old state is fetched
   while handling the delegation requests, requiring synchronization
   algorithm between the bulk data retrieved from the backend system,
   and the requests served during that.  For synchronization, some sort
   of ordering of the transaction stream is needed.  The second
   alternative is synchronous: the bulk query is performed first, and
   only then the RRs' requests are handled.

   Bulk query has several advantages over the on-demand case:

   o  No need for triggering based on either inbound or outbound traffic
      for the prefix.

   o  If DR handles the query synchronously, we can avoid the ordering
      of the transaction stream and the associated complexity rising
      from it.

   o  Given reasonable TCP transport scheme, the transfer of the state
      is more efficient than the on-demand case in terms of total number
      of packets.

   o  Does not require changes to fast path hardware, as no new triggers
      are needed from the traffic.  Instead, simple additional code in
      the system initialization is enough.

   But, unfortunately it has also some disadvantages:

   o  It causes more uneven load on the backend provisioning system than
      the on-demand case.  If the prefix is not being actively used at
      the time, it will not cause traffic in the on-demand case, but it
      will in the bulk case.

   o  Synchronization is non-trivial if the DR serves RR requests during
      the bulk retrieval of the data.

   o  Doesn't work very well with virtual interfaces - it is hard to
      retrieve state at boot time if the interfaces themselves get up
      only at some point, and with their transient nature mapping a DUID
      to individual customers is difficult.







Stenberg & Troan         Expires March 24, 2007                 [Page 6]


Internet-Draft    IPv6 PD route maintenance approaches    September 2006


2.2.3.  Approach 3: Persistent storage

   It is possible for the DR to store the route information to be
   injected either locally, or on some adjacent storage node.  The clear
   advantage of this is the lack of traffic on the wire.

   Unfortunately, it has also some problems - the data being possibly
   outdated due to lack of synchronization, and the management overhead
   when the customers for example move around would be significant.

   However, in most deployment scenarios persistent storage at or near
   all routers is not desirable or possible in the first place, so this
   is listed simply for the sake of completeness.

2.3.  Requesting router responsible for routing state

   The most interested party in the routing state of the given prefix is
   the RR itself; therefore, giving the responsibility for maintaining
   its routing state to it seemed to be idea worth considering.

   Due to the operators wariness of the systems not under their direct
   control, even with the RR responsible for maintenance of the state,
   the real route injection should be handled by the DR.

   The nice thing about some of the RR-oriented solutions is that they
   can be deployed without any changes to the rest of the
   infrastructure.

2.3.1.  Approach 1: Layer-2 detection of link state

   If the RR implementation gets notifications about the state of the
   link layer, it can actually detect the state of the network link
   going down and coming back up; performing reconfiguration to ensure
   that the routing state is still up seems like a trivial solution in
   this case.

   This solution can be the best one when operating over connection-
   oriented media (PPPoE, L2TP) but it doesn't work on say, Ethernet
   without direct connection between the RR and DR.

2.3.2.  Approach 2: Keepalive

   If the RR doesn't have L2 way of detecting DR being restarted, it can
   maintain a keepalive mechanism using, for example, Bidirectional
   Forwarding Detection (BFD - [4]) to send self-addressed echo packets
   to the DR and waiting for their replies.  The implementations SHOULD
   do this only if there is no traffic from the network within a desired
   period of time - see IPv6 Neighbor Unreachability Detection (NUD)'s



Stenberg & Troan         Expires March 24, 2007                 [Page 7]


Internet-Draft    IPv6 PD route maintenance approaches    September 2006


   definition of forward progress detection as a way to send keepalives
   only when truly necessary in [5].

   Assuming sub-second round-trips (reasonable assumption in most modern
   network environments), the longest factor for the determining the
   keepalive timeout is the recovery speed of the DR (by orders of
   magnitude), as it can take from some seconds (hot standby) to minutes
   (non-HA restart, or cold standby with huge configurations).  The
   initial keepalive timer should be some fraction of the highest delay
   in the system, that is, the DR recovery time.  The subsequent retries
   if no reply is received within reasonable timeframe should be
   calculated based on the link delay, and jitter, to ensure that the
   reply is unlikely to be coming back by the time the keepalive message
   is re-sent.

   As far as overhead is concerned, assuming the cold standby/restart
   taking minute(s), with a keepalive per 60 seconds for example, the
   QoS would remain roughly same as with faster intervals (as the DR
   going down would cause interruption in the routing in the order of
   minute(s) in any case).  This value would cause overhead of 0.017pps
   per RR, and it is unlikely to be the straw that breaks camel's back
   for the DR.  With any traffic, even NUD packets should outnumber the
   keepalive traffic.

   As far as resource utilization is concerned, this solution involves
   only routing plane of the RR, the data link between RR and DR, and
   fast path of DR which bounces the packets back.  Therefore it can be
   argued to be fairly lightweight general-purpose solution.

2.3.3.  Approach 3: Short lifetimes

   The current best practice for maintaining the routing state is to set
   short configuration lifetimes (DHCP T1/T2 values).  It causes extra
   traffic and load on the whole DHCP infrastructure.  That is because
   during every reconfiguration, even with the DR constantly up and
   running, the backend system is queried.  The transaction involves all
   three components.  Due to that, every RR will cause constant load on
   the backend system itself over the time, making the solution simply
   not scale well.

2.3.4.  Approach 4: Routing protocol to the requesting router

   The final RR-based approach consists of the RR actually running a
   routing protocol; this way, the RR router can simply advertise the
   prefix as it receives it, and everything just works.  Or not, as it
   may be.

   The downside is the security, or complete lack of it.  The DR



Stenberg & Troan         Expires March 24, 2007                 [Page 8]


Internet-Draft    IPv6 PD route maintenance approaches    September 2006


   accepting arbitrary RR-advertised prefixes (assuming no state at the
   DR) should be acceptable only if DR and RR are within the same
   administrative domain.  For that case, this is probably the cleanest
   solution of all.

   If administrative boundaries are crossed, the DR will not take prefix
   advertisement at face value.  The DR will have extra overhead of
   checking the backend provisioning system for AAA purposes before
   actually doing anything with the prefix.  This can imply look-up for
   validity using the prefix and the interface the advertisement came
   from, including the DUID or some other identifier within the route
   advertisement message, or using some real AAA mechanism if the
   routing protocol supports one.  If minimal changes to the routing
   protocol implementation are desired, it is also possible to ignore
   the advertisement itself, and just trigger on-demand lease query,
   thereby using the routing protocol just as an alternative keepalive
   mechanism the with most of the logic shoved in DR instead of RR.

                     /~~~~~~~~~~\
                     |  Network |
                     \~~~~~~~~~~/
                         |   |
                ---------/   \---------
               /                       \
   +---------------------+   +---------------------+
   | Delegating router 1 |   | Delegating router 2 |
   +---------------------+   +---------------------+
               \                       /
                -----------+-----------
                           |
                +---------------------+
                | Requesting router   |
                +---------------------+

   Figure 2: Multihomed deployment.

   There is also a case where the RR HAS to run its own routing
   protocol; in multihomed situation like Figure 2, with the same
   routable prefix advertised via two different DRs, there is no other
   practical way to get the system working.  Of course, static route
   configuration is always an alternative but it is seldom desirable.

   The routing-protocol-based solutions all require a significant level
   of trust between RR and DR; regrettably the current routing protocols
   are not designed with AAA (or security for that matter) in mind, and
   therefore when crossing administrative boundaries, the alternatives
   are either using them as-is as a hint that something needs to be
   done, or significantly extending the protocols in the AAA direction.



Stenberg & Troan         Expires March 24, 2007                 [Page 9]


Internet-Draft    IPv6 PD route maintenance approaches    September 2006


   Adding extra complexity to the DR's routing protocol implementation
   or configuration is not desirable in general.

   Finally, the current prefix delegation solution (DHCPv6 PD) does not
   provide the information about which routing protocol to use, and
   there is no routing protocol auto-negotiation protocol.  Therefore
   the auto-configuration of the RR with arbitrary routing protocol
   cannot be done currently.











































Stenberg & Troan         Expires March 24, 2007                [Page 10]


Internet-Draft    IPv6 PD route maintenance approaches    September 2006


3.  Security Considerations

   The backend-oriented solution detailed in Section 2.1 implies a
   significant level of trust between the DR and the backend
   provisioning system.  The system's configuration is simpler if the
   backend provisioning system can inject arbitrary routes to the DR,
   but allowing injection of routes for only specific sub-prefixes of a
   specific prefix is considerably more secure solution.  Unfortunately
   it requires advance configuration of the prefix(es) involved.

   The delegating router-based solutions detailed in Section 2.2 do not
   have any security issues, assuming the delegation protocol itself is
   secured, or can be assumed to be used only within a trusted network.

   The requesting router-based solutions in Section 2.3, even
   incorrectly implemented, at most just cause extra load to the DR.  As
   noted in Section 2.3.4, even when running routing protocol from the
   RR, ideally the DR should consider the advertisements only a hint at
   best if not part of the same adminstrative domain.  This may not be
   ideal if the routing protocol information should be propagated as-is
   onward, as in the the multihoming cases.  Unfortunately, those cases
   also most likely cross administrative boundaries (the requesting
   router being part of one domain, and connected to delegating routers
   in most likely more than one), the providers will not most likely
   trust the routing protocol to be used as-is at the delegating
   routers, and their complexity will increase due to the required AAA/
   policy checks.  This is a potential security risk in a critical part
   of the network infrastructure.























Stenberg & Troan         Expires March 24, 2007                [Page 11]


Internet-Draft    IPv6 PD route maintenance approaches    September 2006


4.  IANA Considerations

   As this document is informational in nature and only summarizes
   current best practices, it does not require action from IANA.















































Stenberg & Troan         Expires March 24, 2007                [Page 12]


Internet-Draft    IPv6 PD route maintenance approaches    September 2006


5.  Summary

   The backend provisioning system should not be assigned the
   responsibility for the maintenance of the route.  As seen in
   Section 2.1, that approach has significant obstacles without any
   clear benefits.

   If the link layer state can be used to detect the (potential) restart
   of delegating router, the requesting router-based simple
   reconfiguration described in Section 2.3.1 seems to be the best
   choice.

   When link layer state is not available, there is no clear 'best'
   solution.  The tradeoff seems to be between increasing the complexity
   of the delegation protocol and the delegating router/backend system
   (as described in the lease query cases in Section 2.2.1 and
   Section 2.2.2), decreasing scalability of the system significantly by
   using low lifetimes for configuration (as described in
   Section 2.3.3), or small overhead of the keepalive (as described in
   Section 2.3.2).

   Only in multihoming cases, given some extensions to the current
   prefix delegation protocol, should routing protocol on the requesting
   router be considered, as described in Section 2.3.4.  Multihoming
   solution itself is challenging to do securely, as noted in Section 3,
   due to lack of AAA support in routing protocols.

6.  References

   [1]  Droms, R., Bound, J., Volz, B., Lemon, T., Perkins, C., and M.
        Carney, "Dynamic Host Configuration Protocol for IPv6 (DHCPv6)",
        RFC 3315, July 2003.

   [2]  Troan, O. and R. Droms, "IPv6 Prefix Options for Dynamic Host
        Configuration Protocol (DHCP) version 6", RFC 3633,
        December 2003.

   [3]  Brzozowski, J., "DHCPv6 Leasequery",
        draft-ietf-dhc-dhcvp6-leasequery-00 (work in progress),
        August 2006.

   [4]  Katz, D. and D. Ward, "Bidirectional Forwarding Detection",
        draft-ietf-bfd-base-05 (work in progress), June 2006.

   [5]  Narten, T., Nordmark, E., and W. Simpson, "Neighbor Discovery
        for IP Version 6 (IPv6)", RFC 2461, December 1998.





Stenberg & Troan         Expires March 24, 2007                [Page 13]


Internet-Draft    IPv6 PD route maintenance approaches    September 2006


Appendix A.  Acknowledgements

   Thanks to Bernie Volz for feedback during writing of the document.
















































Stenberg & Troan         Expires March 24, 2007                [Page 14]


Internet-Draft    IPv6 PD route maintenance approaches    September 2006


Authors' Addresses

   Markus Stenberg
   cisco Systems, Inc.
   Shinjuku Mitsui Building, 2-1-1, Nishi-Shinjuku
   Shinjuku-Ku, Tokyo-to  1630409
   JP

   Email: mstenber@cisco.com


   Ole Troan
   cisco Systems, Inc.
   Shinjuku Mitsui Building, 2-1-1, Nishi-Shinjuku
   Shinjuku-Ku, Tokyo-to  1630409
   JP

   Email: ot@cisco.com

































Stenberg & Troan         Expires March 24, 2007                [Page 15]


Internet-Draft    IPv6 PD route maintenance approaches    September 2006


Intellectual Property Statement

   The IETF takes no position regarding the validity or scope of any
   Intellectual Property Rights or other rights that might be claimed to
   pertain to the implementation or use of the technology described in
   this document or the extent to which any license under such rights
   might or might not be available; nor does it represent that it has
   made any independent effort to identify any such rights.  Information
   on the procedures with respect to rights in RFC documents can be
   found in BCP 78 and BCP 79.

   Copies of IPR disclosures made to the IETF Secretariat and any
   assurances of licenses to be made available, or the result of an
   attempt made to obtain a general license or permission for the use of
   such proprietary rights by implementers or users of this
   specification can be obtained from the IETF on-line IPR repository at
   http://www.ietf.org/ipr.

   The IETF invites any interested party to bring to its attention any
   copyrights, patents or patent applications, or other proprietary
   rights that may cover technology that may be required to implement
   this standard.  Please address the information to the IETF at
   ietf-ipr@ietf.org.


Disclaimer of Validity

   This document and the information contained herein are provided on an
   "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
   OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET
   ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED,
   INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE
   INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
   WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.


Copyright Statement

   Copyright (C) The Internet Society (2006).  This document is subject
   to the rights, licenses and restrictions contained in BCP 78, and
   except as set forth therein, the authors retain all their rights.


Acknowledgment

   Funding for the RFC Editor function is currently provided by the
   Internet Society.




Stenberg & Troan         Expires March 24, 2007                [Page 16]