Network Working Group S. Vinapamula
Internet-Draft Juniper Networks
Intended status: Standards Track S. Sivakumar
Expires: April 23, 2015 Cisco Systems
M. Boucadair
France Telecom
T. Reddy
Cisco
October 20, 2014
Application-Initiated Flow High Availability Awareness through PCP
draft-vinapamula-flow-ha-05
Abstract
This document specifies a mechanism for a host to signal via Port
Control Protocol (PCP) which connections should be protected against
network failures. These connections will be elected to be subject to
high availability mechanisms enabled at the network side.
This approach assumes that applications/users have more visibility
about sensitive connections rather than any heuristic that can be
enabled at the network side to guess which connections should be
secured.
Requirements Language
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in RFC 2119 [RFC2119].
Status of This Memo
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet-
Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
This Internet-Draft will expire on April 23, 2015.
Vinapamula, et al. Expires April 23, 2015 [Page 1]
Internet-Draft HA through PCP October 2014
Copyright Notice
Copyright (c) 2014 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License.
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2
2. Issues with the existing implementations . . . . . . . . . . 3
3. CHECKPOINT-REQUIRED PCP Option . . . . . . . . . . . . . . . 4
3.1. Format . . . . . . . . . . . . . . . . . . . . . . . . . 4
3.2. Behavior . . . . . . . . . . . . . . . . . . . . . . . . 5
4. Typical Usage Examples . . . . . . . . . . . . . . . . . . . 6
5. Signaling HA for other Network Functions . . . . . . . . . . 7
6. Security Considerations . . . . . . . . . . . . . . . . . . . 7
7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 7
8. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 8
9. References . . . . . . . . . . . . . . . . . . . . . . . . . 8
9.1. Normative references . . . . . . . . . . . . . . . . . . 8
9.2. Informative References . . . . . . . . . . . . . . . . . 8
1. Introduction
Internet service continuity is critical in Service Providers'
environment and for Enterprise networks. To achieve this, most
Service Providers deploy active-backup systems. This not only helps
them in service continuity during failover, but also help in service
continuity hitless upgrade or minimal hit upgrades of both software
or hardware and achieve desired level of service continuity
compliance.
For some of the network functions, a state would be maintained for
every connection for processing subsequent packets of that
connection. For service continuity of those connections on backup
when active fail, that corresponding state had to be check-pointed on
the backup. NAPT is one such network function, where a state is
maintained for every connection.
Vinapamula, et al. Expires April 23, 2015 [Page 2]
Internet-Draft HA through PCP October 2014
Heuristic based on the protocol, mapping lifetime, etc are used in
the network side to elect which connections are elected to High
Availability (HA) means. This document advocates for an application-
initiated approach that would allow applications/user to signal to
the network which of their connections are critical.
PCP-initiated signaling is superior to heuristics deployed at the
network side.
This document specifies how PCP can be extended to signal which
connection should be subject to HA mechanism. This document does not
make any assumption on the PCP-controlled device that will make use
of the content of signals issued by PCP clients. These devices are
likely to be flow-aware.
The proposed approach is aligned with the current networking trends
advocating for open network APIs to interact with applications/
services. Policy-decision making process at the network side will be
enriched with information signaled by application using PCP for
instance.
2. Issues with the existing implementations
In a high availability (HA) deployment, it is expensive in terms of
memory, CPU and other resources to checkpoint all connections state.
Also check-pointing may not be required for all connections as all
connections may not be critical. But, this leaves a challenge to
identify what connections to checkpoint.
Typically, this is addressed by identifying long lived connections
and check-pointing state of only those connections that lived long
enough, to the backup for service continuity.
However, following are the issues with that approach:
1. It is hard for a network to identify/guess which connection is
(business) critical. This characterization is mainly subscriber-
specific: a flow can be sensitive for a User#1 while it is not
for another User#2. Furthermore, this characterization can vary
in time: a flow can be sensitive in hour X, while it is not
later.
2. Heuristics are not deterministic.
3. A connection which could potentially be long-lived would face
disruption in service on failure of active system, before it had
not lived long enough for it to be check-pointed.
Vinapamula, et al. Expires April 23, 2015 [Page 3]
Internet-Draft HA through PCP October 2014
4. A connection may not be long lived but critical like shorter
Voice over (VoIP) conversations.
5. Similarly not every long lived connection need to be critical,
say a free-service connection of a hosted service need not be
check-pointed while a paid-service connection has to be check-
pointed.
3. CHECKPOINT-REQUIRED PCP Option
3.1. Format
This proposal is based on the assumption that an application or user
is the best judge to decide which of its connections' are critical.
An application/user may indicate the desire for checkpoint through
PCP client, using the CHECKPOINT_REQUIRED option as described in
Figure 1.
The entry to be backed up is indicated by the content of a MAP or
PEER message.
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|Option Code=TBA| Reserved | Option Length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Option Name: CHECKPOINT_REQUIRED
Number: <TBA>
Purpose: Indicate if an entry needs to be check pointed.
Valid for Opcodes: MAP, PEER
Length: 0.
May appear in: request, response.
Maximum occurrences: 1.
Figure 1: CHECKPOINT_REQUIRED PCP Option
The description of the fields is as follows:
o Option Code: To be assigned by IANA.
o Reserved: This field is initialized as specified in Section 7.3 of
[RFC6887].
o Option Length: 0. This means no data is included in the option.
Vinapamula, et al. Expires April 23, 2015 [Page 4]
Internet-Draft HA through PCP October 2014
It was tempting to include additional fields in the option but this
would lead to a more complex design that is not justified, e.g.,:
o Define a dedicated field to indicate a priority level. This
priority is intended to be used by the PCP server as a hint when
processing a request with a CHECKPOINT_REQUIRED option.
Nevertheless, an applications may systematically choose to set the
priority level to the highest value so that it increases its
chance to be serviced!
o Return a more granular failure error code to the requesting PCP
client. Nevertheless this would require extra processing at both
the PCP client and server sides for handling the various error
codes without any guarantee for the PCP client to have its
mappings check-pointed.
An application or user can use this option to indicate that one or
more of its connections are critical and disruption is not desired.
Doing so will trigger check-pointing of state to the backup.
Communication between application/user and PCP client is
implementation-specific.
3.2. Behavior
Support for the CHECKPOINT_REQUIRED option by PCP servers and PCP
clients is optional. This option (Code TBA; see Figure 1) MAY be
included in a PCP MAP/PEER request to indicate a connection is to be
protected against network failures.
The PCP client includes a CHECKPOINT_REQUIRED option in a MAP or PEER
request to signal that the corresponding mapping is to be protected.
A PCP server MAY ignore the CHECKPOINT_REQUIRED option sent to it by
a PCP client (e.g., if it does not support the option or if it is
configured to ignore it). To signal that it has not accepted the
option, a PCP server simply does not include the CHECKPOINT_REQUIRED
option in the response. If the PCP client does not receive a
CHECKPOINT_REQUIRED option in a response to a request enclosing a
CHECKPOINT_REQUIRED option, this means the PCP server does not
support the option or it is configured to ignore it.
If the CHECKPOINT_REQUIRED option is not included in the PCP client
request, the PCP server does not include the CHECKPOINT_REQUIRED
option in the associated response. This is mainly because there are
not valid motivation that would justify a PCP server notify a PCP
client about it reliability decision.
Vinapamula, et al. Expires April 23, 2015 [Page 5]
Internet-Draft HA through PCP October 2014
When the PCP server receives a CHECKPOINT_REQUIRED option, the PCP
server checks if it can honor this request depending on whether
resources are available for check-pointing. If there are no
resources available for check-pointing, but there are resources
available to honor the MAP/PEER request, a response is sent back to
the PCP client without including the CHECKPOINT_REQUIRED option
(i.e., the request is processed as any MAP/PEER request that does not
convey a CHECKPOINT_REQUIRED option). If check-pointing resources
are still available and the quota for this PCP client is not reached,
the PCP server tags the corresponding entry as eligible to HA
mechanism and sends back the CHECKPOINT_REQUIRED option in the
positive answer to the PCP client.
To update the check-pointing behavior of a mapping maintained by the
PCP server, the PCP client generates a PCP MAP/PEER renewal request
that includes a CHECKPOINT_REQUIRED option to indicate this mapping
has to be check-pointed or without including a CHECKPOINT_REQUIRED
option to indicate this mapping need not be check-pointed anymore.
Upon receipt of the PCP request, the PCP server proceeds to the same
operations to validate a MAP/PEER request updating an existing
mapping. If validation checks are successfully passed, the PCP
server updates the check-point flag associated with that mapping
accordingly (i.e., it is set if a CHECKPOINT_REQUIRED option was
included in the update request or it is cleared if no
CHECKPOINT_REQUIRED option was included) , and the PCP server returns
the response to the PCP client accordingly.
What information to checkpoint and how to checkpoint is out of scope
of this document, and is left for implementations. Also, interest to
indicate check-pointing by users/applications in a PCP request, may
be automatic, semi-automatic, or human intervened. This behavior is
also left for application implementations.
It is RECOMMENDED to checkpoint state on backup for honored requests
before a response is sent to the PCP client.
4. Typical Usage Examples
Below are provided some examples for illustration purposes:
1. Disruption in a Voice over Internet Protocol (VoIP) connection is
not desired: Application that initiates or receives VoIP flows
can include CHECKPOINT_REQUIRED option in its associated MAP/PEER
message(s) to indicate to the PCP server that the corresponding
entry must be protected against failure.
2. Similarly disruption in media streaming (e.g., video-on-demand
(VoD)) is not desired: The PCP client uses on behalf of the media
Vinapamula, et al. Expires April 23, 2015 [Page 6]
Internet-Draft HA through PCP October 2014
service the CHECKPOINT_REQUIRED option while initiating a mapping
request, and may mark connection(s) associated with that mapping,
depending on whether the connection is from a paid subscriber or
from a free subscriber through a PEER request. So check-pointing
mapping doesn't result in auto check-pointing of connections.
5. Signaling HA for other Network Functions
In conjunction with NAT, other network functions that may maintain
state for each connection such as stateful firewall may register to
PCP server, and may be triggered for check-pointing respective state
of that connection.
6. Security Considerations
PCP-related security considerations are discussed in [RFC6887].
CHECKPOINT_REQUIRED option can be used by an attacker to identify
critical flows. This issue is mitigated if the network on which the
PCP messages are to be sent is fully trusted. Means to defend
against attackers who can intercept packets between the PCP server
and the PCP client should be enabled. In some deployments, access
control lists (ACLs) can be installed on the PCP client, PCP server,
and the network between them, so those ACLs allow only communications
between trusted PCP elements. If the networking environment between
the PCP client and PCP server is not secure, means to protect
exposing the content of PCP messages (e.g., DTLS [RFC6347]) are
recommended.
A network device can always override the end-user signalling, i.e.,
what is signaled by the PCP client, if the instructions are
conflicting with the network policies.
There is a risk that every PCP client may wish to checkpoint every
connection, which can potentially load the system. Administration
SHOULD restrict the number of connections that can be elected to be
backed up and the rate of check-pointing on per PCP client.
7. IANA Considerations
The following PCP Option Code is to be allocated in the optional-to-
process range (the registry is maintained in http://www.iana.org/
assignments/pcp-parameters):
CHECKPOINT_REQUIRED set to TBA (see Section 3.1)
Vinapamula, et al. Expires April 23, 2015 [Page 7]
Internet-Draft HA through PCP October 2014
8. Acknowledgements
Thanks to Reinaldo Penno, Stuart Shechire, and Dave Thaler for their
comments.
9. References
9.1. Normative references
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, March 1997.
[RFC6887] Wing, D., Cheshire, S., Boucadair, M., Penno, R., and P.
Selkirk, "Port Control Protocol (PCP)", RFC 6887, April
2013.
9.2. Informative References
[RFC6347] Rescorla, E. and N. Modadugu, "Datagram Transport Layer
Security Version 1.2", RFC 6347, January 2012.
Authors' Addresses
Suresh Vinapamula
Juniper Networks
1194 North Mathilda Avenue
Sunnyvale, CA 94089
USA
Phone: +1 408 936 5441
EMail: sureshk@juniper.net
Senthil Sivakumar
Cisco Systems
7100-8 Kit Creek Road
Research Triangle Park, NC 27760
USA
Phone: +1 919 392 5158
EMail: ssenthil@cisco.com
Vinapamula, et al. Expires April 23, 2015 [Page 8]
Internet-Draft HA through PCP October 2014
Mohamed Boucadair
France Telecom
Rennes 35000
France
EMail: mohamed.boucadair@orange.com
Tirumaleswar Reddy
Cisco Systems, Inc.
Cessna Business Park, Varthur Hobli
Sarjapur Marathalli Outer Ring Road
Bangalore, Karnataka 560103
India
EMail: tireddy@cisco.com
Vinapamula, et al. Expires April 23, 2015 [Page 9]