HTTP Working Group J. C. Mogul, DECWRL
Internet-Draft 6 January 1997
Expires: 15 July 1997
Forcing HTTP/1.1 proxies to revalidate responses
draft-mogul-http-revalidate-00.txt
STATUS OF THIS MEMO
This document is an Internet-Draft. Internet-Drafts are
working documents of the Internet Engineering Task Force
(IETF), its areas, and its working groups. Note that other
groups may also distribute working documents as
Internet-Drafts.
Internet-Drafts are draft documents valid for a maximum of
six months and may be updated, replaced, or obsoleted by
other documents at any time. It is inappropriate to use
Internet-Drafts as reference material or to cite them other
than as "work in progress."
To learn the current status of any Internet-Draft, please
check the "1id-abstracts.txt" listing contained in the
Internet-Drafts Shadow Directories on ftp.is.co.za
(Africa), nic.nordu.net (Europe), munnari.oz.au (Pacific
Rim), ds.internic.net (US East Coast), or ftp.isi.edu (US
West Coast).
Distribution of this document is unlimited. Please send
comments to the HTTP working group at
<http-wg@cuckoo.hpl.hp.com>. Discussions of the working
group are archived at
<URL:http://www.ics.uci.edu/pub/ietf/http/>. General
discussions about HTTP and the applications which use HTTP
should take place on the <www-talk@w3.org> mailing list.
ABSTRACT
The HTTP/1.1 specification [1] currently defines a
``proxy-revalidate'' Cache-control directive, which forces
a proxy to revalidate a stale response before using it in a
reply. There is no mechanism defined that forces a proxy,
but not an end-client, to revalidate a fresh response. The
lack of such a mechanism is due to an error in drafting
RFC2068, and appears to create problems for use of the
Authorization header, the Digest Access Authentication
extension [2], the State Management Mechanism [3], and
several other proposed extensions. This document discusses
the problem and several possible solutions, and proposes to
add a new ``proxy-maxage'' directive as the best available
solution.
Mogul [Page 1]
Internet-Draft HTTP proxy revalidation 6 January 1997 18:22
TABLE OF CONTENTS
1 Introduction 2
2 Problems with proxy-revalidate 3
3 Possible alternatives 5
3.1 Alternatives not requiring changes to RFC2068 5
3.2 Alternatives that require changes to RFC2068 6
4 Proposed solution 8
5 Security Considerations 10
6 Acknowledgements 10
7 References 10
8 Author's address 10
1 Introduction
HTTP/1.1 introduces a ``Cache-control'' header to allow origin
servers and clients to impose fine-grained control over the operation
of HTTP caches. One important aspect of HTTP caching is whether a
cache should ``revalidate'' a cached response with the origin server,
before using the response as a cache hit. In cases where the use of
an invalid cache entry could lead to serious error, such as the
violation of an authentication policy, or incorrect behavior of an
online shopping application, proper revalidation could be crucial.
On the other hand, caching can yield significant performance
benefits, and so we want to make caching as effective as possible.
Note: HTTP caches normally revalidate a cached response by
sending a conditional GET to the origin server. This may be
done using the ``If-none-match'' request header or the
``If-modified-since'' request header. If the server would
return the same response as the cached response, the server may
reply with a status code of 304 (Not Modified). While this
does involve a message exchange, by avoiding the transmission
of the entity body, a revalidation is often much cheaper than
an unconditional retrieval.
Regarding the terms ``fresh'' and ``stale'': a cached response
is considered to be fresh if its current age is less than its
maximum allowed age. A cached response is stale otherwise.
Normally, only stale responses need to be revalidated; a fresh
response is inherently usable without revalidation, until it
reaches its maximum age.
During the design of HTTP/1.1, it was realized that different
revalidation policies might be applied to end-client caches (e.g., in
browsers) and to intermediate proxy caches. For example, a proxy
cache might be shared between multiple users (raising security
considerations), or it might be operated by someone whose interests
in reducing transmission costs do not coincide with the interest of
the ultimate client or origin server in preserving certain kinds of
application semantics.
Mogul [Page 2]
Internet-Draft HTTP proxy revalidation 6 January 1997 18:22
It was also realized that in some cases, it might sometimes be
appropriate to configure a cache to be ``loose'' in its behavior for
stale responses. That is, in such a situation, the cache might
return a stale response without revalidating it. This might be done,
for example, if the network connection between the cache and the
origin server is not working, or if the cost or delay for
revalidation is prohibitively high.
There is an obvious potential contradiction between the occasional
requirement for strict revalidation of certain responses, and the
occasional desire to allow loose operation of some HTTP caches.
HTTP/1.1 resolves this by allowing (although not encouraging) loose
operation as the default, but by providing a protocol mechanism for
origin servers or end-clients to insist on mandatory strict operation
when necessary. This is done using the ``Cache-control'' header,
which can carry a number of cache-control directives. In particular,
these directives are defined in section 14.9 of RFC2068:
- max-age=NNN
Sets the maximum age for this response to NNN seconds. By
itself, does not force strict revalidation behavior.
- no-cache
Prevents any caching of this response.
- private
Prevents any caching by a shared cache.
- must-revalidate
Requires that an HTTP/1.1 cache revalidate the response
before using it, if the response is stale.
- proxy-revalidate
Requires that an HTTP/1.1 proxy cache revalidate the
response before using it, if the response is stale; does
not affect an end-client cache.
2 Problems with proxy-revalidate
The fundamental problem with proxy-revalidate, as defined in RFC2068,
is that it does not require a proxy cache to revalidate a fresh
response before using it. However, there are several circumstances
in which it is desirable or necessary to force a proxy cache to
revalidate a response that, to an end-client cache, would appear to
be fresh.
In section 14.8 of RFC2068, defining the ``Authorization'' header,
this language appears:
Mogul [Page 3]
Internet-Draft HTTP proxy revalidation 6 January 1997 18:22
1. If the response includes the "proxy-revalidate"
Cache-Control directive, the cache MAY use that response in
replying to a subsequent request, but a proxy cache MUST
first revalidate it with the origin server, using the
request-headers from the new request to allow the origin
server to authenticate the new request.
While this could be read as modifying the definition of
proxy-revalidate from section 14.9.4, it was in fact not intended as
a modification. Rather, the author of these two sections of RFC2068
(me!) failed to notice the conflicting intentions of these two uses
of proxy-revalidate.
In section 2.1.2 of RFC2069 [2] (the specification of the Digest
Access Authentication extension to HTTP/1.1), this language appears:
Implementors should be aware of how authenticated
transactions interact with proxy caches. The HTTP/1.1
protocol specifies that when a shared cache (see section
13.10 of [2]) has received a request containing an
Authorization header and a response from relaying that
request, it MUST NOT return that response as a reply to any
other request, unless one of two Cache-control (see section
14.9 of [2]) directives was present in the response. If the
original response included the ``must-revalidate''
Cache-control directive, the cache MAY use the entity of that
response in replying to a subsequent request, but MUST first
revalidate it with the origin server, using the request
headers from the new request to allow the origin server to
authenticate the new request. Alternatively, if the original
response included the ``public'' Cache-control directive, the
response entity MAY be returned in reply to any subsequent
request.
This discussion appears to be in error, since its implication that
``must-revalidate'' always MUST cause a revalidation does not cover
the case of apparently fresh responses. In fact, discussion with one
of the authors of RFC2069 has confirmed that he understood that
RFC2068 had provided a ``proxy must revalidate even if fresh''
directive, which it does not.
In section 4.2.3 of RFCXXXX [3] (the specification of the State
Management Mechanism for HTTP/1.1), this language appears:
The origin server should send [one of] the following
additional HTTP/1.1 response headers, depending on
circumstances:
* To suppress caching of a private document in shared caches:
Cache-control: private.
Mogul [Page 4]
Internet-Draft HTTP proxy revalidation 6 January 1997 18:22
* To allow caching of a document and require that it be
validated before returning it to the client: Cache-control:
must-revalidate.
* To allow caching of a document, but to require that proxy
caches (not user agent caches) validate it before returning
it to the client: Cache-control: proxy-revalidate.
* To allow caching of a document and request that it be
validated before returning it to the client (by
``pre-expiring'' it): Cache-control: max-age=0. Not all
caches will revalidate the document in every case.
Here again there seems to be a (false) assumption that
must-revalidate and proxy-revalidate cause revalidation even of fresh
responses.
Finally, the proposed Hit-Metering extension to HTTP/1.1 [4] depends
on a mechanism whereby an origin server can require proxy caches to
revalidate a response before every use, without requiring end-client
caches to do the same thing (which would be prohibitively
inefficient).
In summary, there is a clear need for a Cache-control mechanism that
allows an origin server to specify that a proxy must always
revalidate a response, while allowing end-clients to cache it without
revalidation (perhaps for a limited period).
3 Possible alternatives
3.1 Alternatives not requiring changes to RFC2068
Assuming that we do want a mechanism that allows an origin server to
specify that a proxy must always revalidate a response, while
allowing end-clients to cache it without revalidation, we could
certainly do this by modifying the HTTP/1.1 specification proposed in
RFC2068. Would it be possible to do this without modifying RFC2068,
possibly by using a combination of existing Cache-control directives
to approximate the desired behavior?
One solution would simply to use ``Cache-control: private''. This
would preserve any necessary semantics (because it would prevent any
and all proxy caching of the response). However, it is much less
efficient; because ``private'' prevents a shared cache from even
storing the response, it cannot do a conditional request for
subsequent references. Hence, this approach would lead to much
unnecessary transmission of entity bodies when we could be using 304
(Not Modified) responses.
Another approach would be to use ``Cache-control: proxy-revalidate,
max-age=0''. This allows proxies to store the response and forces
Mogul [Page 5]
Internet-Draft HTTP proxy revalidation 6 January 1997 18:22
them to revalidate it on every reference. However, it also implies
that strict end-user caches should revalidate on every reference as
well, which could cause even more unnecessary traffic than
``Cache-control: private'' would.
In short, there does not appear to be a way to use the existing
RFC2068 mechanisms to preserve both the necessary semantics and
optimal cache performance.
3.2 Alternatives that require changes to RFC2068
Several proposals have been made for modifications to RFC2068 to
resolve this problem.
We could redefine proxy-revalidate to mean ``always revalidate, even
if the response is fresh.'' However, this would leave us either with
no way to allow strict caches to use a response while it is fresh, or
with no way to force loose caches to revalidate certain responses.
The other proposals all involve adding one new Cache-control
directive, while preserving the current meaning of the existing
proxy-revalidate directive:
- proxy-mustcheck
This would mean that a proxy, but not an end-client, would
have to revalidate the response even it is fresh
- proxy-maxage=NNN
This would mean defining separate maximum ages for proxy
caches and for end-client caches. The existing max-age (or
Expires) value would continue to apply to end-client
caches, and would continue to apply to proxy caches if the
proxy-maxage directive were not present. However, if
proxy-maxage is present, then it would override the max-age
(or Expires) limit for proxies, but would be ignored by
end-clients.
- agent-maxage=NNN
This would also mean defining separate maximum ages for
proxy caches and for end-client caches. The existing
max-age (or Expires) value would continue to apply to proxy
caches, and would continue to apply to end-client caches if
the agent-maxage directive were not present. However, if
agent-maxage is present, then it would override the max-age
(or Expires) limit for end-clients, but would be ignored by
proxy caches.
One option would be to make either ``proxy-maxage'' or
``agent-maxage'' always strict: that is, they would imply that a
proxy or end-client, respectively, would be required to revalidate a
stale response. Alternatively, they could be combined with a
``must-revalidate'' directive to force strict behavior, but would
otherwise allow loose behavior.
Mogul [Page 6]
Internet-Draft HTTP proxy revalidation 6 January 1997 18:22
Each of these proposals would solve the existing problem, would be
simple to specify, and would probably not require significant
implementation complexity or overhead. However, we should probably
choose just one of these options; what are the relative merits?
The proxy-mustcheck approach is clearly the simplest, but gives up
the possibility of separate control over proxy and end-client
expiration times. Since this orthogonality could potentially be
useful, it seems more useful to adopt the proxy-maxage or
agent-maxage proposals.
If one assumes that this change could be made to the HTTP/1.1
specification before the permanent deployment of any HTTP/1.1
proxies, there at first seems to be no obvious reason to prefer one
to the other. That is, this header
Cache-control: max-age=10,proxy-maxage=3
and this one
Cache-control: max-age=3,agent-maxage=10
both express the same semantics in the same number of bytes.
However, if we also adopt the rule that proxy-maxage implies the
presence of proxy-revalidate, then in order to express the semantics
of
Cache-control: max-age=10,proxy-maxage=3
the origin server would have to send
Cache-control: max-age=3,agent-maxage=10,proxy-revalidate
which is somewhat more expensive.
Also, if we do adopt the Hit-metering proposal [4], the proxy-maxage
approach seems preferable, because it would allow the necessary
header rewriting to be accomplished by simple addition of a
directive, rather than more elaborate rewriting. For example, if the
origin server sends a hit-metered response with
Cache-control: max-age=10
then it would be rewritten (at the appropriate proxy, if necessary;
see [4] for details) as
Cache-control: max-age=10,proxy-maxage=0
using the proxy-maxage alternative, but would have to be rewritten as
Mogul [Page 7]
Internet-Draft HTTP proxy revalidation 6 January 1997 18:22
Cache-control: max-age=0,agent-maxage=10
using the agent-maxage proposal.
On the other hand, if we cannot make the necessary change to the
specification before the deployment of HTTP/1.1 proxies, then the
agent-maxage proposal is somewhat safer in terms of semantics. That
is, if HTTP/1.1 proxies are deployed that do not understand the
proxy-maxage directive, the use of agent-maxage will not cause these
proxies to avoid revalidating fresh responses. This is because they
will presumably carry a ``max-age=0'' directive, and so not appear to
be fresh to these proxies.
Unfortunately, if we fail to change the specification before the
permanent deployment of HTTP/1.1 end-clients, then we may face a
performance problem with the use of agent-maxage: clients that do
not understand this new directive might do many more revalidations
than necessary, and so cause excessive network and server loading, as
well as unnecessary delays.
Ultimately, therefore, it would be best if we made this specification
change before any permanent deployment of HTTP/1.1 proxies or
clients. If we do so, then it seems more efficient to use the
proxy-maxage mechanism.
4 Proposed solution
The HTTP/1.1 specification in RFC2068 should be changed, in section
14.9 (Cache-Control), in the following ways:
- The grammar for cache-response-directive should include a
new alternative:
| "proxy-maxage" "=" delta-seconds
There is no need for a corresponding change to the grammar
for cache-request-directive.
- Section 14.9.3 should include, after the second paragraph
(which starts with ``If a response includes ...''), this
new paragraph:
If a response includes a proxy-maxage directive,
then for a proxy cache (but not for an end-client
cache), the maximum age specified by this directive
overrides the maximum age specified by either the
max-age directive or the Expires header. The
proxy-maxage directive also implies the semantics
of the proxy-revalidate directive (see section
14.9.4), i.e., that the proxy MUST NOT use the
Mogul [Page 8]
Internet-Draft HTTP proxy revalidation 6 January 1997 18:22
entry after it becomes stale to respond to a
subsequent request without first revalidating it
with the origin server. The proxy-maxage directive
is always ignored by an end-client.
- In section 13.4, the list of response headers and
directives implicitly allowing cachability should include
``proxy-maxage'' after ``max-age''.
- In section 14.8 (Authorization), this paragraph:
1. If the response includes the "proxy-revalidate"
Cache-Control directive, the cache MAY use that
response in replying to a subsequent request, but a
proxy cache MUST first revalidate it with the
origin server, using the request-headers from the
new request to allow the origin server to
authenticate the new request.
should become
1. If the response includes the "proxy-revalidate"
Cache-Control directive, the cache MAY use that
response in replying to a subsequent request, but
if the response is stale, a proxy cache MUST first
revalidate it with the origin server, using the
request-headers from the new request to allow the
origin server to authenticate the new request.
This paragraph
2. If the response includes the "must-revalidate"
Cache-Control directive, the cache MAY use that
response in replying to a subsequent request, but
all caches MUST first revalidate it with the origin
server, using the request-headers from the new
request to allow the origin server to authenticate
the new request.
should become
2. If the response includes the "must-revalidate"
Cache-Control directive, the cache MAY use that
response in replying to a subsequent request, but
if the response is stale, all caches MUST first
revalidate it with the origin server, using the
request-headers from the new request to allow the
origin server to authenticate the new request.
Additionally, RFC2069 [2] and RFCXXXX [3] should probably be modified
to suggest the use of ``proxy-maxage=0'' and/or ``max-age=0,
must-revalidate'' to force proxies to revalidate a response.
Mogul [Page 9]
Internet-Draft HTTP proxy revalidation 6 January 1997 18:22
5 Security Considerations
The proposed Digest Access Authentication extension [2] depends upon
a mechanism to force proxies to always revalidate certain responses.
Whether or not the proposal in this document is adopted, the Digest
Access Authentication extension requires modification to reflect the
option chosen (unless the HTTP/1.1 specification is revised to make
``proxy-revalidate'' apply to fresh as well as to stale responses.)
6 Acknowledgements
Several people contributed to my understanding of this issue,
including Koen Holtman, Paul Leach, Ingrid Melve, and Anselm
Baird-Smith. However, the proposal in this document is my fault
alone.
7 References
1. Roy T. Fielding, Jim Gettys, Jeffrey C. Mogul, Henrik Frystyk
Nielsen, and Tim Berners-Lee. Hypertext Transfer Protocol --
HTTP/1.1. RFC 2068, HTTP Working Group, January, 1997.
2. J. Franks, P. Hallam-Baker, J. Hostetler, P. Leach, A. Luotonen,
E. Sink, L. Stewart. An Extension to HTTP: Digest Access
Authentication. RFC 2069, HTTP Working Group, January, 1997.
3. D. Kristol, L. Montulli. HTTP State Management Mechanism. RFC
XXXX, HTTP Working Group, January, 1997.
draft-ietf-http-state-mgmt-05.txt; approved by the IESG, not yet
assigned an RFC number..
4. J. Mogul and P. Leach. Simple Hit-Metering for HTTP. Internet
Draft draft-mogul-http-hit-metering-01.txt, HTTP Working Group,
December, 1996. This is a work in progress..
8 Author's address
Jeffrey C. Mogul
Western Research Laboratory
Digital Equipment Corporation
250 University Avenue
Palo Alto, California, 94305, USA
Email: mogul@wrl.dec.com
Mogul [Page 10]