HTTP Working Group                                   J. C. Mogul, DECWRL
Internet-Draft                                            6 January 1997
Expires: 15 July 1997


            Forcing HTTP/1.1 proxies to revalidate responses

                   draft-mogul-http-revalidate-00.txt


STATUS OF THIS MEMO

        This document is an Internet-Draft. Internet-Drafts are
        working documents of the Internet Engineering Task Force
        (IETF), its areas, and its working groups. Note that other
        groups may also distribute working documents as
        Internet-Drafts.

        Internet-Drafts are draft documents valid for a maximum of
        six months and may be updated, replaced, or obsoleted by
        other documents at any time. It is inappropriate to use
        Internet-Drafts as reference material or to cite them other
        than as "work in progress."

        To learn the current status of any Internet-Draft, please
        check the "1id-abstracts.txt" listing contained in the
        Internet-Drafts Shadow Directories on ftp.is.co.za
        (Africa), nic.nordu.net (Europe), munnari.oz.au (Pacific
        Rim), ds.internic.net (US East Coast), or ftp.isi.edu (US
        West Coast).

        Distribution of this document is unlimited.  Please send
        comments to the HTTP working group at
        <http-wg@cuckoo.hpl.hp.com>.  Discussions of the working
        group are archived at
        <URL:http://www.ics.uci.edu/pub/ietf/http/>.  General
        discussions about HTTP and the applications which use HTTP
        should take place on the <www-talk@w3.org> mailing list.


ABSTRACT

        The HTTP/1.1 specification [1] currently defines a
        ``proxy-revalidate'' Cache-control directive, which forces
        a proxy to revalidate a stale response before using it in a
        reply.  There is no mechanism defined that forces a proxy,
        but not an end-client, to revalidate a fresh response.  The
        lack of such a mechanism is due to an error in drafting
        RFC2068, and appears to create problems for use of the
        Authorization header, the Digest Access Authentication
        extension [2], the State Management Mechanism [3], and
        several other proposed extensions.  This document discusses
        the problem and several possible solutions, and proposes to
        add a new ``proxy-maxage'' directive as the best available
        solution.
Mogul                                                           [Page 1]


Internet-Draft          HTTP proxy revalidation     6 January 1997 18:22


                           TABLE OF CONTENTS

1 Introduction                                                         2
2 Problems with proxy-revalidate                                       3
3 Possible alternatives                                                5
     3.1 Alternatives not requiring changes to RFC2068                 5
     3.2 Alternatives that require changes to RFC2068                  6
4 Proposed solution                                                    8
5 Security Considerations                                             10
6 Acknowledgements                                                    10
7 References                                                          10
8 Author's address                                                    10


1 Introduction

   HTTP/1.1 introduces a ``Cache-control'' header to allow origin
   servers and clients to impose fine-grained control over the operation
   of HTTP caches.  One important aspect of HTTP caching is whether a
   cache should ``revalidate'' a cached response with the origin server,
   before using the response as a cache hit.  In cases where the use of
   an invalid cache entry could lead to serious error, such as the
   violation of an authentication policy, or incorrect behavior of an
   online shopping application, proper revalidation could be crucial.
   On the other hand, caching can yield significant performance
   benefits, and so we want to make caching as effective as possible.

      Note: HTTP caches normally revalidate a cached response by
      sending a conditional GET to the origin server.  This may be
      done using the ``If-none-match'' request header or the
      ``If-modified-since'' request header.  If the server would
      return the same response as the cached response, the server may
      reply with a status code of 304 (Not Modified).  While this
      does involve a message exchange, by avoiding the transmission
      of the entity body, a revalidation is often much cheaper than
      an unconditional retrieval.

      Regarding the terms ``fresh'' and ``stale'': a cached response
      is considered to be fresh if its current age is less than its
      maximum allowed age.  A cached response is stale otherwise.
      Normally, only stale responses need to be revalidated; a fresh
      response is inherently usable without revalidation, until it
      reaches its maximum age.

   During the design of HTTP/1.1, it was realized that different
   revalidation policies might be applied to end-client caches (e.g., in
   browsers) and to intermediate proxy caches.  For example, a proxy
   cache might be shared between multiple users (raising security
   considerations), or it might be operated by someone whose interests
   in reducing transmission costs do not coincide with the interest of
   the ultimate client or origin server in preserving certain kinds of
   application semantics.
Mogul                                                           [Page 2]


Internet-Draft          HTTP proxy revalidation     6 January 1997 18:22


   It was also realized that in some cases, it might sometimes be
   appropriate to configure a cache to be ``loose'' in its behavior for
   stale responses.  That is, in such a situation, the cache might
   return a stale response without revalidating it.  This might be done,
   for example, if the network connection between the cache and the
   origin server is not working, or if the cost or delay for
   revalidation is prohibitively high.

   There is an obvious potential contradiction between the occasional
   requirement for strict revalidation of certain responses, and the
   occasional desire to allow loose operation of some HTTP caches.
   HTTP/1.1 resolves this by allowing (although not encouraging) loose
   operation as the default, but by providing a protocol mechanism for
   origin servers or end-clients to insist on mandatory strict operation
   when necessary.  This is done using the ``Cache-control'' header,
   which can carry a number of cache-control directives.  In particular,
   these directives are defined in section 14.9 of RFC2068:

      - max-age=NNN
        Sets the maximum age for this response to NNN seconds.  By
        itself, does not force strict revalidation behavior.

      - no-cache
        Prevents any caching of this response.

      - private
        Prevents any caching by a shared cache.

      - must-revalidate
        Requires that an HTTP/1.1 cache revalidate the response
        before using it, if the response is stale.

      - proxy-revalidate
        Requires that an HTTP/1.1 proxy cache revalidate the
        response before using it, if the response is stale; does
        not affect an end-client cache.


2 Problems with proxy-revalidate

   The fundamental problem with proxy-revalidate, as defined in RFC2068,
   is that it does not require a proxy cache to revalidate a fresh
   response before using it.  However, there are several circumstances
   in which it is desirable or necessary to force a proxy cache to
   revalidate a response that, to an end-client cache, would appear to
   be fresh.

   In section 14.8 of RFC2068, defining the ``Authorization'' header,
   this language appears:



Mogul                                                           [Page 3]


Internet-Draft          HTTP proxy revalidation     6 January 1997 18:22


       1. If the response includes the "proxy-revalidate"
       Cache-Control directive, the cache MAY use that response in
       replying to a subsequent request, but a proxy cache MUST
       first revalidate it with the origin server, using the
       request-headers from the new request to allow the origin
       server to authenticate the new request.

   While this could be read as modifying the definition of
   proxy-revalidate from section 14.9.4, it was in fact not intended as
   a modification.  Rather, the author of these two sections of RFC2068
   (me!) failed to notice the conflicting intentions of these two uses
   of proxy-revalidate.

   In section 2.1.2 of RFC2069 [2] (the specification of the Digest
   Access Authentication extension to HTTP/1.1), this language appears:

       Implementors should be aware of how authenticated
       transactions interact with proxy caches.  The HTTP/1.1
       protocol specifies that when a shared cache (see section
       13.10 of [2]) has received a request containing an
       Authorization header and a response from relaying that
       request, it MUST NOT return that response as a reply to any
       other request, unless one of two Cache-control (see section
       14.9 of [2]) directives was present in the response.  If the
       original response included the ``must-revalidate''
       Cache-control directive, the cache MAY use the entity of that
       response in replying to a subsequent request, but MUST first
       revalidate it with the origin server, using the request
       headers from the new request to allow the origin server to
       authenticate the new request.  Alternatively, if the original
       response included the ``public'' Cache-control directive, the
       response entity MAY be returned in reply to any subsequent
       request.

   This discussion appears to be in error, since its implication that
   ``must-revalidate'' always MUST cause a revalidation does not cover
   the case of apparently fresh responses.  In fact, discussion with one
   of the authors of RFC2069 has confirmed that he understood that
   RFC2068 had provided a ``proxy must revalidate even if fresh''
   directive, which it does not.

   In section 4.2.3 of RFCXXXX [3] (the specification of the State
   Management Mechanism for HTTP/1.1), this language appears:

       The origin server should send [one of] the following
       additional HTTP/1.1 response headers, depending on
       circumstances:

       * To suppress caching of a private document in shared caches:
       Cache-control: private.


Mogul                                                           [Page 4]


Internet-Draft          HTTP proxy revalidation     6 January 1997 18:22


       * To allow caching of a document and require that it be
       validated before returning it to the client: Cache-control:
       must-revalidate.

       * To allow caching of a document, but to require that proxy
       caches (not user agent caches) validate it before returning
       it to the client:  Cache-control: proxy-revalidate.

       * To allow caching of a document and request that it be
       validated before returning it to the client (by
       ``pre-expiring'' it):  Cache-control: max-age=0.  Not all
       caches will revalidate the document in every case.

   Here again there seems to be a (false) assumption that
   must-revalidate and proxy-revalidate cause revalidation even of fresh
   responses.

   Finally, the proposed Hit-Metering extension to HTTP/1.1 [4] depends
   on a mechanism whereby an origin server can require proxy caches to
   revalidate a response before every use, without requiring end-client
   caches to do the same thing (which would be prohibitively
   inefficient).

   In summary, there is a clear need for a Cache-control mechanism that
   allows an origin server to specify that a proxy must always
   revalidate a response, while allowing end-clients to cache it without
   revalidation (perhaps for a limited period).


3 Possible alternatives

3.1 Alternatives not requiring changes to RFC2068
   Assuming that we do want a mechanism that allows an origin server to
   specify that a proxy must always revalidate a response, while
   allowing end-clients to cache it without revalidation, we could
   certainly do this by modifying the HTTP/1.1 specification proposed in
   RFC2068.  Would it be possible to do this without modifying RFC2068,
   possibly by using a combination of existing Cache-control directives
   to approximate the desired behavior?

   One solution would simply to use ``Cache-control: private''.  This
   would preserve any necessary semantics (because it would prevent any
   and all proxy caching of the response).  However, it is much less
   efficient; because ``private'' prevents a shared cache from even
   storing the response, it cannot do a conditional request for
   subsequent references.  Hence, this approach would lead to much
   unnecessary transmission of entity bodies when we could be using 304
   (Not Modified) responses.

   Another approach would be to use ``Cache-control:  proxy-revalidate,
   max-age=0''.  This allows proxies to store the response and forces

Mogul                                                           [Page 5]


Internet-Draft          HTTP proxy revalidation     6 January 1997 18:22


   them to revalidate it on every reference.  However, it also implies
   that strict end-user caches should revalidate on every reference as
   well, which could cause even more unnecessary traffic than
   ``Cache-control: private'' would.

   In short, there does not appear to be a way to use the existing
   RFC2068 mechanisms to preserve both the necessary semantics and
   optimal cache performance.

3.2 Alternatives that require changes to RFC2068
   Several proposals have been made for modifications to RFC2068 to
   resolve this problem.

   We could redefine proxy-revalidate to mean ``always revalidate, even
   if the response is fresh.''  However, this would leave us either with
   no way to allow strict caches to use a response while it is fresh, or
   with no way to force loose caches to revalidate certain responses.

   The other proposals all involve adding one new Cache-control
   directive, while preserving the current meaning of the existing
   proxy-revalidate directive:

      - proxy-mustcheck
        This would mean that a proxy, but not an end-client, would
        have to revalidate the response even it is fresh

      - proxy-maxage=NNN
        This would mean defining separate maximum ages for proxy
        caches and for end-client caches.  The existing max-age (or
        Expires) value would continue to apply to end-client
        caches, and would continue to apply to proxy caches if the
        proxy-maxage directive were not present.  However, if
        proxy-maxage is present, then it would override the max-age
        (or Expires) limit for proxies, but would be ignored by
        end-clients.

      - agent-maxage=NNN
        This would also mean defining separate maximum ages for
        proxy caches and for end-client caches.  The existing
        max-age (or Expires) value would continue to apply to proxy
        caches, and would continue to apply to end-client caches if
        the agent-maxage directive were not present.  However, if
        agent-maxage is present, then it would override the max-age
        (or Expires) limit for end-clients, but would be ignored by
        proxy caches.

   One option would be to make either ``proxy-maxage'' or
   ``agent-maxage'' always strict: that is, they would imply that a
   proxy or end-client, respectively, would be required to revalidate a
   stale response.  Alternatively, they could be combined with a
   ``must-revalidate'' directive to force strict behavior, but would
   otherwise allow loose behavior.
Mogul                                                           [Page 6]


Internet-Draft          HTTP proxy revalidation     6 January 1997 18:22


   Each of these proposals would solve the existing problem, would be
   simple to specify, and would probably not require significant
   implementation complexity or overhead.  However, we should probably
   choose just one of these options; what are the relative merits?

   The proxy-mustcheck approach is clearly the simplest, but gives up
   the possibility of separate control over proxy and end-client
   expiration times.  Since this orthogonality could potentially be
   useful, it seems more useful to adopt the proxy-maxage or
   agent-maxage proposals.

   If one assumes that this change could be made to the HTTP/1.1
   specification before the permanent deployment of any HTTP/1.1
   proxies, there at first seems to be no obvious reason to prefer one
   to the other.  That is, this header

      Cache-control: max-age=10,proxy-maxage=3

   and this one

      Cache-control: max-age=3,agent-maxage=10

   both express the same semantics in the same number of bytes.

   However, if we also adopt the rule that proxy-maxage implies the
   presence of proxy-revalidate, then in order to express the semantics
   of

      Cache-control: max-age=10,proxy-maxage=3

   the origin server would have to send

      Cache-control: max-age=3,agent-maxage=10,proxy-revalidate

   which is somewhat more expensive.

   Also, if we do adopt the Hit-metering proposal [4], the proxy-maxage
   approach seems preferable, because it would allow the necessary
   header rewriting to be accomplished by simple addition of a
   directive, rather than more elaborate rewriting.  For example, if the
   origin server sends a hit-metered response with

      Cache-control: max-age=10

   then it would be rewritten (at the appropriate proxy, if necessary;
   see [4] for details) as

      Cache-control: max-age=10,proxy-maxage=0

   using the proxy-maxage alternative, but would have to be rewritten as


Mogul                                                           [Page 7]


Internet-Draft          HTTP proxy revalidation     6 January 1997 18:22


      Cache-control: max-age=0,agent-maxage=10

   using the agent-maxage proposal.

   On the other hand, if we cannot make the necessary change to the
   specification before the deployment of HTTP/1.1 proxies, then the
   agent-maxage proposal is somewhat safer in terms of semantics.  That
   is, if HTTP/1.1 proxies are deployed that do not understand the
   proxy-maxage directive, the use of agent-maxage will not cause these
   proxies to avoid revalidating fresh responses. This is because they
   will presumably carry a ``max-age=0'' directive, and so not appear to
   be fresh to these proxies.

   Unfortunately, if we fail to change the specification before the
   permanent deployment of HTTP/1.1 end-clients, then we may face a
   performance problem with the use of agent-maxage:  clients that do
   not understand this new directive might do many more revalidations
   than necessary, and so cause excessive network and server loading, as
   well as unnecessary delays.

   Ultimately, therefore, it would be best if we made this specification
   change before any permanent deployment of HTTP/1.1 proxies or
   clients.  If we do so, then it seems more efficient to use the
   proxy-maxage mechanism.


4 Proposed solution

   The HTTP/1.1 specification in RFC2068 should be changed, in section
   14.9 (Cache-Control), in the following ways:

      - The grammar for cache-response-directive should include a
        new alternative:

                            | "proxy-maxage" "=" delta-seconds

        There is no need for a corresponding change to the grammar
        for cache-request-directive.

      - Section 14.9.3 should include, after the second paragraph
        (which starts with ``If a response includes ...''), this
        new paragraph:

            If a response includes a proxy-maxage directive,
            then for a proxy cache (but not for an end-client
            cache), the maximum age specified by this directive
            overrides the maximum age specified by either the
            max-age directive or the Expires header.  The
            proxy-maxage directive also implies the semantics
            of the proxy-revalidate directive (see section
            14.9.4), i.e., that the proxy MUST NOT use the

Mogul                                                           [Page 8]


Internet-Draft          HTTP proxy revalidation     6 January 1997 18:22


            entry after it becomes stale to respond to a
            subsequent request without first revalidating it
            with the origin server.  The proxy-maxage directive
            is always ignored by an end-client.

      - In section 13.4, the list of response headers and
        directives implicitly allowing cachability should include
        ``proxy-maxage'' after ``max-age''.

      - In section 14.8 (Authorization), this paragraph:

            1. If the response includes the "proxy-revalidate"
            Cache-Control directive, the cache MAY use that
            response in replying to a subsequent request, but a
            proxy cache MUST first revalidate it with the
            origin server, using the request-headers from the
            new request to allow the origin server to
            authenticate the new request.

        should become

            1. If the response includes the "proxy-revalidate"
            Cache-Control directive, the cache MAY use that
            response in replying to a subsequent request, but
            if the response is stale, a proxy cache MUST first
            revalidate it with the origin server, using the
            request-headers from the new request to allow the
            origin server to authenticate the new request.

        This paragraph

            2. If the response includes the "must-revalidate"
            Cache-Control directive, the cache MAY use that
            response in replying to a subsequent request, but
            all caches MUST first revalidate it with the origin
            server, using the request-headers from the new
            request to allow the origin server to authenticate
            the new request.

        should become

            2. If the response includes the "must-revalidate"
            Cache-Control directive, the cache MAY use that
            response in replying to a subsequent request, but
            if the response is stale, all caches MUST first
            revalidate it with the origin server, using the
            request-headers from the new request to allow the
            origin server to authenticate the new request.

   Additionally, RFC2069 [2] and RFCXXXX [3] should probably be modified
   to suggest the use of ``proxy-maxage=0'' and/or ``max-age=0,
   must-revalidate'' to force proxies to revalidate a response.
Mogul                                                           [Page 9]


Internet-Draft          HTTP proxy revalidation     6 January 1997 18:22


5 Security Considerations

   The proposed Digest Access Authentication extension [2] depends upon
   a mechanism to force proxies to always revalidate certain responses.
   Whether or not the proposal in this document is adopted, the Digest
   Access Authentication extension requires modification to reflect the
   option chosen (unless the HTTP/1.1 specification is revised to make
   ``proxy-revalidate'' apply to fresh as well as to stale responses.)


6 Acknowledgements

   Several people contributed to my understanding of this issue,
   including Koen Holtman, Paul Leach, Ingrid Melve, and Anselm
   Baird-Smith.  However, the proposal in this document is my fault
   alone.


7 References

   1.  Roy T. Fielding, Jim Gettys, Jeffrey C. Mogul, Henrik Frystyk
   Nielsen, and Tim Berners-Lee.  Hypertext Transfer Protocol --
   HTTP/1.1.  RFC 2068, HTTP Working Group, January, 1997.

   2.  J. Franks, P. Hallam-Baker, J. Hostetler, P. Leach, A. Luotonen,
   E. Sink, L. Stewart.  An Extension to HTTP: Digest Access
   Authentication.  RFC 2069, HTTP Working Group, January, 1997.

   3.  D. Kristol, L. Montulli.  HTTP State Management Mechanism.  RFC
   XXXX, HTTP Working Group, January, 1997.
   draft-ietf-http-state-mgmt-05.txt; approved by the IESG, not yet
   assigned an RFC number..

   4.  J. Mogul and P. Leach.  Simple Hit-Metering for HTTP.  Internet
   Draft draft-mogul-http-hit-metering-01.txt, HTTP Working Group,
   December, 1996. This is a work in progress..


8 Author's address

   Jeffrey C. Mogul
   Western Research Laboratory
   Digital Equipment Corporation
   250 University Avenue
   Palo Alto, California, 94305, USA
   Email: mogul@wrl.dec.com






Mogul                                                          [Page 10]