draft-ietf-http-hit-metering-00

HTTP Working Group                                 Jeffrey Mogul, DECWRL
Internet-Draft                                  Paul J. Leach, Microsoft
Expires: 22 July 1997                                    21 January 1997


                      Simple Hit-Metering for HTTP
                           Preliminary Draft

                  draft-ietf-http-hit-metering-00.txt


STATUS OF THIS MEMO

        This document is an Internet-Draft. Internet-Drafts are
        working documents of the Internet Engineering Task Force
        (IETF), its areas, and its working groups. Note that other
        groups may also distribute working documents as
        Internet-Drafts.

        Internet-Drafts are draft documents valid for a maximum of
        six months and may be updated, replaced, or obsoleted by
        other documents at any time. It is inappropriate to use
        Internet-Drafts as reference material or to cite them other
        than as "work in progress."

        To learn the current status of any Internet-Draft, please
        check the "1id-abstracts.txt" listing contained in the
        Internet-Drafts Shadow Directories on ftp.is.co.za
        (Africa), nic.nordu.net (Europe), munnari.oz.au (Pacific
        Rim), ds.internic.net (US East Coast), or ftp.isi.edu (US
        West Coast).

        Distribution of this document is unlimited.  Please send
        comments to the HTTP working group at
        <http-wg@cuckoo.hpl.hp.com>.  Discussions of the working
        group are archived at
        <URL:http://www.ics.uci.edu/pub/ietf/http/>.  General
        discussions about HTTP and the applications which use HTTP
        should take place on the <www-talk@w3.org> mailing list.


ABSTRACT

        This draft proposes a simple extension to HTTP, using a new
        ``Meter'' header, to permit a limited form of demographic
        information (colloquially called ``hit-counts'') to be
        reported by caches to origin servers, in a more efficient
        manner than the ``cache-busting'' techniques currently
        used.  It also permits an origin server to control the
        number of times a cache uses a cached response, and
        outlines a technique that origin servers can use to capture
        referral information without ``cache-busting.''



Mogul, Leach                                                    [Page 1]


Internet-Draft       Hit-Metering for HTTP (DRAFT) 21 January 1997 12:06


                           TABLE OF CONTENTS

1 Introduction                                                         2
     1.1 Goals, non-goals, and limitations                             3
     1.2 Brief summary of the design                                   4
2 Overview                                                             5
     2.1 Discussion                                                    7
3 Design concepts                                                      7
     3.1 Implementation of the "metering subtree"                      8
     3.2 Format of the Meter header                                    9
     3.3 Negotiation of hit-metering and usage-limiting               10
     3.4 Transmission of usage reports                                13
     3.5 When to send usage reports                                   14
     3.6 Subdivision of usage-limits                                  16
4 Analysis                                                            17
     4.1 What about "Network Computers"?                              18
     4.2 Why max-uses is not a Cache-control directive                19
5 Specification                                                       19
     5.1 Specification of Meter header and directives                 19
     5.2 Abbreviations for Meter directives                           21
     5.3 Counting rules                                               22
          5.3.1 Counting rules for hit-metering                       23
          5.3.2 Counting rules for usage-limiting                     23
          5.3.3 Equivalent algorithms are allowed                     24
     5.4 Counting rules: interaction with Range requests              25
     5.5 Implementation by non-caching proxies                        25
6 Expressing or approximating the "proxy-mustcheck" directive         26
7 Examples                                                            27
     7.1 Example of a complete set of exchanges                       27
     7.2 Protecting against HTTP/1.0 proxies                          29
     7.3 More elaborate examples                                      29
8 Interactions with varying resources                                 30
9 A Note on Capturing Referrals                                       31
10 Security Considerations                                            32
11 Revision history                                                   32
     11.1 draft-mogul-http-hit-metering-01.txt                        32
     11.2 draft-mogul-http-hit-metering-00.txt                        33
12 Acknowledgements                                                   33
13 References                                                         33
14 Authors' addresses                                                 33


1 Introduction

   For a variety of reasons, content providers want to be able to
   collect information on the frequency with which their content is
   accessed. This desire leads to some of the "cache-busting" done by
   existing servers (exactly how much is unknown).  This kind of
   cache-busting is done not for the purpose of maintaining transparency
   or security properties, but simply to collect demographic
   information.  It has also been pointed out that some cache-busting is

Mogul, Leach                                                    [Page 2]


Internet-Draft       Hit-Metering for HTTP (DRAFT) 21 January 1997 12:06


   also done to provide different advertising images to appear on the
   same page (i.e., each retrieval of the page sees a different ad).

   One model that this proposal tries to support is one reasonably
   similar to that of publishers of hard-copy publications: such
   publishers (try to) report to their advertisers how many people read
   an issue of a publication at least once; they don't (try to) report
   how many times a reader re-reads an issue. They do this by counting
   copies published, and then try to estimate, for their publication, on
   average how many people read a single copy at least once. The key
   point is that the results aren't exact, but are still useful. Another
   model is that of coding inquiries in such a way that the advertiser
   can tell which publication produced the inquiry.

1.1 Goals, non-goals, and limitations
   HTTP/1.1 already allows origin servers to prevent caching of
   responses, and we have evidence that at least some of the time, this
   is being done for the sole purpose of collecting counts of the number
   of accesses of specific pages.  Some of this evidence is inferred
   from the study of proxy traces; some is based on explicit statements
   of the intention of the operators of Web servers.  We take no
   position on whether the information collected this way is of use to
   the people who collect it; the fact is that they want to collect it,
   or already do so.

   Our goal in this proposal is to provide an optional performance
   optimization for this use of HTTP/1.1.

   Our proposal is:

      - Optional: no server or proxy is required to implement it.

      - Proxy-centered: there is no involvement on the part of
        end-client implementations.

      - Solely a performance optimization: it provides no
        information or functionality that is not already available
        in HTTP/1.1.  Our intention is to improve performance
        overall, and reduce latency for almost all interactions; we
        do not purport to reduce latency for every single HTTP
        interaction.

      - Best-efforts: it does not guarantee the accuracy of the
        reported information, although it does provide accurate
        results in the absence of persistent network failures or
        host crashes.

      - Neutral with respect to privacy: it reveals to servers no
        information about clients that is not already available
        through the existing features of HTTP/1.1.


Mogul, Leach                                                    [Page 3]


Internet-Draft       Hit-Metering for HTTP (DRAFT) 21 January 1997 12:06


   To the extent that any part of this specification conflicts with
   these criteria, we would consider that to be a bug, and will
   undertake to resolve this when it is brought to our attention.

   Our goals do not include:

      - Solving the entire problem of efficiently obtaining
        extensive information about requests made via proxies.

      - Improving the protection of user privacy (although our
        proposal may reduce the transfer of user-specific
        information to servers, it does not prevent it).

      - Preventing or encouraging the use of log-exchange
        mechanisms.

      - Avoiding all forms of "cache-busting", or even all
        cache-busting done for gathering counts.

   We recognize certain potential limitations of our design:

      - If it is not deployed widely in both proxies and servers,
        it will provide little benefit.

      - It may, by partially solving the hit-counting problem,
        reduce the pressure to adopt (hypothetical) more complete
        solutions.

      - Even if widely deployed, it might not be widely used, and
        so might not significantly improve performance.

   We do not believe that these potential limitations are problems in
   reality.

1.2 Brief summary of the design
   This section is included for people not wishing to read the entire
   document; it is not a specification for the proposed design, and
   over-simplifies many aspects of the design.

   Our goal is to eliminate the need for origin servers to use
   "cache-busting" techniques, when this is done just for the purpose of
   counting the number of users of a resource.  (Cache-busting includes
   techniques such as setting immediate Expiration dates, or sending
   "Cache-control:  private" in each response.)

   We add a new "Meter" header to HTTP; the header is always protected
   by the "Connection" header, and so is always hop-by-hop.  This
   mechanism allows us to construct a "metering subtree", which is a
   connected subtree of proxies, rooted at an origin server.  Only those
   proxies that explicitly volunteer to join in the metering subtree for
   a resource participate in hit-metering, but those proxies that do

Mogul, Leach                                                    [Page 4]


Internet-Draft       Hit-Metering for HTTP (DRAFT) 21 January 1997 12:06


   volunteer are required to make their best effort to provide accurate
   counts.  When a hit-metered response is forwarded outside of the
   metering subtree, the forwarding proxy adds "Cache-control:
   proxy-mustcheck", so that other proxies (outside the metering
   subtree) are forced to forward all requests to a server in the
   metering subtree.

      ---------
      NOTE: the HTTP/1.1 specification does NOT define a
      "proxy-mustcheck" Cache-control directive.  We use this name as
      a placeholder for a directive meaning "proxies must revalidate
      this response even if fresh," which is not currently defined in
      HTTP/1.1.  In section 6 we describe several alternatives for
      expressing or approximating this placeholder; see also [2].
      ---------

   The Meter header carries zero or more directives, similar to the way
   that the Cache-control header carries directives.  Proxies may use
   certain Meter directives to volunteer to do hit-metering for a
   resource.  If a proxy does volunteer, the server may use certain
   directives to require that a response be hit-metered.  Finally,
   proxies use a "count" Meter directive to report the accumulated hit
   counts.

   The Meter mechanism can also be used by a server to limit the number
   of uses that a cache may make of a cached response, before
   revalidating it.

   The full specification includes complete rules for counting "uses" of
   a response (e.g., non-conditional GETs) and "reuses" (conditional
   GETs).  These rules ensure that the results are entirely consistent
   in all cases, except when systems or networks fail.


2 Overview

   The design described in this document introduces several new features
   to HTTP:

      - Hit-metering: allows an origin server to obtain reasonably
        accurate counts of the number of clients using a resource
        instance via a proxy cache, or a hierarchy of proxy caches.

      - Usage-limiting: allows an origin server to control the
        number of times a cached response may be used by a proxy
        cache, or a hierarchy of proxy caches, before revalidation
        with the origin server.

   These new non-mandatory features require minimal new protocol
   support, no change in protocol version, relatively little overhead in
   message headers, and no additional network round-trips in any
   critical path.
Mogul, Leach                                                    [Page 5]


Internet-Draft       Hit-Metering for HTTP (DRAFT) 21 January 1997 12:06


   The primary goal of hit-metering and usage-limiting is to obviate the
   need for an origin server to send "Cache-control: proxy-mustcheck"
   with responses for resources whose value is not likely to change
   immediately.  In other words, in cases where the only reason for
   contacting the origin server on every request that might otherwise be
   satisfied by a proxy cache entry is to allow the server to collect
   demographic information or to control the number of times a cache
   entry is used, the extension proposed here will avoid a significant
   amount of unnecessary network traffic and latency.

   This design introduces one new ``Meter'' header, which is used both
   in HTTP request messages and HTTP response messages.  The Meter
   header is used to transmit a number of directives and reports.  In
   particular, all negotiation of the use of hit-metering and usage
   limits is done using this header.  No other changes to the existing
   HTTP/1.1 specification [1] are proposed in this document.

   This design also introduces several new concepts:

      1. The concepts of a "use" of a cache entry, which is when a
         proxy returns its entity-body in response to a conditional
         or non-conditional request, and the "reuse" of a cache
         entry, which is when a proxy returns a 304 (Not Modified)
         response to a conditional request which is satisfied by
         that cache entry.

      2. The concept of a hit-metered resource, for which proxy
         caches make a best-effort attempt to report accurate
         counts of uses and/or reuses to the origin server.

      3. The concept of a usage-limited resource, for which the
         origin server expects proxy caches to limit the number of
         uses and/or reuses.

   The new Meter directives and reports interact to allow proxy caches
   and servers to cooperate in the collection of demographic data.  The
   goal is a best-efforts approximation of the true number of uses
   and/or reuses, not a guaranteed exact count.

   The new Meter directives also allow a server to bound the inaccuracy
   of a particular hit-count, by bounding the number of uses between
   reports.  It can also, for example, bound the number of times the
   same ad is shown because of caching.

   We also identify a way to use server-driven content negotiation (the
   Vary header) that allows an HTTP origin server to flexibly separate
   requests into categories and count requests by category (see section
   8).  Implementation of such a categorized hit counting is likely to
   be a very small modification to most implementations of Vary; some
   implementations may not require any modification at all.


Mogul, Leach                                                    [Page 6]


Internet-Draft       Hit-Metering for HTTP (DRAFT) 21 January 1997 12:06


2.1 Discussion
   Mapping this onto the publishing model, a proxy cache would increment
   the use-count for a cache entry once for each unconditional GET done
   for the entry, and once for each conditional GET that results in
   sending a copy of the entry to update a client's invalid cached copy.
   Conditional GETs that result in 304 (Not Modified) are not included
   in the use-count, because they do not result in a new user seeing the
   page, but instead signify a repeat view by a user that had seen it
   before.  However, 304 responses are counted in the reuse-count.
   HEADs are not counted at all, because their responses do not contain
   an entity-body.

   The Meter directives apply only to shared proxy caches, not to
   end-client (or other single-user) caches.  Single user caches should
   not use Meter, because their hits will be automatically counted as a
   result of the unconditional GET with which they first fetch the page,
   from either the origin-server or from a proxy cache.  Their
   subsequent conditional GETs do not result in a new user seeing the
   page.

      ---------
      Note: this means that the reuse-count does not include reuses
      done locally to the end-client.  While there are some reasons
      to want to collect such information, especially for research
      into user behavior patterns, we believe that the reasons
      against doing so (network overheads, additional client
      complexity, and possible privacy issues) are stronger.
      However, we encourage further discussion of this issue.
      ---------

   The mechanism specified here counts GETs; other methods either do not
   result in a page for the user to read, aren't cached, or are
   "written-through" and so can be directly counted by the origin
   server. (If, in the future, a "cachable POST" came into existence,
   whereby the entity-body in the POST request was used to select a
   cached response, then such POSTs would have to be treated just like
   GETs.)

   In the case of multiple caches along a path, a proxy cache does the
   obvious summation when it receives a use-count or reuse-count in a
   request from another cache.


3 Design concepts

   In order to allow the introduction of hit-metering and usage-limiting
   without requiring a protocol revision, and to ensure a reasonably
   close approximation of accurate counts, the negotiation of metering
   and usage-limiting is done hop-by-hop, not end-to-end.  If one
   considers the "tree" of proxies that receive, store, and forward a
   specific response, the intent of this design is that within some

Mogul, Leach                                                    [Page 7]


Internet-Draft       Hit-Metering for HTTP (DRAFT) 21 January 1997 12:06


   (possibly null) "metering subtree", rooted at the origin server, all
   proxies are using the hit-metering and/or usage-limiting requested by
   the origin server.

   Proxies at the leaves of this subtree will insert a "Cache-control:
   proxy-mustcheck" directive, which forces all other proxies (below
   this subtree) to check with a leaf of the metering subtree on every
   request.  However, it does not prevent them from storing and using
   the response, if the revalidation succeeds.

   No proxy is required to implement hit-metering or usage-limiting.
   However, any proxy that transmits the Meter header in a request MUST
   implement every requirement of this specification, without exception
   or amendment.

   This is a conservative design, which may sometimes fail to take
   advantage of hit-metering support in proxies outside the metering
   subtree.  However, we believe that without a conservative design,
   managers of origin servers with requirements for accurate information
   will not take advantage of any hit-metering proposal.

   The hit-metering/usage-limiting mechanism is designed to avoid any
   extra network round-trips in the critical path of any client request,
   and (as much as possible) to avoid excessively lengthening HTTP
   messages.

   The Meter header is used to transmit both negotiation information and
   numeric information.

   A formal specification for the Meter header appears in section 5; the
   following discussion uses an informal approach to improve clarity.

3.1 Implementation of the "metering subtree"
   The "metering subtree" approach is implemented in a simple,
   straightforward way by defining the new "Meter" header as one that
   MUST always be protected by a Connection header in every request or
   response.  I.e., if the Meter header is present in an HTTP message,
   that message:

      1. MUST contain "Connection: meter", and MUST be handled
         according to the HTTP/1.1 specification of the Connection
         header.

      2. MUST NOT be sent in response to a request from a client
         whose version number is less than HTTP/1.1.

      3. MUST NOT be accepted from a client whose version number is
         less than HTTP/1.1.

   The reason for the latter two restrictions is to protect against
   proxies that might not properly implement the Connection header.

Mogul, Leach                                                    [Page 8]


Internet-Draft       Hit-Metering for HTTP (DRAFT) 21 January 1997 12:06


   Otherwise, a subtree that includes an HTTP/1.0 proxy might
   erroneously appear to be a metering subtree.

      ---------
      Note: We believe that in order for the Connection header
      mechanism to function correctly, a system receiving an HTTP/1.0
      (or lower-version) message that includes a Connection header
      must act as if this header, and all of the headers it protects,
      ought to have been removed from the message by an intermediate
      proxy.

      Although the current draft of the HTTP/1.1 specification does
      not specifically require this behavior, we believe that it is
      implied.  Otherwise, one could not depend on the stated
      property (section 14.10) that the protected options ``MUST NOT
      be communicated by proxies over further connections.''  We
      suggest that this be clarified in a subsequent draft of the
      HTTP/1.1 specification.

      We do not, in any way, propose a modification of the
      specification of the Connection header.
      ---------

   From the point of view of an origin server, the proxies in a metering
   subtree work together to obey usage limits and to maintain accurate
   usage counts.  When an origin server specifies a usage limit, a proxy
   in the metering subtree may subdivide this limit among its children
   in the subtree as it sees fit.  Similarly, when a proxy in the
   subtree receives a usage report, it ensures that the hits represented
   by this report are summed properly and reported to the origin server.

   When a proxy forwards a hit-metered or usage-limited response to a
   client (proxy or end-client) not in the metering subtree, it MUST
   omit the Meter header, and it MUST add "Cache-control:
   proxy-mustcheck" to the response.

      ---------
      Design question: alternatively, we could specify that the
      origin server is responsible for adding "Cache-control:
      proxy-mustcheck" to the response, and that a proxy in the
      metering subtree should ignore this directive, unless it has
      exhausted one of the usage limits.  This would get the proxies
      out of the business of adding headers to responses, but it
      would increase the number of bytes in the response from the
      origin server.
      ---------

3.2 Format of the Meter header
   The Meter header is used to carry zero or more directives.  Multiple
   Meter headers may occur in an HTTP message, but according to the
   rules in section 4.2 of the HTTP/1.1 specification [1], they may be

Mogul, Leach                                                    [Page 9]


Internet-Draft       Hit-Metering for HTTP (DRAFT) 21 January 1997 12:06


   combined into a single header (and should be so combined, to reduce
   overhead).

   For example, the following sequence of Meter headers

       Meter: max-uses=3
       Meter: max-reuses=10
       Meter: do-report

   may be expressed as

       Meter: max-uses=3, max-reuses=10, do-report

3.3 Negotiation of hit-metering and usage-limiting
   An origin server that wants to collect hit counts for a resource, by
   simply forcing all requests to bypass any proxy caches, would respond
   to requests on the resource with "Cache-control: proxy-mustcheck".
   (An origin server wishing to prevent HTTP/1.0 proxies from improperly
   caching the response could also send both "Expires: <now>", to
   prevent such caching, and "Cache-control: max-age=NNNN", to allow
   newer proxies to cache the response).

   The purpose of the Meter header is to obviate the need for
   "Cache-control: proxy-mustcheck" within a metering subtree.  Thus,
   any proxy may negotiate the use of hit-metering and/or usage-limiting
   with the next-hop server.  If this server is the origin server, or is
   already part of a metering subtree (rooted at the origin server),
   then it may complete the negotiation, thereby extending the metering
   subtree to include the new proxy.

   To start the negotiation, a proxy sends its request with one of the
   following Meter directives:

   will-report-and-limit
                   indicates that the proxy is willing and able to
                   return usage reports and will obey any usage-limits.

   wont-report     indicates that the proxy will obey usage-limits but
                   will not send usage reports.

   wont-limit      indicates that the proxy will not obey usage-limits
                   but will send usage reports.

   A proxy willing to neither obey usage-limits nor send usage reports
   MUST NOT transmit a Meter header in the request.

   By definition, an empty Meter header:

       Meter:

   is equivalent to "Meter: will-report-and-limit", and so, by the

Mogul, Leach                                                   [Page 10]


Internet-Draft       Hit-Metering for HTTP (DRAFT) 21 January 1997 12:06


   definition of the Connection header (see section 14.10 of the
   HTTP/1.1 specification [1]), a request that contains

       Connection: Meter

   and no explicit Meter header is equivalent to a request that contains

       Connection: Meter
       Meter: will-report-and-limit

   This makes the default case more efficient.

   These request directives ("will-report", "will-limit", and
   "will-report-and-limit" in both its explicit and implicit forms)
   apply to all subsequent requests made on the given transport
   connection.

      ---------
      Note: one way for a server to implement the ``connection-long''
      nature of these three directives is to associate two flag bits
      with each transport connection from a client, which are
      initially cleared when the connection is established.  Receipt
      of the "will-report" or "will-limit" directive sets the
      corresponding flag bit; receipt of the "will-report-and-limit"
      or of an empty Meter request header sets both bits.
      ---------

   An origin server that is not interested in metering or usage-limiting
   the requested resource simply ignores the Meter header.

   If the server wants the proxy to do hit-metering and/or
   usage-limiting, its response should include one or more of the
   following Meter directives:

   For hit-metering:

   do-report       specifies that the proxy MUST send usage reports to
                   the server.

   dont-report     specifies that the proxy SHOULD NOT send usage
                   reports to the server.

   By definition, an empty Meter header in a response, or any Meter
   header that does not contain "dont-report", means "Meter: do-report";
   this makes a common case more efficient.

   For usage-limiting

   max-uses=NNN    sets an upper limit of NNN "uses" of the response,
                   not counting its immediate forwarding to the
                   requesting end-client, for all proxies in the
                   following subtree taken together.
Mogul, Leach                                                   [Page 11]


Internet-Draft       Hit-Metering for HTTP (DRAFT) 21 January 1997 12:06


   max-reuses=NNN  sets an upper limit of NNN "reuses" of the response
                   for all proxies in the following subtree taken
                   together.

   When a proxy has exhausted its allocation of "uses" or "reuses" for a
   cache entry, it MUST revalidate the cache entry (using a conditional
   request) before returning it in a response.  (The proxy SHOULD use
   this revalidation message to send a usage report, if one was
   requested and it is time to send it.  See sections 3.4 and 3.5.)

   These Meter response-directives apply only to the specific response
   that they are attached to.

      ---------
      Note that the limit on "uses" set by the max-uses directive
      does not include the use of the response to satisfy the
      end-client request that caused the proxy's request to the
      server.  This counting rule supports the notion of a
      cache-initiated prefetch: a cache may issue a prefetch request,
      receive a max-uses=0 response, store that response, and then
      return that response (without revalidation) when a client makes
      an actual request for the resource.  However, each such
      response may be used at most once in this way, so the origin
      server maintains precise control over the number of actual
      uses.
      ---------

   A proxy receiving a Meter header in a response MUST either obey it,
   or it MUST revalidate the corresponding cache entry on every access.
   (I.e., if it chooses not to obey the Meter header in a response, it
   MUST act as if the response included "Cache-control:
   proxy-mustcheck".)

      ---------
      Note: a proxy that has not sent the Meter header in a request
      during the given transport connection, and which has therefore
      not volunteered to honor Meter directives in a response, is not
      required to honor them.  If, in this situation, the server does
      send a Meter header in a response, this is a protocol error.
      However, based on the robustness principle, the proxy may
      choose to interpret the Meter header as an implicit request to
      include "Cache-control: proxy-mustcheck" when it forwards the
      response, since this preserves the apparent intention of the
      server.
      ---------

   A proxy that receives the Meter header in a request may ignore it
   only to the extent that this is consistent with its own duty to the
   next-hop server.  If the received Meter header is inconsistent, or no
   Meter header is received and the next-hop server has requested any
   metering or limiting, then the proxy MUST add "Cache-control:

Mogul, Leach                                                   [Page 12]


Internet-Draft       Hit-Metering for HTTP (DRAFT) 21 January 1997 12:06


   proxy-mustcheck" to all responses it sends for the resource.  (A
   proxy SHOULD NOT add or change the Expires header or max-age
   Cache-control directive.)

      ---------
      For example, if proxy A receives a GET request from proxy B for
      URL X with "Connection: Meter", but proxy A's cached response
      for URL does not include any Meter directives, then proxy A may
      ignore the metering offer from proxy B.

      However, if proxy A has previously told the origin server
      "Meter: wont-limit" (implying will-report), and the cached
      response contains "Meter: do-report", and proxy B's request
      includes "Meter:  wont-report", then proxy B's offer is
      inconsistent with proxy A's duty to the origin server.
      Therefore, in this case proxy A must add "Cache-control:
      proxy-mustcheck" when it returns the cached response to proxy
      B, and must not include a Meter header in this response.
      ---------

   If a server does not want to use the Meter mechanism, and will not
   want to use it any time soon, it may send this directive:

   wont-ask        recommends that the proxy SHOULD NOT send any Meter
                   directives to this server.

   The proxy SHOULD remember this fact for up to 24 hours.  This avoids
   virtually all unnecessary overheads for servers that do not wish to
   use or support the Meter header.  (This directive also implies
   ``dont-report''.)

3.4 Transmission of usage reports
   To transmit a usage report, a proxy sends the following Meter header
   in a request on the appropriate resource:

       Meter: count=NNN/MMM

   The first integer indicates the count of uses of the cache entry
   since the last report; the second integer indicates the count of
   reuses of the entry (see section 5.3 for rules on counting uses and
   reuses).  The transmission of a "count" directive in a request with
   no other Meter directive is also defined as an implicit transmission
   of a "will-report-and-limit" directive, to optimize the common case.
   (A proxy not willing to honor usage-limits would send "Meter:
   count=NNN/MMM, wont-limit" for its reports.)

   Note that when a proxy forwards a client's request and receives a
   response, the response that the proxy sends immediately to the
   requesting client is not counted as a "use".  I.e., the reported
   count is the number of times the cache entry was used, and not the
   number of times that the response was used.

Mogul, Leach                                                   [Page 13]


Internet-Draft       Hit-Metering for HTTP (DRAFT) 21 January 1997 12:06


   A proxy SHOULD NOT transmit "Meter: count=0/0", since this conveys no
   useful information.

   Usage reports MUST always be transmitted as part of a conditional
   request (such as a GET or HEAD), since the information in the
   conditional header (e.g., If-Modified-Since or If-None-Match) is
   required for the origin server to know which instance of a resource
   is being counted.  Proxys forwarding usage reports up the metering
   subtree MUST NOT change the contents of the conditional header, since
   otherwise this would result in incorrect counting.

   A usage report MUST NOT be transmitted as part of a forwarded request
   that includes multiple entity tags in an If-None-Match or If-Match
   header.

      ---------
      Note: a proxy that offers its willingness to do hit-metering
      (report usage) must count both uses and reuses.  It is not
      possible to negotiate the reporting of one but not the other.
      ---------

3.5 When to send usage reports
   A proxy that has offered to send usage reports to its parent in the
   metering subtree MUST send a usage report in each of these
   situations:

      1. When it forwards a conditional GET on the resource
         instance on behalf of one of its clients (if the GET is
         conditional on at most one entity-tag).

      2. When it forwards a conditional HEAD on the resource
         instance on behalf of one of its clients.

      3. When it must generate a conditional GET to satisfy a
         client request because the max-uses limit has been
         exceeded.

      4. When it removes the corresponding non-zero hit-count entry
         from its storage for any reason including:

            - the proxy needs the storage space for another
              hit-count entry.

            - the proxy is not able to store more than one response
              per resource, and a request forwarded on behalf of a
              client has resulted in the receipt of a new response
              (one with a different entity-tag or last-modified
              time).

         Note that a cache might continue to store hit-count
         information even after having deleted the body of the

Mogul, Leach                                                   [Page 14]


Internet-Draft       Hit-Metering for HTTP (DRAFT) 21 January 1997 12:06


         response, so it is not necessary to report the hit-count
         when deleting the body; it is only necessary to report it
         if the proxy is about to "forget" a non-zero value.

   (Section 5.3 explains how hit-counts become zero or non-zero.)

   If the usage report is being sent because the proxy is about to
   remove the hit-count entry from its storage:

      - The proxy MUST send the report as part of a conditional
        HEAD request on the resource instance.

      - The proxy is not required to retry the HEAD request if it
        fails (this is a best-efforts design).

      - The proxy is not required to serialize any other operation
        on the completion of this request.

      ---------
      Note: proxy implementors are strongly encouraged to batch
      several HEAD-based reports to the same server, when possible,
      over a single persistent connection, to reduce network overhead
      as much as possible.  This may involve a non-naive algorithm
      for scheduling the deletion of hit-count entries.
      ---------

   If the usage count is sent because of an arriving request that also
   carries a "count" directive, the proxy MUST combine its own (possibly
   zero) use and reuse counts with the arriving counts, and then attempt
   to forward the request.

      ---------
      Discussion point: a previous version of this design made the
      final HEAD-based report optional for the proxy, and included a
      way for the proxy to notify the server that it intended to
      provide this report.

      In this design, a proxy that offers its willingness to
      hit-meter a resource must make the final HEAD-based report, if
      the unreported count is non-zero; there is no option.

      Doing so commits a hit-metering proxy to send a fraction of one
      extra request per "cache entry and removal" cycle.  This is not
      exactly one request, because it is possible that the stored
      count is zero and does not need to be reported.  (Trace-based
      studies should be done to estimate the actual fraction.)  We
      believe that this cost is minimal except for proxies whose
      network connection is severely bandwidth-limited, and since
      origin servers may not be willing to allow proxy caching except
      when the final hit-count report is provided, even
      bandwidth-limited proxies may come out ahead by offering to
      hit-meter.
Mogul, Leach                                                   [Page 15]


Internet-Draft       Hit-Metering for HTTP (DRAFT) 21 January 1997 12:06


      However, it is feasible to change this protocol design to allow
      a proxy to offer to hit-meter without committing to send a
      final HEAD-based report.  This would involve the addition of
      two more Meter directives, "wont-final-report" and
      "dont-final-report".  An origin server receiving a "Meter:
      wont-final-report" may, at its option, either reply with
      "Meter: dont-final-report" and allow the proxy to cache the
      response, or with a "Cache-control: proxy-mustcheck" (if it
      wants fully accurate hit counts).  If the protocol is amended
      to include this feature, proxy administrators would need to
      choose between the small extra overhead of doing this final
      HEAD, and the possibly much larger cost of not being permitted
      to cache certain resources at all.

      We do not believe that this option is likely to result in
      improved performance, but we are willing to include it in the
      specification if strong arguments are made in its favor.
      ---------

      ---------
      Discussion point: one reviewer suggested that it would be
      useful, in some cases, for the origin server to be able to
      insist on receiving a final HEAD-based report even when the
      use-count and reuse-count are both zero.  The justification for
      this proposal was that it would resolve the ambiguity between a
      cache that has removed the entry because it has not incremented
      the counters, and one that has non-zero counters for the entry
      but which has not yet removed it (perhaps because a lot of
      storage is available and/or no usage-limit has been reached).

      We are considering adding another Meter response-directive that
      would allow the origin server to specify either a boolean flag
      (``you should send a final usage-report even if the counts are
      both zero'') or a timeout (``you should send a usage-report
      when the cache entry reaches age N'').  It is not clear,
      however, if the benefits would offset the additional
      complexity; more comment is invited.
      ---------

3.6 Subdivision of usage-limits
   When an origin server specifies a usage limit, a proxy in the
   metering subtree may subdivide this limit among its children in the
   subtree as it sees fit.

   For example, consider the situation with two proxies P1 and P2, each
   of which uses proxy P3 as a way to reach origin server S. Imagine
   that S sends P3 a response with

       Meter: max-uses=10

   The proxies use that response to satisfy the current requesting

Mogul, Leach                                                   [Page 16]


Internet-Draft       Hit-Metering for HTTP (DRAFT) 21 January 1997 12:06


   end-client.  The max-uses directive in this example allows the
   combination of P1, P2, and P3 together to satisfy 10 additional
   end-client uses (unconditional GETs) for the resource.

   This specification does not constrain how P3 divides up that
   allocation among itself and the other proxies.  For example, P3 could
   retain all of max-use allocation for itself.  In that case, it would
   forward the response to P1 and/or P2 with

       Meter: max-uses=0

   P3 might also divide the allocation equally among P1 and P2,
   retaining none for itself (which may be the right choice if P3 has
   few or no other clients).  In this case, it could send

       Meter: max-uses=5

   to the proxy (P1 or P2) that made the initial request, and then
   record in some internal data structure that it "owes" the other proxy
   the rest of the allocation.

   Note that this freedom to choose the max-uses value applies to the
   origin server, as well.  There is no requirement that an origin
   server send the same max-uses value to all caches.  For example, it
   might make sense to send "max-uses=2" the first time one hears from a
   cache, and then double the value (up to some maximum limit) each time
   one gets a "use-count" from that cache.  The idea is that the faster
   a cache is using up its max-use quota, the more likely it will be to
   report a use-count value before removing the cache entry.  Also, high
   and frequent use-counts imply a corresponding high efficiency benefit
   from allowing caching.

   Again, the details of such heuristics would be outside the scope of
   this specification.


4 Analysis

   We recognize that, for many service operators, the single most
   important aspect of the request stream is the number of distinct
   users who have retrieved a particular entity. We believe that our
   design provides adequate support for user-counting, based on the
   following analysis.

   We start with the observation that almost all Web users have client
   software that maintains local caches, and that the state of the art
   of local-caching technology is quite effective. Therefore, to a first
   approximation, each individual user who retrieves an entity does
   exactly one GET request that results in a 200 or 203 response, or a
   206 response that includes the first byte of the entity. If a proxy
   cache maintains an accurate use-count of such retrievals, then its

Mogul, Leach                                                   [Page 17]


Internet-Draft       Hit-Metering for HTTP (DRAFT) 21 January 1997 12:06


   use-count will approximate the number of distinct users who have
   retrieved the entity.

   There are some circumstances under which this approximation can break
   down.  For example, if an entity stays in a proxy cache for much
   longer than it persists in the typical client cache, and users often
   re-reference the entity, then this scheme will tend to over-count the
   number of users. Or, if the cache-management policy implemented in
   typical client caches is biased against retaining certain kinds of
   frequently re-referenced entities (such as very large images), the
   use-counts reported will tend to overestimate the user-counts for
   such entities.  For the most part, however, we do not believe this
   will be a source of significant error.

   We also note that the existing "cache-busting" mechanisms for
   counting distinct users will certainly overestimate the number of
   users behind a proxy, since it provides no reliable way to
   distinguish between a user's initial request and subsequent repeat
   requests caused by insufficient space in the end-client cache.

4.1 What about "Network Computers"?
   Our analysis assumes that "almost all Web users" have client caches.
   If the Network Computers (NC) model becomes popular, however, then
   this assumption may be faulty: most proposed NCs have no disk
   storage, and relatively little RAM.  Such systems may do little or no
   caching of HTTP responses.  This means that a single user might well
   generate many unconditional GETs that yield the same response from a
   proxy cache.

   We first note that the hit-metering design in this document operates
   correctly, even with such clients: the counts that a proxy would
   return to an origin server would represent exactly the number of
   requests that the proxy would forward to the server, if the server
   simply specifies "Cache-control:  proxy-mustcheck".

   However, it may be possible to improve the accuracy of these
   hit-counts by use of some heuristics at the proxy.  For example, the
   proxy might note the IP address of the client, and count only one GET
   per client address per response.  This is not perfect: for example,
   it fails to distinguish between NCs and certain other kinds of hosts.
   The proxy might also use the heuristic that only those clients that
   never send a conditional GET should be treated this way, although we
   are not at all certain that NCs will never send conditional GETs.

   Since the solution to this problem appears to require heuristics
   based on the actual behavior of NCs (or perhaps a new HTTP protocol
   feature that allows unambiguous detection of cacheless clients), we
   believe it is premature to specify a solution.




Mogul, Leach                                                   [Page 18]


Internet-Draft       Hit-Metering for HTTP (DRAFT) 21 January 1997 12:06


4.2 Why max-uses is not a Cache-control directive
   Our first proposal was that the "max-uses" directive should be
   carried by the Cache-control header, since it is superficially
   similar to the "max-age" Cache-control directive.  However, we
   believe that the HTTP community will not accept a specification that
   makes the implementation of "max-uses" mandatory for proxy caches,
   and in any event we could not force older implementations to honor
   it.

   Because the Cache-control mechanism has no means for a proxy to
   explicitly promise to honor "max-uses", it would not be possible (in
   general) for a server to depend on such a Cache-control header.  The
   "metering subtree" mechanism implemented by the Meter header,
   however, does allow a server to rely on a precise interpretation of
   "max-uses," when used as a Meter directive.


5 Specification

5.1 Specification of Meter header and directives
   The Meter general-header field is used to:

      - Negotiate the use of hit-metering and usage-limiting among
        origin servers and proxy caches.

      - Report use counts and reuse counts.

   Implementation of the Meter header is optional for both proxies and
   origin servers.  However, any proxy that transmits the Meter header
   in a request MUST implement every requirement of this specification,
   without exception or amendment.

   The Meter header MUST always be protected by a Connection header.  A
   proxy that does not implement the Meter header MUST NOT pass it
   through to another system (see section 5.5 for how a non-caching
   proxy may comply with this specification).  If a Meter header is
   received in a message whose version is less than HTTP/1.1, it MUST be
   ignored (because it has clearly flowed through a proxy that does not
   implement Meter).

   A proxy that has received a response with a version less than
   HTTP/1.1, and therefore from a server (or another proxy) that does
   not implement the Meter header, SHOULD NOT send Meter request
   directives to that server, because these would simply waste
   bandwidth.  This recommendation does not apply if the proxy is
   currently hit-metering or usage-limiting any responses from that
   server.  If the proxy receives a HTTP/1.1 or higher response from
   such a server, it should cease its suppression of the Meter
   directives.



Mogul, Leach                                                   [Page 19]


Internet-Draft       Hit-Metering for HTTP (DRAFT) 21 January 1997 12:06


   All proxies sending the Meter header MUST adhere to the "metering
   subtree" design described in section 3.

       Meter = "Meter" ":" 0#meter-directive

       meter-directive = meter-request-directive
                       | meter-response-directive
                       | meter-report-directive

       meter-request-directive =
                         "will-report-and-limit"
                       | "wont-report"
                       | "wont-limit"

       meter-report-directive =
                       | "count" "=" 1*DIGIT "/" 1*DIGIT

       meter-response-directive =
                         "max-uses" "=" 1*DIGIT
                       | "max-reuses" "=" 1*DIGIT
                       | "do-report"
                       | "dont-report"

                       | "wont-ask"

   A meter-request-directive or meter-report-directive may only appear
   in an HTTP request message.  A meter-response-directive may only
   appear in an HTTP response directive.

   A meter-request-directive applies to all subsequent requests made on
   the given transport connection.  All other Meter directives apply
   only to the specific request or response that they are attached to.

   An empty Meter header in a request means "Meter:
   will-report-and-limit" (and so applies to all subsequent requests on
   the given transport connection).  An empty Meter header in a
   response, or any other response including one or more Meter headers
   without the "dont-report" or "wont-ask" directive, implies "Meter:
   do-report".

   The meaning of the meter-request-directives are as follows:

   will-report-and-limit
                   indicates that the proxy is willing and able to
                   return usage reports and will obey any usage-limits.

   wont-report     indicates that the proxy will obey usage-limits but
                   will not send usage reports.

   wont-limit      indicates that the proxy will not obey usage-limits
                   but will send usage reports.

Mogul, Leach                                                   [Page 20]


Internet-Draft       Hit-Metering for HTTP (DRAFT) 21 January 1997 12:06


   A proxy willing neither to obey usage-limits nor to send usage
   reports MUST NOT transmit a Meter header in the request.

   The meaning of the meter-report-directives are as follows:

   count "=" 1*DIGIT "/" 1*DIGIT
                   Both digit strings encode decimal integers.  The
                   first integer indicates the count of uses of the
                   cache entry since the last report; the second integer
                   indicates the count of reuses of the entry.

   Section 5.3 specifies the counting rules.

   The meaning of the meter-response-directives are as follows:

   max-uses "=" 1*DIGIT
                   sets an upper limit on the number of "uses" of the
                   response, not counting its immediate forwarding to
                   the requesting end-client, for all proxies in the
                   following subtree taken together.

   max-reuses "=" 1*DIGIT
                   sets an upper limit on the number of "reuses" of the
                   response for all proxies in the following subtree
                   taken together.

   do-report       specifies that the proxy MUST send usage reports to
                   the server.

   dont-report     specifies that the proxy SHOULD NOT send usage
                   reports to the server.

   wont-ask        specifies that the proxy SHOULD NOT send any Meter
                   headers to the server.  The proxy should forget this
                   advice after a period of no more than 24 hours.

   Section 5.3 specifies the counting rules, and in particular specifies
   a somewhat non-obvious interpretation of the max-uses value.

5.2 Abbreviations for Meter directives
   To allow for the most efficient possible encoding of Meter headers,
   we define abbreviated forms of all Meter directives.  These are
   exactly semantically equivalent to their non-abbreviated
   counterparts.  All systems implementing the Meter header MUST
   implement both the abbreviated and non-abbreviated forms.
   Implementations SHOULD use the abbreviated forms in normal use.

   The abbreviated forms of Meter directive are shown below, with the
   corresponding non-abbreviated literals in the comments:



Mogul, Leach                                                   [Page 21]


Internet-Draft       Hit-Metering for HTTP (DRAFT) 21 January 1997 12:06


       Abb-Meter = "Meter" ":" 0#abb-meter-directive

       abb-meter-directive = abb-meter-request-directive
                       | abb-meter-response-directive
                       | abb-meter-report-directive

       abb-meter-request-directive =
                         "w"           ; "will-report-and-limit"
                       | "x"           ; "wont-report"
                       | "y"           ; "wont-limit"

       abb-meter-report-directive =
                       | "c" "=" 1*DIGIT "/" 1*DIGIT   ; "count"

       abb-meter-response-directive =
                         "u" "=" 1*DIGIT       ; "max-uses"
                       | "r" "=" 1*DIGIT       ; "max-reuses"
                       | "d"                   ; "do-report"
                       | "e"                   ; "dont-report"

                       | "n"                   ; "wont-ask"

      ---------
      Note: although the Abb-Meter BNF rule is defined separately
      from the Meter rule, one may freely mix abbreviated and
      non-abbreviated Meter directives in the same header.
      ---------

5.3 Counting rules

      ---------
      Note: please remember that hit-counts and usage-counts are
      associated with individual responses, not with resources.  A
      cache entry that, over its lifetime, holds more than one
      response is also not a "response", in this particular sense.
      ---------

   Let R be a cached response, and V be the value of the Request-URI and
   selecting request-headers (if any, see section 14.43 of the HTTP/1.1
   specification [1]) that would select R if contained in a request.  We
   define a "use" of R as occurring when the proxy returns its stored
   copy of R in a response with any of the following status codes: a 200
   (OK) status; a 203 (Non-Authoritative Information) status; or a 206
   (Partial Content) status when the response contains byte #0 of the
   entity (see section 5.4 for a discussion of Range requests).

      ---------
      Note: when a proxy forwards a client's request and receives a
      response, the response that the proxy sends immediately to the
      requesting client is not counted as a "use".  I.e., the
      reported count is the number of times the cache entry was used,
      and not the number of times that the response was used.
Mogul, Leach                                                   [Page 22]


Internet-Draft       Hit-Metering for HTTP (DRAFT) 21 January 1997 12:06


      ---------

   We define a "reuse" of R as as occurring when the proxy responds to a
   request selecting R with a 304 (Not Modified) status, unless that
   request is a Range request that does not specify byte #0 of the
   entity.

5.3.1 Counting rules for hit-metering
   A proxy participating in hit-metering for a cache response R
   maintains two counters, CU and CR, associated with R. When a proxy
   first stores R in its cache, it sets both CU and CR to 0 (zero).
   When a subsequent client request results in a "use" of R, the proxy
   increments CU.  When a subsequent client request results in a "reuse"
   of R, the proxy increments CR.  When a subsequent client request
   selecting R (i.e., including V) includes a "count" Meter directive,
   the proxy increments CU and CR using the corresponding values in the
   directive.

   When the proxy sends a request selecting R (i.e., including V) to the
   inbound server, it includes a "count" Meter directive with the
   current CU and CR as the parameter values.  If this request was
   caused by the proxy's receipt of a request from a client, upon
   receipt of the server's response, the proxy sets CU and CR to the
   number of uses and reuses, respectively, that may have occurred while
   the request was in progress.  (These numbers are likely, but not
   certain, to be zero.)  If the proxy's request was a final HEAD-based
   report, it need no longer maintain the CU and CR values, but it may
   also set them to the number of intervening uses and reuses and retain
   them.

5.3.2 Counting rules for usage-limiting
   A proxy participating in usage-limiting for a response R maintains
   either or both of two counters TU and TR, as appropriate, for that
   resource.  TU and TR are incremented in just the same way as CU and
   CR, respectively.  However, TU is zeroed only upon receipt of a
   "max-uses" Meter directive for that response (including the initial
   receipt).  Similarly, TR is zeroed only upon receipt of a
   "max-reuses" Meter directive for that response.

   A proxy participating in usage-limiting for a response R also stores
   values MU and/or MR associated with R. When it receives a response
   including only a max-uses value, it sets MU to that value and MR to
   infinity.  When it receives a response including only a max-reuses
   value, it sets MR to that value and MU to infinity.  When it receives
   a response including both max-reuses and max-reuses values, it sets
   MU and MR to those values, respectively.  When it receives a
   subsequent response including neither max-reuses nor max-reuses
   values, it sets both MU and MR to infinity.

   If a proxy participating in usage-limiting for a response R receives
   a request that would cause a "use" of R, and TU >= MU, it MUST

Mogul, Leach                                                   [Page 23]


Internet-Draft       Hit-Metering for HTTP (DRAFT) 21 January 1997 12:06


   forward the request to the server.  If it receives a request that
   would cause a "reuse" of R, and TR >= MR, it MUST forward the request
   to the server.  If (in either case) the proxy has already forwarded a
   previous request to the server and is waiting for the response, it
   should delay further handling of the new request until the response
   arrives (or times out); it SHOULD NOT have two revalidation requests
   pending at once that select the same response, unless these are Range
   requests selecting different subranges.

   There is a special case of this rule for the "max-uses" directive: if
   the proxy receives a response with "max-uses=0" and does not forward
   it to a requesting client, the proxy should set a flag PF associated
   with R. If R is true, then when a request arrives while if TU >= MU,
   if the PF flag is set, then the request need not be forwarded to the
   server (provided that this is not required by other caching rules).
   However, the PF flag MUST be cleared on any use of the response.

      ---------
      Note: the "PF" flag is so named because this feature is useful
      only for caches that could issue a "prefetch" request before an
      actual client request for the response.  A proxy not
      implementing prefetching need not implement the PF flag.
      ---------

5.3.3 Equivalent algorithms are allowed
   Any other algorithm that exhibits the same external behavior (i.e.,
   generates exactly the same requests from the proxy to the server) as
   the one in this section is explicitly allowed.

      ---------
      Note: in most cases, TU will be equal to CU, and TR will be
      equal to CR.  The only two cases where they could differ are:

         1. The proxy issues a non-conditional request for the
            resource using V, while TU and/or TR are non-zero, and
            the server's response includes a new "max-uses" and/or
            "max-reuses" directive (thus zeroing TU and/or TR, but
            not CU and CR).

         2. The proxy issues a conditional request reporting the
            hit-counts (and thus zeroing CU and CR, but not TU or
            TR), but the server's response does not include a new
            "max-uses" and/or "max-reuses" directive.

      To solve the first case, the proxy has several implementation
      options

         - Always store TU and TR separately from CU and CR.

         - Create "shadow" copies of TU and TR when this situation
           arises (analogous to "copy on write").

Mogul, Leach                                                   [Page 24]


Internet-Draft       Hit-Metering for HTTP (DRAFT) 21 January 1997 12:06


         - Generate a HEAD-based usage report when the
           non-conditional request is sent (or when the
           "max-uses=0" is received), causing CU and CR to be
           zeroed (analogous in some ways to a "memory barrier"
           instruction).

      In the second case, the server implicitly has removed the
      usage-limit(s) on the response (by setting MU and/or MR to
      infinity), and so the fact that, say, TU is different from CU
      is not significant.
      ---------

      ---------
      Note: It may also be possible to eliminate the PF flag by
      sending extra HEAD-based usage-report requests, but we
      recommend against this; it is better to allocate an extra bit
      per entry than to transmit extra requests.
      ---------

5.4 Counting rules: interaction with Range requests
   HTTP/1.1 allows a client to request sub-ranges of a resource.  A
   client might end up issuing several requests with the net effect of
   receiving one copy of the resource.  We need to establish a rule for
   counting these references, although it is not clear that one rule
   generates accurate results in every case.

   The rule established in this specification is that proxies count as a
   "use" or "reuse" only those Range requests that result in the return
   of byte #0 of the resource.  The rationale for this rule is that in
   almost every case, an end-client will retrieve the beginning of any
   resource that it references at all, and that it will seldom retrieve
   any portion more than once.  Therefore, this rule appears to meet our
   goal of a "best-efforts" approach to accuracy.

5.5 Implementation by non-caching proxies
   A non-caching proxy may participate in the metering subtree; in fact,
   we strongly recommend this.

   A non-caching proxy (HTTP/1.1 or higher) that participates in the
   metering subtree SHOULD forward Meter headers on both requests and
   responses, with the appropriate Connection headers.

   If a non-caching proxy forwards Meter headers, it MUST comply with
   these restrictions:

      1. If the proxy forwards Meter headers in requests, it MUST
         NOT reorder the requests from a given client to a given
         server.

      2. If the proxy forwards Meter headers in requests, it must
         not do so while merging requests from multiple incoming

Mogul, Leach                                                   [Page 25]


Internet-Draft       Hit-Metering for HTTP (DRAFT) 21 January 1997 12:06


         connections (i.e., connections from one or more of its
         clients) onto one outgoing connection.

      3. If the proxy forwards Meter headers in requests, if its
         connection to the server is closed or aborted, then it
         should close or abort the corresponding connection to the
         client.  (Alternatively, the proxy may act in any way that
         precisely maintains the ``connection-long'' nature of the
         meter-request-directives.)

      4. If the proxy forwards Meter headers in responses, such a
         response MUST NOT be returned to any request except the
         one that elicited it.

      5. Once a non-caching proxy starts forwarding Meter headers,
         it should not arbitrarily stop forwarding them (or else
         reports may be lost).

      ---------
      Note: the intent of these restrictions is to make the
      non-caching proxy appear functionally transparent with respect
      to the Meter header.  It may be possible to relax these
      restrictions after a more careful analysis, while still meeting
      this intent.  We do not believe that these restrictions will
      add much complexity to a straightforward implementation of a
      non-caching HTTP proxy.
      ---------

   A proxy that caches some responses and not others, for whatever
   reason, may choose to implement the Meter header as a caching proxy
   for the responses that it caches, and as a non-caching proxy for the
   responses that it does not cache, as long as its external behavior
   with respect to any particularly response is fully consistent with
   this specification.


6 Expressing or approximating the "proxy-mustcheck" directive

   As we pointed out in section 1.2, this hit-metering design depends on
   HTTP/1.1 support for a way for an origin server to say "proxies must
   revalidate this response even if fresh."  The existing HTTP/1.1
   specification does not provide exactly such a mechanism.  In this
   section, we discuss the alternatives for resolving this problem.

      ---------
      Note: much of the discussion in this section is better covered
      by the proposal made in [2], which proposes a new
      ``proxy-maxage'' Cache-control directive.  If that proposal is
      adopted, this document would be modified by removing this
      section, and replacing ``proxy-mustcheck'' everywhere else with
      ``proxy-maxage=0''.
      ---------
Mogul, Leach                                                   [Page 26]


Internet-Draft       Hit-Metering for HTTP (DRAFT) 21 January 1997 12:06


   One possibility would simply be to modify the HTTP/1.1 specification
   to include a Cache-control cache-response-directive with precisely
   the required semantics.  The meaning of the "proxy-mustcheck"
   directive would be identical to "proxy-revalidate", except that it
   would require revalidation whether or not the entry is fresh.

   In order for this approach to be reliable, it would have to be
   supported by all HTTP/1.1-compliant proxies.  This means that the
   specification change would have to be adopted quickly, before any
   significant operational deployment of HTTP/1.1 proxies.

   If it is not feasible to modify the specification HTTP/1.1, there are
   several ways to approximate a "proxy-mustcheck" directive using the
   existing specification:

      - Use of "Cache-control: private": because this prevents
        shared caches from storing the response, it has the effect
        that it forces as many requests as proxy-mustcheck" would,
        and so the origin server will receive accurate counts.
        However, because "private" prevents a shared cache from
        even storing the response, it cannot do a conditional
        request for subsequent references, and hence this approach
        would lead to unnecessary transmission of entity bodies
        (instead of 304 Not Modified responses).

      - Use of "Cache-control: proxy-revalidate, max-age=0": this
        allows proxies to store the response and forces them to
        revalidate it on every reference.  However, it also implies
        that end-user caches should revalidate on every reference
        as well, which is not necessary for most hit-metering
        applications (see section 4).

   Because of the way Cache-control is specified, it would also be
   possible to phase in the use of a new "proxy-mustcheck" directive
   without compromising counting accuracy in the interim, by using
   "Cache-control: private, proxy-mustcheck".  (This means that the
   specification of "proxy-mustcheck" would explicitly have it override
   "private".)  The risk of taking this phased approach is that, until
   most proxies support "proxy-mustcheck", a lot of unnecessary
   full-body responses would be sent.


7 Examples

7.1 Example of a complete set of exchanges
   This example shows how the protocol is intended to be used most of
   the time: for hit-metering without usage-limiting.  Entity bodies are
   omitted.

   A client sends request to a proxy:


Mogul, Leach                                                   [Page 27]


Internet-Draft       Hit-Metering for HTTP (DRAFT) 21 January 1997 12:06


       GET http://foo.com/bar.html HTTP/1.1

   The proxy forwards request to the origin server:

       GET /bar.html HTTP/1.1
       Host: foo.com
       Connection: Meter

   thus offering (implicitly) "will-report-and-limit".

   The server responds to the proxy:

       HTTP/1.1 200 OK
       Cache-control: max-age=3600
       Connection: meter
       Etag: "abcde"

   thus (implicitly) requiring "do-report" (but not requiring
   usage-limiting).

   The proxy responds to the client:

       HTTP/1.1 200 OK
       Etag: "abcde"
       Cache-control: max-age=3600, proxy-mustcheck
       Age: 1

   since the proxy does not know if its client is an end-system, or a
   proxy that doesn't do metering, it adds the "proxy-mustcheck"
   directive.

   Another client soon asks for the resource:

       GET http://foo.com/bar.html HTTP/1.1

   and the proxy sends the same response as it sent to the other client,
   except (perhaps) for the Age value.

   After an hour has passed, a third client asks for the response:

       GET http://foo.com/bar.html HTTP/1.1

   But now the response's max-age has been exceeded, so the proxy
   revalidates the response with the origin server:

       GET /bar.html HTTP/1.1
       If-None-Match: "abcde"
       Host: foo.com
       Connection: Meter
       Meter: count=1/0


Mogul, Leach                                                   [Page 28]


Internet-Draft       Hit-Metering for HTTP (DRAFT) 21 January 1997 12:06


   thus simultaneously fulfilling its duties to validate the response
   and to report the one "use" that wasn't forwarded.

   The origin server responds:

       HTTP/1.1 304 Not Modified
       Cache-control: max-age=3600
       Etag: "abcde"

   so the proxy can use the original response to reply to the new
   client; the proxy also zeros the use-count it associates with that
   response.

   Another client soon asks for the resource:

       GET http://foo.com/bar.html HTTP/1.1

   and the proxy sends the appropriate response.

   After another few hours, the proxy decides to remove the cache entry.
   When it does so, it sends to the origin server:

       HEAD /bar.html HTTP/1.1
       If-None-Match: "abcde"
       Host: foo.com
       Connection: Meter
       Meter: count=1/0

   reporting that one more use of the response was satisfied from the
   cache.

7.2 Protecting against HTTP/1.0 proxies
   An origin server that does not want HTTP/1.0 caches to store the
   response at all, and is willing to have HTTP/1.0 end-system clients
   generate excess GETs (which will be handled by the proxy, of course)
   could send this for its reply:

       HTTP/1.1 200 OK
       Cache-control: max-age=3600
       Connection: meter
       Etag: "abcde"
       Expires: Sun, 06 Nov 1994 08:49:37 GMT

   HTTP/1.0 caches will see the ancient Expires header, but HTTP/1.1
   caches will see the max-age directive and will ignore Expires.

7.3 More elaborate examples
   Here is a request from a proxy that is willing to hit-meter but is
   not willing to usage-limit:



Mogul, Leach                                                   [Page 29]


Internet-Draft       Hit-Metering for HTTP (DRAFT) 21 January 1997 12:06


       GET /bar.html HTTP/1.1
       Host: foo.com
       Connection: Meter
       Meter: wont-limit

   Here is a response from an origin server that does not want hit
   counting, but does want "uses" limited to 3, and "reuses" limited to
   6:

       HTTP/1.1 200 OK
       Cache-control: max-age=3600
       Connection: meter
       Etag: "abcde"
       Expires: Sun, 06 Nov 1994 08:49:37 GMT
       Meter: max-uses=3, max-reuses=6, dont-report

   Here is the same example with abbreviated Meter directive names:

       HTTP/1.1 200 OK
       Cache-control: max-age=3600
       Connection: meter
       Etag: "abcde"
       Expires: Sun, 06 Nov 1994 08:49:37 GMT
       Meter:u=3,r=6,e


8 Interactions with varying resources

   Separate counts should be kept for each combination of the headers
   named in the Vary header for the Request-URI (what [1] calls "the
   selecting request-headers"), even if they map to the same entity-tag.
   This rule has the effect of counting hits on each variant, if there
   are multiple variants of a page available.

      ---------
      Note: This interaction between Vary and the hit-counting
      directives allows the origin server a lot of flexibility in
      specifying how hits should be counted.  In essence, the origin
      server uses the Vary mechanism to divide the requests for a
      resource into arbitrary categories, based on the request-
      headers. (We will call these categories "request-patterns".)
      Since a proxy keeps its hit-counts for each request-pattern,
      rather than for each resource, the origin server can obtain
      separate statistics for many aspects of an HTTP request.
      ---------

   For example, if a page varied based on the value of the User-Agent
   header in the requests, then hit counts would be kept for each
   different flavor of browser. But it is in fact more general than
   that; because multiple header combinations can map to the same
   variant, it also enables the origin server to count the number of

Mogul, Leach                                                   [Page 30]


Internet-Draft       Hit-Metering for HTTP (DRAFT) 21 January 1997 12:06


   times (e.g.) the Swahili version of a page was requested, even though
   it is only available in English.

   If a proxy does not support the Vary mechanism, then [1] says that it
   MUST NOT cache any response that carries a Vary header, and hence
   need not implement any aspect of this hit-counting or usage-limiting
   design for varying resources.

      ---------
      Note: this also implies that if a proxy supports the Vary
      mechanism but is not willing to maintain independent hit-counts
      for each variant response in its cache, then it must follow at
      least one of these rules:

         1. It must not use the Meter header in a request to offer
            to hit-meter or usage-limit responses.

         2. If it does offer to hit-meter or usage-limit responses,
            and then receives a response that includes both a Vary
            header and a Meter header with a directive that it
            cannot satisfy, then the proxy must not cache the
            response.

      In other words, a proxy is allowed to partially implement the
      Vary mechanism with respect to hit-metering, as long as this
      has no externally visible effect on its ability to comply with
      the Meter specification.
      ---------

   This approach works for counting almost any aspect of the request
   stream, without embedding any specific list of countable aspects in
   the specification or proxy implementation.


9 A Note on Capturing Referrals

   It is alleged that some advertisers want to pay content providers,
   not by the "hit", but by the "nibble" -- the number of people who
   actually click on the ad to get more information.

   Now, HTTP already has a mechanism for doing this: the "Referer"
   header. However, perhaps it ought to be disabled for privacy reasons
   -- according the HTTP/1.1 spec:

          "Because the source of the link may be private information
       or may reveal an otherwise private information source, it is
       strongly recommended that the user be able to select whether
       or not the Referer field is sent."

   However, in the case of ads, the source of the link actually wants to
   let the referred-to page know where the reference came from.

Mogul, Leach                                                   [Page 31]


Internet-Draft       Hit-Metering for HTTP (DRAFT) 21 January 1997 12:06


   This does not require the addition of any extra mechanism, but rather
   can use schemes that embed the referrer in the URI in a manner
   similar to this:

          http://www.blah.com/ad-reference?from=site1

   Such a URI should point to a resource (perhaps a CGI script) which
   returns a 302 redirect to the real page

          http://www.blah.com/ad-reference.html

   Proxies which do not cache 302s will cause one hit on the redirection
   page per use, but the real page will get cached. Proxies which do
   cache 302s and report hits on the cached 302s will behave optimally.

   This approach has the advantage that it works whether or not the
   end-client has disabled the use of Referer.


10 Security Considerations

   Which outbound clients should a server (proxy or origin) trust to
   report hit counts?  A malicious proxy could easily report a large
   number of hits on some page, and thus perhaps cause a large payment
   to a content provider from an advertiser.  To help avoid this
   possibility, a proxy may choose to only relay usage counts received
   from its outbound proxies to its inbound servers when the proxies
   have authenticated themselves using Proxy-Authorization and/or they
   are on a list of approved proxies.

   We do not see a way to enforce usage limits if a proxy is willing to
   cheat.

   Regarding privacy:  we believe that the design in this document does
   not reveal any more information about individual users than would
   already be revealed by implementation of the existing HTTP/1.1
   support for "Cache-control: max-age=0, proxy-revalidate".  It may, in
   fact, help to conceal certain aspects of the organizational structure
   on the outbound side of a proxy.


11 Revision history

   Minor clarifications, and grammar and spelling corrections, are not
   listed here.

11.1 draft-mogul-http-hit-metering-01.txt
   Clarified goals, non-goals, and limitations (section 1.1).

   Removed the term ``sticky'' from the specification of
   meter-request-directive; added an implementation note (section 3.3).

Mogul, Leach                                                   [Page 32]


Internet-Draft       Hit-Metering for HTTP (DRAFT) 21 January 1997 12:06


   Clarifications and corrections concerning the use of the Connection
   header (section 3.1).

   Added support for non-caching proxies (section 5.5).

   Modified discussion of the Referer header (section 9).

   Added the "wont-ask" directive (sections 3.3 and 5.1).

   Replaced the use of "proxy-revalidate" with the (placeholder)
   directive-name "proxy-mustcheck", and added a discussion of the
   alternatives for making this real (section 6).

11.2 draft-mogul-http-hit-metering-00.txt
   Initial revision.


12 Acknowledgements

   We gratefully acknowledge the constructive comments received from
   Anselm Baird-Smith, Koen Holtman (who suggested the technique
   described in section 9), Dave Kristol, Ari Luotonen, Patrick
   R. McManus, and Ingrid Melve.


13 References

   1.  Roy T. Fielding, Jim Gettys, Jeffrey C. Mogul, Henrik Frystyk
   Nielsen, and Tim Berners-Lee.  Hypertext Transfer Protocol --
   HTTP/1.1.  RFC 2068, HTTP Working Group, January, 1997.

   2.  J. Mogul.  Forcing HTTP/1.1 proxies to revalidate responses.
   Internet Draft draft-mogul-http-revalidate-00.txt, HTTP Working
   Group, January, 1997. This is a work in progress.


14 Authors' addresses

   Jeffrey C. Mogul
   Western Research Laboratory
   Digital Equipment Corporation
   250 University Avenue
   Palo Alto, California, 94305, U.S.A.
   Email: mogul@wrl.dec.com
   Phone: 1 415 617 3304 (email preferred)

   Paul J. Leach
   Microsoft
   1 Microsoft Way
   Redmond, Washington, 98052, U.S.A.
   Email: paulle@microsoft.com

Mogul, Leach                                                   [Page 33]

Document	Document type	This is an older version of an Internet-Draft that was ultimately published as RFC 2227. Expired & archived
	Select version	00 01 02 03 RFC 2227
	Compare versions
	Author
	RFC stream
	Other formats	txt pdf bibtex bibxml
	Additional resources	Mailing list discussion