HTTP Working Group                                      David M. Kristol
INTERNET DRAFT                                    AT&T Bell Laboratories
<draft-kristol-http-state-info-00.txt>
August 25, 1995                                Expires February 25, 1995


                   Proposed HTTP State-Info Mechanism



                          Status of this Memo

     This document is an Internet-Draft.  Internet-Drafts are
     working documents of the Internet Engineering Task Force
     (IETF), its areas, and its working groups.  Note that other
     groups may also distribute working documents as Internet-
     Drafts.

     Internet-Drafts are draft documents valid for a maximum of six
     months and may be updated, replaced, or obsoleted by other
     documents at any time.  It is inappropriate to use Internet-
     Drafts as reference material or to cite them other than as
     ``work in progress.''

     To learn the current status of any Internet-Draft, please
     check the ``1id-abstracts.txt'' listing contained in the
     Internet- Drafts Shadow Directories on ftp.is.co.za (Africa),
     nic.nordu.net (Europe), munnari.oz.au (Pacific Rim),
     ds.internic.net (US East Coast), or ftp.isi.edu (US West
     Coast).

     This is author's draft 1.12.


1.  ABSTRACT

HTTP, the protocol that underpins the World-Wide Web (WWW), is
stateless.  That is, each request stands on its own; origin servers
don't need to remember what happened with previous requests to service a
new one.  Statelessness is a mixed blessing, because there are potential
WWW applications, like ``shopping baskets'' and library browsing, for
which the history of a user's actions is useful or essential.

This proposal outlines a way to introduce state into HTTP.  A new
request/response header, State-Info, carries the state back and forth,
thus relieving the origin server from needing to keep an extensive per-
user or per-connection database.  The changes required to user agents,
origin servers, and proxy servers to support State-Info are very modest.


2.  TERMINOLOGY

The terms user agent, client, server, proxy, and origin server have the
same meaning as in the HTTP/1.0 specification.





Kristol           draft-kristol-http-state-info-00.txt          [Page 1]


INTERNET DRAFT     Proposed HTTP State-Info Mechanism    August 25, 1995



                         3.  STATE AND SESSIONS

This proposal outlines how to introduce state into HTTP, the protocol
that underpins the World-Wide Web (WWW).  At present, HTTP is stateless:
a WWW origin server obtains everything it needs to know about a request
from the request itself.  After it processes the request, the origin
server can ``forget'' the transaction.

What do I mean by ``state?''  ``State'' implies some relation between
one request to an origin server and previous ones made by the same user
agent to the same origin server.  If the sequence of these requests is
considered as a whole, they can be thought of as a ``session.''

Koen Holtman identified these dimensions for the ``solution space'' of
stateful dialogs:

   +o simplicity of implementation

   +o simplicity of use

   +o time of general availability when standardized

   +o downward compatibility

   +o reliability

   +o amount of privacy protection

   +o maximum complexity of stateful dialogs supported

   +o amount of cache control possible

   +o risks when used with non-conforming caches

The paradigm I have in mind obtains the same effect as if a user agent
connected to an origin server, carried out many transactions at the
user's direction, then disconnected.  Two example applications I have in
mind are a ``shopping cart,'' where the state information comprises what
the user has bought, and a magazine browsing system, where the state
information comprises the set of journals and articles the user has
looked at already.  Note some of the key points in the session paradigm:

  1.  The session has a beginning and an end.

  2.  The session is relatively short-lived.

  3.  Either the user agent or the origin server may terminate a
      session.

  4.  State is a property of the connection to the origin server.  The
      user agent itself has no special state information.  (However,



Kristol           draft-kristol-http-state-info-00.txt          [Page 2]


INTERNET DRAFT     Proposed HTTP State-Info Mechanism    August 25, 1995



      what the user agent presents to the user may reflect the origin
      server's state, because the origin server returns that information
      to the user agent.)


4.  PROPOSAL OUTLINE

The proposal I outline here defines a way for an origin server to send
state information to the user agent, and for the user agent to return
the state information to the origin server.  The goal of the proposal is
to have a minimal impact on HTTP and user agents.  Only origin servers
that need to maintain sessions would suffer any significant impact.

4.1  Origin Server Role

The origin server initiates a session, if it so desires.  (Note that
``session'' here is a logical connection, not a physical one.  Don't
confuse these logical sessions with various ``keepalive'' proposals for
physical sessions.)  To initiate a session, the origin server returns an
extra response header to the client:

        State-Info:     opaque information

The opaque information may be anything the origin server chooses to
send, encoded in printable ASCII.  ``Opaque'' implies that the content
is of interest and relevance only to the origin server.  The content
may, in fact, be readable by anyone that examines the State-Info header.

If the origin server gets a State-Info request header from the client
(see below), it may ignore it or use it to determine the current state
of the session.  It may send back to the client the same, a different,
or no State-Info response header.  The origin server effectively ends a
session by sending back a State-Info header with no value.

4.2  User Agent Role

The user agent keeps track of State-Info for each origin server
(distinguished by name or IP address and port).  The extent of its
bookkeeping is to note that it does or does not have State-Info for the
origin server.

The user agent goes from the ``no State-Info'' state to the ``have
State-Info'' state when it receives a non-empty State-Info response
header from the origin server.  (The user agent saves the State-Info
value.)  It returns to the ``no State-Info'' state if it receives a
State-Info response header with no value.  It stays in the ``have
State-Info'' state if it receives a non-empty State-Info response
header; the new value overwrites the old one.  If the user agent
receives no State-Info response header, it stays in the same state
(``have State-Info'' or ``no State-Info'').  The behavior described
above applies for all response codes from the origin server.



Kristol           draft-kristol-http-state-info-00.txt          [Page 3]


INTERNET DRAFT     Proposed HTTP State-Info Mechanism    August 25, 1995



When it sends a request to an origin server, the user agent sends a
State-Info request header if it's in the ``have State-Info'' state;
otherwise it sends no State-Info request header.

A user agent usually begins execution with no remembered State-Info
information.  The user agent may be configured never to send State-Info,
in which case it can never sustain state with an origin server.  (This
would also be true of user agents that are unaware of how to handle
State-Info.)

A user agent (at the user's direction) can terminate a session with an
origin server by discarding the associated State-Info information
(moving to the ``no State-Info'' state).

When the user agent terminates execution, it discards all State-Info
information.  Alternatively, the user agent may ask the user whether
State-Info should be retained; the default should be ``no.''  Retained
State-Info would then be restored when the user agent begins execution
again.

User agent programs that can display multiple independent windows should
behave as if each window were a separate program instance with respect
to State-Info.  Thus State-Info obtained in one window would have no
effect on links followed in another.  (The user agent would have to
store State-Info tagged by window number, as well as origin server
address and port.)  When a window terminates, all associated State-Info
information gets discarded.

4.3  Caching Proxy Role

One reason for separating state information from both a URL and document
content is to facilitate the scaling that caching permits.  A caching
proxy

   +o must pass along a State-Info request header from the requesting
     client to the next server, even if it has cached the requested
     resource locally.  (I originally assumed that requests from a cache
     always resulted in a conditional GET request to the next server,
     and that a State-Info header could ride along for free.  Such is
     not the case, and passing along State-Info headers, which is an
     essential part of this proposal, could be expensive.)

   +o must pass back to the client any State-Info response header it
     receives.

   +o may cache the received response, but must not cache the State-Info
     header as part of its cache state.  (Caching the response is
     subject to the control of the usual headers, such as Expires and
     Pragma: no-cache.)





Kristol           draft-kristol-http-state-info-00.txt          [Page 4]


INTERNET DRAFT     Proposed HTTP State-Info Mechanism    August 25, 1995



5.  IMPLEMENTATION CONSIDERATIONS

Here I speculate on likely or desirable details for an origin server
that implements Server-Info.

5.1  State-Info Content

An origin server's content should probably be divided into disjoint
application areas, some of which require the use of State-Info.  The
application areas can be distinguished by their request URLs.  The
State-Info header can incorporate information about multiple sessions
that a user agent might start as follows.  Imagine that a single
session's state information takes the form
        URL opaque

The opaque information might be a uuencoding of application-specific
information.  The URL might be the actual URL of a resource, or it might
be the prefix for all URLs that comprise a particular application.  The
State-Info header for multiple sessions can be formed by concatenating
the session state information of all sessions, separated by commas, as
in

        State-Info: /A YXBwbGljYXRpb246MQ==, /B YXBwbGljYXRpb246Mg==

The session information can obviously be clear or encoded text that
describes state.  However, if it grows too large, it can become
unwieldy.  Therefore, an implementor might choose for the session
information to be a key into a server-side database.  Of course, using a
database creates some problems that the State-Info proposal was meant to
avoid, namely:

  1.  keeping real state on the server side;

  2.  how and when to garbage-collect the database entry, in case the
      user agent terminates the session by, for example, exiting.

The origin server software should probably be designed to separate the
session information for different applications and only present to a
particular application the session information that applies to it.

5.2  Stateless Pages

Caching is a good thing for the scalability of WWW.  Therefore it's
important to reduce the number of documents that have state embedded in
them inherently.  For example, if a shopping-basket-style application
always displayed a user's current basket contents on each page, those
pages could not be cached, because each user's basket's contents would
be different.  On the other hand, if each page contained just a link
that allowed the user to ``Look at My Shopping Basket,'' the page could
be cached.




Kristol           draft-kristol-http-state-info-00.txt          [Page 5]


INTERNET DRAFT     Proposed HTTP State-Info Mechanism    August 25, 1995



6.  PRIVACY

An origin server can create a State-Info header to track the path of a
user through the server.  Users may object to this behavior as intrusive
accumulation of information, although their identity is not evident.
(Identity might become evident if a user fills out a form that contains
identifying information.)  The State-Info proposal therefore gives a
user some control over this possible intrusion by

   +o Recommending that a user agent should be able, as a configuration
     option, never to create stateful sessions.

   +o Recommending that a user agent allow a user to discard State-Info
     at any time.

   +o Recommending that terminating a user agent's execution (or the
     execution of a window, for multi-window user agents) causes State-
     Info to be discarded.


7.  OTHER, SIMILAR, PROPOSALS

I'm aware of two other proposals to accomplish similar goals.  Netscape
proposes a Cookie request header and Set-Cookie response header.
Netscape cookies have expiration times and other information that
require more complicated processing by the user agent than does my
proposal.  Furthermore, there's no requirement that cookies be discarded
when the user exits a user agent program.

Brian Behlendorf proposed a Session-ID header that would be user-agent-
initiated and could be used by an origin server to track
``clickstreams.''  It would not carry any origin-server-defined state,
however.

Koen Holtman has made a proposal that is similar in flavor to, but
different in detail from, this one.


8.  SECURITY CONSIDERATIONS

The information in the State-Info headers is unprotected.  Two
consequences are:

  1.  Any sensitive information that is conveyed in a State-Info header
      is exposed to intruders.

  2.  A malicious intermediary could alter the State-Info header as it
      travels in either direction, with unpredictable results.

These facts imply that information of a personal and/or financial nature
should only be sent over a secure channel.  For less sensitive



Kristol           draft-kristol-http-state-info-00.txt          [Page 6]


INTERNET DRAFT     Proposed HTTP State-Info Mechanism    August 25, 1995



information, or when the content of the header is a database key, an
origin server should be vigilant to prevent a bad Session-Info value
from causing it to fail.


9.  ACKNOWLEDGEMENTS

My thanks go to correspondents on the http-wg and www-talk mailing lists
who contributed ideas and criticism that found its way into this
proposal.  Special thanks to Bob Wyman, Koen Holtman, Shel Kaphan.


10.  AUTHOR'S ADDRESS

David M. Kristol
AT&T Bell Laboratories
600 Mountain Ave.  Room 2A-227
Murray Hill, NJ  07974



Phone: (908) 582-2250
FAX: (908) 582-5809
Email: dmk@research.att.com






























Kristol           draft-kristol-http-state-info-00.txt          [Page 7]