Network Working Group                                      M. Nottingham
Internet-Draft                                         February 26, 2006
Expires: August 30, 2006


             Feed History: Enabling Incremental Syndication
                draft-nottingham-atompub-feed-history-05

Status of this Memo

   By submitting this Internet-Draft, each author represents that any
   applicable patent or other IPR claims of which he or she is aware
   have been or will be disclosed, and any of which he or she becomes
   aware will be disclosed, in accordance with Section 6 of BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups.  Note that
   other groups may also distribute working documents as Internet-
   Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/ietf/1id-abstracts.txt.

   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html.

   This Internet-Draft will expire on August 30, 2006.

Copyright Notice

   Copyright (C) The Internet Society (2006).

Abstract

   This specification describes a technique for feed publishers to give
   hints about the completeness of a feed, and a means of retrieving
   "missed" entries from a incremental feed.









Nottingham               Expires August 30, 2006                [Page 1]


Internet-Draft                Feed History                 February 2006


Table of Contents

   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  3
   2.  Terminology  . . . . . . . . . . . . . . . . . . . . . . . . .  3
   3.  Notational Conventions . . . . . . . . . . . . . . . . . . . .  4
   4.  Publishing Incremental Feeds . . . . . . . . . . . . . . . . .  4
   5.  Publishing Non-Incremental Feeds . . . . . . . . . . . . . . .  5
   6.  Reconstructing Feed State  . . . . . . . . . . . . . . . . . .  5
   7.  Examples . . . . . . . . . . . . . . . . . . . . . . . . . . .  7
   8.  Security Considerations  . . . . . . . . . . . . . . . . . . . 10
   9.  Normative References . . . . . . . . . . . . . . . . . . . . . 11
   Appendix A.  Acknowledgements  . . . . . . . . . . . . . . . . . . 11
   Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 12
   Intellectual Property and Copyright Statements . . . . . . . . . . 13





































Nottingham               Expires August 30, 2006                [Page 2]


Internet-Draft                Feed History                 February 2006


1.  Introduction

   Syndication documents (e.g., those in formats such as Atom [RFC4287]
   and RSS 2.0 [1]) usually only contain the last several entries in a
   larger channel (or "feed") of information.  Often, consuming software
   keeps copies of all entries that have been previously seen,
   effectively keeping a history of the feed's contents.

   However, not all feeds benefit from this practice; in some, previous
   entries are not relevant to the current contents of the feed.  For
   example, it's not desirable to keep history in this manner with a
   "top ten" feed; showing old entries would imply that the previous
   number one is now number eleven, and so forth.

   Feeds that encourage this practice have a different problem.  If
   consuming software does not poll often enough, some entries may be
   missed, causing them to be silently omitted.  For some applications,
   this is a serious error on its own.  Even in non-critical
   applications, this phenomenon can cause publishers to make Feed
   Documents contain more entries than reasonably necessary, just to
   assure that consumers have an amply large window in which to
   reconstruct the feed's state.

   This specification describes a technique that allows feed publishers
   to give hints as to the completeness of a feed document, and a means
   of retrieving "missed" entries from a partial, or incremental, feed
   by linking archives of the feed together.

   Although it refers to Atom normatively, the mechanism described
   herein can be used with similar syndication formats, such as the
   various flavours of RSS.


2.  Terminology

   In this specification, "feed document" refers to an Atom Feed
   Document, RSS document or similar syndication format instance
   document.  It may contain any number of entries (in RSS, items), and
   may or may not be a complete representation of the logical feed.

   "Subscription document" refers to a feed document that always
   contains the most recent entries available in the feed (often, the
   feed document that should be subscribed to).

   "Archive document" refers to a feed document that is archived; i.e.,
   the set of entries inside it does not change over time.  Entries
   within an archive MAY themselves change, however.




Nottingham               Expires August 30, 2006                [Page 3]


Internet-Draft                Feed History                 February 2006


   "Incremental feed" refers to a set of associated feed documents
   (namely, one subscription document and any number of archive
   documents) that can be combined to form a single, logical feed.

   Finally, "head section" refers to the children of a feed document's
   document-wide metadata container; e.g., the child elements of the
   atom:feed element in an Atom Feed Document.


3.  Notational Conventions

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in BCP 14, [RFC2119], as
   scoped to those conformance targets.

   This specification uses XML Namespaces [W3C.REC-xml-names-19990114]
   to uniquely identify XML element names.  It uses the following
   namespace prefix for the indicated namespace URI;

   "fh": "http://purl.org/syndication/history/1.0"

   This specification uses terms from the XML Infoset [W3C.REC-xml-
   infoset-20040204].  However, this specification uses a shorthand; the
   phrase "Information Item" is omitted when naming Element Information
   Items.  Therefore, when this specification uses the term "element,"
   it is referring to an Element Information Item in Infoset terms.

   This specification also uses Atom link relations to identify
   different types of links; see the Atom specification [RFC4287] for
   information about their syntax, and the link relation registry [2]
   for more information about specific values.


4.  Publishing Incremental Feeds

   Publishers who wish to make a feed available incrementally should:

   1.  Periodically remove the least recent entries from the
       subscription document, moving them into archive documents.
   2.  In the subscription document's head section, add a "previous"
       link relation, containing a URI reference to the most recent
       archive documents.
   3.  In archive documents' head sections, add a "previous" link
       relation, containing a URI reference to the next most recent
       archive (if any).





Nottingham               Expires August 30, 2006                [Page 4]


Internet-Draft                Feed History                 February 2006


   4.  In archive documents' head sections, add a "current" link
       relation, containing the subscription document's URI reference.

   Publishers are not required to make all archive documents available;
   they may refuse to serve (e.g., with HTTP status code 403), or be
   unable to serve (e.g., with HTTP status code 404) an archive
   document.

   Publishers may also duplicate entries between the subscription
   document and the most current archive document; i.e., they may
   archive current entries.

   Note that because consumers are not required to re-fetch archived
   feeds that they've previously stored, changes to entries in those
   documents may not be apparent to all users.  Therefore, if a
   publisher requires a change to be visible to all users (e.g.,
   correcting factual errors), they should consider publishing the
   revised entry in the subscription feed, in addition to (or instead
   of) the appropriate archive feed.  Conversely, unimportant changes
   (e.g., spelling corrections) might be only effected in archive feeds.


5.  Publishing Non-Incremental Feeds

   Some consumers may attempt to keep a history of feed entries, even
   without explicit hints that enable it.  Because this is undesirable
   for some feeds, this section defines an explicit means of indicating
   that history should not be kept.

   For example, a feed that represents a ranking that varies over time,
   such as "Top Twenty Records" or "Most Popular Items" should not have
   newer entries displayed alongside older ones.

   The fh:complete element, when present in a feed's head section,
   indicates that the subscription document is a complete representation
   of the logical feed's entries.

   For example,

     <fh:complete/>


6.  Reconstructing Feed State

   When presented with a feed document, a consumer MAY reconstruct the
   feed's state in a local store S by following these steps:





Nottingham               Expires August 30, 2006                [Page 5]


Internet-Draft                Feed History                 February 2006


   1.  If the feed document's head section contains a "current" link
       relation value C, dereference C and use the resulting feed
       document as D.
   2.  If the fh:complete element is present in D's head section:
       1.  Remove all entries present in local store S.
       2.  Add all of the document D's entries to local store S.
       3.  Stop processing.
   3.  Otherwise:
       1.  Create an empty list L.
       2.  Consider the URI of the last archive document successfully
           stored to local store S as A.
       3.  Consider the entries in document D as E.
       4.  If the document D has a "previous" link relation value P in
           its head section, and P is not A,
           1.  Append P to L.
           2.  Dereference P and use the resulting feed document as D.
       5.  Repeat the previous step until no new P is found.
       6.  Add all of document D's entries to the local store S,
           replacing any entries with the same identity.
       7.  Pop the last "previous" link relation from L, dereference its
           value and use the resulting feed document as D.
       8.  Repeat the previous two steps until L is empty.
       9.  Add the entries E to the local store S, replacing any entries
           with the same identity.

   In the instructions above, the concept of an entry's identity is
   format-specific; e.g., in Atom, it is conveyed by the atom:id
   element; in RSS 2, it is indicated by the guid element.

   Consumers SHOULD warn users when they do not have the complete feed,
   or an error is encountered in the reconstruction process (e.g., by
   alerting the user that an archive document is unavailable, or
   displaying pseudo-entries that inform the user that some entries may
   be missing).

   Consumers MAY cache archive documents and/or use a different method
   of reconstructing the feed, as long as the result is the same as that
   achieved by following these steps.

   In particular, consumers MAY use the "first" and "next" link
   relations, if available, to traverse the feed in a different manner.
   However, such consumers MUST NOT require their presense to
   reconstruct a feed's state.

   When comparing URIs to determine if an archive has been successfully
   stored, the value of the "self" link relation, if present, SHOULD be
   used as the URI of the feed document it occurs in.




Nottingham               Expires August 30, 2006                [Page 6]


Internet-Draft                Feed History                 February 2006


   Note that URI references in link relation values may be relative, and
   when they are used they must be absolutised, as described in Section
   5.1 of [RFC3986].

   This specification does not bestow any semantics on the ordering of
   entries in a feed; the only purpose of introducing ordering between
   archive documents is to allow the feed to be reconstructed.


7.  Examples

   Non-Incremental Atom Feed Document

   <?xml version="1.0" encoding="utf-8"?>
   <feed xmlns="http://www.w3.org/2005/Atom"
    xmlns:fh="http://purl.org/syndication/history/1.0">
    <title>Example Feed</title>
    <link href="http://example.org/"/>
    <fh:complete/>
    <link rel="self" href="http://example.org/index.atom"/>
    <updated>2003-12-13T18:30:02Z</updated>
    <author>
      <name>John Doe</name>
    </author>
    <id>urn:uuid:60a76c80-d399-11d9-b93C-0003939e0af6</id>
    <entry>
      <title>Atom-Powered Robots Run Amok</title>
      <link href="http://example.org/2003/12/13/atom03"/>
      <id>urn:uuid:1225c695-cfb8-4ebb-aaaa-80da344efa6a</id>
      <updated>2003-12-13T18:30:02Z</updated>
      <summary>Some text.</summary>
    </entry>
   </feed>


















Nottingham               Expires August 30, 2006                [Page 7]


Internet-Draft                Feed History                 February 2006


   Incremental Atom Feed: Subscription Document

   <?xml version="1.0" encoding="utf-8"?>
   <feed xmlns="http://www.w3.org/2005/Atom">
    <title>Example Feed</title>
    <link href="http://example.org/"/>
    <link rel="self" href="http://example.org/index.atom"/>
    <link rel="previous" href="http://example.org/2003/11/index.atom"/>
    <updated>2003-12-13T18:30:02Z</updated>
    <author>
      <name>John Doe</name>
    </author>
    <id>urn:uuid:60a76c80-d399-11d9-b93C-0003939e0af6</id>
    <entry>
      <title>Atom-Powered Robots Run Amok</title>
      <link href="http://example.org/2003/12/13/atom03"/>
      <id>urn:uuid:1225c695-cfb8-4ebb-aaaa-80da344efa6a</id>
      <updated>2003-12-13T18:30:02Z</updated>
      <summary>Some text.</summary>
    </entry>
   </feed>

   Incremental Atom Feed: Archive Document

   <?xml version="1.0" encoding="utf-8"?>
   <feed xmlns="http://www.w3.org/2005/Atom">
    <title>Example Feed</title>
    <link rel="current" href="http://example.org/index.atom"/>
    <link rel="self" href="http://example.org/2003/11/index.atom"/>
    <link rel="previous" href="http://example.org/2003/10/index.atom"/>
    <updated>2003-11-24T12:00:00Z</updated>
    <author>
      <name>John Doe</name>
    </author>
    <id>urn:uuid:60a76c80-d399-11d9-b93C-0003939e0af6</id>
    <entry>
      <title>Atom-Powered Robots Scheduled To Run Amok</title>
      <link href="http://example.org/2003/11/24/robots_coming"/>
      <id>urn:uuid:cdef5c6d5-gff8-4ebb-assa-80dwe44efkjo</id>
      <updated>2003-11-24T12:00:00Z</updated>
      <summary>Some text from an old, different entry.</summary>
    </entry>
   </feed>








Nottingham               Expires August 30, 2006                [Page 8]


Internet-Draft                Feed History                 February 2006


   Incremental RSS 2.0 Feed: Subscription Document

   <?xml version="1.0"?>
   <rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
    <channel>
     <title>Liftoff News</title>
     <link>http://liftoff.nasa.gov/</link>
     <description>Liftoff to Space Exploration.</description>
     <language>en-us</language>
     <pubDate>Tue, 10 Jun 2003 04:00:00 GMT</pubDate>
     <lastBuildDate>Tue, 10 Jun 2003 09:41:01 GMT</lastBuildDate>
     <docs>http://blogs.law.harvard.edu/tech/rss</docs>
     <generator>Weblog Editor 2.0</generator>
     <managingEditor>editor@example.com</managingEditor>
     <webMaster>webmaster@example.com</webMaster>
     <atom:link rel="previous"
      href="http://liftoff.nasa.gov/2003/05/index.rss"/>

     <item>
      <title>Star City</title>
      <link>http://liftoff.nasa.gov/2003/06/news-starcity</link>
      <description>How do Americans get ready to work with Russians
      aboard the International Space Station? They take a crash course
      in culture, language and protocol at Russia's <a
      href="http://howe.iki.rssi.ru/GCTC/gctc_e.htm"&gt;Star
      City</a&gt;.</description>
      <pubDate>Tue, 03 Jun 2003 09:39:21 GMT</pubDate>
      <guid>http://liftoff.nasa.gov/2003/06/03.html#item573</guid>
     </item>
    </channel>
   </rss>




















Nottingham               Expires August 30, 2006                [Page 9]


Internet-Draft                Feed History                 February 2006


   Incremental RSS 2.0 Feed: Archive Document

   <?xml version="1.0"?>
   <rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
    <channel>
     <title>Liftoff News</title>
     <link>http://liftoff.nasa.gov/</link>
     <description>Liftoff to Space Exploration.</description>
     <language>en-us</language>
     <pubDate>Tue, 30 May 2003 08:00:00 GMT</pubDate>
     <lastBuildDate>Tue, 30 May 2003 10:31:52 GMT</lastBuildDate>
     <docs>http://blogs.law.harvard.edu/tech/rss</docs>
     <generator>Weblog Editor 2.0</generator>
     <managingEditor>editor@example.com</managingEditor>
     <webMaster>webmaster@example.com</webMaster>
     <atom:link rel="current" href="http://liftoff.nasa.gov/index.rss"/>
     <atom:link rel="previous"
      href="http://liftoff.nasa.gov/2003/04/index.rss"/>

     <item>
      <description>Sky watchers in Europe, Asia, and parts of
      Alaska and Canada will experience a partial eclipse of the Sun
      on Saturday, May 31st.</description>
      <pubDate>Fri, 30 May 2003 11:06:42 GMT</pubDate>
      <guid>http://liftoff.nasa.gov/2003/05/30.html#item572</guid>
     </item>
     <item>
      <title>The Engine That Does More</title>
      <link>http://liftoff.nasa.gov/2003/05/news-VASIMR.asp</link>
      <description>Before man travels to Mars, NASA hopes to
      design new engines that will let us fly through the Solar
      System more quickly.  The proposed VASIMR engine would do
      that.</description>
      <pubDate>Tue, 27 May 2003 08:37:32 GMT</pubDate>
      <guid>http://liftoff.nasa.gov/2003/05/27.html#item571</guid>
     </item>
    </channel>
   </rss>


8.  Security Considerations

   Feeds using the mechanisms described here could be crafted in such a
   way as to cause a consumer to initiate excessive (or even an unending
   sequence of) network requests, causing denial of service (either to
   the consumer, the target server, and/or intervening networks).
   Consumers can mitigate this risk by requiring user intervention after
   a certain number of requests, or by limiting requests either



Nottingham               Expires August 30, 2006               [Page 10]


Internet-Draft                Feed History                 February 2006


   according to a hard limit, or with heuristics.

   Consumers should be mindful of resource limits when storing feed
   documents; to reiterate, they are not required to always store or
   reconstruct the feed when conforming to this specification; they only
   need inform the user when the reconstructed feed is not complete.

9.  Normative References

   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
              Requirement Levels", BCP 14, RFC 2119, March 1997.

   [RFC3986]  Berners-Lee, T., Fielding, R., and L. Masinter, "Uniform
              Resource Identifier (URI): Generic Syntax", STD 66,
              RFC 3986, January 2005.

   [RFC4287]  Nottingham, M. and R. Sayre, "The Atom Syndication
              Format", RFC 4287, December 2005.

   [W3C.REC-xml-infoset-20040204]
              Cowan, J. and R. Tobin, "XML Information Set (Second
              Edition)", W3C REC REC-xml-infoset-20040204,
              February 2004.

   [W3C.REC-xml-names-19990114]
              Bray, T., Hollander, D., and A. Layman, "Namespaces in
              XML", W3C REC REC-xml-names-19990114, January 1999.

   [1]  <http://blogs.law.harvard.edu/tech/rss>

   [2]  <http://www.iana.org/assignments/link-relations>


Appendix A.  Acknowledgements

   The author would like to thank the following people for their
   contributions, comments and help: Danny Ayers, Thomas Broyer, Stefan
   Eissing, David Hall, Aristotle Pagaltzis, Dave Pawson, Garrett
   Rooney, Robert Sayre, James Snell, Henry Story.

   Any errors herein remain the author's, not theirs.










Nottingham               Expires August 30, 2006               [Page 11]


Internet-Draft                Feed History                 February 2006


Author's Address

   Mark Nottingham

   Email: mnot@pobox.com
   URI:   http://www.mnot.net/













































Nottingham               Expires August 30, 2006               [Page 12]


Internet-Draft                Feed History                 February 2006


Intellectual Property Statement

   The IETF takes no position regarding the validity or scope of any
   Intellectual Property Rights or other rights that might be claimed to
   pertain to the implementation or use of the technology described in
   this document or the extent to which any license under such rights
   might or might not be available; nor does it represent that it has
   made any independent effort to identify any such rights.  Information
   on the procedures with respect to rights in RFC documents can be
   found in BCP 78 and BCP 79.

   Copies of IPR disclosures made to the IETF Secretariat and any
   assurances of licenses to be made available, or the result of an
   attempt made to obtain a general license or permission for the use of
   such proprietary rights by implementers or users of this
   specification can be obtained from the IETF on-line IPR repository at
   http://www.ietf.org/ipr.

   The IETF invites any interested party to bring to its attention any
   copyrights, patents or patent applications, or other proprietary
   rights that may cover technology that may be required to implement
   this standard.  Please address the information to the IETF at
   ietf-ipr@ietf.org.


Disclaimer of Validity

   This document and the information contained herein are provided on an
   "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
   OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET
   ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED,
   INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE
   INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
   WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.


Copyright Statement

   Copyright (C) The Internet Society (2006).  This document is subject
   to the rights, licenses and restrictions contained in BCP 78, and
   except as set forth therein, the authors retain all their rights.


Acknowledgment

   Funding for the RFC Editor function is currently provided by the
   Internet Society.




Nottingham               Expires August 30, 2006               [Page 13]