Network Working Group                                          R. Sparks
Internet-Draft                                                   Tekelec
Intended status: Informational                               Dec 9, 2011
Expires: June 11, 2012


         IETF Email List Archiving and Search Tool Requirements
                    draft-sparks-genarea-mailarch-01

Abstract

   The IETF makes heavy use of email lists to conduct its work.
   Participants frequently need to search and browse the archives of
   these lists, and have asked for improved search capabilities.  The
   current archive mechanism could also be made more efficient.  This
   memo captures the requirements for improved email list archiving and
   searching systems.

Status of this Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF).  Note that other groups may also distribute
   working documents as Internet-Drafts.  The list of current Internet-
   Drafts is at http://datatracker.ietf.org/drafts/current/.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   This Internet-Draft will expire on June 11, 2012.

Copyright Notice

   Copyright (c) 2011 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   (http://trustee.ietf.org/license-info) in effect on the date of
   publication of this document.  Please review these documents
   carefully, as they describe your rights and restrictions with respect
   to this document.  Code Components extracted from this document must
   include Simplified BSD License text as described in Section 4.e of
   the Trust Legal Provisions and are provided without warranty as



Sparks                    Expires June 11, 2012                 [Page 1]


Internet-Draft   List Archiving and Search Requirements         Dec 2011


   described in the Simplified BSD License.


Table of Contents

   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . . . 3
   2.  List Search and Archive Requirements  . . . . . . . . . . . . . 3
     2.1.  Search and Browsing . . . . . . . . . . . . . . . . . . . . 3
     2.2.  IMAP access . . . . . . . . . . . . . . . . . . . . . . . . 4
     2.3.  Archiving Active Lists  . . . . . . . . . . . . . . . . . . 4
     2.4.  Importing Messages from Other Archives  . . . . . . . . . . 5
     2.5.  Exporting messages from the Archives  . . . . . . . . . . . 5
     2.6.  Redundancy  . . . . . . . . . . . . . . . . . . . . . . . . 6
     2.7.  Archive Administration  . . . . . . . . . . . . . . . . . . 6
     2.8.  Transition Requirements . . . . . . . . . . . . . . . . . . 6
   3.  Future Considerations . . . . . . . . . . . . . . . . . . . . . 7
   4.  Security Considerations . . . . . . . . . . . . . . . . . . . . 7
   5.  IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 7
   6.  Acknowledgements  . . . . . . . . . . . . . . . . . . . . . . . 7
   7.  Changelog . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
     7.1.  00 to 01  . . . . . . . . . . . . . . . . . . . . . . . . . 7
   8.  Informative References  . . . . . . . . . . . . . . . . . . . . 7
   Author's Address  . . . . . . . . . . . . . . . . . . . . . . . . . 8




























Sparks                    Expires June 11, 2012                 [Page 2]


Internet-Draft   List Archiving and Search Requirements         Dec 2011


1.  Introduction

   The IETF makes heavy use of email lists to conduct its work.
   Participants frequently need to search the archives of these lists,
   and have asked for improved search capabilities.  The current archive
   mechanism could also be made more efficient.  This memo captures the
   requirements for improved email list archiving and searching systems.

   Discussion of this memo should take place on the ietf@ietf.org
   mailing list.


2.  List Search and Archive Requirements

2.1.  Search and Browsing

   o  The system must provide a web interface for search and browsing
      archived messages.

   o  The system must allow browsing the entire archive of a given list
      by thread or by date.

   o  The system must allow browsing the results of a search by thread
      or by date.

         Both threading based on Message-Id/References/In-Reply-To and
         threading based on same subject line (modulo short prefixes
         like re: and fwd:) shoul dbe taken into account.

   o  The system must allow searching across any subset of the archived
      lists (one list, a selection of lists, or all lists).

   o  The system must allow searching of any combination (using AND and
      OR operators) of the following attributes.  Richer search
      capabilites are highly desirable.

      -  string occurring in sender name

      -  date range

      -  string occurring in Subject

      -  string occurring in message body

      -  string occuring in message header (in particular, exact match
         of Message-Id)





Sparks                    Expires June 11, 2012                 [Page 3]


Internet-Draft   List Archiving and Search Requirements         Dec 2011


            For instance, it would be nice to search the entire archive
            for instances of a message with a given Message-ID with a
            URL like <http://datatracker.ietf.org/mlarchive/
            msg?id=4EA6E023.6010603@nostrum.com>

   o  Individual messages must be representable by a long-term stable
      URI that can be shared between users.  That is, the URI must be
      suitable for reference in an email message.

   o  Searches should be representable by a URI that can be shared
      between users

      -  Such URIs should be long-term stable.

      -  The search may be re-executed when the URI is referenced.  It
         is acceptable for the same URI to produce different results if
         accessed at different times (reflecting additional messages
         that may match the search criteria for example.)

2.2.  IMAP access

   Many participants would prefer to access the list archives using IMAP
   [RFC3501].  Providing this access while meeting the following
   requirements will likely require an IMAP server with specialized
   capabilities.

   o  The system should expose the archive using an IMAP interface, with
      each list represented as a mailbox.

   o  This interface must work with standard IMAP clients.

   o  The interface should allow users to each have their own read/
      unread marks for messages.  Allowing other annotation is
      desirable.

      -  If this requires the user to login, the system should use
         datatracker login credentials

   o  The interface must have server-side searching enabled, and should
      support multiple simultaneous extensive searches.

2.3.  Archiving Active Lists

   o  The archive system must accept messages handled by various mail
      reflector packages.






Sparks                    Expires June 11, 2012                 [Page 4]


Internet-Draft   List Archiving and Search Requirements         Dec 2011


      -  Lists hosted on the IETF systems are served by mailman
         [mailman].

      -  Lists hosted at other organizations may use other packages.

         *  The archive system must accept messages through subscribing
            to such an external list.

         *  The archive system may support other mechanisms for
            accepting messages into the archive

2.4.  Importing Messages from Other Archives

   Lists hosted at other systems are sometimes moved to the IETF
   servers, and their archive is moved with them.  The archiving system
   must be able to import these archives.

   o  At a minimum the archive system must be able to import mbox
      formatted archives [RFC4155][mbox].

   o  The archive system should be able to import maildir and maildir-
      like (the key characteristic being on-message-per-file) formatted
      archives [maildir].

   o  It is acceptable to use a separate utility to convert between
      these formats before import as long as the conversion is lossless

2.5.  Exporting messages from the Archives

   o  The archive system must support exporting messages in the mbox
      format

   o  The archive system should support exporting messages in mailder
      format

   o  The archive system must support exporting the entire archive of a
      given list

   o  The archive system must support exporting all messages from a
      given list within a given daterange

   o  The archive system should allow exporting the results of any
      supported search query








Sparks                    Expires June 11, 2012                 [Page 5]


Internet-Draft   List Archiving and Search Requirements         Dec 2011


2.6.  Redundancy

   o  The systems must facilitate providing archive, search, and browse
      functions through geographically distributed servers

      -  The systems must support a single active an dsingle standby
         server.  This reflects the current operating configuration and
         is expected to be the initial deployment model.

      -  The systems should support a single active and multiple standby
         servers.

      -  The systems should support multiple active servers for the
         search and browse functions.  Multiple active archive servers
         are not a requirement.

      -  The amount of data replication between servers should be on the
         order of the size of any new/changed messages in the archives.

         *  It is acceptable for replication to be part of the archival
            system itself (such as using the replication mechanisms from
            an underlying database).

         *  It is acceptable to rely on replication of the underlying
            filesystem objects (using rsync of one or more directory
            trees for example), but only if the objects in the
            underlying filesystem are formatted such that the size of
            the replication data is on the order of the size of any new/
            changed messages in the archives.

2.7.  Archive Administration

   o  The archive system must support adding and removing lists to be
      archived

   o  The system must allow the administrator to add messages to and
      delete messages from an archived list.  The system should log such
      actions.

   o  The system must allow the administrator to delete messages from an
      archived list

2.8.  Transition Requirements

   There are many existing archived messages containing embedded links
   into the existing MHonArc mail archive.  These links must continue to
   work, but should reach the message as archived in the new system.




Sparks                    Expires June 11, 2012                 [Page 6]


Internet-Draft   List Archiving and Search Requirements         Dec 2011


3.  Future Considerations

   The archive and search functions should anticipate internationalized
   email addresses as discussed in the EAI working group.  Since the EAI
   WG has not finished their work, there is no firm requirement at this
   time.


4.  Security Considerations

   Creating a new tool for searching and archiving IETF email lists does
   not affect the security of the Internet in any significant fashion.


5.  IANA Considerations

   This document has no actions for IANA.


6.  Acknowledgements

   The Tools Development team provided input into this initial
   brainstorm.  Text suggestions from Alexey Melnikov and Pete Resnick
   have been incorporated.


7.  Changelog

7.1.  00 to 01

   1.  Requested ability to import maildir-like archives, not just
       maildir proper

   2.  Added a section requesting IMAP access to the archive.


8.  Informative References

   [RFC3501]  Crispin, M., "INTERNET MESSAGE ACCESS PROTOCOL - VERSION
              4rev1", RFC 3501, March 2003.

   [RFC4155]  Hall, E., "The application/mbox Media Type", RFC 4155,
              September 2005.

   [maildir]  "Maildir", <http://en.wikipedia.org/wiki/Maildir>.

   [mailman]  "Mailman", <http://www.list.org/>.




Sparks                    Expires June 11, 2012                 [Page 7]


Internet-Draft   List Archiving and Search Requirements         Dec 2011


   [mbox]     "Mbox", <http://en.wikipedia.org/wiki/Mbox>.


Author's Address

   Robert Sparks
   Tekelec
   17210 Campbell Road
   Suite 250
   Dallas, Texas  75254-4203
   USA

   Email: RjS@nostrum.com






































Sparks                    Expires June 11, 2012                 [Page 8]