Internet Draft                                              Matt Curtin
draft-ietf-usefor-message-id-00.txt           The Ohio State University
Category-to-be: Informational                            Jamie Zawinski
                                                Netscape Communications

                                                              June 1998
                                    Expires: Six Months from above date


              Recommendations for generating Message IDs

                         Status of this Memo

This document is an Internet-Draft. Internet-Drafts are working
documents of the Internet Engineering Task Force (IETF), its areas, and
its working groups. Note that other groups may also distribute working
documents as Internet-Drafts.

Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet- Drafts as reference material
or to cite them other than as ``work in progress.''

To view the entire list of current Internet-Drafts, please check the
"1id-abstracts.txt" listing contained in the Internet-Drafts Shadow
Directories on ftp.is.co.za (Africa), ftp.nordu.net (Northern Europe),
ftp.nis.garr.it (Southern Europe), munnari.oz.au (Pacific Rim),
ftp.ietf.org (US East Coast), or ftp.isi.edu (US West Coast).


                               Abstract

This draft provides recommendations on how to generate globally unique
Message IDs in client software.

Table of Contents

1. Introduction
2. Message-ID formatting
3. Message-ID generation
4. Acknowledgments
5. References
6. Authors' addresses


1. Introduction

Message-ID headers are used to uniquely identify Internet messages.
Having a unique identifier for each message has many benefits,
including ease in the following of threads and intelligent scoring of
messages based on threads to which they belong.

It has been suggested that it is impossible for client software to be
able to generate globally-unique Message-IDs.  We believe this to be
incorrect, and herein to offer suggestions for generating unique
Message-IDs.


2. Message-ID formatting

As defined in [NEWS], a message ID consists of two parts, a local part
and a domain, separated by an at-sign and enclosed in angle brackets:

    message-id = "<" local-part "@" domain ">"

Practically, news message IDs are a restricted subset of mail message
IDs.  In particular, no existing news software copes properly with mail
quoting conventions within the local part, so software generating a
Message-ID would be well-advised to avoid this pitfall.

It is also noted that some buggy software considers message IDs
completely case-insensitive, in violation of the standards.  It is
therefore advised that one not generate IDs such that two IDs so
generated can differ only in character case.


3. Message-ID generation

The most popular method of generating local parts is to use the date and
time, plus some way of distinguishing between simultaneous postings on
the same host (e.g. a process number), and encode them in a suitably-
restricted alphabet.  An older but now less-popular alternative is to
use a sequence number, incremented each time the host generates a new
message ID; this is workable, but requires careful design to cope
properly with simultaneous posting attempts, and is not as robust in the
presence of crashes and other malfunctions.

On many client systems, it is not always possible to get the
fully-qualified domain name (FQDN) of the local host.  In that
situation, a reasonable fallback course of action would be to use the
domain-part of the user's return address.  Doing so makes the generation
of the "distinguishing number" be more important; in particular, it
means that a process ID is probably not sufficient.

An alternative for generating the distinguishing number, on systems
where the process ID isn't available, or in the case where the local
host's FQDN isn't known, is to generate a large random number from a
high-quality, well-seeded pseudorandom number generator.  (Note that the
RNGs shipped by many vendors are not high quality.)

In summary, one possible approach to generating a Message-ID would be:

  *  Append "<".

  *  Get the current (wall-clock) time in the highest resolution to
     which you have access (most systems can give it to you in
     milliseconds, but seconds will do);

  *  Generate 64 bits of randomness from a good, well-seeded random
     number generator;

  *  Convert these two numbers to base 36 (0-9 and A-Z) and append the
     first number, a ".", the second number, and an "@".  This makes the
     left hand side of the message ID be only about 21 characters long.

  *  Append the FQDN of the local host, or the host name in the user's
     return address.

  *  Append ">".

If the random number generator is good, this will reduce the odds of a
collision of message IDs to well below the odds that a cosmic ray will
cause the computer to miscompute a result.  That means that it's good
enough.

There are many other approaches.  This is provided only as an example.


4. Acknowledgments

This document is partially derived from an earlier, unrelated draft by
Henry Spencer.


5. References

Ref.          Author, title                         IETF status (June 1998)
                                                    ----------------------
---           -------------

[NEWS]        M.R. Horton, R. Adams: "Standard      Non-standard (but still
              for interchange of USENET             widely used as a de-facto
              messages", RFC 1036, December         standard).
              1987.



6. Authors' Addresses

Matt Curtin
The Ohio State University
791 Dreese Laboratories
2015 Neil Ave
Columbus OH 43210
+1 614 292 7352
cmcurtin@cis.ohio-state.edu

Jamie Zawinski
Netscape Communications Corporation
501 East Middlefield Road
Mountain View, CA 94043
(650) 937-2620
jwz@netscape.com