Network Working Group                        Charles H. Lindsey
Internet-Draft                               University of Manchester
                                             May 2000

                   Signed Headers in Mail and Netnews

                   draft-lindsey-usefor-signed-00.txt

Status of this Memo

   This document is an Internet-Draft and is in full conformance with
   all provisions of Section 10 of RFC2026.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups.  Note that
   other groups may also distribute working documents as Internet-
   Drafts.

   Internet-Drafts are draft documents valid for a maximum of six
   months and may be updated, replaced, or obsoleted by other
   documents at any time.  It is inappropriate to use Internet-
   Drafts as reference material or to cite them other than as "work
   in progress."

   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/ietf/1id-abstracts.txt.

   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html.

Abstract

   The huge growth of Netnews/Usenet in recent years has been
   accompanied by many attempts to abuse the system by various forms
   of malpractice, particularly the forging of various headers,
   causing it to appear that articles came from parties other than
   those that actually injected them or conveyed some Approval that
   the real poster was not entitled to give. Insofar as Netnews is
   regularly gatwayed to and from Email systems, these problems also
   extend to the Email domain.

   This document provides a cryptographically secure means whereby it
   can be established beyond doubt that relevant headers of a Netnews
   article or an Email message have not been tampered with in
   transit, and that they were indeed originated by the person
   purporting to have done so. It seeks to supplement, rather than to
   supplant, the existing protocols for signing the bodies of
   articles and messages.

[This proposal arises from the activities of the Usenet Format Working
Group, which is charged with updating the Netnews standards. Comments
are invited, preferably sent to the mailing list of the Group at
usenet-format@landfield.com.]


Lindsey                                                         [Page 1]


                   Signed Headers in Mail and Netnews           May 2000

1.  Introduction

[Remarks enclosed in square brackets and aligned with the left margin,
such as this one, are not part of this draft, but are editorial notes to
explain matters amongst ourselves, or to point out alternatives, or to
indicate work yet to be done.]

1.1.  Scope and Objectives

[This is a Draft of a Draft, for discussion within the USEFOR mailing
list until the best format for putting it forward has been decided on.
It also needs to be decided whether it should be aimed towards an
Experimental Protocol, the Standards track, or as an integral part of
[USEFOR]]

   "Netnews" is a set of protocols [USEFOR] that enables news "articles"
   to be broadcast to potentially-large audiences, using a flooding
   algorithm which propagates copies throughout a network of
   participating hosts. The huge growth in the use of this protocol in
   recent years has been accompanied by many attempts to abuse the
   system by causing it to appear that articles came from parties other
   than those that actually injected them, or that they had been posted
   with some Approval that the real poster was not entitled to give, or
   that they otherwise appeared to be different from what they actually
   were. The effects of such abuse are particularly accute in the case
   of "Control" articles which can cause newsgroups to be created or
   removed on hosts worldwide, or which can cause unauthorized deletion
   of articles already received and stored on such hosts. It is
   therefore considered essential to provide a cryptographically secure
   means whereby it can be established beyond doubt that the source and
   structure of articles are exactly as they purport to be.

   "Electronic Mail" is a system for routing "messages" [MESSFOR]
   between individual computer users, usually on a one-to-one basis. The
   formats of Email messages and News articles have deliberately been
   made to be similar, so that messages may be gatewayed to news systems
   and vice-versa. In order that the same protection may be provided
   end-to-end for articles passing through such gateways, the protocal
   described here has been designed so that it will also work in the
   Email environment. If it should be found to have further applications
   in the Email environment, then that would be an added bonus.

   An existing experimental protocol "pgpverify" [PGPVERIFY] is already
   in widespread use for authenticating Control messages for creating
   and removing newsgroups within Usenet, and has proven itself very
   successful in mitigating the effects of malicious attacks against the
   integrity of Usenet.  This present proposal is largely based upon
   pgpverify; however, pgpverify is unsuitable for more widespread use
   as it stands because it is unable to cope with folded headers and
   with the changes that mail messages in particular are likely to
   undergo during transport. A second similar experimental protocol
   "pgpmoose" [PGPMOOSE] is also currently in use for protecting
   moderated newsgroups against unauthorized postings.


Lindsey                                                         [Page 2]


                   Signed Headers in Mail and Netnews           May 2000

   There also exist protocols for the cryptographic signature of bodies
   of articles, notably S/Mime and PGP/Mime [RFC 2015], and it is
   moreover common to sign such bodies using PGP alone without the use
   of Mime [RFC 2045] et seq at all. However, these protocols cannot, by
   their nature, be used to sign headers. Moreover, since the signature
   is applied after any Content-Transfer-Encoding [RFC 2045], it may be
   impossible to verify the signature if the Content-Transfer-Encoding
   should be changed as the message passes through a succession of sites
   during transport. Nevertheless, this present proposal does not
   attempt to usurp those protocols, but merely provides the means to
   sign headers, both of complete messages and of headers embedded in
   Mime messages and multiparts.

[This document has been designed to fit on top of the drafts currently
in preparation for Email [MESSFOR] and for News [USEFOR].  It is
expected that at least the Email draft will have progressed to the RFC
stage by the time the present document is complete, at which time all
references to [MESSFOR] in the present text will be replaced by
references to that RFC. If it is thought wise to issue this document
before [USEFOR] is complete, then that reference will have to be to [RFC
1036] instead.]

1.2.  Notations and Conventions

1.2.1.  Requirements notation

   Certain words, when capitalized, are used to define the significance
   of individual requirements. The key words "MUST", "SHOULD", "MAY" and
   the same words followed by "NOT" are to be interpreted as described
   in [RFC 2119].

1.2.2.  Syntactic notation

   This document uses the Augmented Backus Naur Form described in [RFC
   2234].  A discussion of this is outside the bounds of this document,
   but it is expected that implementors will be able to quickly
   understand it with reference to the defining document.

1.3.  Overview

   This proposal makes provision for Signed headers to be included in
   news articles and in Mime messages and multiparts. A Signed header
   provides a cryptographic signature over a named set of other headers,
   including lower level headers contained in Mime messages and
   multiparts below the current level. Such signatures can give
   assurance to a recipient who verifies them that those headers have
   not been changed or added to in transit, and/or that the article was
   indeed sent by its purported originator.

   The bodies of articles, Mime messages and multiparts are not directly
   included in the Signature. Rather, the intention is that each such
   body part should have a Content-MD5 (or similar) header computed for
   it, and that header should then be included in the Signature instead.


Lindsey                                                         [Page 3]


                   Signed Headers in Mail and Netnews           May 2000

   There is also provision for Verified headers which may be added by
   agents that have checked a Signed header. Verified headers may
   themselves be included in further Signed headers; this may be
   especially useful in the case of gateways which find it necessary to
   change an article in ways that invalidate an original signature.

   Every effort has been made to ensure that signatures remain
   verifiable in spite of all reasonable (and even unreasonable) changes
   to which they may be subjected in transit. These include changes to
   the Content-Transfer-Encoding of body parts (a principle reason for
   including them only via the Content-MD5 header), changes in the order
   of headers and of their layout, and encodings and re-encodings of
   unusual character sets. This is to be achieved by converting headers
   into a canonical form before they are signed. New headers, yet to be
   invented, need provide no problem, and there is no commitment to any
   particular character set (provided header-names remain in US-ASCII,
   as at present).

   Provision is made for different protocols which may be required in
   the future.  However, this proposal defines just one, recommended
   protocol, and it is not desirable that other protocols should be
   defined unless and until serious deficiencies in the existing ones
   have been revealed.

2.  Basic Structure of Authenticating Headers

   A Signed or a Verified header may appear in the headers of a news
   article or a mail message, or in the headers of a Mime multipart
   sub-part or of a Mime message/rfc822 object (or indeed of any similar
   Mime object yet to be invented). In all cases, the term "current
   level" encompasses the entire set of headers in that same object.
   Where the headers at the current level include a "Content-Type:
   multipart/*" or "Content-Type:  message/*" header, lower-level
   headers can arise within its sub-parts.

2.1.  Syntax of the Signed header

      Signed         = "Signed" ["-" DIGIT9] ":" 1*SP header-ref-list
                          1*( ";" header-parameter ) CRLF
      DIGIT9         = %x31-39                           ; 1..9
      header-ref-list= header-ref *( [CFWS] "," [CFWS] header-ref )
      header-ref     = [ "+" / "-" ] ( field-name *( "/" 1*DIGIT )
                       / "mail-standard" / "news-standard" )
      field-name     ; see [MESSFOR]
      CFWS           ; see [MESSFOR]
      FWS            ; see [MESSFOR]
      header-parameter
                     = attribute "=" value
      attribute      = signed-token / x-token
      signed-token   = "protocol" / "key" / "sig" /
                     <Any other token defined for a particular protocol>
      value          = token / quoted-string
      x-token        = [CFWS] The two characters "X-" or "x-" followed,
                       with no intervening white space, by any token>

Lindsey                                                         [Page 4]


                   Signed Headers in Mail and Netnews           May 2000

                       [CFWS]
      token          = [CFWS] 1*<any (US-ASCII) CHAR except SP, CTLs,
                                 or tspecials> [CFWS]
      tspecials      = "(" / ")" / "<" / ">" / "@" /
                       "," / ";" / ":" / "\" / DQUOTE /
                       "/" / "[" / "]" / "?" / "="
      quoted-string  ; see [MESSFOR]
      protocol-value = ietf-token / x-token
      ietf-token     = <An extension token defined by a standards-track
                        or experimental protocol RFC and registered
                        with IANA>
      key-id-value   = token
      signature-value= DQUOTE [FWS] 1*( btext [FWS] ) DQUOTE
      btext          = %x41-5A / %x61-7A / %x30-39 / "+" / "/" / "="
                       ; base 64 chars

   The header-parameters MUST include a "protocol" parameter and a "sig"
   parameter, of which the "sig" paramameter MUST be the last parameter
   and MUST NOT be followed by CFWS (though it MAY be followed by WS).

        NOTE: The requirement for an explicit SP after the ":" is to
        ensure compatibility with the syntax of Netnews [USEFOR]; it is
        not strictly necessary for Email.

   The use of a DIGIT9 in the Signed header allows for 10 distinct such
   headers at any one level. This is more than sufficient for the
   intended usage (it would be most unusual to get beyond Signed-2)
   whilst still permitting implementations to check header-names against
   a fixed list of valid names. There MUST NOT be more than one Signed
   header with no DIGIT9, or the same DIGIT9, within one set of headers.

   The header-ref-list indicates those header-refs, at or below the
   current level, which are covered by the signature. The ordering of
   this list is significant. A header-ref prefixed by a "+", or not
   prefixed at all, indicates a header-ref to be added to the list
   defined by those preceding it, and a header-ref prefixed by "-"
   indicates a header-ref to be removed from the header-refs defined by
   the list preceding it.

   Tokens are case-insensitive.  "Foobar" is the preferred protocol
   defined by this proposal.  It is desirable to keep the number of
   recognized protocols to an absolute minimum, and it is anticipated
   that further protocols would only be needed in the event that serious
   cryptographic deficiencies were to be found in the existing ones.
[Obviously, "foobar" is just a placeholder for whatever name is finally
chosen.]

   The "key" parameter identifies the key used to generate the signature
   in a notation dependent upon the protocol (but commonly "0x" followed
   by hexadecimal digits). The CFWS following it MAY include a comment
   containing an identification of the person or entity which created
   the signature.



Lindsey                                                         [Page 5]


                   Signed Headers in Mail and Netnews           May 2000

   The header-ref "news-standard" is a macro representing a set of
   common headers that SHOULD normally be included when signing the
   headers of a Netnews article, and is defined as the list

      Date, Newsgroups, Distribution, Message-ID, From, Reply-To,
      Followup-To, References, Subject, Keywords, Control, Content-Type,
      Content-ID

   The header-ref "mail-standard" performs the same function for mail
   messages, and is defined as the list

      Date, From, Reply-To, To, Cc, In-Reply-To, References, Subject,
      Keywords, Content-Type, Content-ID

        NOTE: Those lists have carefully excluded those headers (such as
        Sender and Content-Transfer-Encoding) which are liable to be
        added or altered by sites downstream from the one which
        generated the Signed header. If some header-ref in the list
        matches no header in the actual article, then it comprises an
        assertion that no such header was present when the article was
        signed. Headers which are routinely added to or altered as the
        article progresses through transports (such as Path, Received
        and Xref) SHOULD NOT be included in a header-ref-list, and
        neither should any header which appears twice in the set of
        headers. A header-ref prefixed by "-" may be used to exclude any
        header-ref from one of the standard lists.

2.2.  Semantics of the Signed header

   Where the headers at the current level include a "Content-Type:
   multipart/*" or "Content-Type: message/*" header, lower-level headers
   within its sub-parts may be referenced as follows:

   (i)  A header-ref not postfixed by any "/ DIGIT"s references the
        header of that name, if any, at the current level. Header-refs
        are, for this purpose, considered as case-insensitive.

   (ii) A header-ref of the form "XXXX/<m>" (or "XXXX/<m>/<n>..."),
        where <m> and <n> are numbers and the current level contains a
        "Content-Type: multipart/*" header, references the header that
        would be referenced by "XXXX" alone (or by "XXXX/<n>...") in the
        <m>th sub-part of that multipart, that sub-part now being
        regarded as the current level.

   (iii)A header-ref of the form "XXXX/1", where the current level
        contains a "Content-Type: message/rfc822" header (or any other
        message type which provides for its own set of headers),
        references the header that would be referenced by "XXXX" alone
        in that message object.

   (iv) A header-ref that does not match up with multipart or message
        Content-Type headers as indicated above MUST NOT be used.



Lindsey                                                         [Page 6]


                   Signed Headers in Mail and Netnews           May 2000

   (v)  For example "Content-MD5/3/2" references the Content-MD5 header
        of the second part of a multipart, which is itself the third
        part of a multipart established at the current level.

   A protocol, as established by this proposal or by any extension to
   it, comprises two parts: a "canonicalization algorithm" and a
   "cryptographic algorithm".

   The signature of a Signed header is constructed in accordance with a
   given header-ref-list as follows:

   1. A partial Signed header is constructed from that header-ref-list
      and such header-parameters (excluding "sig") as are required by
      the protocol, including at least a "protocol" parameter and, most
      likely, a "key" parameter identifying the cryptographic key used
      (possibly followed by a comment indicating the person or entity
      responsible), all followed by a CRLF.

   2. The header-ref-list is reduced by expanding the macros "mail-
      standard" and/or "news-standard", removing from the preceding part
      of the list any header-ref prefixed by a "-", and removing any
      duplicates.

   3. The partial Signed header followed by all the headers referenced
      by the reduced header-ref-list (being headers at the current level
      or encapsulated within multiparts at any lower level and taken in
      their order within the header-ref-list) are concatenated to
      produce a list of headers to be signed.

   4. The list of headers to be signed is subjected to the
      canonicalization algorithm of the protocol to produce a
      canonicalized list.

   5. The canonicalized list is subjected to the cryptographic algorithm
      of the protocol to produce an octet stream representing the
      signature.

   6. If the octet stream as produced by the cryptographic algorithm is
      not already in the form of base64 characters, it is now encoded in
      base64 [RFC 2045].  A "sig" parameter is appended to the partial
      Signed header, its value consisting of a quoted-string containing
      the base64-encoded octet stream, split into convenient lines by
      the insertion of FWS.

   7. The Signed header thus constructed is then incorporated into the
      set of headers at the current level.

   The signature of a Signed header is verified as follows:

   1. The "sig" parameter is removed from the Signed header to give a
      partial Signed header.




Lindsey                                                         [Page 7]


                   Signed Headers in Mail and Netnews           May 2000

   2-4.The corresponding steps of the process that constructed the
      header are taken, producing a canonicalized list.

   5. The public key identified according to the "protocol" parameter is
      now used by the cryptographic algorithm of that protocol to verify
      the signature. This may result in a simple pass-fail, or it may
      return some indication of the privileges (such as the authority to
      issue certain news control messages or to manage some mailing
      list) enjoyed by the owner of that key.

   The purpose of a Signed header is solely to establish that the
   headers referenced in it were present in an article when that article
   passed through the hands of the person or entity that generated the
   signature (and hence that it did indeed pass through those hands). It
   SHOULD NOT be taken as an endorsement of whatever is contained in the
   body of the article. If the contents of the body require such
   endorsement, then the body SHOULD be signed separately, for example
   in accordance with PGP/Mime [RFC 2015].

   Signatures will typically be generated by the originators of articles
   (to prove the origin), by moderators of moderated newsgroups (to
   testify to their Approved header), by managers of mailing lists, and
   by gateways. They SHOULD NOT be generated by intermediate transports
   and relayers through which the article might pass. This is intended
   to be an end-to-end protocol, and signatures SHOULD ONLY be added
   when new, hitherto unsigned, information is added. Moreover, the set
   of headers included within the signature SHOULD be no more than is
   necessary to achieve the security desired.

        NOTE: It will be observed that no provision has been made to
        include the bodies of an article or of its sub-parts in the
        signature. If (as will indeed often be the case) it is required
        to attest that the body (or sub-part) dispatched along with the
        set of headers is the same as the body that was delivered at the
        far end, then the proper procedure is to construct a Content-MD5
        header [RFC 1864] for that body (or sub-part) and to include
        that Content-MD5 amongst the headers that are signed. Doing it
        this way confers three advantages:
        a) The Content-MD5 header is constructed in such a way that it
        is immune to changes of Content-Transfer-Encoding to which an
        article, or its sub-parts, may be subjected during transport.
        b) Given that many user agents already routinely construct a
        Content-MD5 header, and verify it on receipt (a practice much to
        be commended), it should be possible to generate a Signed header
        without an extra pass through the entire body (especially in the
        common case where there are no sub-parts).  This applies
        particularly in the case of additional signatures by moderators
        or mailing list managers, who may not need to examine the body
        at all.
        c) If a Content-MD5 header should fail to verify (perhaps
        because of some transmission error) the verification of a Signed
        header might still succeed, giving the recipient at least some
        partial information as to where any problem might lie.


Lindsey                                                         [Page 8]


                   Signed Headers in Mail and Netnews           May 2000

        NOTE: If, at some future time, a Content-SHA1 header (or any
        similar header based upon a different hashing algorithm) should
        be invented, it could equally well be used for this purpose.

2.3.  Syntax of the Verified header

      Verified       = "Verified" ["-" DIGIT9] ":" 1*SP name-addr
                          *( ";" header-parameter ) CRLF
      name-addr      ; <see [MESSFOR]>
      attribute      =/ verified-token
      verified-token = "signature" / "hashcheck"
      signature-value= "good" / "FAILED"
      hashcheck-value= DQUOTE ( "good" / "FAILED" )
                           FWS header-ref-list DQUOTE

   The use of a DIGIT9 in the Verified header allows for 10 distinct
   such headers in one article.  Each Verified header MUST match some
   Signed header with the same DIGIT9 in that same set of headers. There
   MAY be more than one Verified header with the same DIGIT9 within one
   set of headers (but observe that it would not then be possible to
   include those headers in a further Signed header).

   Tokens used for attributes are case-insensitive. The only parameters
   defined by this proposal are the "signature" and "hashcheck"
   parameters. Other parameters permitted by the syntax are for the
   purpose of future extensions to this proposal, and should be ignored
   except as defined in such extensions.  The absence of a "signature"
   parameter should be taken as indicating that the verification had
   succeeded. The "hashcheck" parameter is to indicate that a Content-
   MD5 (or similar) header identified in the header-ref-list had been
   verified, or not as the case may be.
[Do we also want a "confidence" parameter for the verifier to express
his certainty of the identity of the original Signer, and if so, what
notation to use?]

2.4.  Semantics of the Verified header

   The Verified header is intended to be added to an article by an agent
   through which the article passes, and serves as an assertion that the
   corresponding Signed header has been cryptographically verified by
   the person or entity identified in the name-addr (or otherwise if the
   "FAILED" value is present).  The addr-spec contained in that name-
   addr MUST be a valid email address by which that person or entity may
   be contacted. The original Signed header MUST NOT be removed from the
   article. The Verified header (supposing it is the only one present
   with that particular DIGIT9, if any) MAY itself be included in a
   further Signed header added at the same time.

        NOTE: The purpose of a Verified header is to save the ultimate
        recipient the trouble of verifying the cryptographic signature
        himself (which can be time consuming, and may require knowledge
        of public keys not in his possession).  Such a verification, if
        performed close to the ultimate recipient (such as by the news
        or mail server to which he connects) could normally be regarded

Lindsey                                                         [Page 9]


                   Signed Headers in Mail and Netnews           May 2000

        as adequate evidence of authenticity, even if not signed itself.
        It would be hard (certainly in the case of Netnews) for a
        malicious interloper to cause such a verification to appear
        bearing the identity of the local server of each ultimate
        recipient.

        NOTE: The Verified header is also useful in the case that a
        gateway (or a moderator) makes some change to an article that
        renders an original Signed header invalid. Such a gateway can
        therefore certify that the original form of the Signed header
        had been verified, and can then resign the article (including
        his added Verified header). Likewise, a site (such as the
        originator's own server) with a well known public key can verify
        and resign an article whose originator's public key may be less
        well known. However, Verified headers SHOULD NOT be added as
        routine by other intermediate sites.

   It is normally the business of the reading agent of the ultimate
   recipient to check the correctness of a Content-MD5 or similar
   header. Nevertheless, an earlier agent that has added a Verified
   header and also checked such a Content-MD5 header MAY so indicate by
   including a "hashcheck" parameter.

3.  Protocol definition

3.1.  Requirements for canonicalization algorithms

   It is a sad fact of life that those implementing agents for handling
   Netnews and Email cannot resist the temptation to "improve" articles
   passed through them by rewriting headers that are thought not to
   conform to some real or supposed standard. Experience shows that, in
   the majority of cases, such tinkering makes matters worse rather than
   better, and for that reason [USEFOR] and, to a lesser extent,
   [MESSFOR] and [SMTP] try to forbid it, especially when perpetrated by
   relaying and transport agents (there are arguments in favour of
   allowing injecting agents and other agents close to the originator to
   do some limited cleanups, especially where it is impractical to
   return the article to the originator for correction).

   Furthermore, in the case of Email it is often required for the
   transport protocols to modify articles en route, most notably when
   articles containing octets with the 8th bit set have to be passed
   through a channel that permits only 7bit.

   It is a further sad fact of life that agents which make such changes
   are not going to go away just because some standard says so.
   Therefore, the canonicalization algorithm SHOULD endeavour to enable
   the headers of articles to be signed and verified in accordance with
   this proposal in spite of such tinkerings, insofar as they can be
   anticipated. The following list indicates some common practices which
   are worth detecting and protecting against.

     o Headers may be re-folded to fit within some preferred overall
       line length.  This may result in the creation of whitespace where

Lindsey                                                        [Page 10]


                   Signed Headers in Mail and Netnews           May 2000

       none existed before.
     o Trailing whitespace may be removed, and line endings changed
       to/from CRLF.
     o Header-names may be converted into some usual canonical form
       (e.g.  "Mime-Version" into "MIME-Version").
     o Phrases, or parts thereof, may be converted to or from quoted-
       strings.
     o Date-times may be rewritten in some preferred format, or into
       some preferred timezone.
     o Headers with non-ASCII characters may be converted to or from the
       notation defined in [RFC 2047].
   Observe that there is no canonical way to do this conversion and it
   is, moreover, frequently performed in contexts where it is not
   strictly allowed.
[Other contributions to this list welcomed.]

   Since the slightest change to a canonicalization algorithm will
   render it inoperable with previous versions, such an algorithm MUST
   NOT be changed once it has been defined by this proposal, or any
   extension thereof. In the event of some inadequacy being found, it
   would be necessary to devise and standardize a new algorithm, a task
   not to be undertaken lightly. For this reason, canonicalization
   algorithms SHOULD be designed to cope with the widest possible range
   of headers, including those not yet invented. Therefore, they SHOULD
   NOT, so far as possible, rely on the ability to parse any particular
   header.

        NOTE: A canonicalization algorithm is required simply to produce
        an octet stream for submission to the cryptographic algorithm.
        That stream does not have to be human readable, nor does it have
        to be a syntactically-correct header, nor does it have to be
        convertible back into the original header, or into any correct
        header at all. Insofar as many original headers can, in
        principle, be mapped into the same octet stream, this in no way
        reduces the utility of the algorithm, even though it might
        enable conspiracy theorists to imagine, and even implement,
        various sorts of covert channels for use by malicious
        interlopers.

3.2.  The Foobar protocol

[Suggestions for a proper name on a postcard, please, to /dev/null for
now.]

   The "foobar" protocol is comprised of a canonicalization algorithm
   "foo" and a cryptographic algorithm "bar".

3.2.1.  The Foo canonicalization algorithm

   For the purposes of this algorithm, the headers Subject, Comments,
   Organization and Summary, and all headers starting with "X-", are to
   be considered "unstructured" and all other headers "structured"
   (whether or not they were so described in any other standard).
   Headers are considered to be constrained to the following syntax:

Lindsey                                                        [Page 11]


                   Signed Headers in Mail and Netnews           May 2000

      structured-header
                    = header-name ":"
                         1*SP structured-header-content CRLF
      unstructured-header
                    = header-name ":"
                         1*SP unstructured-header-content CRLF
      header-name   = 1*name-character *( "-" 1*name-character )
      name-character= ALPHA / DIGIT
      structured-header-content
                    = *structured-header-zone
      unstructured-header-content
                    = unstructured-header-zone
      structured-header-zone
                    = neutral-zone / quoted-zone / sharp-zone /
                      square-zone / comment-zone
      unstructured-header-zone
                    = 1*( FWS / encoded-word / <any visible character> )
      neutral-zone  = 1*( FWS / encoded-word /
                          <any character except DQUOTE, "<", "[", "("> )
      quoted-zone   = DQUOTE *( FWS /
                                <any character except unquoted DQUOTE> )
                          DQUOTE
      sharp-zone    = "<" *( FWS /
                             <any character except unquoted ">"> ) ">"
      square-zone   = "[" *( FWS /
                             <any character except unquoted "]"> ) "]"
      comment-zone  = "(" *( FWS / encoded-word / comment-zone /
                             <any character except unquoted ")"> ) ")"
      encoded-word  = "=?" pure-token "?" pure-token "?"
                         1*<any printable US-ASCII character other than
                            "?" or SP> "?="
      pure-token    = 1*<any (US-ASCII) CHAR except SP, CTLs,
                         or tspecials>

     o where '<any visible character>' means any octet other than those
       representing the US-ASCII characters NULL, CR, LF, TAB and SP,
     o where 'except unquoted "x"' means except any "x" not immediately
       preceded by a "\" and thus constituting a quoted-pair, and
     o where an encoded-word does not include "(" or ")" when in a
       comment-zone, and does not include DQUOTE, "<", "[", or "(" when
       in a neutral-zone.
   Observe that certain header-names containing non-alphanumeric
   characters, and permitted by [MESSFOR] (though never used in
   practice) are excluded from this protocol. Moreover, it is not
   assumed that this protocol will work on any of the obsolete syntax
   defined by [MESSFOR].

        NOTE: All known Email and Netnews headers (and a lot more
        besides) are encompassed within this syntax. Observe that the
        various zones cannot possibly overlap, and that any encoded-word
        must be fully contained within its zone.  All encoded-words
        permitted by [RFC 2047] (and more besides) are covered. The
        structure is easily parsed by a straightforward state machine
        (though the nesting of comment-zones is a nuisance, as is the

Lindsey                                                        [Page 12]


                   Signed Headers in Mail and Netnews           May 2000

        impossibility to detect whether a sequence beginning "=?"  was
        really an encoded-word until you get to the matching "?=").

   Each header to be included in the algorithm, which will in general
   consist of several lines (those after the first commencing with
   whitespace), is processed as follows:

   1. The header-name at the start of the header is converted to
      lowercase and the whitespace following it (if any) is replaced by
      a single SP.

   2. Within each unstructured-header-zone and each comment-zone, all
      instances of FWS are replaced by a single SP; within each
      neutral-, quote-, sharp- or square-zone, all instances of FWS are
      omitted (thus the header has now been unfolded into a single
      line). Any whitespace at the end of the header is removed, and it
      is ensured that the header ends with a single CRLF.

   3. The DQUOTEs (ASCII '"') enclosing each quoted-zone are removed
      (but not any quoted DQUOTE or any DQUOTE within other zones so
      that, in particular, they are not removed within msg-ids).

   4. Any date-time occurring in a Date, Resent-Date or Expires header
      (but not in any other header) is converted into the number of
      seconds since the start of January 1st 1970 UTC, expressed as a
      decimal number without leading zeroes, and as more precisely
      defined by the POSIX mktime routine.
[Can someone give me a reference to the proper POSIX document?]

   5. Any encoded-word (where allowed by the above sysnax, and whether
      or not its length is more than 75 characters) is replaced by the
      sequence of octets obtained by decoding it. Moreover, where two
      adjacent encoded-words are separated by whitespace, that
      whitespace is removed (see [RFC 2047]).

        NOTE: The decoding of encoded-words must take place last,
        because it could produce arbitrary sequences of octets (when
        decoding into UCS-16, for example) which might then be confused
        with US-ASCII characters such as DQUOTE, etc. Whitespace needs
        to be removed entirely from structured headers because it is
        possible it may have been introduced by folding in unexpected
        places en route, subsequent to the original signing.

   If, during signing, a header is found not to conform to the given
   syntax (in particular, if the closing delimiter of some zone is not
   found), then the signing MUST be aborted (and it MAY be aborted if
   the header is malformed for some other reason). When verifying a
   signature, however, an implementation MAY attempt to continue even
   when the final zone of a header has no closing delimiter.

        NOTE: If an internet mail message in the format defined by
        [MESSFOR] is converted into X.400 mail by a gateway conforming
        to [RFC 1327] and then back into internet mail, then it is
        likely that any signature made in accordance with this proposal

Lindsey                                                        [Page 13]


                   Signed Headers in Mail and Netnews           May 2000

        will fail to verify. For example comments in headers containing
        addresses (such as in From, Reply-To, etc.) may be converted
        into phrases and moved in front of the addr-spec, or even
        removed entirely, and thus the canonicalized form of the message
        will have been changed.  This old convention, for storing the
        Real Name of the person associated with the address in a
        following comment, is now deprecated by both [MESSFOR] and
        [USEFOR], but even where phrases are used for this purpose it is
        possible that other changes to the message will still render the
        signature unverifiable. Note that there is in any case no
        expectation that an internet mail message signed according to
        this proposal will ever be able to be verified once it has been
        passed permanently into an X.400 system, nor vice versa.

3.2.2.  The Bar cryptographic algorithm

[Open PGP is the obvious choice for this, since it is widely available
and is blessed by the IETF. My only reservation is that it comes with a
rather poor certification system as compared with, say, SPKI. So this
choice might yet have to be reviewed.]

   The stream of octets resulting from the canonicalization algorithm is
   signed, in binary mode (signature type 0x00), in accordance with Open
   PGP [RFC 2440].

        NOTE: The signature is made in binary mode just in case any [RFC
        2047] decoding into UCS-16 has produced octets which might be
        mistaken for isolated CR, LF or trailing SP characters, which
        are treated specially in PGP text mode.

   The output of the algorithm MUST be Ascii-armored [RFC 2440], but the
   Armor Header Line ("BEGIN PGP SIGNATURE"), the Armor Headers (e.g.
   "Version:"), the blank line following the Armor Headers, and the
   Armor Tail ("END PGP SIGNATURE") are to be omitted (thus yielding a
   sequence of base64 characters). Observe that these characters will
   include a CRC checksum, which SHOULD be on a separate line from the
   rest of the signature.

   The signature included within the Ascii-armor MAY include
   certificates as evidence that the signing key has the necessary
   authorization to sign articles of that nature, but such usage is in
   general deprecated except between parties that have agreed otherwise
   or where, for some reason, an unusual signatory is signing and
   attaches a certificate from the usual signatory.

   The signature SHOULD use the DSA public-key algorithm and the SHA-1
   hashing algorithm, and be incorporated in a Version 4 Signature
   Packet in the new format. It MAY alternatively use the combination
   RSA/MD5 with Version 3 in the old format (for compatibility with PGP
   2.6.x) and it MAY use the combination RSA/SHA-1 with Version 4 in the
   new format. Verifiers MUST be able to verify all of these forms.




Lindsey                                                        [Page 14]


                   Signed Headers in Mail and Netnews           May 2000

4.  Applications

   It is anticipated that protocols for specific applications of the
   signature mechanisms described in this proposal will be devised,
   whether under the auspices of the IETF or otherwise. For example, the
   need to be able to verify the origin of Control messages for creating
   and removing newsgroups and for cancelling articles was a prime
   motivation for creating this proposal.

   It is up to each such application to specify appropriate mechanisms
   for establishing a Public Key Infrastructure suited to its purpose.
   Such an infrastructure would provide for the storing, distribution
   and authorization of the necessary public keys (and for revocations
   thereof). This proposal establishes no preferred mechanisms in this
   regard, except to draw attention to the possible usefulness of the
   Content-Type application/pgp-keys as defined in [RFC 2015].

5.  Examples

[The MD5 hashes in the following are bogus, but I would expect to
include genuine ones in the final version. The signatures are genuine,
by my own key]

5.1.  Newgroup Control message

   A 'newgroup' control message in the format given in [USEFOR].

   Newsgroups: comp.foo
   From: "Charles Lindsey" <group-admin@isc.example>
   Subject: cmsg newgroup comp.foo moderated
   Control: newgroup comp.foo moderated
   Approved: newgroups-request@isc.example
   Message-ID: <919190727.4918@isc.example>
   Date: Tue, 16 Feb 1999 18:45:27 -0000
   MIME-Version: 1.0
   Content-Type: multipart/mixed; boundary=88888888
   Signed: news-standard,+content-md5/1,+content-type/1,+content-md5/3,
       +content-type/3; protocol=foobar; key="0x2C15F1A9"
       (Charles Lindsey);
      sig="
      iQB8AwUAOLVOAK1e6k0sFfGpAQH5swMzBpEVYf0mhFg1r3ErtGSC1RS7iwHPalsJ
      3miSKIfK7GdBnNfVGg9feiTkYMv3aMpUGYRaxn6W1K5QxIQInU+KNbCWiPLrGPdS
      jW7gYe7vB3tBeXiOe7+6wPHmzUAlKiuRuNcfQrOYGg==
      =GGsm"

   This is a multipart message in MIME format.

   --88888888
   Content-Type: application/news-groupinfo
   Content-MD5: T7NtIdVqde62kheQuAHOaw==

   For your newsgroups file:
   comp.foo                For Foo discussions (Moderated)


Lindsey                                                        [Page 15]


                   Signed Headers in Mail and Netnews           May 2000

   --88888888
   Content-Type: text/plain

   comp.foo a moderated newsgroup which passed its vote for creation
   by 424:8 as reported in news.announce.newgroups on 10 Feb 99.

   --88888888
   Content-Type: application/news-transmission
   Content-MD5: +piSsoeNmdin5ukFQuFTlw==

   Newsgroups: comp.foo
   Path: not-for-relaying
   Distribution: local
   From: "Charles Lindsey" <group-admin@isc.example>
   Message-ID: <919190727.4918/part2@isc.example>
   Date: Tue, 16 Feb 1999 18:45:27 -0000
   Subject: Charter for newsgroup com.foo
   Approved: newgroups-request@isc.example

   The charter, culled from the call for votes:

      Comp.foo is a moderated newsgroup for discussing all manner of
      Foos.

   Moderation submission address:
      comp-foo@bar.example

   --88888888--

5.2.  Mail message re-signed by mailing list owner

   received: from house.example by bar.example (8.8.8/AL/MJK-2.0)
           id XAA10880; Sat, 13 Feb 1999 23:00:14 GMT
   Resent-From: "Example Mail Server" <majordomo@com.example>
   Precedence: list
   Received: (from list@localhost)
           by house.example (8.9.2/8.9.2) id OAA28279;
           Sat, 13 Feb 1999 14:59:56 -0800 (PST)
   From: <"[john]"@
       temple.example> (John Smith)
   Organization: http://www.temple.example/john
   Subject: Submission to mailing list
         in connection with foo.
   Message-ID: <19990213145946.20115@main.temple.example>
   Date: Sat, 13 Feb 1999 22:59:46 +0000
   Mime-Version: 1.0
   Content-Type: text/plain; charset=us-ascii
   Content-MD5: +piSsoeNmdin5ukFQuFTlw==
   Signed: mail-standard,content-md5;
       protocol=Foobar; key="0x2376C8BD" (John Smith);
      sig="
      iQBVAwUAOLVRmGR/OLEjdsi9AQEIfQH+I9fB4+4cItsNX0fHq8KlT6ETKQUwnmZB
      TBB3ygoa0n6fiSxMijoMR3SRfQqzGY5fMbOMlv1mMyxVcs74jpk8OQ==
      =qRiE"

Lindsey                                                        [Page 16]


                   Signed Headers in Mail and Netnews           May 2000

   Verified: majordomo-request@com.example; signature=good;
       hashcheck=content-md5
   Signed-1: message-id,date,resent-from,
           verified,signed; protocol=FOOBAR; key="0x2C15F1A9";
      sig="
      iQB8AwUAOLVs2a1e6k0sFfGpAQFGGwMxAeCoV6JIuruJky7j2TOhvILDgf6ZUZA5
      B7okwUTK0omlWdBmc3jLb/8oVHhZCD1aEoejqLWsU1KbQYdn2MZuwA/yAaTDEpdM
      DMXM1ui+G569BoyxKmUce9Je4hY6tq47e1ajQO8HRw==
      =JXiU"

   Text of John's message.

   --
   John's signature.

   Passing the original form of this through the foo canonicalization
   algorithm produces the following, in the case of the "Signed:" header
   (observe lines folded for convenience of this document - the true
   line endings indicated by "CRLF"):

   signed: mail-standard,content-md5;protocol=Foobar;key=0x2376C8BD(
   John Smith)CRLF
   date: 918946786CRLF
   from: <"[john]"@temple.example>(John Smith)CRLF
   subject: Submission to mailing list in connection with foo.CRLF
   content-type: text/plain;charset=us-asciiCRLF
   content-md5: +piSsoeNmdin5ukFQuFTlw==CRLF

   And here is the result of canonicalizing to produce the "Signed-1:"
   header:

   signed-1: message-id,date,resent-from,verified,signed;protocol=FO
   OBAR;key=0x2C15F1A9CRLF
   message-id: <19990213145946.20115@main.temple.example>CRLF
   date: 918946786CRLF
   resent-from: ExampleMailServer<majordomo@com.example>CRLF
   verified: majordomo-request@com.example;signature=good;hashcheck=
   content-md5CRLF
   signed: mail-standard,content-md5;protocol=Foobar;key=0x2376C8BD(
   John Smith);sig=iQBVAwUAOLVRmGR/OLEjdsi9AQEIfQH+I9fB4+4cItsNX0fHq
   8KlT6ETKQUwnmZBTBB3ygoa0n6fiSxMijoMR3SRfQqzGY5fMbOMlv1mMyxVcs74jp
   k8OQ===qRiECRLF

        NOTE: the second signature signed only that which it had added
        itself, plus sufficient of the original headers to identify the
        original message. It did not need to scan the body to recompute
        the MD5 hash, but effectively included it by signing the
        original "Signed:" header.

6.  Security

   TBD



Lindsey                                                        [Page 17]


                   Signed Headers in Mail and Netnews           May 2000

[What is there to say here?]

7.  References


   [MESSFOR] P. Resnick, "Internet Message Format Standard", draft-
        ietf-drums-msg-fmt-07.txt, March 1998.

   [PGPMOOSE] Greg Rose, [I need a URL for this], October 1995.

   [PGPVERIFY] David Lawrence,
        ftp://ftp.isc.org/pub/pgpcontrol/README.html.

   [RFC 1036] M. Horton and R. Adams, "Standard for Interchange of
        USENET Messages", RFC 1036, December 1987.

   [RFC 1327] S. Hardcastle-Kille, "Mapping between X.400(1988) / ISO
        10021 and RFC 822", RFC 1327, May 1992.

   [RFC 1864] J. Myers and M. Rose, "The Content-MD5 Header Field", RFC
        1864, October 1995.

   [RFC 2015] M. Elkins, "MIME Security with Pretty Good Privacy (PGP)",
        RFC 2015, October 1996.

   [RFC 2045] N. Freed and N. Borenstein, "Multipurpose Internet Mail
        Extensions (MIME) Part One: Format of Internet Message Bodies",
        RFC 2045, November 1996.

   [RFC 2047] K. Moore, "MIME (Multipurpose Internet Mail Extensions)
        Part Three: Message Header Extensions for Non-ASCII Text", RFC
        2047, November 1996.

   [RFC 2119] S. Bradner, "Key words for use in RFCs to Indicate
        Requirement Levels", RFC 2119, March 1997.

   [RFC 2234] D. Crocker and P. Overell, "Augmented BNF for Syntax
        Specifications: ABNF", RFC 2234, November 1997.

   [RFC 2440] J. Callas, L. Donnerhacke, H. Finney, and R. Thayer,
        "OpenPGP Message Format", RFC 2440, November 1998.

   [SMTP] John C. Klensin and Dawn P. Mann, "Simple Mail Transfer
        Protocol", draft-ietf-drums-smtpupd-*.txt.

   [USEFOR] Charles H. Lindsey, "News Article Format", draft-ietf-
        usefor-article-format-03.txt.


8.  Acknowledgements

   The author acknowledges the work of David Lawrence, as original
   author of "pgpverify", for many of the ideas contained herein, and
   also many contributions from members of the usenet-format mailing

Lindsey                                                        [Page 18]


                   Signed Headers in Mail and Netnews           May 2000

   list.

9.  Contact Address

        Charles. H. Lindsey
        5 Clerewood Avenue
        Heald Green
        Cheadle
        Cheshire SK8 3JU
        United Kingdom
        Phone: +44 161 437 4506
        Email: chl@clw.cs.man.ac.uk

   Comments on this draft should preferably be sent to the mailing list
   of the Usenet Format Working Group at

        usenet-format@landfield.com.

   This draft expires six months after the date of publication (see Page
   1) (i.e. in November 2000).

10.  Intellectual Property Rights

[The usual texts from RFC 2026 to be inserted here.]

Appendix A. Model implementation

   The following is written in PERL, with full use made of facilities
   provided by the Perl CPAN library.

Appendix A.1. The foo canonicalization

   package Canon;

   use MIME::Words qw(decode_mimewords);
   use Date::Parse;
   use Exporter ();
   @ISA = qw(Exporter);
   @EXPORT = qw(canonicalize);

   %unstructureds = ('subject', 1, 'comments', 1, 'organization', 1,
                     'summary', 1);
   %dates = ('date', 1, 'resent-date', 1, 'expires', 1);

   sub canonicalize {
     my $tag = lc shift;
     my $line = shift;
     my $signing = shift;  # for more stringent checks when signing

     $is_structured = (not $unstructureds{$tag}) && $tag !~ m/^x-/o;
     $is_date = $dates{$tag};
     @outlist = ($tag, ': ');
     $outptr = \@outlist;  # will point to @encodelist during encoding
     $state = 0;           # for the state machine

Lindsey                                                        [Page 19]


                   Signed Headers in Mail and Netnews           May 2000

     $encoding = 0;        # part of the state machine
     $pending = 0;         # to remember the FWS between encoded-words

     do {
       # lexical split of $line into plain ($x) and next delimiter ($y)
       $line =~ m/(.*?)   # anything except the following:
                  ( \\\S      # quoted-pair
                  | [][)><("] # various bracket delimiters
                  | =\?(?!=) | \?=\s+=\? | \?=  # for encoded-words
                  | \s*$      # trailing whitespace
                  ) /sogx;
       $x = $1; $y = $2;

       # convert $x into canonical form
       if      ($is_date && $state == 0) {
         $x =~ s/(\S*)\s+/$1 /sog; # reduce FWS to SP
         if ($x !~ m/^\s*$/) {  # zone not empty
           if ($signing && $x !~ m/^\s?
                    ((mon|tue|wed|thu|fri|sat|sun)\s?,\s?)?
                    [0-9]{1,2}\s
                    (jan|feb|mar|apr|may|jun|jul|aug|sep|oct|nov|dec)\s
                    [0-9]{4}\s
                    [0-9]{2}:[0-9]{2}:[0-9]{2}\s
                    [-+][0-9]{4}\s?
                    /oix) {die "Bad Date '", $x, "'\n"}
           if (not ($x = str2time($x))) {die "Bad Date '", $x, "'\n"}
         }
       } elsif ($is_structured && $state <= 0) {
         $x =~ s/(\S*)\s+/$1/sog;  # eliminate FWS
       } else { # unstructured, or in a comment-zone
         $x =~ s/(\S*)\s+/$1 /sog; # reduce FWS to SP
       }
       push @$outptr, $x;

       # state machine to process $y
       if ($is_structured) {
         if      ($state == 0)  { # neutral-zone
           if    ($y eq '"')
                 {$state = -1; _end_encoding()}
           elsif ($y eq '<')
                 {$state = -2; push @$outptr, $y; _end_encoding()}
           elsif ($y eq '[')
                 {$state = -3; push @$outptr, $y; _end_encoding()}
           elsif ($y eq '(')
                 {$state =  1; push @$outptr, $y; _end_encoding()}
           elsif ($y eq '=?')
                 {_start_encoding(); push @$outptr, $y}
           elsif ($y =~ m/\?=/o)
                 {push @$outptr, $y; _end_encoding()}
           elsif ($y =~ m/^[])>]$/o) {
                 if ($signing)   {die "Unbalanced '", $y, "'\n"}
                 else            {push @$outptr, $y}
           }
           else  {$y =~ s/^\s*$/\r\n/o; push @$outptr, $y}

Lindsey                                                        [Page 20]


                   Signed Headers in Mail and Netnews           May 2000

                 # eliminate trailing WS; insert CRLF

         } else {
           if    ($y =~ s/^\s*$/\r\n/o && $signing)
                             {die "Unbalanced header ", $line}

           if    ($state == -1) { # in quoted-zone
                 if    ($y eq '"') {$state = 0}
                 else              {push @$outptr, $y}
           }
           elsif ($state == -2) { # in sharp-zone
                 if    ($y eq '>') {$state = 0}
                 push @$outptr, $y;
           }
           elsif ($state == -3) { # in square-zone
                 if    ($y eq ']') {$state = 0}
                 push @$outptr, $y;
           }
           elsif ($state > 0)   { # in comment-zone
                 if    ($y eq '(')
                       {$state ++; push @$outptr, $y; _end_encoding()}
                 elsif ($y eq ')')
                       {$state --; push @$outptr, $y; _end_encoding()}
                 elsif ($y eq '=?')
                       {_start_encoding(); push @$outptr, $y}
                 elsif ($y =~ m/\?=/o)
                       {push @$outptr, $y; _end_encoding()}
                 else  {push @$outptr, $y}
           }
         }
       } else { # unstructured
         $y =~ s/^\s*$/\r\n/o;     # eliminate trailing WS; insert CRLF
         if    ($y eq '=?')
               {_start_encoding(); push @$outptr, $y}
         elsif ($y =~ m/\?=/o)
               {push @$outptr, $y; _end_encoding()}
         else  {push @$outptr, $y}
       }

     } until $y eq "\r\n";
     if ($encoding) {_end_encoding()}
     $line = join('', @outlist);
     return $line;
   }

   sub _start_encoding {     # entered at every '=?'
     @encodelist = ();
     $outptr = \@encodelist; # divert output during encoding
     $encoding = 1;
   }

   sub _end_encoding   { # entered at every '?=' or unexpected delimiter
     my $token = "[^][()<>@,;:\"\?.=\x00-\x20\x7f-\xff]+";
     my $encoded_text = "[^\?\x00-\x20\x7f-\xff]+";

Lindsey                                                        [Page 21]


                   Signed Headers in Mail and Netnews           May 2000

     if ($encoding) {
       $outptr = \@outlist;  # cease output diversion
       if ($y =~ m/^\?=/o) { # '?=' as expected
         $encodelist[$#encodelist] = '?='; # in case it was '?=\s=?'
         $x = join('', @encodelist);
         if ($genuine = $x =~ m/^=\?$token\?$token\?$encoded_text\?=$/o)
           {$x = decode_mimewords($x)}     # dies if it fails
         if ($is_structured && $state <= 0) {
           if ($genuine) {$x =~ s/\s//go}  # eliminate FWS
         } else {
           if ($pending && not $genuine) {push @$outptr, ' '}
         }
         push @$outptr, $x;
       } else {  # unexpected delimiter during encoding
         if ($pending && (not $is_structured || $state > 0)) {
           push @$outptr, ' ';
         }
         push @$outptr, @encodelist;
       }
       $encoding = 0;
       if ($pending = $y =~ m/^\?=\s+=\?/o) {
         _start_encoding();
         push @$outptr, ('=?');
       }
     }
   }


Appendix A.2. Parsing of the Signed header

   # This module must be stored in Mail/Field/Signed.pm
   # relative to the other programs in the suite
   package Mail::Field::Signed;

   use strict;
   use vars qw(@ISA);
   use MIME::Field::ParamVal;
   use Carp;

   @ISA = qw(MIME::Field::ParamVal);

   INIT: {
       my $x = bless([]);

       $x->register('Signed');
       $x->register('Signed_1');
       $x->register('Signed_2');
       $x->register('Signed_3');
       $x->register('Signed_4');
       $x->register('Signed_5');
       $x->register('Signed_6');
       $x->register('Signed_7');
       $x->register('Signed_8');
       $x->register('Signed_9');

Lindsey                                                        [Page 22]


                   Signed Headers in Mail and Netnews           May 2000

   }

   my @news_standard = qw(date newsgroups distribution message-id from
                          reply-to followup-to references subject
                          keywords control content-type content-id);
   my @mail_standard = qw(date from reply-to to cc in-reply-to
                          references subject keywords content-type
                          content-id);

   sub parse {
       my ($self, $string) = @_;
       my $clean_string = _skip_CFWS($string);
       $self->set($self->parse_params($clean_string));
       $self->{string} = $string;
       $self->{header_refs} = ();
       do {
           if ($self->{_} =~ m/([-+]?[-\w]+(\/\d+)*)/og) {
               if    ($1 eq "news-standard")
                   {$self->_incorporate_header(@news_standard)}
               elsif ($1 eq "mail-standard")
                   {$self->_incorporate_header(@mail_standard)}
               else
                   {$self->_incorporate_header(($1))}
           } else { die "Bad header-ref-list", $string,"\n" }
       } while ($self->{_} =~ m/,/og);
       return $self;
   }

   sub stringify {
       my $self = shift;
       return $self->{string};
   }

   sub header_refs {
       my $self = shift;
       @{$self->{header_refs}};
   }

   sub _incorporate_header {
       my ($self, @additions) = @_;
       my $refs = \@{$self->{header_refs}};
       foreach (@additions) {
           if      (m/^-([-\w]+(\/\d+)*)/o) {
               # item to be removed from list
               for (my $i = 0; $i < @$refs; $i++)
                   {if (@$refs[$i] eq $1) {splice(@$refs, 1)} }
           } elsif (m/^\+?([-\w]+(\/\d+)*)/o) {
               # item to be added to list
            I: {
                   for (my $i = 0; $i < @$refs; $i++)
                       {if (@$refs[$i] eq $1) {last I} }
                   push (@$refs, $1);  # only if not already present
               }
           }

Lindsey                                                        [Page 23]


                   Signed Headers in Mail and Netnews           May 2000

       }
   }

   sub _skip_CFWS {
       my $line = shift;
       my $count = 0;
       my @buf = ();
       while ($line =~ m/\G([^\s\("]*)\s*|\G(\()|\G(")/sog) {
           if    ($1) {push @buf, ($1)}
           elsif ($2) {  # comment
               $count += 1;
               do {
                   $line =~ m/\G[^()]*([()])/sog
                      or die "Unclosed comment\n";
                   $count += ($1 eq '(') ? +1 : -1;
               } until ($count == 0);
           } elsif ($3) {  # quoted-string
               push @buf, ('"');
               do {
                   $line =~ m/\G([^\"\s]+)|\G(\s+)|\G(")/sog;
                   if    ($1) {push @buf, ($1)}
                   elsif ($2) {push @buf, (' ')}
                   elsif ($3) {push @buf, ('"'); last}
               }
           }
       }
       return join('', @buf);
   }

   1;

Appendix A.3. The Signing program

   use English;
   use Mail::Header;
   use Mail::Field;
   use Mail::Field::Signed;
   use MIME::Parser;
   use Canon;

   $signing = 1;  # This is a program to sign headers

   # Read partial Signed header from file
   open SIGNED, "<".$ARGV[0];
   $signed = new Mail::Header \*SIGNED;
   @names = $signed->tags;
   $tag = $names[0];
   if ($tag !~ m/^signed(-[1-9])?$/oi || $#names != 0)
      {die "Invalid SIGNED file ", $ARGV[0], "\n"}
   $line = Mail::Field->extract($tag, $signed);

   unless (lc($line->param('protocol')) eq 'foobar')
      {die "Unknown protocol ", $line->param('protocol'), "\n"}
   if ($line->param('sig'))

Lindsey                                                        [Page 24]


                   Signed Headers in Mail and Netnews           May 2000

      {die "'sig' already present\n"}
   unless ($line->param('key'))
      {die "'key' missing\n"}

   $parser = new MIME::Parser output_to_core=>'ALL';
   $article = $parser->read(\*STDIN) or die "Malformed article\n";

   if ($article->head->count($tag))
      {die "Message already signed\n"}

   $tmp = "/tmp/sign-$$";
   open(FH, "> $tmp") or die "Cannot open $tmp: $!\n";

   print FH canonicalize($tag, $line->stringify, $signing);
   foreach $ref ($line->header_refs) {
      _extract_header($article, $ref);
   }
   close(FH);

   sub _extract_header {
      my ($article, $ref) = @_;
      $ref =~ m/([-\w]+(\/\d+)*?)((\/(\d+))?)/o;
      if ($3)  # $ref of the form "header/1"; call ourselves recursively
         {_extract_header($article->parts($5-1), $1)}
      else {   # $ref is a header at the current level
         if    ($article->head->count($1) > 1)
            {die "Cannot sign duplicated header ", $1, "\n"}
         elsif ($article->head->count($1) == 1) {
            print FH canonicalize($1, $article->head->get($1), $signing)
         }
      }
   }

   # The remainder of this code is dependent upon the particular
   # implementation of OpenPGP.

   $key = $line->param('key');
   $pgp =
     "pgps -fab +verbose=0 +textmode=off -u $key < $tmp 2>/dev/null |";
   open(FH, $pgp) or die "Cannot open pipe from pgp: $!\n";

   undef $INPUT_RECORD_SEPARATOR;
   $_ = <FH>;  # The OpenPGP signature record
   unlink $tmp;
   s/^.*[^\w+\/=\n].*\n|^\n//mog;     # remove non-base64 lines
   s/^/   /mog;                       # indent by 3 spaces
   s/\A/;\n   sig="\n/mo; s/\Z/"/mo;  # enclose in '; sig="..."'

   $article->head->add($tag, $line->stringify . $_);
   $article->print;

Appendix A.4. The Verification program



Lindsey                                                        [Page 25]


                   Signed Headers in Mail and Netnews           May 2000

   use English;
   use Mail::Header;
   use Mail::Field;
   use Mail::Field::Signed;
   use MIME::Parser;
   use Canon;

   $signing = 0;  # This is a program to verify signed headers
   $parser = new MIME::Parser output_to_core=>'ALL';
   $article = $parser->read(\*STDIN) or die "Malformed article\n";

   $tag = $ARGV[0];
   unless ($tag =~ m/^Signed(-[1-9])?/io)
      {die "Bad parameter ", $tag, "\n"}

   $line = Mail::Field->extract($tag, $article);
   unless ($line)
      {die $tag, " header not found\n"}
   unless (lc($line->param('protocol')) eq 'foobar')
      {die "Unknown protocol ", $line->param('protocol'), "\n"}
   unless ($line->param('key') and $line->param('sig'))
      {die "Malformed Signed header\n"}

   $tmp = "/tmp/sign-$$";
   open(FH, "> $tmp") or die "Cannot open $tmp: $!\n";

   $signed = $line->stringify;
   $signed =~ s/\s*;[^;]*\bsig\b[^;]*$//io;  # remove "; sig=..."
   print FH canonicalize($tag, $signed, $signing);
   foreach $ref ($line->header_refs) {
      _extract_header($article, $ref);
   }
   close(FH);

   sub _extract_header {
      my ($article, $ref) = @_;
      $ref =~ m/([-\w]+(\/\d+)*?)((\/(\d+))?)/o;
      if ($3)  # $ref of the form "header/1"; call ourselves recursively
         {_extract_header($article->parts($5-1), $1)}
      else {   # $ref is a header at the current level
         if    ($article->head->count($1) > 1)
            {die "Duplicated header ", $1, " signed\n"}
         elsif ($article->head->count($1) == 1) {
            print FH canonicalize($1, $article->head->get($1), $signing)
         }
      }
   }

   # The remainder of this code is dependent upon the particular
   # implementation of OpenPGP.

   use IPC::Open2;
   $pgp = "pgpv -f --batchmode -o $tmp 2>&1";
   open2(\*PIPEOUT, \*PIPEIN, $pgp);

Lindsey                                                        [Page 26]


                   Signed Headers in Mail and Netnews           May 2000

   $armour = $line->param('sig');
   $armour =~ s/\s//sog;
   $armour =~ s/([\w+\/=]{64})/$1\n/sog;
   $armour =~ s/(=[\w+\/]{4}\Z)/\n$1/so;
   print PIPEIN "-----BEGIN PGP SIGNATURE-----\n",
                "Charset: noconv\n\n",
                $armour, "\n",
                "-----END PGP SIGNATURE-----\n";
   close(PIPEIN);
   undef $INPUT_RECORD_SEPARATOR;
   $result = <PIPEOUT>;
   unlink $tmp;

   $result =~ s/^This signature applies to another message\n//mo;
   $result =~ m/Key ID +([0-9a-fA-F]+)/iom;
   unless ("0x" . $1 eq $line->param('key')) {
      print "Signature was for key ", $line->param('key'),
            ", not for 0x", $1, "\n";
      $badsig = 1;
   }
   $badsig |= ($result !~ m/Good signature/iom);
   print $result;
   exit $badsig;

Appendix B. Test cases

   The following, believe it or not, is a valid email message.  Note
   that there are various TABs and much trailing whitespace in it
   (assuming these come through to the published form of this document).

   Subject: Unstructured headers can contain unmatched (s and unescaped
                 "s; (comments like this) and "quoted strings" are not
           treated specially.
   SUMMARY: Multiple         spaces, tabs and foldings
    in unstructured headers are reduced to a single SP, and trailing
    whitespace (of which there is much in these examples)) is ignored.
   X-Header: All X headers are "treated "as unstructured")
   from: "Scooby Doo" <foo@bar.example>   (all FWS in
      structured headers is removed, except in comments)
   tO:           "John (the Boss) Smith"  <bar@foo.example> ,
      "Bill \"fingers\"
    Sykes" <"#*\"~"@twist.example> (Observe unescaped \( and escaped "
    within quoted strings, and (properly matched) parentheses within
    comments)
   rEPLY-tO:"#*\"~"@twist.example
    (Observe "s elided, since not in <...>)
   Message-ID: <"*\"~and-other-grunge)(]["@[127.0.0.1"Ugh!]>
    (Yes that is a legal msg-id, including the " in the domain-literal)
   Sender: foo@[127.0.0.1"Ugh!] (another " in a domain-literal)
   Cc: foo@[127.0.0.1(this is not], bar@[a comment)127.0.0.1],
    "=?utf-8?Q?not_an_encoded_word?="
    <=?utf-8?Q?not_an_encoded_word?=@bar.example>,
    =?us-ascii?Q?Joe_D._Bloggs_=5Bwho=20else=5d?= <foo@bar.example>,
    =?us-ascii?Q?C&A?=@bar.example (treated as an encoded-word even

Lindsey                                                        [Page 27]


                   Signed Headers in Mail and Netnews           May 2000

    though, syntactically, it isn't)
    (in comment but =?is0-8859-1?Q?not(an_encoded-word?=))
    (=?us-ascii?Q?encoded-word_split_into-?=
    =?us-ascii?b?cGFydHM=?=)
   Comments: An unstructured encoded word can have
    =?us-ascii?Q?any_characters_in_it_<>()[]"?= =?bogus_e.w?=
   Date: (pre comment) sAt, 13 fEb
                  1999 14:59:56 -0800 (PST)
   Keywords: (various illegal constructs which nevertheless get through)
    \(Not a comment\), \" (naked quoted-pair), \ (not a quoted-SP)

   Comments: Various mismatches, which should be rejected.
   Foo: ) (naked \))
   Bar: ((mismatched parens)
   Baz: <"mismatch"
   Fred: ["mismatch"
   Date: Sat, 13 Feb 1999 23:00:14 GMT
   Date: 29 Feb 2001 23:00:14 +0000

   The following is the result of applying the foo canonicalization to
   it (lines folded for convenience, as before, and blank lines inserted
   between headers for readability).

   subject: Unstructured headers can contain unmatched (s and unesca
   ped "s; (comments like this) and "quoted strings" are not treated
    specially.CRLF

   summary: Multiple spaces, tabs and foldings in unstructured heade
   rs are reduced to a single SP, and trailing whitespace (of which
   there is much in these examples)) is ignored.CRLF

   x-header: All X headers are "treated "as unstructured")CRLF

   from: ScoobyDoo<foo@bar.example>(all FWS in structured headers is
    removed, except in comments)CRLF

   to: John(theBoss)Smith<bar@foo.example>,Bill\"fingers\"Sykes<"#*\
   "~"@twist.example>(Observe unescaped \( and escaped " within quot
   ed strings, and (properly matched) parentheses within comments)CRLF

   reply-to: #*\"~@twist.example(Observe "s elided, since not in <..
   .>)CRLF

   message-id: <"*\"~and-other-grunge)(]["@[127.0.0.1"Ugh!]>(Yes tha
   t is a legal msg-id, including the " in the domain-literal)CRLF

   sender: foo@[127.0.0.1"Ugh!](another " in a domain-literal)CRLF

   cc: foo@[127.0.0.1(thisisnot],bar@[acomment)127.0.0.1],=?utf-8?Q?
   not_an_encoded_word?=<=?utf-8?Q?not_an_encoded_word?=@bar.example
   >,JoeD.Bloggs[whoelse]<foo@bar.example>,C&A@bar.example(treated a
   s an encoded-word even though, syntactically, it isn't)(in commen
   t but =?is0-8859-1?Q?not(an_encoded-word?=))(encoded-word split i
   nto-parts)CRLF

Lindsey                                                        [Page 28]


                   Signed Headers in Mail and Netnews           May 2000

   comments: An unstructured encoded word can have any characters in
    it <>()[]" =?bogus_e.w?=CRLF

   date: (pre comment)918946796(PST)CRLF

   keywords: (various illegal constructs which nevertheless get thro
   ugh)\(Notacomment\),\"(naked quoted-pair),\(not a quoted-SP)CRLF
















































Lindsey                                                        [Page 29]