INTERNET-DRAFT                               Charles H. Lindsey
Usenet Format Working Group                  University of Manchester
                                             May 2002

                          News Article Format
                   <draft-ietf-usefor-article-07.txt>

Status of this Memo

   This document is an Internet-Draft and is in full conformance with
   all provisions of Section 10 of RFC 2026.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups. Note that
   other groups may also distribute working documents as Internet-
   Drafts.

   Internet-Drafts are draft documents valid for a maximum of six
   months and may be updated, replaced, or obsoleted by other
   documents at any time. It is inappropriate to use Internet-Drafts
   as reference material or to cite them other than as "work in
   progress."

   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/ietf/1id-abstracts.txt.

   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html.

Abstract

   This Draft is intended as a standards track document, obsoleting
   RFC 1036, which itself dates from 1987.

   This Standard defines the format of Netnews articles and specifies
   the requirements to be met by software which originates, distributes,
   stores and displays them.

   Since the 1980s, Usenet has grown explosively, and many Internet and
   non-Internet sites now participate. In addition, the Netnews
   technology is now in widespread use for other purposes.

   Backward compatibility has been a major goal of this endeavour, but
   where this standard and earlier documents or practices conflict, this
   standard should be followed. In most such cases, current practice is
   already compatible with these changes.

[The use of the words "this standard" within this document when
referring to itself does not imply that this draft yet has pretensions
to be a standard, but rather indicates what will become the case if and
when it is accepted as an RFC with the status of a proposed or draft
standard.]



C. H. Lindsey                                                   [Page 1]


                          News Article Format                   May 2002

[Remarks enclosed in square brackets and aligned with the left margin,
such as this one, are not part of this draft, but are editorial notes to
explain matters amongst ourselves, or to point out alternatives, or to
assist the RFC Editor.]

[In this draft, references to [NNTP] are to be replaced by [RFC 977], or
else by references to the RFC arising from the series of drafts draft-
ietf-nntpext-base-*.txt, in the event that such RFC has been accepted at
the time this document is published.]

[Please note that this Draft is now close to Last Call, and the material
included here is unlikely to change in any major way.]


                           Table of Contents

1.  Introduction ..................................................    6
  1.1.  Basic Concepts ............................................    6
  1.2.  Objectives ................................................    7
  1.3.  Historical Outline ........................................    7
  1.4.  Transport .................................................    7
2.  Definitions, Notations and Conventions ........................    8
  2.1.  Definitions ...............................................    8
  2.2.  Textual Notations .........................................    9
  2.3.  Relation To Email and MIME ................................   10
  2.4.  Syntax ....................................................   11
    2.4.1.  Syntax Notation .......................................   11
    2.4.2.  Syntax adapted from Email and MIME ....................   11
    2.4.3.  Syntax copied from other standards ....................   13
  2.5.  Language ..................................................   14
3.  Changes to the existing protocols .............................   15
  3.1.  Principal Changes .........................................   15
  3.2.  Transitional Arrangements .................................   15
4.  Basic Format ..................................................   17
  4.1.  Syntax of News Articles ...................................   17
  4.2.  Headers ...................................................   18
    4.2.1.  Naming of Headers .....................................   18
    4.2.2.  MIME-style Parameters .................................   19
    4.2.3.  White Space and Continuations .........................   20
    4.2.4.  Comments ..............................................   21
    4.2.5.  Header Properties .....................................   22
      4.2.5.1.  Experimental Headers ..............................   22
      4.2.5.2.  Inheritable Headers ...............................   22
      4.2.5.3.  Variant Headers ...................................   23
    4.2.6.  Undesirable Headers ...................................   23
  4.3.  Body ......................................................   23
    4.3.1.  Body Format Issues ....................................   23
    4.3.2.  Body Conventions ......................................   24
  4.4.  Characters and Character Sets .............................   26
    4.4.1.  Character Sets within Article Headers .................   26
    4.4.2.  Character Sets within Article Bodies ..................   27
  4.5.  Size Limits ...............................................   28
  4.6.  Example ...................................................   29
5.  Mandatory Headers .............................................   29

C. H. Lindsey                                                   [Page 2]


                          News Article Format                   May 2002

  5.1.  Date ......................................................   30
    5.1.1.  Examples ..............................................   30
  5.2.  From ......................................................   30
    5.2.1.  Examples:  ............................................   31
  5.3.  Message-ID ................................................   32
  5.4.  Subject ...................................................   33
    5.4.1.  Examples ..............................................   34
  5.5.  Newsgroups ................................................   34
    5.5.1.  Forbidden newsgroup names .............................   39
  5.6.  Path ......................................................   39
    5.6.1.  Format ................................................   39
    5.6.2.  Adding a path-identity to the Path-header .............   40
    5.6.3.  The tail-entry ........................................   41
    5.6.4.  Path-Delimiter Summary ................................   42
    5.6.5.  Suggested Verification Methods ........................   43
    5.6.6.  Example ...............................................   43
6.  Optional Headers ..............................................   44
  6.1.  Reply-To ..................................................   44
    6.1.1.  Examples ..............................................   44
  6.2.  Sender ....................................................   45
  6.3.  Organization ..............................................   45
  6.4.  Keywords ..................................................   45
  6.5.  Summary ...................................................   45
  6.6.  Distribution ..............................................   46
  6.7.  Followup-To ...............................................   47
  6.8.  Mail-Copies-To ............................................   47
  6.9.  Posted-And-Mailed .........................................   49
  6.10.  References ...............................................   49
    6.10.1.  Examples .............................................   50
  6.11.  Expires ..................................................   50
  6.12.  Archive ..................................................   50
  6.13.  Control ..................................................   51
  6.14.  Approved .................................................   52
  6.15.  Supersedes ...............................................   52
  6.16.  Xref .....................................................   53
  6.17.  Lines ....................................................   54
  6.18.  User-Agent ...............................................   54
    6.18.1.  Examples .............................................   55
  6.19.  Injector-Info ............................................   55
    6.19.1.  Usage of Injector-Info-parameters ....................   57
      6.19.1.1.  The posting-host-parameter .......................   58
      6.19.1.2.  The posting-account-parameter ....................   58
      6.19.1.3.  The posting-sender-parameter .....................   58
      6.19.1.4.  The posting-logging-parameter ....................   58
      6.19.1.5.  The posting-date-parameter .......................   58
    6.19.2.  Example ..............................................   59
  6.20.  Complaints-To ............................................   59
  6.21.  MIME headers .............................................   59
    6.21.1.  Syntax ...............................................   59
    6.21.2.  Content-Type .........................................   60
      6.21.2.1.  Message/partial ..................................   60
      6.21.2.2.  Message/rfc822 ...................................   61
      6.21.2.3.  Message/external-body ............................   62
      6.21.2.4.  Multipart types ..................................   62

C. H. Lindsey                                                   [Page 3]


                          News Article Format                   May 2002

    6.21.3.  Content-Transfer-Encoding ............................   62
    6.21.4.  Character Sets .......................................   64
    6.21.5.  Content Disposition ..................................   64
    6.21.6.  Definition of some new Content-Types .................   64
      6.21.6.1.  Application/news-transmission ....................   64
      6.21.6.2.  Message/news obsoleted ...........................   66
  6.22.  Obsolete Headers .........................................   66
7.  Control Messages ..............................................   66
  7.1.  Digital Signature of Headers ..............................   67
  7.2.  Group Control Messages ....................................   67
    7.2.1.  The 'newgroup' Control Message ........................   67
      7.2.1.1.  The Body of the 'newgroup' Control Message ........   68
      7.2.1.2.  Application/news-groupinfo ........................   68
      7.2.1.3.  Initial Articles ..................................   70
      7.2.1.4.  Example ...........................................   71
    7.2.2.  The 'rmgroup' Control Message .........................   71
      7.2.2.1.  Example ...........................................   72
    7.2.3.  The 'mvgroup' Control Message .........................   72
      7.2.3.1.  Example ...........................................   73
    7.2.4.  The 'checkgroups' Control Message .....................   74
      7.2.4.1.  Application/news-checkgroups ......................   75
  7.3.  Cancel ....................................................   76
  7.4.  Ihave, sendme .............................................   77
  7.5.  Obsolete control messages.  ...............................   78
8.  Duties of Various Agents ......................................   78
  8.1.  General principles to be followed .........................   78
  8.2.  Duties of an Injecting Agent ..............................   79
    8.2.1.  Proto-articles ........................................   79
    8.2.2.  Procedure to be followed by Injecting Agents ..........   79
  8.3.  Duties of a Relaying Agent ................................   81
  8.4.  Duties of a Serving Agent .................................   82
  8.5.  Duties of a Posting Agent .................................   83
  8.6.  Duties of a Followup Agent ................................   83
  8.7.  Duties of a Moderator .....................................   84
  8.8.  Duties of a Gateway .......................................   85
    8.8.1.  Duties of an Outgoing Gateway .........................   86
    8.8.2.  Duties of an Incoming Gateway .........................   88
    8.8.3.  Example ...............................................   90
9.  Security and Related Considerations ...........................   90
  9.1.  Leakage ...................................................   91
  9.2.  Attacks ...................................................   91
    9.2.1.  Denial of Service .....................................   91
    9.2.2.  Compromise of System Integrity ........................   92
  9.3.  Liability .................................................   93
10.  IANA Considerations ..........................................   94
11.  References ...................................................   94
12.  Acknowledgements .............................................   97
13.  Contact Address ..............................................   97
Appendix A.1 - A-News Article Format ..............................   98
Appendix A.2 - Early B-News Article Format ........................   99
Appendix A.3 - Obsolete Headers ...................................   99
Appendix A.4 - Obsolete Control Messages ..........................  100
Appendix B - Collected Syntax .....................................  101
Appendix B.1 - Characters, Atoms and Folding ......................  101

C. H. Lindsey                                                   [Page 4]


                          News Article Format                   May 2002

Appendix B.2 - Basic Forms ........................................  103
Appendix B.3 - Headers ............................................  104
Appendix B.3.1 - Header outlines ..................................  104
Appendix B.3.2 - Control-message outlines .........................  106
Appendix B.3.3 - Other header rules ...............................  107
Appendix C - Notices ..............................................  109

















































C. H. Lindsey                                                   [Page 5]


                          News Article Format                   May 2002

1.  Introduction

1.1.  Basic Concepts

   "Netnews" is a set of protocols for generating, storing and
   retrieving news "articles" (which resemble email messages) and for
   exchanging them amongst a readership which is potentially widely
   distributed. It is organized around "newsgroups", with the
   expectation that each reader will be able to see all articles posted
   to each newsgroup in which he participates. These protocols most
   commonly use a flooding algorithm which propagates copies throughout
   a network of participating servers.  Typically, only one copy is
   stored per server, and each server makes it available on demand to
   readers able to access that server.

   An important characteristic of Netnews is the lack of any requirement
   for a central administration or for the establishment of any
   controlling host to manage the network. A network which limits
   participation to some restricted set of hosts (within some company,
   for example) is a "closed" network; otherwise it is an "open"
   network. A set of hosts within a network which, by mutual
   arrangement, operates some variant (whether more or less restrictive)
   of the Netnews protocols is a "cooperating subnet".

   "Usenet" is a particular worldwide open network based upon the
   Netnews protocols, with the newsgroups being organized into
   recognized "hierarchies".  Anybody can join (it is simply necessary
   to negotiate an exchange of articles with one or more other
   participating hosts). Usenet "belongs" to those who administer the
   hosts of which it is comprised. There is no Cabal with overall
   authority to direct what is to be be allowed. Nevertheless, there do
   exist agencies within Usenet that have authority to establish
   policies and to perform administrative functions, but such authority
   derives solely from the consent of those sites which choose to
   recognize it (and who can decline to exchange articles with sites
   which choose not to recognize it). Usually, the authority of such an
   agency is restricted to a particular hierarchy, or group of
   hierarchies.

   A "policy" is a rule intended to facilitate the smooth operation of a
   network by establishing parameters which restrict behaviour that,
   whilst technically unexceptionable, would nevertheless contravene
   some accepted standard of "Good Netkeeping". Since the ultimate
   beneficiaries of a network are its human readers, who will be less
   tolerant of poorly designed interfaces than mere computers, articles
   in breach of established policy can cause considerable annoyance to
   their recipients.

   Policies may well vary from network to network, from hierarchy to
   hierarchy within one network, and even between individual newsgroups
   within one hierarchy. It is assumed, for the purposes of this
   standard, that agencies with varying degrees of authority to
   establish such policies will exist, and that where they do not,
   policy will be established by mutual agreement.  For the benefit of

C. H. Lindsey                                                   [Page 6]


                          News Article Format                   May 2002

   networks and hierarchies without such established agencies, and to
   provide a basis upon which all agencies can build, this present
   standard often provides default policy parameters, usually
   introducing them by a phrase such as "As a matter of policy ...".

1.2.  Objectives

   The purpose of this present standard is to define the format of
   articles and the protocols to be used for Netnews in general, and for
   Usenet in particular, and to set standards to be followed by software
   that implements those protocols.

   It is NOT the purpose of this standard to define how the authority of
   various agencies to exercise control or oversight of the various
   parts of Usenet is established (that is itself a matter of policy).
   Nevertheless, it is assumed that such authorities will exist, and
   tools are provided within the protocols for their use.

1.3.  Historical Outline

   Network news originated as the medium of communication for Usenet,
   circa 1980.  Since then, Usenet has grown explosively, and many
   Internet and non-Internet sites participate in it.  In addition, the
   news technology is now in widespread use for other purposes, on the
   Internet and elsewhere.

   The earliest news interchange used the so-called "A News" article
   format.  Shortly thereafter, an article format vaguely resembling
   Internet Mail was devised and used briefly.  Both of those formats
   are completely obsolete; they are documented in Appendix A.1 and
   Appendix A.2 for historical reasons only.  With publication of [RFC
   850] in 1983, news articles came to closely resemble Internet Mail
   messages, with some restrictions and some additional headers. [RFC
   1036] in 1987 updated [RFC 850] without making major changes.

   A Draft popularly referred to as "Son of 1036" [Son-of-1036] was
   written in 1994 by Henry Spencer. That document formed the original
   basis for this standard. Much is taken directly from Son of 1036, and
   it is hoped that we have followed its spirit and intentions.

1.4.  Transport

   As in this standard's predecessors, the exact means used to transmit
   articles from one host to another is not specified. NNTP [NNTP] is
   the most common transmission method on the Internet, but much
   transmission takes place entirely independent of the Internet. Other
   methods in use include the UUCP protocol [RFC 976] extensively used
   in the early days of Usenet, FTP, downloading via satellite, tape
   archives, and physically delivered magnetic and optical media.






C. H. Lindsey                                                   [Page 7]


                          News Article Format                   May 2002

2.  Definitions, Notations and Conventions

2.1.  Definitions

   An "article" is the unit of news, analogous to an [RFC 2822]
   "message". A "proto-article" is one that has not yet been injected
   into the news system.

   A "message identifier" (5.3) is a unique identifier for an article,
   usually supplied by the "posting agent" which posted it or, failing
   that, by the "injecting agent". It distinguishes the article from
   every other article ever posted anywhere. Articles with the same
   message identifier are treated as if they are the same article
   regardless of any differences in the body or headers.

   A "newsgroup" is a single news forum, a logical bulletin board,
   having a name and nominally intended for articles on a specific
   topic. An article is "posted to" a single newsgroup or several
   newsgroups. When an article is posted to more than one newsgroup, it
   is said to be "crossposted"; note that this differs from posting the
   same text as part of each of several articles, one per newsgroup.

   A newsgroup may be "moderated", in which case submissions are not
   posted directly, but mailed to a "moderator" for consideration and
   possible posting.  Moderators are typically human but may be
   implemented partially or entirely in software.

   A "hierarchy" is the set of all newsgroups whose names share a first
   component (as defined in 5.5).  The term "sub-hierarchy" is also used
   where several initial components are shared.

   A "poster" is the person or software that composes and submits a
   possibly compliant article to a "posting agent". The poster is
   analogous to [RFC 2822]'s author(s).

   A "posting agent" is the software that assists posters to prepare
   proto-articles, in compliance with this standard. The proto-article
   is then passed on to an "injecting agent" for final checking and
   injection into the news stream. If the article is not compliant, or
   is rejected by the injecting agent, then the posting agent informs
   the poster with an explanation of the error.

   A "reader" is the person or software reading news articles.

   A "reading agent" is software which presents articles to a reader.

   A "followup" is an article containing a response to the contents of
   an earlier article (the followup's "precursor").

   A "followup agent" is a combination of reading agent and posting
   agent that aids in the preparation and posting of a followup.




C. H. Lindsey                                                   [Page 8]


                          News Article Format                   May 2002

   An article's "reply address" is the address to which mailed replies
   should be sent. This is the address specified in the article's From-
   header (5.2), unless it also has a Reply-To-header (6.1).

   A "sender" is the person or software (usually, but not always, the
   same as the poster) responsible for the operation of the posting
   agent or, which amounts to the same thing, for passing the article to
   the injecting agent. The sender is analogous to [RFC 2822]'s sender.

   An "injecting agent" takes the finished article from the posting
   agent (often via the NNTP "post" command) performs some final checks
   and passes it on to a relaying agent for general distribution.

   A "relaying agent" is software which receives allegedly compliant
   articles from injecting agents and/or other relaying agents, and
   possibly passes copies on to other relaying agents and serving
   agents.

   A "news database" is the set of articles and related structural
   information stored by a serving agent and made available for access
   by reading agents.

   A "serving agent" receives an article from a relaying agent and files
   it in a news database. It also provides an interface for reading
   agents to access the news database.

   A "control message" is an article which is marked as containing
   control information; a relaying or serving agent receiving such an
   article may (subject to the policies observed at that site) take
   actions beyond just filing and passing on the article.

   A "gateway" is software which receives news articles and converts
   them to messages of some other kind (e.g. mail to a mailing list), or
   vice versa; in essence it is a translating relaying agent that
   straddles boundaries between different methods of message exchange.
   The most common type of gateway connects newsgroup(s) to mailing
   list(s), either unidirectionally or bidirectionally, but there are
   also gateways between news networks using this standard's news format
   and those using other formats.

2.2.  Textual Notations

   This standard contains explanatory NOTEs using the following format.
   These may be skipped by persons interested solely in the content of
   the specification. The purpose of the notes is to explain why choices
   were made, to place them in context, or to suggest possible
   implementation techniques.

        NOTE: While such explanatory notes may seem superfluous in
        principle, they often help the less-than-omniscient reader grasp
        the purpose of the specification and the constraints involved.
        Given the limitations of natural language for descriptive
        purposes, this improves the probability that implementors and
        users will understand the true intent of the specification in

C. H. Lindsey                                                   [Page 9]


                          News Article Format                   May 2002

        cases where the wording is not entirely clear.

   "US-ASCII" is short for "the ANSI X3.4 character set" [ANSI X3.4].
   While "ASCII" is often misused to refer to various character sets
   somewhat similar to X3.4, in this standard "US-ASCII" is used to mean
   X3.4 and only X3.4. US-ASCII is a 7 bit character set. Please note
   that this standard requires that all agents be 8 bit clean; that is,
   they must accept and transmit data without changing or omitting the
   8th bit.

   Certain words, when capitalized, are used to define the significance
   of individual requirements. The key words "MUST", "REQUIRED",
   "SHOULD", "RECOMMENDED", "MAY" and "OPTIONAL", and any of those words
   associated with the word "NOT", are to be interpreted as described in
   [RFC 2119].

   In addition, the word "Ought", when applied to a poster, or to
   actions of posting and similar agents which a poster may easily
   override, indicates a recommendation whose violation would do no more
   than breach established policy, or accepted best practice.

        NOTE: The use of "MUST" or "SHOULD" implies a requirement that
        would or could lead to interoperability problems if not
        followed. Although not following an "Ought" recommendation might
        do no worse than cause extreme irritation to other readers,
        particularly in the case of the publicly distributed Usenet,
        that is no reason not to take it seriously. The essential
        distinction is that enforcement of a "MUST" or "SHOULD" is a
        matter of ensuring correct implementation, whereas enforcement
        of an "Ought" is more a matter of sensible design or of social
        pressure (whose effectiveness should not be underestimated, even
        though it cannot be prescribed by this standard).

        NOTE: A requirement imposed on a relaying or serving agent
        regarding some particular article should be understood as
        applying only if that article is actually accepted for
        processing (since any agent may always reject any article
        entirely, for reasons of site policy).

   Throughout this standard we will give examples of various
   definitions, headers and other specifications. It needs to be
   remembered that these samples are for the aid of the reader only and
   do NOT define any specification themselves.  In order to prevent
   possible conflict with "Real World" entities and people the top level
   domain ".example" is used in all sample domains and addresses. The
   hierarchy "example.*" is also used as a sample hierarchy.
   Information on the ".example" top level domain is in [RFC 2606].

2.3.  Relation To Email and MIME

   The primary intent of this standard is to describe the news article
   format.  Insofar as news articles are a subset of the email message
   format augmented by some new headers, this standard incorporates many
   (though not all) of the provisions of [RFC 2822], with the aim of

C. H. Lindsey                                                  [Page 10]


                          News Article Format                   May 2002

   enabling news articles to pass through email systems and vice versa,
   provided only that they contain the minimum headers required for the
   mode of transport being used. Unfortunately, the match is not
   perfect, but it is the intention of this standard that gateways
   between Email and Netnews should be able to operate with the minimum
   of tinkering.

   Likewise, this standard incorporates many (though not all) of the
   provisions of the MIME standards [RFC 2045] et seq which, though
   designed with Email in mind, are mostly applicable to Netnews.

2.4.  Syntax

   The complete syntax defined in this standard is repeated, for
   convenience, in Appendix B.

2.4.1.  Syntax Notation

   This standard uses the Augmented Backus Naur Form described in [RFC
   2234].

   In particular, it makes significant use of the "incremental
   alternative" feature of that notation. For example, the two rules
      header              = other-header
      header              =/ Date-header
   are equivalent to the single rule
      header              = other-header / Date-header

2.4.2.  Syntax adapted from Email and MIME

   Much of the syntax of Netnews Articles is based on the corresponding
   syntax defined in [RFC 2822] or in the MIME specifications [RFC 2045]
   et seq, which are deemed to have been incorporated into this standard
   as required. However, there are some important differences arising
   from the fact that [RFC 2822] does not recognize anything other than
   US-ASCII characters, that it does not recognize the MIME headers [RFC
   2045], and that it includes much syntax described as "obsolete"
   (which is excluded from this standard, as detailed below).

        NOTE: Netnews parsers historically have been much less
        permissive than Email parsers, and this is reflected in the
        modifications referred to, and in some further specific rules.

   The following syntactic rules therefore supersede the corresponding
   rules given in [RFC 2822] and [RFC 2045], thus allowing UTF-8
   characters [RFC 2279] to appear in certain contexts (the five rules
   begining with "strict-" reflect the corresponding original rules from
   [RFC 2822]).







C. H. Lindsey                                                  [Page 11]


                          News Article Format                   May 2002

      UTF8-xtra-2-head= %xC2-DF
      UTF8-xtra-3-head= %xE0 %xA0-BF / %xE1-EC %x80-BF /
                        %xED %x80-9F / %xEE-EF %x80-BF
      UTF8-xtra-4-head= %xF0 %x90-BF / %xF1-F7 %x80-BF
      UTF8-xtra-5-head= %xF8 %x88-BF / %xF9-FB %x80-BF
      UTF8-xtra-6-head= %xFC %x84-BF / %xFD    %x80-BF
      UTF8-xtra-tail  = %x80-BF
      UTF8-xtra-char  = UTF8-xtra-2-head 1( UTF8-xtra-tail ) /
                        UTF8-xtra-3-head 1( UTF8-xtra-tail ) /
                        UTF8-xtra-4-head 2( UTF8-xtra-tail ) /
                        UTF8-xtra-5-head 3( UTF8-xtra-tail ) /
                        UTF8-xtra-6-head 4( UTF8-xtra-tail )
      text            = %d1-9 /            ; all UTF-8 characters except
                        %d11-12 /          ; US-ASCII NUL, CR and LF
                        %d14-127 /
                        UTF8-xtra-char
      ctext           = NO-WS-CTL /        ; all of <text> except
                        %d33-39 /          ; SP, HTAB, "(", ")"
                        %d42-91 /          ; "\" and DEL
                        %d93-126 /
                        UTF8-xtra-char
      qtext           = NO-WS-CTL /        ; all of <text> except
                        %d33 /             ; SP, HTAB, "\" DQUOTE
                        %d35-91 /          ; and DEL
                        %d93-126 /
                        UTF8-xtra-char
      utext           = NO-WS-CTL /        ; Non white space controls
                        %d33-126 /         ; The rest of UTF-8
                        UTF8-xtra-char
      strict-text     = %d1-9 /            ; text restricted to
                        %d11-12 /          ; US-ASCII
                        %d14-127
      strict-qtext    = NO-WS-CTL /        ; qtext restricted to
                        %d33 /             ; US-ASCII
                        %d35-91 /
                        %d93-126
      strict-quoted-pair
                      = "\" strict-text
      strict-qcontent = strict-qtext / strict-quoted-pair
      strict-quoted-string
                      = [CFWS] DQUOTE
                           *( [FWS] strict-qcontent ) [FWS]
                           DQUOTE [CFWS]
      unstructured    = 1*( [FWS] utext ) [FWS]

   The syntax for UTF8-xtra-char excludes those redundant sequences of
   octets which cannot occur in UTF-8, as defined by [RFC 2279], either
   because they would not be the shortest possible encodings of some UCS
   character [ISO/IEC 10646], or they would represent one of the
   characters D800 through DFFF, disallowed in UCS because of their
   surrogate use in the UTF-16 encoding.  These sequences MUST NOT be
   generated by posting agents. Where they occur inadadvertently, they
   MAY be passed on untouched by other agents, but they MUST NOT ever be
   interpreted as valid characters.

C. H. Lindsey                                                  [Page 12]


                          News Article Format                   May 2002

   Observe, in contradistinction to [RFC 2822], that an unstructured
   MUST contain at least one non-whitespace character (see also remarks
   about empty headers in 4.2.6).

   Wherever in this standard the syntax is stated to be taken from [RFC
   2822], it is to be understood as the syntax defined by [RFC 2822]
   after making the above changes, but NOT including any syntax defined
   in section 4 ("Obsolete syntax") of [RFC 2822].  Software compliant
   with this standard MUST NOT generate any of the syntactic forms
   defined in that Obsolete Syntax, although it MAY accept such
   syntactic forms. Certain syntax from the MIME specifications [RFC
   2045] et seq is also considered a part of this standard (see 6.21).

2.4.3.  Syntax copied from other standards

   The following syntactic forms, taken from [RFC 2234] or from [RFC
   2822], are repeated here for convenience only:

      ALPHA         = %x41-5A /          ; A-Z
                      %x61-7A            ; a-z
      CR            = %x0D               ; carriage return
      CRLF          = CR LF
      DIGIT         = %x30-39            ; 0-9
      HTAB          = %x09               ; horizontal tab
      LF            = %x0A               ; line feed
      SP            = %x20               ; space
      NO-WS-CTL     = %d1-8 /            ; US-ASCII control characters
                      %d11 /             ; which do not include the
                      %d12 /             ; carriage return, line feed,
                      %d14-31 /          ; and whitespace characters
                      %d127
      specials      = "(" / ")" /        ; Special characters used in
                      "<" / ">" /        ; other parts of the syntax
                      "[" / "]" /
                      ":" / ";" /
                      "@" / "\" /
                      "," / "." /
                      DQUOTE
      WSP           = SP / HTAB          ; Whitespace characters
      FWS           = ([*WSP CRLF] 1*WSP); Folding whitespace
      ccontent      = ctext / quoted-pair / comment
      comment       = "(" *([FWS] ccontent) [FWS] ")"
      CFWS          = *([FWS] comment) (([FWS] comment) / FWS )
      DQUOTE        = %d34               ; quote mark
      quoted-pair   = "\" text










C. H. Lindsey                                                  [Page 13]


                          News Article Format                   May 2002

      atext         = ALPHA / DIGIT /
                      "!" / "#" /        ; Any US-ASCII character except
                      "$" / "%" /        ; controls, SP, and specials.
                      "&" / "'" /        ; Used for atoms
                      "*" / "+" /
                      "-" / "/" /
                      "=" / "?" /
                      "^" / "_" /
                      "`" / "{" /
                      "|" / "}" /
                      "~"
      atom          = [CFWS] 1*atext [CFWS]
      dot-atom      = [CFWS] dot-atom-text [CFWS]
      dot-atom-text = 1*atext *( "." 1*atext )
      qcontent      = qtext / quoted-pair
      quoted-string = [CFWS] DQUOTE
                         *( [FWS] qcontent ) [FWS]
                         DQUOTE [CFWS]
      word          = atom / quoted-string
      phrase        = 1*word

        NOTE: Following [RFC 2234], literal text included in the syntax
        is to be regarded as case-insensitive.  However, in
        contradistinction to [RFC 2822], the Netnews protocols are
        sensitive to case in some instances (as in newsgroup names, some
        header parameters, etc.). Care has been taken to indicate this
        explicitly where required.

   As in [RFC 2822], where any quoted-pair appears it is to be
   interpreted as its text character alone. That is to say, the "\"
   character that appears as part of a quoted-pair is semantically
   "invisible".

   Again, as in [RFC 2822], strings of characters that include
   characters not syntactically allowed in some particular context may
   be incorporated into a quoted-string by "encapsulating" them between
   quote (DQUOTE, US-ASCII 34) characters, prefixing every quote and
   backslash character (and possibly other characters too) with a "\" so
   as to form a quoted-pair, and possibly introducing folding by
   prefixing some WSP with CRLF.

   The semantic value of a quoted-string (i.e. the result of reversing
   the encapsulation) is a string of characters which includes neither
   the optional CFWS outside of the quote characters, nor the quote
   characters themselves, nor any CRLF contained within any FWS between
   the two quote characters, nor the "\" which introduces any quoted-
   pair.

2.5.  Language

   Various constant strings in this standard, such as header names and
   month names, are derived from English words. Despite their
   derivation, these words do NOT change when the poster or reader
   employing them is interacting in a language other than English.

C. H. Lindsey                                                  [Page 14]


                          News Article Format                   May 2002

   Posting and reading agents MAY translate as appropriate in their
   interaction with the poster or reader, but the forms that actually
   appear in articles as transmitted MUST be the English-derived ones
   defined in this standard.

3.  Changes to the existing protocols

   This standard prescribes many changes, clarifications and new
   features since the protocols described in [RFC 1036] and [Son-of-
   1036].  It is the intention that they can be assimilated into Usenet
   as it presently operates without major interruption to the service,
   though some of the new features may not begin to show benefit until
   they become widely implemented. This section summarizes the main
   changes, and comments on some features of the transition.

3.1.  Principal Changes

     o The [RFC 2822] conventions for parenthesis-enclosed comments in
       headers are supported.
     o Whitespace is permitted in Newsgroups-headers, permitting folding
       of such headers. Indeed, all headers can now be folded.
     o An enhanced syntax for the Path-header enables the injection
       point of and the route taken by an article to be determined with
       certainty.
     o Netnews is firmly established as an 8bit medium and all headers
       are deemed to be in the UTF-8 character set (thus permitting, in
       particular, the use of non-ASCII newsgroup-names).
     o Large parts of MIME are recognised as an integral part of
       Netnews.
     o There is a new Control message 'mvgroup' to facilitate moving a
       group to a different place (name) in a hierarchy.
     o There are several new headers defined, notably Archive,
       Complaints-To, Injector-Info, Mail-Copies-To, Posted-And-Mailed
       and User-Agent, leading to increased functionality.
     o Provision has been made for almost all headers to have MIME-style
       parameters (to be ignored if not recognized), thus facilitating
       extension of those headers in future standards.
     o Certain headers and Control messages (AppendixA.3 and Appendix
       A.4) have been made obsolete.
     o Distributions are expected to be checked at the receiving end, as
       well as the sending end, of a relaying link.
     o There are numerous other small changes, clarifications and
       enhancements.

3.2.  Transitional Arrangements

   An important distinction must be made between serving and relaying
   agents, which are responsible for the distribution and storage of
   news articles, and user agents, which are responsible for
   interactions with users. It is important that the former should be
   upgraded to conform to this standard as soon as possible to provide
   the benefit of the enhanced facilities.  Fortunately, the number of
   distinct implementations of such agents is rather small, at least so
   far as the main "backbone" of Usenet is concerned, and many of the

C. H. Lindsey                                                  [Page 15]


                          News Article Format                   May 2002

   new features are already supported. Contrariwise, there are a great
   number of implementations of user agents, installed on a vastly
   greater number of small sites. Therefore, the new functionality has
   been designed so that existing agents may continue to be used,
   although the full benefits may not be realised until a substantial
   proportion of them have been upgraded.

   In the list which follows, care has been taken to distinguish the
   implications for both kinds of agent.

     o [RFC 2822] style comments in headers do not affect serving and
       relaying agents (note that the Newsgroups-, Distribution- and
       Path-headers do not contain them). They are unlikely to hinder
       their proper display in existing reading agents except in the
       case of the References-header in agents which thread articles.
       Therefore, it is provided that they SHOULD NOT be generated
       except where permitted by the previous standards.
     o Because of its importance to all serving agents, the extension
       permitting whitespace and folding in Newsgroups-headers SHOULD
       NOT be used until it has been widely deployed amongst relaying
       agents. User agents are unaffected.
     o The new style of Path-header is already consistent with the
       previous standards. However, the intention is that relaying
       agents should eventually reject articles in the old style, and so
       this possibility should be offered as a configurable option in
       relaying agents. User agents are unaffected.
     o The vast majority of serving, relaying and transport agents are
       believed to be already 8bit clean (in the slightly restricted
       sense in which that term is used in the MIME standards). User
       agents that do not implement MIME may be disadvantaged, but no
       more so than at present when faced with 8bit characters (which
       currently abound in spite of the previous standards).
     o The introduction of MIME reflects a practice that is already
       widespread.  Articles in strict compliance with the previous
       standards (using strict US-ASCII) will be unaffected. Many user
       agents already support it, at least to the extent of widely used
       charsets such as ISO-8859-1. Users expecting to read articles
       using other charsets will need to acquire suitable reading
       agents. It is not intended, in general, that any single user
       agent will be able to display every charset known to IANA, but
       all such agents MUST support US-ASCII. Serving and relaying
       agents are not affected.
     o The use of the UTF-8 charset for headers will not affect any
       existing usage that complies with the previous standards, since
       US-ASCII is a strict subset of UTF-8. Insofar as newsgroup names
       containing non-ASCII characters can now be expected to arise,
       some support from serving and relaying agents will be desirable,
       although it has been established that most current serving agents
       can already cope with such names without modification (although
       perhaps not in an ideal manner). Note that it is not necessary
       for serving and relaying agents to understand all the characters
       available in UTF-8, though it is desirable for them to be
       displayable for diagnostic purposes via some escape mechanism
       using, for example, the visible subset of US-ASCII. For users

C. H. Lindsey                                                  [Page 16]


                          News Article Format                   May 2002

       expecting to use the more exotic possibilities available under
       UTF-8, the remarks already made in connection with MIME will
       apply.
     o The new Control: mvgroup command will need to be implemented in
       serving agents. For the benefit of older serving agents it is
       therefore RECOMMENDED that it be followed shortly by a
       corresponding newgroup command and it MUST always be followed by
       a rmgroup command for the old group after a reasonable overlap
       period. An implementation of the mvgroup command as an alias for
       the newgroup command would thus be minimally conforming. User
       agents are unaffected.
     o All the headers newly introduced by this standard can safely be
       ignored by existing software, albeit with loss of the new
       functionality.

4.  Basic Format

4.1.  Syntax of News Articles

   The overall syntax of a news article is:

      article           = 1*( header CRLF ) separator body
      header            = other-header
      other-header      = header-name ":" 1*SP other-content
      header-name       = 1*name-character *( "-" 1*name-character )
      name-character    = ALPHA / DIGIT
      other-content     = <the content of a header defined by some
                           other standard>
      separator         = CRLF
      body              = *( *998text CRLF )

   However, the rule given above for header is incomplete. Further
   alternatives will be added incrementally as the various Netnews
   headers are introduced in this standard (or in future extensions),
   using the "=/" notation defined in [RFC 2234].  For example, a
   typical USENET-header would be defined as follows:

      header            =/ USENET-header
      USENET-header     = "USENET" ":" SP USENET-content
                             *( ";" ( USENET-parameter /
                                      other-parameter ) )
      USENET-content    = <syntax specific to that USENET-header>
      USENET-parameter  = <a parameter specific to that USENET-header>

   where the USENET-parameter, which MUST always be of the same
   syntactic form as an other-parameter (see below), is not provided in
   all headers, and even the other-parameter is omitted in some cases
   cases (see 4.2.2).  Observe that "USENET" is (and MUST be) of the
   syntactic form of a header-name.

      other-parameter   = <a parameter not defined by this standard>
      parameter         = attribute "=" value
      attribute         = [CFWS] token [CFWS]
      x-token           = "x-" token

C. H. Lindsey                                                  [Page 17]


                          News Article Format                   May 2002

      token             = 1*<any (US-ASCII) CHAR except SP, CTLs,
                             or tspecials>
      tspecials         = "(" / ")" / "<" / ">" / "@" /
                          "," / ";" / ":" / "\" / DQUOTE /
                          "/" / "[" / "]" / "?" / "="
      value             = [CFWS] token [CFWS] / quoted-string

   An article consists of some headers followed by a body. An empty line
   separates the two. The headers contain structured information about
   the article and its transmission. A header begins with a header-name
   identifying it, and can be continued onto subsequent lines as
   described in section 4.2.3.  The body is largely unstructured text
   significant only to the poster and the readers.

        NOTE: Terminology here follows the current custom in the news
        community, rather than the [RFC 2822] convention of referring to
        what is here called a "header" as a "header-field" or "field".

   Note that the separator line MUST be truly empty, not just a line
   containing white space. Further empty lines following it are part of
   the body, as are empty lines at the end of the article.

        NOTE: The syntax above defines the canonical form of a news
        article as a sequence of lines each terminated by CRLF. This
        does not prevent serving agents or transport agents from storing
        or handling the article in other formats (e.g. using a single LF
        in place of CRLF) so long as the overall effects achieved are as
        defined by this standard when operating on the canonical form.

4.2.  Headers

   The order of headers in an article is not significant. However,
   posting agents are encouraged to put mandatory headers (section 5)
   first, followed by optional headers (section 6), followed by
   experimental headers and headers not defined in this standard or its
   extensions. Relaying agents MUST NOT change the order of the headers
   in an article.

4.2.1.  Naming of Headers

   Despite the restrictions on header-name syntax imposed by the
   grammar, relayers and reading agents SHOULD tolerate header-names
   containing any US-ASCII printable character other than colon (":",
   US-ASCII 58).

   Whilst relaying agents MUST accept, and pass on unaltered, any non-
   variant header whose header-name is syntactically correct, and
   reading agents MUST enable them to be displayed, at least optionally,
   posting and injecting agents SHOULD NOT generate headers other than
     o headers established by this standard or any extension to it;
     o those recognized by other IETF-established standards, notably the
       Email standard [RFC 2822] and its extensions, excluding any
       explicitly deprecated for Netnews (e.g. see section 9.2.1 for the
       deprecated Disposition-Notification-To-header); or,

C. H. Lindsey                                                  [Page 18]


                          News Article Format                   May 2002

       alternatively, those listed in some future IANA registry of
       recognized headers;
     o experimental headers beginning with "X-" (as defined in 4.2.5.1);
     o on a provisional basis only, headers related to new protocols
       under development which are the subject of (or intended to be the
       subject of) some IETF-approved RFC (whether Informational,
       Experimental or Standards-Track).
   However, software SHOULD NOT attempt to interpret headers not
   specifically intended to be meaningful in the Netnews environment.

   Header-names are case-insensitive. There is a preferred case
   convention, which posters and posting agents Ought to use: each
   hyphen-separated "word" has its initial letter (if any) in uppercase
   and the rest in lowercase, except that some abbreviations have all
   letters uppercase (e.g. "Message-ID" and "MIME-Version"). The forms
   given in the various rules defining headers in this standard are the
   preferred forms for them. Relaying and reading agents MUST, however,
   tolerate articles not obeying this convention.

4.2.2.  MIME-style Parameters

   The possibility of allowing Mime-style parameters (whether header-
   specific ones or generic other-parameters) to appear in virtually all
   headers is provided mainly for the purpose of allowing future
   extensions to existing headers, since only a very few specific
   parameters are defined in this standard. Observe that such parameters
   do not, in general, occur in headers defined in other standards,
   except for the MIME standards [RFC 2045] et seq. and their
   extensions.

   Other-parameters (whether those defined elsewhere or experimental
   parameters whose attribute is an x-token) MAY be used, where the
   syntax so allows, in any of the headers defined in this standard or
   its extensions except that, at present, they SHOULD NOT be used in
   headers in widespread use prior to the introduction of this standard
   (this restriction is likely to be removed in a future version of this
   standard). Nevertheless, compliant software MUST accept such
   parameters where required by this standard (ignoring them if their
   meaning is unknown) and SHOULD accept (and ignore) them in all
   structured headers wherever defined.

        NOTE: The syntax does not permit other-parameters in
        unstructured headers (where they are unnecessary) or in certain
        headers (notably the From-, Reply-To-, Mail-Copies-To- and
        Complaints-To-headers) containing address-lists or mailbox-lists
        (so that agents can simply replace the header-name by "To" or
        "Cc" to obtain a header immediately suitable for sending Email,
        and also so as to avoid some minor parsing problems with
        addresses).

   Each header-specific parameter introduced in this standard is
   described by specifying
     (a) the token to be used in its attribute, and
     (b) the syntax rule(s) defining the object(s) permitted in its

C. H. Lindsey                                                  [Page 19]


                          News Article Format                   May 2002

         value.
   If a value object is not of the syntactic form of a token, it MUST
   (and otherwise MAY) be encapsulated in a quoted-string (see 2.4.3).
   Observe that the syntax of a parameter also allows additional WSP,
   folding and comments.

   The semantics of a parameter is always to associate the token in its
   attribute with the object represented by the token, or the semantic
   value (2.4.3) of the quoted-string, contained in its value.

   For example, the posting-sender-parameter (6.19) is defined to be
      <a parameter with attribute "sender" and value some sender-value>
   where
      sender-value      = mailbox / "verified"
   A valid posting-sender-parameter would be
      sender = "\"Joe D. Bloggs\" <jdbloggs@example.com>" (authinfo)
   The comment (syntactically part of the quoted-string) is irrelevant.
   The actual mailbox (to be used, for example, if email is to be sent
   to the sender) is
      "Joe D. Bloggs" <jdbloggs@example.com>

4.2.3.  White Space and Continuations

   Each header is logically a single line of characters comprising the
   header-name, the colon with its following SP, the content, and any
   parameters. For convenience, however, the content and parameters can
   be "folded" into a multiple line representation by inserting a CRLF
   before any WSP contained within any FWS or CFWS (but not any other SP
   or HTAB) allowed by this standard. For example, the header:
      Approved: modname@modsite.example (Moderator of example.foo.bar)
   can be represented as:
      Approved: modname@modsite.example
         (Moderator of example.foo.bar)

   FWS occurs at many places in the syntax (usually within a CFWS) in
   order to allow the inclusion of comments, whitespace and folding. The
   syntax is in fact ambiguous insofar as it sometimes allows two
   consecutive instantiations of FWS (as least one of which is always
   optional), or of FWS followed by an explicit CRLF. However, all such
   cases MUST be treated as if the optional instantiation (or one of
   them) had not been present. It is thus precluded that any line of a
   header should be made up of whitespace characters and nothing else
   (for such a line might otherwise have been interpreted by a non-
   compliant agent as the separator between the headers and the body of
   the article).

        NOTE: This does not lead to semantic ambiguity because, unless
        specifically stated otherwise, the presence or absence of
        folding, a comment or additional WSP has no semantic meaning
        and, in particular, it is a matter of indifference whether it
        forms a part of the syntactic construct preceding it or the one
        following it.



C. H. Lindsey                                                  [Page 20]


                          News Article Format                   May 2002

        NOTE: It may be observed that the content part of every header
        begins and ends with an optional CFWS (or FWS in the case of
        certain headers). Moreover, every parameter also begins and ends
        with an optional CFWS.

        NOTE: Though contents are defined in such a way that folding can
        take place between many of the lexical tokens (and even within
        some of them), folding should be limited to placing the CRLF at
        higher-level syntactic breaks, and should also avoid leaving
        trailing WSP on the preceding line. For instance, if a header-
        content is defined as comma-separated values, it is recommended
        that folding occur after the comma separating the structured
        items, even if it is allowed elsewhere.

   In accordance with the syntax, the header-name on the first line MUST
   be followed by a SP (even if the rest of the header is empty, but see
   4.2.6).  Even though the syntax allows otherwise, at least some of
   the content MUST appear on that first line (to avoid the possibility
   of harm by any non-compliant agent that might eliminate a trailing
   WSP). Although posting agents are REQUIRED to enforce these
   restrictions, relaying and serving agents SHOULD accept articles that
   violate them.

        NOTE: This standard differs from [RFC 2822] in requiring that SP
        following the colon (it was also an [RFC 1036] requirement).

   Posters and posting agents SHOULD use SP, not HTAB, where white space
   is desired in headers (some existing software expects this). Relaying
   and serving agents SHOULD accept HTAB in all such cases, however.

4.2.4.  Comments

   Strings of characters which are treated as comments may be included
   in headers wherever the syntactic element CFWS occurs.  They consist
   of characters enclosed in parentheses.  Comments may be nested.

        NOTE: Although CFWS occurs wherever whitespace is allowed in
        almost all headers, there are exceptions where only FWS is
        permitted (hence folding but no comments). Notably, this happens
        in the case of the Newsgroups-, Distribution-, Path- and
        Followup-To-headers, and within the Date-header except right at
        the end.

   A comment is normally used to provide some human readable
   informational text, except at the end of an address which contains no
   phrase, as in
      fred@foo.bar.example (Fred Bloggs)
   as opposed to
      "Fred Bloggs" <fred@foo.bar.example> .

   The former is a deprecated, but commonly encountered, usage and
   reading agents SHOULD take special note of such comments as
   indicating the name of the person whose address it is. In all other
   situations a comment is semantically interpreted as a single SP.

C. H. Lindsey                                                  [Page 21]


                          News Article Format                   May 2002

   Since a comment is allowed to contain FWS, folding is permitted
   within it as well as immediately preceding and immediately following
   it. Also note that, since quoted-pair is allowed in a comment, the
   parenthesis and backslash characters may appear in a comment so long
   as they appear as a quoted-pair. Semantically, the enclosing
   parentheses are not part of the content of the comment; the content
   is what is contained between the two parentheses.

   Since comments have not hitherto been permitted in news articles,
   except in a few specified places, posters and posting-agents SHOULD
   NOT insert them except in those places, namely following addresses in
   From and similar headers, and to indicate the name of the timezone in
   Date-headers.  However, compliant software MUST accept them in all
   places where they are syntactically allowed.

4.2.5.  Header Properties

   There are three special properties that may apply to particular
   headers, namely: "experimental", "inheritable", and "variant". When a
   header is defined, in this (or any future) standard, as having one
   (or possibly more) of these properties, it is subject to special
   treatment, as indicated below.

4.2.5.1.  Experimental Headers

   Experimental headers are those whose header-names begin with "X-".
   They are to be used for experimental Netnews features, or for
   enabling additional material to be propagated with an article. They
   are not (and will not be) defined by this, or any, standard.

        NOTE: Experimental headers are suitable for situations where
        they need only to be human readable. They are not intended to be
        recognized by widely deployed Netnews software and, should such
        a requirement be envisaged, it is preferable to use a normal
        header on the provisional basis set out in section 4.2.1.

4.2.5.2.  Inheritable Headers

   Subject only to the overriding ability of the poster to determine the
   contents of the headers in a proto-article, headers with the
   inheritable property MUST be copied by followup agents (perhaps with
   some modification) into the followup article, and headers without
   that property MUST NOT be so copied.  Examples include:
     o Newsgroups (5.5) - copied from the precursor, subject to any
       Followup-To-header.
     o Subject (5.4) - modified by prefixing with "Re: ", but otherwise
       copied from the precursor.
     o References (6.10) - copied from the precursor, with the addition
       of the precursor's Message-ID.
     o Distribution (6.6) - copied from the precursor.

        NOTE: The Keywords-header is not inheritable, though some older
        newsreaders treated it as such.


C. H. Lindsey                                                  [Page 22]


                          News Article Format                   May 2002

4.2.5.3.  Variant Headers

   Headers with the variant property may differ between (or even be
   completely absent from) copies of the same article as stored or
   relayed throughout a Netnews system. The manner of the difference (or
   absence) MUST be as specified in this (or any future) standard.
   Typically, these headers are modified as articles are propagated, or
   they reflect the status of the article on a particular serving agent,
   or cooperating group of such agents. The variant header MAY be placed
   anywhere within the headers (though placing it first is recommended).
   The principle examples are:
     o Path (5.6) - augmented at each relaying agent that an article
       passes through.
     o Xref (6.16) - used to keep track of the article locators of
       crossposted articles so that newsreaders serviced by a particular
       serving agent can mark such articles as read.

4.2.6.  Undesirable Headers

   A header whose content is empty is said to be an empty header (in
   fact, no such headers are defined by this standard).  Relaying and
   reading agents SHOULD NOT consider presence or absence of an empty
   header to alter the semantics of an article (although syntactic
   rules, such as requirements that certain header-names appear at most
   once, MUST still be satisfied). Posting and injecting agents SHOULD
   delete empty headers from articles before posting them; relaying
   agents MUST pass them untouched.

   Headers that merely state defaults explicitly (e.g., a Followup-To-
   header with the same content as the Newsgroups-header, or a MIME
   Content-Type-header with contents "text/plain; charset=us-ascii") or
   state information that reading agents can typically determine easily
   themselves (e.g.  the length of the body in octets) are redundant and
   posters and posting agents Ought Not to include them.

4.3.  Body

4.3.1.  Body Format Issues

   The body of an article SHOULD NOT be empty. A posting or injecting
   agent which does not reject such an article entirely SHOULD at least
   issue a warning message to the poster and supply a non-empty body.
   Note that the separator line MUST be present even if the body is
   empty.

        NOTE: Some existing news software is known to react badly to
        body-less articles, hence the request for posting and injecting
        agents to insert a body in such cases. The sentence "This
        article was probably generated by a buggy news reader" has
        traditionally been used is this situation.

   Note that an article body is a sequence of lines terminated by CRLFs,
   not arbitrary binary data, and in particular it MUST end with a CRLF.
   However, relaying and serving agents SHOULD treat the body of an

C. H. Lindsey                                                  [Page 23]


                          News Article Format                   May 2002

   article as an uninterpreted sequence of octets (except as mandated by
   changes of CRLF representation and by control message processing, as
   in 7.2.4) and SHOULD avoid imposing constraints on it. See also
   section 4.5.

   Posters SHOULD avoid using control characters and escape sequences
   except for tab (US-ASCII 9), formfeed (US-ASCII 12) and, possibly,
   backspace (US-ASCII 8).  Tab signifies sufficient horizontal white
   space to reach the next of a set of fixed positions; posters are
   warned that there is no standard set of positions, so tabs should be
   avoided if precise spacing is essential. Formfeed (which is sometimes
   referred to as the "spoiler character") signifies a point at which a
   reading agent Ought to pause and await reader interaction before
   displaying further text.

        NOTE: Passing other control characters or escape sequences
        unaltered to a display or printing device is likely to have
        unpredictable results, except in the case of a device adapted to
        the special needs of some particular character set.

        NOTE: Backspace was historically used for underlining, done by
        an underscore (US-ASCII 95), a backspace, and a character,
        repeated for each character that should be underlined. Posters
        are warned that underlining is not available on all output
        devices or supported by all reading agents and is best not
        relied on for essential meaning.

4.3.2.  Body Conventions

   A body is by default an uninterpreted sequence of octets for most of
   the purposes of this standard. However, a MIME Content-Type-header
   may impose some structure or intended interpretation upon it, and may
   also specify the character set in accordance with which the octets
   are to be interpreted.

   The following conventions for quotations, attributions and
   signatures, although not mandated by this standard, describe widely
   used practices. They are documented here in order to establish their
   correct usage, and the use of the words "MUST", "SHOULD", etc. is to
   be understood accordingly.

   It is conventional for followup agents to enable the incorporation of
   the followed-up article (the "precursor") as a quotation. This SHOULD
   be done by prefacing each line of the quoted text (even if it is
   empty) with the character ">" (or perhaps with "> " in the case of a
   previously unquoted line). This will result in multiple levels of ">"
   when quoted content itself contains quoted content, and it will also
   facilitate the automatic analysis of articles.

        NOTE: Posters should edit quoted context to trim it down to the
        minimum necessary. However, followup agents Ought Not to attempt
        to enforce this beyond issuing a warning (past attempts to do so
        have been found to be notably counter-productive).


C. H. Lindsey                                                  [Page 24]


                          News Article Format                   May 2002

   The followup agent SHOULD also precede the quoted content by an
   "attribution line" (however, readers are warned not to assume that
   they are accurate, especially within multiply nested quotations). The
   following convention for such lines is intended to facilitate their
   automatic recognition and processing by sophisticated reading agents.
   The attribution SHOULD contain the name or the email address of the
   precursor's poster, as in
      Joe D. Bloggs <jdbloggs@foo.example> wrote:
   or
      Helmut Schmidt <helmut@bar.example> schrieb:

   The attribution MAY contain also a single newsgroup-name (the one
   from which the followup is being made), the precursor's Message-ID
   and/or the precursor's Date and Time. Any of these that are present,
   SHOULD precede the name and/or email address. However, the inclusion
   or not of such fields Ought always to be under the control of the
   poster.

   To enable this line, and the Message-ID and the email address within
   it, to be recognized (for example to enable suitable reading agents
   to retrieve the precursor or email its poster by clicking on them),
   the following conventions SHOULD be observed:
     o The precursor's Message-ID SHOULD be enclosed within <...> or
       <news:...>
     o The precursor's poster's email address SHOULD be enclosed within
       <...>
     o The various fields may be separated by arbitrary text and they
       may be folded in the same way as headers, but attributions SHOULD
       always be terminated by a ":" followed by CRLF.

   Further examples:

      On comp.foo in <1234@bar.example> on 24 Dec 2001 16:40:20 +0000,
         Joe D. Bloggs <jdbloggs@bar.example> wrote:

      Am 24. Dez 2001 schrieb Helmut Schmidt <helmut@bar.example>:

   A "personal signature" is a short closing text automatically added to
   the end of articles by posting agents, identifying the poster and
   giving his network addresses, etc. Whenever a poster or posting agent
   appends such a signature to an article, it MUST be preceded with a
   delimiter line containing (only) two hyphens (US-ASCII 45) followed
   by one SP (US-ASCII 32). The signature is considered to extend from
   the last occurrence of that delimiter up to the end of the article
   (or up to the end of the part in the case of a multipart MIME body).
   Followup agents, when incorporating quoted text from a precursor,
   Ought Not to include the signature in the quotation. Posting agents
   Ought to discourage (at least with a warning) signatures of excessive
   length (4 lines is a commonly accepted limit).






C. H. Lindsey                                                  [Page 25]


                          News Article Format                   May 2002

4.4.  Characters and Character Sets

   Transmission paths for news articles MUST treat news articles as
   uninterpreted sequences of octets, excluding the values 0 (US-ASCII
   NUL) and 13 and 10 (US-ASCII CR and LF, which MUST ONLY appear in the
   combination CRLF which denotes a line separator).

        NOTE: this corresponds to the range of octets permitted for MIME
        "8bit data" [RFC 2045].  Thus raw binary data cannot be
        transmitted in an article body except by the use of a Content-
        Transfer-Encoding such as base64.

   Character data is represented by octets in accordance with some
   encoding scheme (UTF-8 for headers, and determined by the Content-
   Type- and Content-Transfer-Encoding-headers for bodies).

   If it comes to a relaying agent's attention that it is being asked to
   pass an article using the Content-Transfer-Encoding "8bit" to a
   relaying agent that does not support it, it SHOULD report this error
   to its administrator. It MUST refuse to pass the article and MUST NOT
   re-encode it with different MIME encodings.

        NOTE: This strategy will do little harm. The target relaying
        agent is unlikely to be able to make use of the article on its
        own servers, and the usual flooding algorithm will likely find
        some alternative route to get the article to destinations where
        it is needed.

4.4.1.  Character Sets within Article Headers

   Within article headers, characters are represented as octets
   according to the UTF-8 encoding scheme [RFC 2279] or [ISO/IEC 10646],
   and hence all the characters in Unicode [UNICODE 3.1] or in the
   Universal Multiple-Octet Coded Character Set (UCS) [ISO/IEC 10646]
   (which is essentially a superset of Unicode and expected to remain
   so) are potentially available. However, processing all octets in the
   same manner as US-ASCII characters should ensure correct behaviour in
   most situations.

        NOTE: UTF-8 is an encoding for 16bit (and even 32bit) character
        sets with the property that any octet less than 128 immediately
        represents the corresponding US-ASCII character, thus ensuring
        upwards compatibility with previous practice.  Non-ASCII
        characters from Unicode are represented by sequences of octets
        satisfying the syntax of a UTF8-xtra-char (2.4.2), which
        excludes certain octet sequences not explicitly permitted by
        [RFC 2279].  Unicode includes all characters from the ISO-8859
        series of characters sets [ISO 8859] (which includes all
        Cyrillic, Greek and Arabic characters) together with the more
        elaborate characters used in Asian countries. See the following
        section for the appropriate treatment of Unicode characters by
        reading agents.



C. H. Lindsey                                                  [Page 26]


                          News Article Format                   May 2002

   Notwithstanding the great flexibility permitted by UTF-8, there is
   need for restraint in its use in order that the essential components
   of headers may be discerned using reading agents that cannot present
   the full Unicode range. In particular, header-names and tokens MUST
   be in US-ASCII, and certain other components of headers, as defined
   elsewhere in this standard - notably msg-ids, date-times, dot-atoms,
   domains and path-identities - MUST be in US-ASCII.  Comments, phrases
   (as in addresses) and unstructured headers (such as the Subject-,
   Organization- and Summary-headers) MAY use the full range of UTF-8
   characters, but SHOULD nevertheless be invariant under Unicode
   normalization NFC [UNICODE 3.1].

        NOTE: Unicode allows for composite characters made up of a
        starter character - which can be a letter, number, punctuation
        mark, or symbol - plus zero or more combining marks (such as
        accents, diacritics, and similar). The requirement that a
        composite be invariant under normalization NFC means that, where
        it could be written in more than one way, only one particular
        one is allowed (for example, the single character E-acute is
        preferred over E followed by a non-spacing acute accent, and A-
        ring is preferred over the Angstrom symbol). At least for the
        main European languages, for which all the needed composites are
        already available as single characters, it is unlikely that
        posting agents will need to take any special steps to ensure
        normalization.

   In the particular case of newsgroup-names (see 5.5) there are more
   stringent requirements regarding the use of UTF-8 and Unicode.

   Where the use of non-ASCII characters, encoded in UTF-8, is permitted
   as above, they MAY also be encoded using the MIME mechanism defined
   in [RFC 2047], but this usage is deprecated within news articles
   (even though it is required in email messages) since it is less
   legible in older reading agents which support neither it nor UTF-8.
   Nevertheless, reading agents SHOULD support this usage, but only in
   those contexts explicitly mentioned in [RFC 2047].

   Similar considerations apply to non-ASCII characters within the
   values of parameters (which, according to the syntax, MUST be in the
   form of quoted-strings in order for UTF8-xtra-chars to be
   accomodated). Such values MAY be encoded using the MIME mechanism
   defined in [RFC 2231], but this usage is deprecated within news
   articles (even though it is required in email messages) since it is
   less legible in older reading agents which support neither it nor
   UTF-8. Nevertheless, reading agents SHOULD support this usage.

4.4.2.  Character Sets within Article Bodies

   Within article bodies, characters are represented as octets according
   to the encoding scheme implied by any Content-Transfer-Encoding- and
   Content-Type-headers [RFC 2045].  In the absence of such headers,
   reading agents cannot be relied upon to display correctly more than
   the US-ASCII characters, though they MUST display at least those.


C. H. Lindsey                                                  [Page 27]


                          News Article Format                   May 2002

        NOTE: Observe that reading agents are not forbidden to "guess",
        or to interpret as UTF-8 regardless, which would be the simplest
        course for them to take.

        NOTE: It is not expected that reading agents will necessarily be
        able to present characters in all possible character sets. For
        example, a reading agent might be able to present only the ISO-
        8859-1 (Latin 1) characters [ISO 8859], in which case it Ought
        to present undisplayable characters using some distinctive
        glyph, or by exhibiting a suitable warning.

   Followup agents MUST be careful to apply appropriate encodings to the
   outbound followup. A followup to an article containing non-ASCII
   material is very likely to contain non-ASCII material itself.

4.5.  Size Limits

   Posting agents SHOULD endeavour to keep all header lines, so far as
   is possible, within 79 characters by folding them at suitable places
   (see 4.2.3).  However, posting agents MUST permit the poster to
   include longer headers if he so insists, and compliant software MUST
   support headers of at least 998 octets. Likewise, injecting agents
   SHOULD fold any headers generated automatically by themselves.
   Relaying agents MUST NOT fold headers (i.e. they must pass on the
   folding as received).

        NOTE: There is NO restriction on the number of lines into which
        a header may be split, and hence there is NO restriction on the
        total length of a header (in particular it may, by suitable
        folding, be made to exceed the 998 octets restriction pertaining
        to a single header line).

   The syntax provides for the lines of a body to be up to 998 octets in
   length, not including the CRLF. All software compliant with this
   standard MUST support lines of at least that length, both in headers
   and in bodies, and all such software SHOULD support lines of
   arbitrary length. In particular, relaying agents MUST transmit lines
   of arbitrary length without truncation or any other modification.

        NOTE: The limit of 998 octets is consistent with the
        corresponding limit in [RFC 2822].

   In plain-text messages (those with no MIME headers, or those with a
   MIME Content-Type of text/plain) posting agents Ought to endeavour to
   keep the length of body lines within some reasonable limit. The size
   of this limit is a matter of policy, the default being to keep within
   79 characters at most, and preferably within 72 characters (to allow
   room for quoting in followups).  Exceptionally, posting agents Ought
   Not to adjust the length of quoted lines in followups unless they are
   able to reformat them in a consistent manner.  Moreover, posting
   agents MUST permit the poster to include longer lines if he so
   insists.



C. H. Lindsey                                                  [Page 28]


                          News Article Format                   May 2002

        NOTE: Plain-text messages are intended to be displayed "as-is"
        without any special action (such as automatic line splitting) on
        the part of the recipient. The policy limit (e.g. 72 or 79)
        should be expressed as a number of characters (as they will be
        displayed by a reading agent) rather than as the number of
        octets used to encode them.

        NOTE: This standard provides no upper bound on the overall size
        of a single article, but neither does it forbid relaying agents
        from dropping articles of excessive length. It is, however,
        suggested that any limits thought appropriate by particular
        agents would be more appropriately expressed in megabytes than
        in kilobytes.

4.6.  Example

   Here is a sample article:

      Path: server.example/unknown.site2.example@site2.example/
        relay.site.example/site.example/injector.site.example%jsmith
      Newsgroups: example.announce,example.chat
      Message-ID: <9urrt98y53@site1.example>
      From: Ann Example <a.example@site1.example>
      Subject: Announcing a new sample article.
      Date: Wed, 27 Mar 2002 12:12:50 +0300
      Approved: example.announce moderator <jsmith@site.example>
      Followup-To: example.chat
      Reply-To: Ann Example <a.example+replies@site1.example>
      Expires: Mon, 22 Apr 2002 12:12:50 +0300
      Organization: Site1, The Number one site for examples.
      User-Agent: ExampleNews/3.14 (Unix)
      Keywords: example, announcement, standards, RFC 1036, Usefor
      Summary: The URL for the next standard.
      Injector-Info: injector.site.example; posting-host=du003.site.example
      Complaints-To: abuse@site.example


      Just a quick announcement that a new standard example article has
      been released; it is in the new USEFOR standard obtainable from
      ftp.ietf.org.
      Ann.

      --
      Ann Example <a.example@site1.example>   Sample Poster to the Stars
      "The opinions in this article are bloody good ones" - J. Clarke.
[The RFC Editor is invited to change the above Date and Expires headers
to match the actual publication dates and to insert its correct URL.]

5.  Mandatory Headers

   An article MUST have one, and only one, of each of the following
   headers: Date, From, Message-ID, Subject, Newsgroups, Path.



C. H. Lindsey                                                  [Page 29]


                          News Article Format                   May 2002

   Note also that there are situations, discussed in the relevant parts
   of section 6, where References-, Sender-, or Approved-headers are
   mandatory. In control messages, specific values are required for
   certain headers.

   A proto-article (see 8.2.1) may lack some of these mandatory headers,
   but they MUST then be supplied by the injecting agent.

5.1.  Date

   The Date-header contains the date and time that the article was
   prepared by the poster ready for transmission and SHOULD express the
   poster's local time. The content syntax makes use of syntax defined
   in [RFC 2822], subject to the following revised definition of zone.

      header              =/ Date-header
      Date-header         = "Date" ":" SP Date-content
                               *( ";" other-parameter )
      Date-content        = date-time
      zone                = (( "+" / "-" ) 4DIGIT) / "UT" / "GMT"

   The forms "UT" and "GMT" (indicating universal time) are to be
   regarded as obsolete synonyms for "+0000". They MUST be be accepted,
   and passed on unchanged, by all agents, but they MUST NOT be
   generated as part of new articles by posting and injecting agents.
   The date-time MUST be semantically valid as required by [RFC 2822].
   Although folding white space is permitted throughout the date-time
   syntax, it is RECOMMENDED that a single space be used in each place
   that FWS appears (whether it is required or optional).

        NOTE: A convention that is sometimes followed is to add a
        comment, after the date-time, containing the time zone in
        human-readable form, but many of the abbreviations commonly used
        for this purpose are ambiguous. The value given by the <zone> is
        the only definitive form.

   In order to prevent the reinjection of expired articles into the news
   stream, relaying and serving agents MUST refuse "stale" articles
   whose Date-header predates the earliest articles of which they
   normally keep record, or which is more than 24 hours into the future
   (though they MAY use a margin less than that 24 hours). Relaying
   agents MUST NOT modify the Date-header in transit.

5.1.1.  Examples

      Date: Sat, 26 May 2001 11:13:00 -0500 (EST)
      Date: 26 May 2001 16:13 +0000
      Date: 26 May 2001 16:13 GMT (Obsolete)

5.2.  From

   The From-header contains the electronic address(es), and possibly the
   full name, of the article's poster(s). The content syntax makes use
   of syntax defined in [RFC 2822], subject to the following revised

C. H. Lindsey                                                  [Page 30]


                          News Article Format                   May 2002

   definition of local-part.

      header              =/ From-header
      From-header         = "From" ":" SP From-content
      From-content        = mailbox-list
      addr-spec           = local-part "@" domain
      local-part          = dot-atom / strict-quoted-string

        NOTE: This syntax ensures that the local-part of an addr-spec is
        restricted to pure US-ASCII (and is thus in strict compliance
        with [RFC 2822]), whilst allowing any UTF-8 character to be used
        in a preceding quoted-string containing the poster's full name.
        If some future extension to the Email protocols should relax
        this restriction, one would expect the Netnews protocols to
        follow.

   Each mailbox in the From-content SHOULD be a valid address, belonging
   to the poster(s) of the article, or person or agent on whose behalf
   the post is being sent (see the Sender-header, 6.2).  When, for
   whatever reason, the poster does not wish to include such an address,
   the From-content SHOULD then be an address which ends in the top
   level domain of ".invalid" [RFC 2606].

        NOTE: Since such addresses ending in ".invalid" are
        undeliverable, user agents Ought to warn any user attempting to
        reply to them and Ought Not, in any case, to attempt to deliver
        to them (since that would be pointless anyway).  Whether or not
        a valid address can subsequently be extracted from such an
        address falls outside the scope of this standard (though it
        would be pointless to use a disguise so easily penetrable).

        Be warned, however, that some injecting agents which are unable
        to detect that the address belongs to the poster may choose to
        insert a Sender-header (6.2) or some entry in an Injector-Info-
        header (6.19) which discloses some valid address for the poster.

5.2.1.  Examples:

      From: John Smith <jsmith@site.example>
      From: "John Smith" <jsmith@site.example>, dave@isp.example
      From: "John D. Smith" <jsmith@site.example>, andrew@isp.example,
         fred@site2.example
      From: Jan Jones <jan@please_setup_your_system_correctly.invalid>
      From: Jan Jones <joe@guess-where.invalid>
      From: dave@isp.example (Dave Smith)

        NOTE: the last example shows a now deprecated convention of
        putting a poster's full name in a comment following the mailbox,
        rather than in a phrase at the start of it. Observe also the use
        of the quoted-string "John D. Smith" which is required on
        account of presence of the '.' character, and which would also
        have been required had any UTF8-xtra-char been present.



C. H. Lindsey                                                  [Page 31]


                          News Article Format                   May 2002

5.3.  Message-ID

   The Message-ID-header contains the article's message identifier, a
   unique identifier distinguishing the article from every other
   article. The content syntax makes use of syntax defined in [RFC
   2822], subject to the following revised definition of no-fold-quote
   and no-fold-literal.

      header             =/ Message-ID-header
      Message-ID-header  = "Message-ID" ":" SP Message-ID-content
                              *( ";" other-parameter )
      Message-ID-content = msg-id
      id-left            = dot-atom-text / no-fold-quote
      id-right           = dot-atom-text / no-fold-literal
      no-fold-quote      = DQUOTE
                              *( strict-qtext / "\\" / "\" DQUOTE )
                              qspecial
                              *( strict-qtext / "\\" / "\" DQUOTE )
                           DQUOTE
      qspecial           = "(" / ")" /        ; same as specials except
                           "<" / ">" /        ; "\" and DQUOTE quoted
                           "[" / "]" /
                           ":" / ";" /
                           "@" / "\\" /
                           "," / "." /
                           "\" DQUOTE
      no-fold-literal    = "[" *( dtext / "\[" / "\]" / "\\" ) "]"

   The msg-id MUST NOT be more than 250 octets in length.

        NOTE: The restriction to strict-qtext ensures that no UTF8-
        xtra-char can appear. Msg-ids as defined here are a "normalized"
        subset of those defined by [RFC 2822], ensuring that no string
        of characters is quoted unless strictly necessary (it must
        contain at least one qspecial) and no single character is
        prefixed by a "\" in the form of a quoted-pair unless strictly
        necessary, and moreover there is no possibility for WSP to
        occur, whether quoted or not. The length restriction ensures
        that systems which accept message identifiers as a parameter
        when retrieving an article (e.g. [NNTP]) can rely on a bounded
        length. Observe that msg-id includes the '<' and '>'.

   An agent generating an article's message identifier MUST ensure that
   it is unique (as also required in [RFC 2822]) and that it is NEVER
   reused (either in Netnews or Email). Moreover, even though commonly
   derived from the domain name of the originating site (and domain
   names are case-insensitive), a message identifier MUST NOT be altered
   in any way during transport, or when copied (as into a References-
   header), and thus a simple (case-sensitive) comparison of octets will
   always suffice to recognize that same message identifier wherever it
   subsequently reappears.




C. H. Lindsey                                                  [Page 32]


                          News Article Format                   May 2002

        NOTE: These requirements are to be contrasted with those of the
        un-normalized msg-ids defined by [RFC 2822], which may perfectly
        legitimately become normalized (or vice versa) during transport
        or copying in email systems.

        NOTE: Some old software may treat message identifiers that
        differ only in case within their id-right part as equivalent,
        and implementors of agents that generate message identifiers
        should be aware of this.

5.4.  Subject

   The Subject-header contains a short string identifying the topic of
   the message. This is an inheritable header (4.2.5.2) to be copied
   into the Subject-header of any followup, in which case the new
   Subject-content SHOULD then default to the string "Re: " (a "back
   reference") followed by the contents of the pure-subject of the
   precursor. Any leading "Re: " in the pure-subject MUST be stripped.

      header              =/ Subject-header
      Subject-header      = "Subject" ":" SP Subject-content
      Subject-content     = [ [FWS] back-reference ] pure-subject
      pure-subject        = unstructured
      back-reference      = %x52.65.3A.20
                                    ; which is a case-sensitive "Re: "

   The pure-subject MUST NOT begin with "Re: ".

        NOTE: The given syntax differs from that prescribed in [RFC
        2822] insofar as it does not permit a header content to be
        completely empty, or to consist of WSP only (see remarks in
        4.2.6 concerning undesirable headers).

   Followup agents MAY remove strings that are known to be used
   erroneously as back-reference (such as "Re(2): ", "Re:", "RE: ", or
   "Sv: ") from the Subject-content when composing the subject of a
   followup and add a correct back-reference in front of the result.

        NOTE: that would be "SHOULD remove instances" except that we
        cannot find a sufficiently robust and simple algorithm to do the
        necessary natural language processing.

   Followup agents MUST NOT use any other string except "Re: " as a back
   reference. Specifically, a translation of "Re: " into a local
   language or usage MUST NOT be used.

        NOTE: "Re" is an abbreviation for the Latin "In re", meaning "in
        the matter of", and not an abbreviation of "Reference" as is
        sometimes erroneously supposed.

   Agents SHOULD NOT depend on nor enforce the use of back references by
   followup agents. For compatibility with legacy news software the
   Subject-content of a control message (i.e. an article that also
   contains a Control-header) MAY start with the string "cmsg ", and

C. H. Lindsey                                                  [Page 33]


                          News Article Format                   May 2002

   non-control messages MUST NOT start with the string "cmsg ". See also
   section 6.13.

5.4.1.  Examples

   In the following examples, please note that only "Re: " is mandated
   by this standard. "was: " is a convention used by many English-
   speaking posters to signal a change in subject matter.  Software
   should be able to deduce this information from References-header.

      Subject: Film at 11
      Subject: Re: Film at 11
      Subject: Godwin's law considered harmful (was: Film at 11)
      Subject: Godwin's law (was: Film at 11)
      Subject: Re: Godwin's law (was: Film at 11)

5.5.  Newsgroups

   The Newsgroups-header's content specifies the newsgroup(s) in which
   the article is intended to appear. It is an inheritable header
   (4.2.5.2) which then becomes the default Newsgroups-header of any
   followup, unless a Followup-To-header is present to prescribe
   otherwise.  Articles MUST NOT be passed between relaying agents or to
   serving agents unless the sending agent has been configured to supply
   and the receiving agent to receive at least one of the newsgroup-
   names in the Newsgroups-header.

   References to "Unicode" or "the latest version of the Unicode
   Standard" mean [UNICODE 3.1] or any standard that supersedes it. That
   document contains guarantees of strict future upwards compatibility
   (e.g. no character will be removed or change classification).
   Implementors should be aware that currently unassigned code points
   (Unicode category Cn) may become valid characters in future versions
   of Unicode. Since the poster of an article might have access to a
   newer version of that standard, relaying and serving agents MUST
   accept such characters, but posting agents (and indeed all agents)
   MUST NOT generate them (though they might well follow up to
   newsgroup-names containing them).

      header              =/ Newsgroups-header
      Newsgroups-header   = "Newsgroups"  ":" SP Newsgroups-content
                                    *( ";" other-parameter )
      Newsgroups-content  = [FWS] newsgroup-name
                               *( [FWS] ng-delim [FWS] newsgroup-name )
                               [FWS]
      newsgroup-name      = component *( "." component )
      component           = 1*component-glyph
      ng-delim            = ","
      component-glyph     = combiner-base *combiner-mark
      combiner-base       = combiner-ASCII / combiner-extended
      combiner-ASCII      = DIGIT / ALPHA / "+" / "-" / "_"




C. H. Lindsey                                                  [Page 34]


                          News Article Format                   May 2002

      combiner-extended   = <any character with a Unicode code value of
                             0080 or greater and a combining class of 0,
                             but excluding any character in Unicode
                             categories Cc, Cf, Cs, Zs, Zl, and Zp>
      combiner-mark       = <any character with a Unicode code value of
                             0080 or greater and a combining class other
                             than 0>

        NOTE: the excluded characters are control characters (Cc),
        format control characters (Cf), surrogates (Cs), and separators
        (Zs, Zl, Zp). In particular, this excludes all whitespace
        characters.  To all intents and purposes, a component-glyph is
        what a user might regard as a single "character" as displayed on
        his screen, though it might be transmitted as several actual
        characters (e.g. q-circumflex is two characters). Note also
        that, in some writing schemes, several component-glyphs will
        merge into one visible object of variable size.

   Each component MUST be invariant under Unicode normalization NFKC
   (cf. the weaker normalization requirement for other headers in
   section 4.4.1 which specified no more than normalization NFC, and see
   also the explanatory NOTE in that section).

        NOTE: As a result of of this restriction, a name has only one
        valid form. Implementations can assume that a straight
        comparison of characters or octets is sufficient to compare two
        newsgroup-names.

        The requirement that names be invariant under NFKC, rather than
        NFC, means that all characters with a "compatibility
        decomposition" are forbidden (Unicode provides the property
        "NFKC_NO" to make this test easier).  The effect is to exclude
        variant forms of characters, such as superscripts and
        subscripts, wide and narrow forms, font variants, encircled
        forms, ligatures, and so on, as their use could cause confusion.

        There is insufficient experience in this area to determine
        whether this is the right long-term solution. Implementors
        should therefore be aware that a future version of this standard
        might reduce the requirement in the direction of NFC as opposed
        to NFKC.

        NOTE: An implementation is not required to apply NFKC, or any
        other normalization, to newsgroup names. Only agencies that
        create new groups need to be careful to obey this restriction
        (7.2.1).  However, if a posting agent neglects to normalize a
        newsgroup-name entered manually, this may lead to the user
        posting to a non-existent group without understanding why.

   Newsgroup-names containing non-ASCII characters MUST be encoded in
   UTF-8 and not according to [RFC 2047].




C. H. Lindsey                                                  [Page 35]


                          News Article Format                   May 2002

   Components beginning with underline ("_") are reserved for use by
   future versions of this standard and MUST NOT occur in newsgroup
   names (whether in Newsgroups-headers or in newgroup control messages
   (7.2.1)).  However, such names MUST be accepted.

   Components beginning with "+" or "-" are reserved for use by
   implementations and MUST NOT occur in newsgroup names (whether in
   Newsgroups-headers or in newgroup control messages). Implementors may
   assume that this rule will not change in any future version of this
   standard.

        NOTE: For example, implementors may safely use leading "+" and
        "-" to "escape" other entities within something that looks like
        a newsgroup-name.

   Agencies responsible for the administration of particular hierarchies
   Ought to place additional restrictions on the characters they allow
   in newsgroup-names within those hierarchies (such as to accord with
   the languages commonly used within those hierarchies, or to avoid
   perceived ambiguities pertinent to those languages). Where there is
   no such specific policy, the following restrictions SHOULD be applied
   to newsgroup names.

        NOTE: These restrictions are intended to reflect existing
        practice, with some additions to accommodate foreseeable
        enhancements, and are intended both to avoid certain technical
        difficulties and to avoid unnecessary confusion. It may well be
        that experience will allow future extensions to this standard to
        relax some or all of these restrictions.

   The specific restrictions (to be applied in the absence of
   established policies to the contrary) are:

   1. The following characters are forbidden, subject to the comments
      and notes at the end of the list:

      characters in category Cn (Other, Not assigned)         [1]
      characters in category Co (Other, Private Use)          [2]
      characters in category Lt (Letter, Titlecase)           [3]
      characters in category Lu (Letter, Uppercase)           [3]
      characters in category Me (Mark, Enclosing)             [4]
      characters in category Pd (Punctuation, Dash)           [4][5]
      characters in category Pe (Punctuation, Close)          [4]
      characters in category Pf (Punctuation, Final quote)    [4]
      characters in category Pi (Punctuation, Initial quote)  [4]
      characters in category Po (Punctuation, Other)          [4]
      characters in category Ps (Punctuation, Open)           [4]
      characters in category Sc (Symbol, Currency)            [4]
      characters in category Sk (Symbol, Modifier)            [4]
      characters in category Sm (Symbol, Math)                [4][5]
      characters in category So (Symbol, Other)               [4]




C. H. Lindsey                                                  [Page 36]


                          News Article Format                   May 2002

      [1] As new characters are added to Unicode, the code point moves
          from category Cn to some other category. As stated above,
          implementors should be prepared for this.

      [2] Specific private use characters can be used within a hierarchy
          or co-operating subnet that has agreed meanings for them.

      [3] Traditionally, newsgroup-names have been written in lowercase.
          Posting agents Ought Not to convert uppercase or titlecase
          characters to the corresponding lowercase forms except under
          the explicit instructions of the poster.

      [4] Traditionally newsgroup names have only used letters, digits,
          and the three special characters "+", "-" and "_". These
          categories correspond to characters outside that set.

      [5] Although the characters "+" and "-" are within categories Pd
          and Sm, they are not forbidden.

   2. A component name is forbidden to consist entirely of digits.

        NOTE: This requirement was in [RFC 1036] but nevertheless
        several such groups have appeared in practice and implementors
        should be prepared for them. A common implementation technique
        uses each component as the name of a directory and uses numeric
        filenames for each article within a group. Such an
        implementation needs to be careful when this could cause a clash
        (e.g. between article 123 of group xxx.yyy and the directory for
        group xxx.yyy.123).

   3. A component is limited to 30 component-glyphs and a newsgroup-name
      to 71 component-glyphs. Whilst there is no longer any technical
      reason to limit the length of a component (formerly, it was
      limited to 14 octets) nor of a newsgroup-name, it should be noted
      that these names are also used in the newsgroups line (7.2.1.2)
      where an overall policy limit applies and, moreover, excessively
      long names can be exceedingly inconvenient in practical use.

   Serving and relaying agents MUST accept any newsgroup-name that meets
   the above requirements, even if they violate one or more of the
   policy restrictions. Posting and injecting agents MAY reject articles
   containing newsgroup-names that do not meet these restrictions, and
   posting agents MAY attempt to correct them (but only with the
   explicit agreement of the poster for anything more than NFC or NFKC
   normalization). However, because of the large and changing tables
   required to do these checks and corrections throughout the whole of
   Unicode, this standard does not require them to do so. Rather, the
   onus is placed on those who create new newsgroups (7.2.1) to check
   the mandatory requirements, to consider the effects of relaxing the
   other restrictions, and to consider how all this may affect
   propagation of the group.




C. H. Lindsey                                                  [Page 37]


                          News Article Format                   May 2002

   Since future extensions to this standard and the Unicode standard,
   including a possible relaxation of the NFKC normalization, plus any
   relaxations of the default restrictions introduced by specific
   hierarchies might invalidate some such checks, warnings, and
   adjustments, implementations MUST incorporate means to disable them.

      NOTE: The newsgroup-name as encoded in UTF-8 should be regarded as
      the canonical form. Reading agents may convert it to whatever
      character set they are able to display and serving agents may
      possibly need to convert it to some form more suitable as a
      filename. Simple algorithms for both kinds of conversion are
      readily available.  Observe that the syntax does not allow
      comments within the Newsgroups-header; this is to simplify
      processing by relaying and serving agents which have a requirement
      to process this header extremely rapidly.

   The inclusion of folding white space within a Newsgroups-content is a
   newly introduced feature in this standard. It MUST be accepted by all
   conforming implementations (relaying agents, serving agents and
   reading agents).  Posting agents should be aware that such postings
   may be rejected by overly-critical old-style relaying agents. When a
   sufficient number of relaying agents are in conformance, posting
   agents SHOULD generate such whitespace in the form of <CRLF WSP> so
   as to keep the length of lines in the relevant headers (notably
   Newsgroups and Followup-To) to no more than than 79 characters (or
   other agreed policy limit - see 4.5).  Before such critical mass
   occurs, injecting agents MAY reformat such headers by removing
   whitespace inserted by the posting agent, but relaying agents MUST
   NOT do so.

   Posters SHOULD use only the names of existing newsgroups in the
   Newsgroups-header. However, it is legitimate to cross-post to a
   newsgroup(s) which do not exist on the posting agent's host, provided
   that at least one of the newsgroups DOES exist there, and followup
   agents SHOULD accept this (posting agents MAY accept it, but Ought at
   least to alert the poster to the situation and request confirmation).
   Relaying agents MUST NOT rewrite Newsgroups-headers in any way, even
   if some or all of the newsgroups do not exist on the relaying agent's
   host. Serving agents MUST NOT create new newsgroups simply because an
   unrecognized newsgroup-name occurs in a Newsgroups-header (see 7.2.1
   for the correct method of newsgroup creation).

   The Newsgroups-header is intended for use in Netnews articles rather
   than in email messages. It MAY be used in an email message to
   indicate that it is a copy also posted to the listed newsgroups, in
   which case the inclusion of a Posted-And-Mailed header (6.9) would
   also be appropriate. However, it SHOULD NOT be used in an email-only
   reply to a Netnews article (thus the "inheritable" property of this
   header applies only to followups to a newsgroup, and not to followups
   to the poster). Moreover, if a newsgroup-name contains any non-ASCII
   character, it MAY be encoded using the mechanism defined in [RFC
   2047] when sent by email (for which purpose the newsgroup-name SHOULD
   be treated as an encoded-word) but, if it is subsequently returned to
   the Netnews environment, it MUST then be re-encoded into UTF-8. See

C. H. Lindsey                                                  [Page 38]


                          News Article Format                   May 2002

   also the further discussion in section 8.8.1.

5.5.1.  Forbidden newsgroup names

   The following forms of newsgroup-name MUST NOT be used except for the
   specific purposes indicated:

     o Newsgroup-names having only one component. These are reserved for
       newsgroups whose propagation is restricted to a single host or
       local network, and for pseudo-newsgroups such as "poster" (which
       has special meaning in the Followup-To-header - see section 6.7),
       "junk" (often used by serving agents), and "control" (likewise);
     o Any newsgroup-name beginning with "control." (used as pseudo-
       newsgroups by many serving agents);
     o Any newsgroup-name containing the component "ctl" (likewise);
     o "to" or any newsgroup-name beginning with "to." (reserved for the
       ihave/sendme protocol described in section 7.4, and for test
       messages sent on an essentially point-to-point basis);

     o Any newsgroup-name beginning with "example." (reserved for
       examples in this and other standards);
     o Any newsgroup-name containing the component "all" (because this
       is used as a wildcard in some implementations).

   A newsgroup-name SHOULD NOT appear more than once in the Newsgroups-
   header.  The order of newsgroup names in the Newsgroups-header is not
   significant, except for determining which moderator to send the
   article to if more than one of the groups is moderated (see 8.2).

5.6.  Path

   The Path-header shows the route taken by a message since its entry
   into the Netnews system. It is a variant header (4.2.5.3), each agent
   that processes an article being required to add one (or more) entries
   to it. This is primarily to enable relaying agents to avoid sending
   articles to sites already known to have them, in particular the site
   they came from, and additionally to permit tracing the route articles
   take in moving over the network, and for gathering Usenet statistics.
   Finally the presence of a '%' path-delimiter in the Path-header can
   be used to identify an article injected in conformance with this
   standard.

5.6.1.  Format

      header          =/ Path-header
      Path-header     = "Path" ":" SP Path-content
                           *( ";" other-parameter )
      Path-content    = [FWS]
                           *( path-identity [FWS] path-delimiter [FWS] )
                           tail-entry [FWS]
      path-identity   = ( ALPHA / DIGIT )
                           *( ALPHA / DIGIT / "-" / "." / ":" / "_" )
      path-delimiter  = "/" / "?" / "%" / "," / "!"
      tail-entry      = 1*( ALPHA / DIGIT / "-" / "." / ":" / "_" )

C. H. Lindsey                                                  [Page 39]


                          News Article Format                   May 2002

        NOTE: A Path-content will inevitably contain at least one path-
        identity, except possibly in the case of a proto-article that
        has not yet been injected onto the network.

        NOTE: Observe that the syntax does not allow comments within the
        Path-header; this is to simplify processing by relaying and
        injecting agents which have a requirement to process this header
        extremely rapidly.

   A relaying agent SHOULD NOT pass an article to another relaying agent
   whose path-identity (or some known alias thereof) already appears in
   the Path-content. Since the comparison may be either case sensitive
   or case insensitive, relaying agents SHOULD NOT generate a name which
   differs from that of another site only in terms of case.

   A relaying agent MAY decline to accept an article if its own path-
   identity is already present in the Path-content or if the Path-
   content contains some path-identity whose articles the relaying agent
   does not want, as a matter of local policy.

        NOTE: This last facility is sometimes used to detect and decline
        control messages (notably cancel messages) which have been
        deliberately seeded with a path-identity to be "aliased out" by
        sites not wishing to act upon them.

5.6.2.  Adding a path-identity to the Path-header

   When an injecting, relaying or serving agent receives an article, it
   MUST prepend its own path-identity followed by a path-delimiter to
   the beginning of the Path-content. In addition, it SHOULD then add
   CRLF and WSP if it would otherwise result in a line longer than 79
   characters.

   The path-identity added MUST be unique to that agent. To this end it
   SHOULD be one of:

   1. A fully qualified domain name (FQDN) associated (by the Internet
      DNS service [RFC 1034]) with an A record, which SHOULD identify
      the actual machine prepending this path-identity. Ideally, this
      FQDN should also be "mailable" (see below).

   2. A fully qualified domain name (FQDN) associated (by the Internet
      DNS service) with an MX record, which MUST be "mailable".

   3. An arbitrary name believed to be unique and registered at least
      with all sites immediately downstream from the given site.

   4. An encoding of an IP address - <IPv4address> or <IPv6address> [RFC
      2373] (the requirement to be able to use an <IPv6address> is the
      reason for including ':' as an allowed character within a path-
      identity).




C. H. Lindsey                                                  [Page 40]


                          News Article Format                   May 2002

   The FQDN of an agent is "mailable" if the administrators of that
   agent can be reached by email using both of the forms "usenet@<FQDN>"
   and "news@<FQDN>", in conformity with [RFC 2142].

   Of the above options, nos. 1 to 3 are much to be preferred, unless
   there are strong technical reasons dictating otherwise. In
   particular, the injecting agent's path-identity MUST, as a special
   case, be an FQDN as in option 1 or option 2, and MUST be mailable.
   Additionally, in the case of an injecting agent offering its services
   to the general public, its administrators MUST also be reachable
   using the form "abuse@<FQDN>" UNLESS a more specific complaints
   address has been specified in a Complaints-To-header (6.20).

   The injecting agent's path-identity MUST be followed by the special
   path-delimiter '%' which serves to separate the pre-injection and
   post-injection regions of the Path-content (see 5.6.3).

   In the case of a relaying or serving agent, the path-delimiter is
   chosen as follows.  When such an agent receives an article, it MUST
   establish the identity of the source and compare it with the leftmost
   path-identity of the Path-content. If it matches, a '/' should be
   used as the path-delimiter when prepending the agent's own path-
   identity.  If it does not match then the agent should prepend two
   entries to the Path-content; firstly the true established path-
   identity of the source followed by a '?'  path-delimiter, and then,
   to the left of that, the agent's own path-identity followed by a '/'
   path-delimiter as usual.  This prepending of two entries SHOULD NOT
   be done if the provided and established identities match.

   Any method of establishing the identity of the source may be used
   (but see 5.6.5 below), with the consideration that, in the event of
   problems, the agent concerned may be called upon to justify it.

        NOTE: The use of the '%' path-delimiter marks the position of
        the injecting agent in the chain. In normal circumstances there
        should therefore be only one `%` path-delimiter present, and
        injecting agents MAY choose to reject proto-articles with a '%'
        already in them. If, for whatever reason, more than one '%' is
        found, then the path-identity in front of the leftmost '%' is to
        be regarded as the true injecting agent.

5.6.3.  The tail-entry

   For historical reasons, the tail-entry (i.e. the rightmost entry in
   the Path-content) is regarded as a "user name", and therefore MUST
   NOT be interpreted as a site through which the article has already
   passed. Moreover, the Path-content as a whole is not an email address
   and MUST NOT be used to contact the poster. Posting and/or injecting
   agents MAY place any string here.  When it is not an actual user
   name, the string "not-for-mail" is often used, but in fact a simple
   "x" would be sufficient.




C. H. Lindsey                                                  [Page 41]


                          News Article Format                   May 2002

   Often this field will be the only entry in the region (known as the
   pre-injection region) after the '%', although there may be entries
   corresponding to machines traversed between the posting agent and the
   injecting agent proper. In particular, injecting agents that receive
   articles from many sources MAY include information to establish the
   circumstances of the injection such as the identity of the source
   machine (especially if the Injector-Info-header (6.19) is absent).
   Any such inclusion SHOULD NOT conflict with any genuine site
   identifier. The '!'  path-delimiter may be used freely within the
   pre-injection region, although '/' and '?' are also appropriate if
   used correctly.

5.6.4.  Path-Delimiter Summary

   A summary of the various path-delimiters. The name immediately to the
   left of the path-delimiter is always that of the machine which added
   the path-delimiter.

   '/' The name immediately to the right is known to be the identity of
       the machine from which the article was received (either because
       the entry was made by that machine and we have verified it, or
       because we have added it ourselves).

   '?' The name immediately to the right is the claimed identity of the
       machine from which the article was received, but we were unable
       to verify it (and have prepended our own view of where it came
       from, and then a '/').

   '%' Everything to the right is the pre-injection region followed by
       the tail-entry.  The name on the left is the FQDN of the
       injecting agent. The presence of two '%'s in a path indicates a
       double-injection (see 8.2.2).

   '!' The name immediately to the right is unverified. The presence of
       a '!' to the left of the '%' indicates that the identity to the
       left is that of an old-style system not conformant with this
       standard.

   ',' Reserved for future use, treat as '/'.

   Other
       Old software may possibly use other path-delimiters, which should
       be treated as '!'.  But note in particular that ':', '-' and '_'
       are components of names, not path-delimiters, and FWS on its own
       MUST NOT be used as the sole path-delimiter.

        NOTE: Old Netnews relaying and injecting agents almost all
        delimit Path entries with a '!', and these entries are not
        verified.  The presence of '%' indicates that the article was
        injected by software conforming to this standard, and the
        presence of '!' to the left of a '%' indicates that the message
        passed through systems developed prior to this standard. It is
        anticipated that relaying agents will reject articles in the old
        style once this new standard has been widely adopted.

C. H. Lindsey                                                  [Page 42]


                          News Article Format                   May 2002

5.6.5.  Suggested Verification Methods

   It is preferable to verify the claimed path-identity against the
   source than to make routine use of the '?' path-delimiter, with
   consequential wasteful double-entry Path additions.

   If the incoming article arrives through some TCP/IP protocol such as
   NNTP, the IP address of the source will be known, and will likely
   already have been checked against a list of known FQDNs, IP
   addresses, or other registered aliases that the receiving site has
   agreed to peer with.

   Since the source host may have several IP addresses, checking the
   claimed FQDN or IP address against the source IP, or finding a
   suitable FQDN to report with a '?' path-delimiter, may involve
   several DNS lookups, following CNAME chains as required. Note that
   any reverse DNS lookup that is involved needs to be confirmed by a
   forward one.

   If the incoming article arrives through some other protocol, such as
   UUCP, that protocol MUST include a means of verifying the source
   site. In UUCP implementations, commonly each incoming connection has
   a unique login name and password, and that login name (or some alias
   registered for it) would be expected as the path-identity.

5.6.6.  Example

      Path: foo.isp.example/
         foo-server/bar.isp.example?10.123.12.2/old.site.example!
         barbaz/baz.isp.example%dialup123.baz.isp.example!x

        NOTE: That article was injected into the news stream by
        baz.isp.example (complaints may be addressed to
        abuse@baz.isp.example). The injector has taken care to record
        that it got it from dialup123.baz.isp.example. "x" is a dummy
        tail-entry, though sometimes a real userid is put there.

        The article was relayed, perhaps by UUCP, to the machine known,
        at least to its downstream, as "barbaz".

        Barbaz relayed it to old.site.example, which does not yet
        conform to this standard (hence the '!' path-delimiter). So one
        cannot be sure that it really came from barbaz.

        Old.site.example relayed it to a site claiming to have the IP
        address [10.123.12.2], and claiming (by using the '/' path-
        delimiter) to have verified that it came from old.site.example.

        [10.123.12.2] relayed it to "foo-server" which, not being
        convinced that it truly came from [10.123.12.2], did a reverse
        lookup on the actual source and concluded it was known as
        bar.isp.example (that is not to say that [10.123.12.2] was not a
        correct IP address for bar.isp.example, but simply that that
        connection could not be substantiated by foo-server).  Observe

C. H. Lindsey                                                  [Page 43]


                          News Article Format                   May 2002

        that foo-server has now added two entries to the Path.

        "foo-server" is a locally significant name within the complex
        site of many machines run by foo.isp.example, so the latter
        should have no problem recognizing foo-server and using a '/'
        path-delimiter.  Presumably foo.isp.example then delivered the
        article to its direct clients.

        It appears that foo.isp.example and old.site.example decided to
        fold the line, on the grounds that it seemed to be getting a
        little too long.

6.  Optional Headers

   None of the headers appearing in this section is required to appear
   in every article but some of them are required in certain types of
   article, such as followups. Any header defined in this (or any other)
   standard MUST NOT appear more than once in an article unless
   specifically stated otherwise.  Experimental headers (4.2.5.1) and
   headers defined by cooperating subnets are exempt from this
   requirement.  See section 8 "Duties of Various Agents" for the full
   picture.

6.1.  Reply-To

   The Reply-To-header specifies a reply address(es) to be used for
   personal replies for the poster(s) of the article when this is
   different from the poster's address(es) given in the From-header. The
   content syntax makes use of syntax defined in [RFC 2822], but subject
   to the revised definition of local-part given in section 5.2.

      header              =/ Reply-To-header
      Reply-To-header     = "Reply-To" ":" SP Reply-To-content
      Reply-To-content    = address-list

   In the absence of Reply-To, the reply address(es) is the address(es)
   in the From-header. For this reason a Reply-To SHOULD NOT be included
   if it just duplicates the From-header.

        NOTE: Use of a Reply-To-header is preferable to including a
        similar request in the article body, because replying agents can
        take account of Reply-To automatically.

6.1.1.  Examples

      Reply-To: John Smith <jsmith@site.example>
      Reply-To: John Smith <jsmith@site.example>, dave@isp.example
      Reply-To: John Smith <jsmith@site.example>,andrew@isp.example,
         fred@site2.example






C. H. Lindsey                                                  [Page 44]


                          News Article Format                   May 2002

6.2.  Sender

   The Sender-header specifies the mailbox of the entity which caused
   this article to be posted (and hence injected), if that entity is
   different from that given in the From-header or if more than one
   address appears in the From-header. This header SHOULD NOT appear in
   an article unless the sender is different from the poster. This
   header is appropriate for use by automatic article posters. The
   content syntax makes use of syntax defined in [RFC 2822].

      header              =/ Sender-header
      Sender-header       = "Sender" ":" SP Sender-content
                               *( ";" other-parameter )
      Sender-content      = mailbox

6.3.  Organization

   The Organization-header is a short phrase identifying the poster's
   organization.

      header              =/ Organization-header
      Organization-header = "Organization" ":" SP Organization-content
      Organization-content= unstructured

        NOTE: Posting and injecting agents are discouraged from
        providing a default value for this header unless it is
        acceptable to all posters using those agents. Unless this header
        contains useful information (including some indication of the
        posters physical location) posters are discouraged from
        including it.

6.4.  Keywords

   The Keywords field contains a comma separated list of important words
   and phrases intended to describe some aspect of the content of the
   article. The content syntax makes use of syntax defined in [RFC
   2822].

      header              =/ Keywords-header
      Keywords-header     = "Keywords" ":" SP Keywords-content
                               *( ";" other-parameter )
      Keywords-content    = phrase *( "," phrase )

        NOTE: The list is comma separated, NOT space separated.

        NOTE: Contrary to the usage defined in [RFC 2822], this standard
        does not permit multiple occurrences of this header.

6.5.  Summary

   The Summary-header is a short phrase summarizing the article's
   content.



C. H. Lindsey                                                  [Page 45]


                          News Article Format                   May 2002

      header              =/ Summary-header
      Summary-header      = "Summary" ":" SP Summary-content
      Summary-content     = unstructured

   The summary should be terse. Authors Ought to avoid trying to cram
   their entire article into the headers; even the simplest query
   usually benefits from a sentence or two of elaboration and context,
   and not all reading agents display all headers. On the other hand the
   summary should give more detail than the Subject.

6.6.  Distribution

   The Distribution-header is an inheritable header (see 4.2.5.2) which
   specifies geographical or organizational limits to an article's
   propagation.

      header              =/ Distribution-header
      Distribution-header = "Distribution" ":" SP Distribution-content
                               *( ";" other-parameter )
      Distribution-content= distribution *( dist-delim distribution )
      dist-delim          = ","
      distribution        = [FWS] distribution-name [FWS]
      distribution-name   = ALPHA 1*distribution-rest
      distribution-rest   = ALPHA / "+" / "-" / "_"

        NOTE: The use of ALPHA in the syntax ensures that distribution
        names are always in US-ASCII.

   Articles MUST NOT be passed between relaying agents or to serving
   agents unless the sending agent has been configured to supply and the
   receiving agent to receive at least one of the distributions in the
   Distribution-header.  Additionally, reading agents MAY also be
   configured so that unwanted distributions do not get displayed.

        NOTE: Although it would seem redundant to filter out unwanted
        distributions at both ends of a relaying link (and it is clearly
        more efficient to do so at the sending end), many sending sites
        have been reluctant, historically speaking, to apply such
        filters (except to ensure that distributions local to their own
        site or cooperating subnet did not escape); moreover they tended
        to configure their filters on an "all but those listed" basis,
        so that new and hitherto unheard of distributions would not be
        caught. Indeed many "hub" sites actually wanted to receive all
        possible distributions so that they could feed on to their
        clients in all possible geographical (or organizational)
        regions.

        Therefore, it is desirable to provide facilities for rejecting
        unwanted distributions at the receiving end. Indeed, it may be
        simpler to do so locally than to inform each sending site of
        what is required, especially in the case of specialized
        distributions (for example for control messages, such as cancels
        from certain issuers) which might need to be added at short
        notice.  The possibility for reading agents to filter

C. H. Lindsey                                                  [Page 46]


                          News Article Format                   May 2002

        distributions has been provided for the same reason.

   Exceptionally, ALL relaying agents are deemed willing to supply or
   accept the distribution "world", and NO relaying agent should supply
   or accept the distribution "local".  However, "world" SHOULD NEVER be
   mentioned explicitly since it is the default when the Distribution-
   header is absent entirely.  "All" MUST NOT be used as a
   distribution-name.  Distribution-names SHOULD contain at least three
   characters, except when they are two-letter country names as in [ISO
   3166].  Distribution-names are case-insensitive (i.e. "US", "Us" and
   "us" all specify the same distribution).

   Posting agents Ought Not to provide a default Distribution-header
   without giving the poster an opportunity to override it. Followup
   agents SHOULD initially supply the same Distribution-header as found
   in the precursor.

6.7.  Followup-To

   The Followup-To-header specifies which newsgroup(s) followups should
   be posted to.

      header              =/ Followup-To-header
      Followup-To-header  = "Followup-To" ":" SP Followup-To-content
                               *( ";" other-parameter )
      Followup-To-content = Newsgroups-content / [FWS] "poster" [FWS]

   The syntax is the same as that of the Newsgroups-content, with the
   addition that the keyword "poster" is allowed. In the absence of a
   Followup-To-header, the default newsgroup(s) for a followup are those
   in the Newsgroups header, and for this reason the Followup-To-header
   SHOULD NOT be included if it just duplicates the Newsgroups-header.

   A Followup-To-header consisting of the keyword "poster" indicates
   that the poster requests no followups to be sent in response to this
   article, only personal replies to the article's reply address.

        NOTE: A poster who wishes both  a personal reply and a followup
        post should include a Mail-Copies-To-header (6.8).

6.8.  Mail-Copies-To

   The Mail-Copies-To-header indicates whether or not the poster wishes
   to have followups to an article emailed in addition to being posted
   to Netnews and, if so, establishes the address to which they should
   be sent.

   The content syntax makes use of syntax defined in [RFC 2822], but
   subject to the revised definition of local-part given in section 5.2.

      header      =/ Mail-Copies-To-header
      Mail-Copies-To-header
                  = "Mail-Copies-To" ":" SP Mail-Copies-To-content


C. H. Lindsey                                                  [Page 47]


                          News Article Format                   May 2002

      Mail-Copies-To-content
                  = copy-addr / [CFWS] ( "nobody" / "poster" ) [CFWS]
      copy-addr   = address-list

   The keyword "nobody" indicates that the poster does not wish copies
   of any followup postings to be emailed. This indication is widely
   seen as a very strong wish, and is to be taken as the default when
   this header is absent.

   The keyword "poster" indicates that the poster wishes a copy of any
   followup postings to be emailed to him.

   Otherwise, this header contains a copy-addr to which the poster
   wishes a copy of any followup postings to be sent.

        NOTE: Some existing practice uses the keyword "never" in place
        of "nobody" and "always" in place of "poster". These usages are
        deprecated, but followup agents MAY observe them.

   The automatic actions of a followup agent in the various cases
   (subject to manual override by the user) are as follows:

   nobody (or when the header is absent)
      The followup agent SHOULD NOT, by default, email such a copy and
      Ought, especially when there is an explicit "nobody", to issue a
      warning and ask for confirmation if the user attempts to do so.

   poster
      The followup agent Ought, by default, to email a copy, which MUST
      then be sent to the address(es) in the Reply-To-header, and in the
      absence of that to the address(es) in the From-header.

   copy-addr
      The followup agent Ought, by default, to email a copy, which MUST
      then be sent to the copy-addr.

        NOTE: This header is only relevant when posting followups to
        Netnews articles, and is to be ignored when sending pure email
        replies to the poster, which are handled as prescribed under the
        Reply-To-header (6.1).  Whether or not this header will also
        find similar usage for replies to messages sent to mailing lists
        falls outside the scope of this standard.

   When emailing a copy, the followup agent SHOULD also include a
   "Posted-And-Mailed: yes" header (6.9).

        NOTE: In addition to the Posted-And-Mailed-header, some followup
        agents also include within the body a mention that the article
        is both posted and mailed, for the benefit of reading agents
        that do not normally show that header.





C. H. Lindsey                                                  [Page 48]


                          News Article Format                   May 2002

6.9.  Posted-And-Mailed

      header      =/ Posted-And-Mailed-header
      Posted-And-Mailed-header
                  = "Posted-And-Mailed" ":" SP Posted-And-Mailed-content
                       *( ";" other-parameter )
      Posted-And-Mailed-content
                  = [CFWS] ( "yes" / "no" ) [CFWS]

   This header, when used with the "yes" keyword, indicates that the
   article has been both posted to the specified newsgroups and emailed.
   It SHOULD be used when replying to the poster of an article to which
   this one is a followup (see the Mail-Copies-To-header in section 6.8)
   and it MAY be used when any article is also mailed to a recipient(s)
   identified in a To- and/or Cc-header that is also present. The "no"
   keyword is included for the sake of completeness; it MAY be used to
   indicate the opposite state, but is redundant insofar as it only
   describes the default state when this header is absent.

   This header, if present, MUST be included in both the posted and
   emailed versions of the article. The Newsgroups-header of the posted
   article SHOULD be included in the email version as recommended in
   section 5.5.  All other headers defined in this standard (excluding
   variant headers, but including specifically the Message-ID-header)
   MUST be identical in both the posted and mailed versions of the
   article, and so MUST the body.

        NOTE: This leaves open the question of whether a To- or a Cc-
        header should appear in the posted version. Naturally, a Bcc-
        header should not appear, except in a form which indicates that
        there are additional unspecified recipients.

6.10.  References

   The References-header lists CFWS-separated message identifiers of
   precursors. The content syntax makes use of syntax defined in [RFC
   2822].

      header              =/ References-header
      References-header   = "References" ":" SP References-content
                               *( ";" other-parameter )
      References-content  = msg-id *( CFWS msg-id )

        NOTE: This differs from the syntax of [RFC 2822] by requiring at
        least one CFWS between the msg-ids (a SP at this point was an
        [RFC 1036] requirement).

   A followup MUST have a References-header, and an article that is not
   a followup MUST NOT have a References-header. In a followup, if the
   precursor did not have a References-header, the followup's
   References-content MUST be formed by the message identifier of the
   precursor. A followup to an article which had a References-header
   MUST have a References-header containing the precursor's References-
   content (subject to trimming as described below) plus the precursor's

C. H. Lindsey                                                  [Page 49]


                          News Article Format                   May 2002

   message identifier appended to the end of the list (separated from it
   by CFWS).

   Followup agents SHOULD NOT trim message identifiers out of a
   References-header unless the number of message identifiers exceeds
   21, at which time trimming SHOULD be done by removing sufficient
   identifiers starting with the second so as to bring the total down to
   21 (but the first message identifier MUST NOT be trimmed). However,
   it would be wrong to assume that References-headers containing more
   than 21 message identifiers will not occur.

6.10.1.  Examples

      References: <i4g587y@site1.example>
      References: <i4g587y@site1.example> <kgb2231+ee@site2.example>
      References: <i4g587y@site1.example> <kgb2231+ee@site2.example>
         <222@site1.example> <87tfbyv@site7.example>
         <67jimf@site666.example>
      References: <i4g587y@site1.example> <kgb2231+ee@site2.example>
         <tisjits@smeghead.example>

6.11.  Expires

   The  Expires-header specifies a date and time when the article is
   deemed to be no longer relevant and  could usefully  be removed
   ("expired"). The content syntax makes use of syntax defined in [RFC
   2822].

      header              =/ Expires-header
      Expires-header      = "Expires" ":" SP Expires-content
                               *( ";" other-parameter )
      Expires-content     = date-time

   An Expires-header should only be used in an article if the requested
   expiry time is earlier or later than the time typically to be
   expected for such articles. Local policy for each serving agent will
   dictate whether and when this header is obeyed and posters SHOULD NOT
   depend on it being completely followed.

6.12.  Archive

   This optional header provides an indication of the poster's intent
   regarding preservation of the article in publicly accessible long-
   term or permanent storage.

      header              =/ Archive-header
      Archive-header      = "Archive" ":" SP Archive-content
                               *( ";" ( Archive-parameter /
                                        other-parameter ) )
      Archive-content     = [CFWS] ("no" / "yes" ) [CFWS]
      Archive-parameter   = <a parameter with attribute "filename"
                             and any value>



C. H. Lindsey                                                  [Page 50]


                          News Article Format                   May 2002

   The presence of an "Archive: no" header in an article indicates that
   the poster does not permit redistribution from publicly accessible
   long-term or permanent archives. The absence of this header, or an
   explicit "Archive: yes", indicates that the poster is willing for
   such redistribution to take place.  The optional "filename" parameter
   can then be used to suggest a filename under which the article should
   be stored. Further extensions to this standard may provide additional
   parameters for administration of the archiving process.

        NOTE: This standard does not attempt to define the length of
        "long-term", since it is dependent on many factors, including
        the retention policies of individual sites, and the customs or
        policies established for particular newsgroups or hierarchies.

        NOTE: Posters are cautioned that some sites may not implement
        the "no" option of the Archive-header correctly. In some
        jurisdictions non-compliance with this header may constitute a
        breach of copyright or of other legal provisions.  Moreover,
        even if this header prevents the poster's words from being
        archived publicly, it does nothing to prevent the archiving of a
        followup in which those words are quoted.

6.13.  Control

   The Control-header marks the article as a control message, and
   specifies the desired actions (additional to the usual ones of
   storing and/or relaying the article).

      header              =/ Control-header
      Control-header      = "Control" ":" SP Control-content
                               *( ";" other-parameter )
      Control-content     = [CFWS] control-message [CFWS]
      control-message     = <empty>

   However, the rule given above for control-message is incomplete.
   Further alternatives will be added incrementally as the various
   control-messages are introduced in section 7, or in extensions to
   this standard, using the "=/" notation defined in [RFC 2234].  For
   example, a typical CONTROL-message would be defined as follows:

      control-message     =/ CONTROL-message
      CONTROL-message     = "CONTROL" CONTROL-arguments
      CONTROL-arguments   = <the argument(s) specific to that
                             CONTROL-message>

   where "CONTROL" is a "verb" which is (and MUST be) of the syntactic
   form of a token and CONTROL-arguments MUST be of the syntactic form
   of a CFWS-separated list of values (which may require the use of
   quoted-strings if any tspecials or non-ASCII characters are
   involved).

   The verb indicates what action should be taken, and the argument(s)
   (if any) supply details. In some cases, the body of the article may
   also contain details.

C. H. Lindsey                                                  [Page 51]


                          News Article Format                   May 2002

   An article with a Control-header MUST NOT also have a Supersedes-
   header.

        NOTE: The presence of a Subject-header starting with the string
        "cmsg " and followed by a Control-message MUST NOT be construed,
        in the absence of a proper Control-header, as a request to
        perform that control action (as may have occurred in some legacy
        software). See also section 5.4.

6.14.  Approved

   The Approved-header indicates the mailing addresses (and possibly the
   full names) of the persons or entities approving the article for
   posting.

      header              =/ Approved-header
      Approved-header     = "Approved" ":" SP Approved-content
                               *( ";" other-parameter )
      Approved-content    = From-content  ; see 5.2

   Each mailbox contained in the Approved-content MUST be that of one of
   the person(s) or entity(ies) in question, and one of those mailboxes
   MUST be that of the actual injector of the article.

   An Approved-header is required in all postings to moderated
   newsgroups. If this header is not present in such postings, then
   relaying and serving agents MUST reject the article. Please see
   section 8.2.2 for how injecting agents should treat postings to
   moderated groups that do not contain this header.

   An Approved-header is also required in certain control messages, to
   reduce the risks of accidental or unauthorized posting of same.

        NOTE: The presence of an Approved-header indicates that the
        person or entity identified claims to have the necessary
        authority to post the article in question, thus enabling sites
        that dispute that authority to refuse to accept or to act upon
        it. However, the mere presence of the header is insufficient to
        provide assurance that it indeed originated from that person or
        entity, and it is therefore desirable that it be included within
        some digital signature scheme (see 7.1), especially in the case
        of control messages (section 7).

6.15.  Supersedes

   The Supersedes-header contains a message identifier specifying an
   article to be superseded upon the arrival of this one. The specified
   article MUST be treated as though a "cancel" control message had
   arrived for the article (but observe that a site MAY choose not to
   honour a "cancel" message, especially if its authenticity is in
   doubt). The content syntax makes use of syntax defined in [RFC 2822].

      header              =/ Supersedes-header


C. H. Lindsey                                                  [Page 52]


                          News Article Format                   May 2002

      Supersedes-header   = "Supersedes" ":" SP Supersedes-content
                               *( ";" other-parameter )
      Supersedes-content  = msg-id

        NOTE: There is no "c" in "Supersedes".

        NOTE: The Supersedes-header defined here has no connection with
        the Supersedes-header that sometimes appears in Email messages
        converted from X.400 according to [RFC 2156]; in particular, the
        syntax here permits only one msg-id in contrast to the multiple
        msg-ids in that Email version.

   If an article contains a Supersedes-header, then the old article
   mentioned SHOULD be withdrawn from circulation or access, as in a
   cancel message (7.3), and the new article inserted into the system as
   any other new article would have been.

   Whatever security or authentication checks are normally applied to a
   Control cancel message (or may be prescribed for such messages by
   some extension to this standard - see the remarks in 7.1 and 7.3)
   MUST also be applied to an article with a Supersedes-header. In the
   event of the failure of such checks, the article SHOULD be discarded,
   or at most stored as an ordinary article.

6.16.  Xref

   The Xref-header is a variant header (4.2.5.3) which indicates where
   an article was filed by the last server to process it.

      header            =/ Xref-header
      Xref-header       = "Xref" ":" SP Xref-content
                             *( ";" other-parameter )
      Xref-content      = [CFWS] server-name 1*( CFWS location ) [CFWS]
      server-name       = path-identity  ; see 5.6.1
      location          = newsgroup-name ":" article-locator
      article-locator   = 1*( %x21-27 / %x29-3A / %x3C-7E )
                             ; US-ASCII printable characters
                             ; except '(' and ';'

   The server-name is included so that software can determine which
   serving agent generated the header. The locations specify what
   newsgroups the article was filed under (which may differ from those
   in the Newsgroups-header) and where it was filed under them. The
   exact form of an article-locator is implementation-specific.

        NOTE: The traditional form of an article-locator is a decimal
        number, with articles in each newsgroup numbered consecutively
        starting from 1. NNTP demands that such a model be provided, and
        much other software expects it, but it seems desirable to permit
        flexibility for unorthodox implementations.

   An agent inserting an Xref-header into an article MUST delete any
   previous Xref-header(s). A relaying agent MAY delete it before
   relaying, but otherwise it SHOULD be ignored by any relaying or

C. H. Lindsey                                                  [Page 53]


                          News Article Format                   May 2002

   serving agent receiving it.

   An agent MUST use the same serving-name in Xref-headers as the path-
   identity it uses in Path-headers.

6.17.  Lines

   The Lines-header indicates the number of lines in the body of the
   article.

      header              =/ Lines-header
      Lines-header        = "Lines" ":" SP Lines-content
                               *( ";" other-parameter )
      Lines-content       = [CFWS] 1*DIGIT [CFWS]

   The line count includes all body lines, including the signature if
   any, including empty lines (if any) at the beginning or end of the
   body, and including the whole of all MIME message and multipart parts
   contained in the body (the single empty separator line between the
   headers and the body is not part of the body). The "body" here is the
   body as found in the posted article as transmitted by the posting
   agent.

   This header is to be regarded as obsolete, and it will likely be
   removed entirely in a future version of this standard. In the
   meantime, its use is deprecated.

6.18.  User-Agent

   The User-Agent-header contains information about the user agent
   (typically a newsreader) generating the article, for statistical
   purposes and tracing of standards violations to specific software
   needing correction. Although not one of the mandatory headers,
   posting agents SHOULD normally include it.

      header              =/ User-Agent-header
      User-Agent-header   = "User-Agent" ":" SP User-Agent-content
                               *( ";" other-parameter )
      User-Agent-content  = product-token *( CFWS product-token )
      product-token       = value [ "/" product-version ]  ; see 4.1
      product-version     = value

   This header MAY contain multiple product-tokens identifying the agent
   and any subproducts which form a significant part of the posting
   agent, listed in order of their significance for identifying the
   application. Product-tokens should be short and to the point - they
   MUST NOT be used for information beyond the canonical name of the
   product and its version.  Injecting agents MAY include product
   information for themselves (such as "INN/1.7.2"), but relaying and
   serving agents MUST NOT generate or modify this header to list
   themselves.




C. H. Lindsey                                                  [Page 54]


                          News Article Format                   May 2002

        NOTE: Variations from [RFC 2616] which describes a similar
        facility for the HTTP protocol:

        1.    use of arbitrary text or octets from character sets other
              than US-ASCII in a product-token may require the use of a
              quoted-string,

        2.    "{" and "}" are allowed in a value (product-token and
              product-version) in Netnews,

        3.    UTF-8 replaces ISO-8859-1 as charset assumption.

        NOTE: Comments should be restricted to information regarding the
        product named to their left such as platform information and
        should be concise. Use as an advertising medium (in the mundane
        sense) is discouraged.

6.18.1.  Examples

      User-Agent: tin/1.2-PL2
      User-Agent: tin/1.3-950621beta-PL0 (Unix)
      User-Agent: tin/unoff-1.3-BETA-970813 (UNIX) (Linux/2.0.30 (i486))
      User-Agent: tin/pre-1.4-971106 (UNIX) (Linux/2.0.30 (i486))
      User-Agent: Mozilla/4.02b7 (X11; I; en; HP-UX B.10.20 9000/712)
      User-Agent: Microsoft-Internet-News/4.70.1161
      User-Agent: Gnus/5.4.64 XEmacs/20.3beta17 ("Bucharest")
      User-Agent: Pluto/1.05h (RISC-OS/3.1) NewsHound/1.30
      User-Agent: inn/1.7.2
      User-Agent: telnet

        NOTE: This header supersedes the role performed redundantly by
        experimental headers such as X-Newsreader, X-Mailer, X-Posting-
        Agent, X-Http-User-Agent, and other headers previously used on
        Usenet for this purpose. Use of these experimental headers
        SHOULD be discontinued in favor of the single, standard User-
        Agent-header which can be used freely both in Netnews and Email
        (except that non-ASCII characters would be inappropriate in
        email).

6.19.  Injector-Info

   The Injector-Info-header SHOULD be added to each article by the
   injecting agent in order to provide information as to how that
   article entered the Netnews system and to assist in tracing its true
   origin.

      header          =/ Injector-Info-header
      Injector-Info-header
                      = "Injector-Info" ":" SP Injector-Info-content
                           *( ";" ( Injector-Info-parameter /
                                    other-parameter ) )
      Injector-Info-content
                      = [CFWS] path-identity [CFWS]


C. H. Lindsey                                                  [Page 55]


                          News Article Format                   May 2002

      Injector-Info-parameter
                      = posting-host-parameter /
                        posting-account-parameter /
                        posting-sender-parameter /
                        posting-logging-parameter /
                        posting-date-parameter
                        ; for {USENET}-parameters see 4.1
      posting-host-parameter
                      = <a parameter with attribute "posting-host"
                         and value some host-value>
      host-value      = dot-atom /
                        [ dot-atom ":" ]
                          ( IPv4address / IPv6address ); see [RFC 2373]
      posting-account-parameter
                      = <a parameter with attribute "posting-account"
                         and any value>
      posting-sender-parameter
                      = <a parameter with attribute "sender"
                         and value some sender-value>
      sender-value    = mailbox / "verified"
      posting-logging-parameter
                      = <a parameter with attribute "logging-data"
                         and any value>
      posting-date-parameter
                      = <a parameter with attribute "posting-date"
                         and value some date-time>

   An Injector-Info-header MUST NOT be added to an article by any agent
   other than an injecting agent. Any Injector-Info-header present when
   an article arrives at an injecting agent MUST be removed. In
   particular if, for some exceptional reason (8.2.2), an article gets
   injected twice, the Injector-Info-header will always relate to the
   second injection.

   The path-identity MUST be the same as the path-identity prepended to
   the Path-header by that same injecting agent which, following section
   5.6.2, MUST therefore be a fully qualified domain name (FQDN)
   mailable address.

   Although comments and folding of white space are permitted throughout
   the Injector-Info-content specification, it is RECOMMENDED that
   folding is not used within any parameter (but only before or after
   the ";" separating those parameters), and that comments are only used
   following the last parameter. It is also RECOMMENDED that such
   parameters as are present are included in the order in which they
   have been defined in the syntax above.  An injecting agent SHOULD use
   a consistent form of this header for all articles emanating from the
   same or similar origins.

        NOTE: The effect of those recommendations is to facilitate the
        recognition of articles arising from certain designated origins
        (as in the so-called "killfiles" which are available in some
        reading agents). Observe that the order within the syntax has
        been chosen to place last those parameters which are most likely

C. H. Lindsey                                                  [Page 56]


                          News Article Format                   May 2002

        to change between successive articles posted from the same
        origin.

        NOTE:  To comply with the overall "attribute = value" syntax of
        parameters, any value containing an IPv6address, a date-time, a
        mailbox or any CFWS MUST be quoted using <DQUOTE>s (the quoting
        is optional in other cases).

        NOTE: This header is intended to replace various currently-used
        but nowhere-documented headers such as "NNTP-Posting-Host",
        "NNTP-Posting-Date" and "X-Trace". These headers are now
        deprecated, and any of them present when an article arrives at
        an injecting agent SHOULD also be removed as above.

6.19.1.  Usage of Injector-Info-parameters

   The purpose of these parameters is to enable the injecting agent to
   make assertions about the origin of the article, in fulfilment of its
   responsibilities towards the rest of the network as set out in
   section 8.2.  These assertions can then be utilized as follows:

   1. To enable the administrator of the injecting agent to respond to
      complaints and queries concerning the article. For this purpose,
      the parameters included SHOULD be sufficient to enable the
      administrator to identify its true origin (which parameters are
      best suited to this purpose will vary with the nature of the
      injecting site and of its relationship to the posters who use it -
      there is no benefit in including parameters which contribute
      nothing to this aim).  An administrator MAY, with those parameters
      where the syntax so allows, use cryptic notations interpretable
      only by himself if he considers it appropriate to protect the
      privacy of that origin.

   2. To enable relaying, serving and reading agents to recognize
      articles from origins which they might wish to reject, divert, or
      otherwise handle specially, for reasons of site policy.

   3. To enable the timely identification of spews of articles arising
      from a common origin.

   An injecting agent MUST NOT include any Injector-Info-parameter
   unless it has positive evidence of its correctness. An injecting
   agent MAY include other-parameters with x-token attributes which will
   assist in identifying the origin of the article.

      NOTE: Administrators of injecting agents can choose which
      selection of the following parameters best enables them to fulfil
      their responsibilities.  Some of these parameters identify the
      source of the article explicitly whereas others do so indirectly,
      thus affording more privacy to posters who value their anonymity,
      but also making harder the tracking of malicious disruption of the
      network, especially so if the administrators choose not to
      cooperate.  There is thus a balance to be struck between the needs
      of privacy on the one hand and the good order of Usenet on the

C. H. Lindsey                                                  [Page 57]


                          News Article Format                   May 2002

      other, and administrators need to be aware of this when
      formulating their policies.

6.19.1.1.  The posting-host-parameter

   If a dot-atom is present, it MUST be a FQDN identifying the specific
   host from which the injecting agent received the article.
   Alternatively, an IP address (IPv4address or IPv6address) identifies
   that host. If both forms are present, then they MUST identify the
   same host, or at least have done so at the time the article was
   injected.

        NOTE: It is commonly the case that this parameter identifies a
        dial-up point-of-presence, in which case a posting-account or
        logging-data may need to be consulted to find the true origin of
        the article.

6.19.1.2.  The posting-account-parameter

   This parameter identifies the source from which the injecting agent
   received the article. It SHOULD be in a cryptic notation
   understandable only by the administrator of the injecting agent, but
   it MUST be such that a given source gives rise to the same posting-
   account, at least in the short term. If the injecting agent is unable
   to meet that obligation, then it should use a posting-logging-
   parameter instead.

6.19.1.3.  The posting-sender-parameter

   This parameter identifies the mailbox of the verified sender of the
   article (alternatively, it uses the token "verified" to indicate that
   at least any addr-spec in the Sender-header of the article, or in the
   From-header if the Sender-header is absent, is correct).

        NOTE: An injecting agent is unlikely to be able to make use of
        this parameter except in cases where it is running on a machine
        which is aware of the user-space in which the posting agent is
        operating. This parameter should be used in preference to a
        posting-account-parameter in such situations.

6.19.1.4.  The posting-logging-parameter

   This parameter contains information (typically a session number or
   other non-persistent means of identifying a posting account) which
   will enable the true origin of the article to be determined by
   reference to logging information kept by the injecting agent.

6.19.1.5.  The posting-date-parameter

   This parameter identifies the time at which the article was injected
   (as distinct from the Date-header, which indicates when it was
   written).



C. H. Lindsey                                                  [Page 58]


                          News Article Format                   May 2002

6.19.2.  Example

      Injector-Info: news2.isp.net; posting-host=modem-15.pop.isp.net;
         posting-account=client0002623; logging-data=2427;
         posting-date="Wed, 2 Aug 2000 20:05:33 -0100 (BST)"


6.20.  Complaints-To

   The Complaints-To-header is added to an article by an injecting agent
   in order to indicate the mailbox to which complaints concerning the
   poster of the article may be sent.

      header            =/ Complaints-To-header
      Complaints-To-header
                        = "Complaints-To" ":" SP Complaints-To-content
      Complaints-To-content
                        = address-list

   A Complaints-To-header MUST NOT be added to an article by any agent
   other than an injecting agent. Any Complaints-To-header present when
   an article arrives at an injecting agent MUST be removed. In
   particular if, for some exceptional reason (8.2.2), an article gets
   injected twice, the Complaints-To-header will always relate to the
   second injection.

   The specified mailbox is for sending complaints concerning the
   behaviour of the poster of the article; it SHOULD NOT be used for
   matters concerning propagation, protocol problems, etc. which should
   be addressed to "usenet@" or "news@" the path-identity which was
   prepended to the Path-header by the injecting agent, in accordance
   with section 5.6.2.  In the absence of this header, complaints
   concerning a poster's behaviour MAY be addressed to "abuse@" that
   path-identity (although section 5.6.2 provides no obligation for that
   address to be mailable at an injecting agent that is not provided for
   the use of the general public).

6.21.  MIME headers

6.21.1.  Syntax

   The following headers may be used within articles conforming to this
   standard.

        MIME-Version:                [RFC 2045]
        Content-Type:                [RFC 2045],[RFC 2046]
        Content-Transfer-Encoding:   [RFC 2045]
        Content-ID:                  [RFC 2045]
        Content-Description:         [RFC 2045]
        Content-Disposition:         [RFC 2183]
        Content-Location:            [RFC 2557]
        Content-MD5:                 [RFC 1864]



C. H. Lindsey                                                  [Page 59]


                          News Article Format                   May 2002

   The RFCs listed are deemed to be incorporated into this standard to
   the extent necessary to facilitate their usage within Netnews,
   subject to the revised syntax of parameter given in this standard
   (which permits UTF-xtra-chars to appear within quoted-strings used as
   values), and subject to curtailment of that usage as described in the
   following sections. Moreover, extensions to those standards
   registered in accordance with [RFC 2048] are also available for use
   within Netnews, as indeed is any other header in the Content-* series
   which has a sensible interpretation within Netnews.

   Insofar as the syntax for these headers, as given in those RFCs does
   not specify precisely where whitespace and comments may occur
   (whether in the form of WSP, FWS or CFWS), the usage defined in this
   standard, and failing that in [RFC 2822], and failing that in [RFC
   822] MUST be followed. In particular, there MUST NOT be any WSP
   between a header-name and the following colon and there MUST be a SP
   following that colon.

6.21.2.  Content-Type

   The Content-Type: "text/plain" is the default type for any news
   article, but the recommendations and limits on line lengths set out
   in section 4.5 Ought to be observed

   The acceptability of other subtypes of Content-Type: "text" (such as
   "text/html") is a matter of policy (see 1.1), and posters Ought Not
   to use them unless established policy or custom in the particular
   hierarchies or groups involved so allows.  Moreover, even in those
   cases, for the benefit of readers who see it only in its transmitted
   form, the material SHOULD be "pretty-printed" (for example by
   restricting its line length as above and by keeping sequences which
   control its layout or style separate from the meaningful text).

   In the same way, Content-Types requiring special processing for their
   display, such as "application", "image", "audio", "video" and
   "multipart/related" are discouraged except in groups specifically
   intended (by policy or custom) to include them. Exceptionally, those
   application types defined in [RFC 1847] and [RFC 3156] for use within
   "multipart/signed" articles, and the type "application/pgp-keys" (or
   other similar types containing digital certificates) may be used
   freely.

   Reading agents SHOULD NOT, unless explicitly configured otherwise,
   act automatically on Application types which could change the state
   of that agent (e.g. by writing or modifying files), except in the
   case of those prescribed for use in control messages (7.2.1.2 and
   7.2.4.1).

6.21.2.1.  Message/partial

   The Content-Type "message/partial" MAY be used to split a long news
   article into several smaller ones.



C. H. Lindsey                                                  [Page 60]


                          News Article Format                   May 2002

        NOTE: This Content-Type is not recommended for textual articles
        because the Content-Type, and in particular the charset, of the
        complete article cannot be determined by examination of the
        second and subsequent parts, and hence it is not possible to
        read them as separate articles (except when they are written in
        pure US-ASCII). Moreover, for full compliance with [RFC 2046] it
        would be necessary to use the "quoted-printable" encoding to
        ensure the material was 7bit-safe.  In any case, breaking such
        long texts into several parts is usually unnecessary, since
        modern transport agents should have no difficulty in handling
        articles of arbitrary length.

        On the other hand, "message/partial" may be useful for binaries
        of excessive length, since reading of the individual parts on
        their own is not required and they would in any case be encoded
        in a manner that was 7bit-safe.

   IF this Content-Type is used, then the "id" parameter SHOULD be in
   the form of a unique message identifier (but different from that in
   the Message-ID-header of any of the parts). The second and subsequent
   parts SHOULD contain References-headers referring to all the previous
   parts, thus enabling reading agents with threading capabilities to
   present them in the correct order.  Reading agents MAY then provide a
   facility to recombine the parts into a single article (but this
   standard does not require them to do so).

6.21.2.2.  Message/rfc822

   The Content-Type "message/rfc822" should be used for the
   encapsulation (whether as part of another news article or, more
   usually, as part of an email message) of complete news articles which
   have already been posted to Netnews and which are for the information
   of the recipient, and do not constitute a request to repost them.

   In the case where the encapsulated article has Content-Transfer-
   Encoding "8bit", it will be necessary to change that encoding if it
   is to be forwarded over some email transport that only supports
   "7bit". However, this should not be necessary for any email transport
   that supports the 8BITMIME feature [RFC 2821].  Moreover, where the
   headers of the encapsulated article contain any UTF8-xtra-chars
   (2.4.2), it may not be possible to transport them over email
   transports even where 8BITMIME is supported. In such cases, it will
   be necessary to encode those headers as provided in [RFC 2047]
   (notwithstanding that such usage is deprecated for news headers by
   this standard, and actually forbidden in the case of the Newsgroups-
   header).

   In the event that the encapsulated article has to be encoded for
   either of these reasons, it may be necessary to reverse that encoding
   if certain forms of digital signatures have been employed, or if the
   article is to be reintroduced into some Netnews system (however, in
   the latter case, the Content-Type "application/news-transmission"
   should have been used instead).


C. H. Lindsey                                                  [Page 61]


                          News Article Format                   May 2002

        NOTE: It is likely, though not guaranteed, that headers
        containing UTF8-xtra-chars will pass safely through email
        transports supporting 8BITMIME if the "message/rfc822" object is
        sent as an attachment (i.e.  as a part of a multipart) rather
        than as the top-level body of the email message. Moreover, it is
        anticipated that future extensions to the Email standards will
        permit headers containing UTF8-xtra-chars to be carried without
        further ado over conforming transports.

6.21.2.3.  Message/external-body

   The Content-Type "message/external-body" could be appropriate for
   texts which it would be uneconomic (in view of the likely readership)
   to distribute to the entire network.

6.21.2.4.  Multipart types

   The Content-Types "multipart/mixed", "multipart/parallel" and
   "multipart/signed" may be used freely in news articles.  However,
   except where policy or custom so allows, the Content-Type:
   "multipart/alternative" SHOULD NOT be used, on account of the extra
   bandwidth consumed and the difficulty of quoting in followups, but
   reading agents MUST accept it.

   The Content-Type: "multipart/digest" is commended for any article
   composed of multiple messages more conveniently viewed as separate
   entities, thus enabling reading agents to move rapidly between them.
   The "boundary" should be composed of 28 hyphens (US-ASCII 45) (which
   makes each boundary delimiter 30 hyphens, or 32 for the final one) so
   as to enable reading agents which currently support the digest usage
   described in [RFC 1153] to continue to operate correctly.

        NOTE: The various recommendations given above regarding the
        usage of particular Content-Types apply also to the individual
        parts of these multiparts.

6.21.3.  Content-Transfer-Encoding

   "Content-Transfer-Encoding: 7bit" is sufficient for article bodies
   (or parts of multiparts) written in pure US-ASCII (or most other
   material representable in 7 bits).  Posting agents SHOULD specify
   "Content-Transfer-Encoding: 8bit" for all other cases unless there
   are pressing reasons to do otherwise. They MAY use "8bit" encoding
   even when "7bit" encoding would have sufficed. Examples of such
   pressing reasons are the following:

   1. The content type implies that the content is (or may be) "8bit-
      unsafe"; i.e.  it may contain octets equivalent to the US-ASCII
      characters CR or LF (other than in the combination CRLF) or NUL.
      In that case one of the Content-Transfer-Encodings "base64" or
      "quoted-printable" MUST be used, and reading agents MUST be able
      to handle both of them. Encoding "binary" MUST NOT be used (except
      in cooperating subnets with alternative transport arrangements)
      because this standard does not mandate a transport mechanism that

C. H. Lindsey                                                  [Page 62]


                          News Article Format                   May 2002

      could support it.

        NOTE: If a future extension to the MIME standards were to
        provide a more compact encoding of binary suited to transport
        over an 8bit channel, it could be considered as an alternative
        to base64 once it had gained widespread acceptance.

   2. It is often the case that "application" Content-Types are textual
      in nature, and intelligible to humans as well as to machines, and
      where this state can be recognized by the posting agent (either
      through knowledge of the particular application type or by
      testing) the material SHOULD NOT be treated as 8bit-unsafe; this
      has the added benefit, where the posting agent uses other than
      CRLF for line endings internally, of automatically ensuring that
      line endings are processed correctly during transport.

      If, on the other hand, the posting agent recognizes that the
      material is not textual, or cannot reasonably determine it to be
      so, then the material MUST be encoded as for 8bit-unsafe (however,
      in that case, it is the responsibility of the agent generating the
      material to ensure that lines endings, if any, are represented
      correctly).

        NOTE: All the application types defined by this standard, namely
        "application/news-transmission", "application/news-groupinfo"
        and "application/news-checkgroups" are textual, and indeed
        designed for human reading.

   3. Although the "text" Content-Types should normally be encoded as
      8bit (or 7bit), if the character set specified by the "charset="
      parameter can include the 3 disallowed octets, then the material
      MUST be encoded as for 8bit-unsafe.  This is most likely to arise
      in the case of 16-bit character sets such as UTF-16 ([UNICODE3.1]
      or [ISO/IEC 10646]).  In addition, where it is known that the
      material is subsequently to be gatewayed from Netnews to Email
      (8.8), the encoding "quoted-printable" MAY be used (otherwise the
      gateway might have to re-encode it itself).

   4. Some protocols REQUIRE the use of a particular Content-Transfer-
      Encoding. In particular, the authentication protocol based on
      [Open]PGP defined in [RFC 3156] mandates the use of one of the
      encodings "quoted-printable" or "base64".  Whilst posters might be
      tempted to risk the use of "8bit" or "7bit" encodings (and indeed
      the referenced standard recommends that signed messages using
      those encodings be accepted and interpreted), they should be
      warned that differences in the treatment of trailing whitespace
      between OpenPGP [RFC 2440] and earlier versions of PGP may render
      signatures written with the one unverifiable by the other; and,
      moreover, Usenet articles are very likely to include trailing
      whitespace in the form of a personal signature (4.3.2).

   5. The Content-Type message/partial [RFC 2046] is required to use
      encoding "7bit" (the encapsulated complete message may itself use
      encoding "quoted-printable" or "base64", but that information is

C. H. Lindsey                                                  [Page 63]


                          News Article Format                   May 2002

      only conveyed along with the first of the partial parts).

        NOTE: Although there would actually be no problem using encoding
        "8bit" in a pure Netnews (as opposed to Email) environment, this
        standard discourages (see 6.21.2.1) the use of "message/partial"
        except for binary material, which will be encoded to pass
        through "7bit" in any case.

   Injecting and relaying agents MUST NOT change the encoding of
   articles passed to them. Gateways SHOULD NOT change the encoding
   unless absolutely necessary.

6.21.4.  Character Sets

   In principle, any character set may be specified in the "charset="
   parameter of a content type. However, only those character sets (and
   the corresponding parts of UTF-8) should be used which are
   appropriate for the customary language(s) of the hierarchy or
   newsgroup concerned (whose readers could be expected to possess
   agents capable of displaying them).

6.21.5.  Content Disposition

   Reading agents Ought to honour any Content-Disposition-header that is
   provided (in particular, they Ought to display any part of a
   multipart for which the disposition is "inline", possibly
   distinguished from adjacent parts by some suitable separator). In the
   absence of such a header, the body of an article or any part of a
   multipart with Content-Type "text" Ought to be displayed inline.
   Followup agents which quote parts of a precursor (see 4.3.2) Ought
   initially to include all parts of the precursor that were displayed
   inline, as if they were a single part.

6.21.6.  Definition of some new Content-Types

   This standard defines (or redefines) several new Content-Types, which
   require to be registered with IANA as provided for in [RFC 2048].
   For "application/news-groupinfo" see 7.2.1.2, for "application/news-
   checkgroups" see 7.2.4.1, and for "application/news-transmission" see
   the following section.

6.21.6.1.  Application/news-transmission

   The Content-Type "application/news-transmission" is intended for the
   encapsulation of complete news articles where the intention is that
   the recipient should then inject them into Netnews. This Application
   type SHOULD be used when mailing articles to moderators and to
   email-to-news gateways (see 8.2.2).

        NOTE: The benefit of such encapsulation is that it removes
        possible conflict between news and email headers and it provides
        a convenient way of "tunnelling" a news article through a
        transport medium that does not support 8bit characters.


C. H. Lindsey                                                  [Page 64]


                          News Article Format                   May 2002

   The MIME content type definition of "application/news-transmission"
   is:

   MIME type name:           application
   MIME subtype name:        news-transmission
   Required parameters:      none
   Optional parameters:      usage=moderate
                             usage=inject
                             usage=relay
   Encoding considerations:  A transfer-encoding (such as Quoted-
                             Printable or Base64) different from that of
                             the article transmitted MAY be supplied
                             (perhaps en route) to ensure correct
                             transmission over some 7bit transport
                             medium.
   Security considerations:  A news article may be a "control message",
                             which could have effects on the recipient
                             host's system beyond just storage of the
                             article. However, such control messages
                             also occur in normal news flow, so most
                             hosts will already be suitably defended
                             against undesired effects.
   Published specification:  [USEFOR]
   Body part:                A complete article or proto-article, ready
                             for injection into Netnews, or a batch of
                             such articles.

        NOTE: It is likely that the recipient of an "application/news-
        transmission" will be a specialized gateway (e.g. a moderator's
        submission address) able to accept articles with only one of the
        three usage parameters "moderate", "inject" and "relay", hence
        the reason why they are optional, being redundant in most
        situations. Nevertheless, they MAY be used to signify the
        originator's intention with regard to the transmission, so
        removing any possible doubt.

   When the parameter "relay" is used, or implied, the body part MAY be
   a batch of articles to be transmitted together, in which case the
   following syntax MUST be used.

      batch             = 1*( batch-header article )
      batch-header      = "#!" SP rnews SP article-size CRLF
      rnews             = %x72.6E.65.77.73 ; case sensitive "rnews"
      article-size      = 1*DIGIT

   Thus a batch is a sequence of articles, each prefixed by a header
   line that includes its size. The article-size is a decimal count of
   the octets in the article, counting each CRLF as one octet regardless
   of how it is actually represented.

        NOTE: Despite the similarity of this format to an executable
        UNIX script, it is EXTREMELY unwise to feed such a batch into a
        command interpreter in anticipation of it running a command
        named "rnews"; the security implications of so doing would be

C. H. Lindsey                                                  [Page 65]


                          News Article Format                   May 2002

        disastrous.

6.21.6.2.  Message/news obsoleted

   The Content-Type "message/news", as previously registered with IANA,
   is hereby declared obsolete. It was never widely implemented, and its
   default treatment as "application/octet-stream" by agents that did
   not recognize it was counter productive. The Content-Type
   "message/rfc822" SHOULD be used in its place, as already described
   above.

6.22.  Obsolete Headers

   Persons writing new agents SHOULD ignore any former meanings of the
   following headers:

        Also-Control
        See-Also
        Article-Names
        Article-Updates

7.  Control Messages

   The following sections document the control messages.  "Message" is
   used herein as a synonym for "article" unless context indicates
   otherwise.

   The Newsgroups-header of each control message SHOULD include the
   newsgroup-name(s) for the group(s) affected (i.e. groups to be
   created, modified or removed, or containing articles to be canceled).
   This is to ensure that the message propagates to all sites which
   receive (or would receive) that group(s). It MAY include other
   newsgroup-names so as to improve propagation (but this practice may
   cause the control message to propagate also to places where it is
   unwanted, or even cause it not to propagate where it should, so it
   should not be used without good reason).

   The descriptions below set out REQUIREMENTS to be followed by sites
   that receive control messages and choose to honour them. However,
   nothing in these descriptions should be taken as overriding the right
   of any such site, in accordance with its local policy, to deny any
   particular control message, or to refer it to an administrator for
   approval (either as a class or on a case-by-case basis). In
   particular, sites Ought to deny messages not issued by the
   appropriate administrative agencies, and therefore SHOULD take such
   steps as are reasonably practicable to validate their authenticity
   (see, for example, section 7.1 below).

   Relaying Agents MUST propagate even control messages that they do not
   recognize.

   In the following sections, each type of control message is defined
   syntactically by defining its verb, its arguments, and possibly its
   body.

C. H. Lindsey                                                  [Page 66]


                          News Article Format                   May 2002

7.1.  Digital Signature of Headers

   It is most desirable that group control messages (7.2) in particular
   be authenticated by incorporating them within some digital signature
   scheme that encompasses other headers closely associated with them
   (including at least the Approved-, Message-ID- and Date-headers). At
   the time of writing, this is usually done by means of a protocol
   known as "PGPverify" ([PGPVERIFY]), and continued usage of this is
   encouraged at least as an interim measure.

   However, PGPverify is not considered suitable for standardization in
   its present form, for various technical reasons. It is therefore
   expected that an early extension to this standard will provide a
   robust and general purpose digital authentication mechanism with
   applicability to all situations requiring protection against
   malicious use of, or interference with, headers.  That extension
   would also address other Netnews security issues.

7.2.  Group Control Messages

   "Group control messages" are the sub-class of control messages that
   request some update to the configuration of the groups known to a
   serving agent, namely "newgroup".  "rmgroup", "mvgroup" and
   "checkgroups", plus any others created by extensions to this
   standard.

   All of the group control messages MUST have an Approved-header
   (6.14).  Moreover, in those hierarchies where appropriate
   administrative agencies exist (see 1.1), group control messages Ought
   Not to be issued except as authorized by those agencies.

7.2.1.  The 'newgroup' Control Message

      control-message     =/ Newgroup-message
      Newgroup-message    = "newgroup" Newgroup-arguments
      Newgroup-arguments  = CFWS newsgroup-name [ CFWS newgroup-flag ]
      newgroup-flag       = "moderated"

   The "newgroup" control message requests that the specified group be
   created or changed. If the request is honoured, or if the group
   already exists on the serving agent, and if the newgroup-flag
   "moderated" is present, then the group MUST be marked as moderated,
   and vice versa. "Moderated" is the only such flag defined by this
   standard; other flags MAY be defined for use in cooperating subnets,
   but newgroup messages containing them MUST NOT be acted on outside of
   those subnets.

        NOTE: Specifically, some alternative flags such as "y" and "m",
        which are sent and recognized by some current software, are NOT
        part of this standard.  Moreover, some existing implementations
        treat any flag other than "moderated" as indicating an
        unmoderated newsgroup. Both of these usages are contrary to this
        standard and control messages with such non-standard flags
        should be ignored.

C. H. Lindsey                                                  [Page 67]


                          News Article Format                   May 2002

   The message body comprises or includes an "application/news-
   groupinfo" (7.2.1.2) part containing machine- and human-readable
   information about the group.

   It is REQUIRED that the newsgroup-name conforms to all requirements
   set out in section 5.5.  This includes the restrictions as to the
   permitted characters, and the requirement that they be invariant
   under NFKC normalization. It is essential that those who issue
   newgroup messages are aware of their responsibility to enforce this
   requirement, since some of those conditions are hard to enforce
   mechanically.

   Additionally, the newsgroup-name Ought to conform to whatever
   policies have been established by the administrative agency, if any,
   for that hierarchy. Serving agents SHOULD, insofar as they are
   conveniently able to detect them, reject all newgroup messages not
   meeting those requirements.

   The newgroup command is also used to update the newsgroups-line or
   the moderation status of a group.

7.2.1.1.  The Body of the 'newgroup' Control Message

   The body of the newgroup message contains the following subparts,
   preferably in the order shown:

   1. An "application/news-groupinfo" part (7.2.1.2) containing the name
      and newsgroups-line of the group(s). This part MUST be present.

   2. Other parts containing useful information about the background of
      the newsgroup message (typically of type "text/plain").

   3. Parts containing initial articles for the newsgroup. See section
      7.2.1.3 for details.

   In the event that there is only the single (i.e. application/news-
   groupinfo) subpart present, it will suffice to include a "Content-
   Type:  application/news-groupinfo" amongst the headers of the control
   message.  Otherwise, a "Content-Type: multipart/mixed" header will be
   needed, and each separate part will then need its own Content-Type-
   header.

7.2.1.2.  Application/news-groupinfo

   The "application/news-groupinfo" body part contains brief information
   about a newsgroup, i.e. the group's name, it's newsgroup-description
   and the moderation-flag.

        NOTE: The presence of the newsgroups-tag "For your newsgroups
        file:" is intended to make the whole newgroup message compatible
        with current practice as described in [Son-of-1036].




C. H. Lindsey                                                  [Page 68]


                          News Article Format                   May 2002

   The MIME content type definition of "application/news-groupinfo" is:

   MIME type name:           application
   MIME subtype name:        news-groupinfo
   Required parameters:      none
   Disposition:              by default, inline
   Encoding considerations:  "7bit" or "8bit" is sufficient and MUST be
                             used to maintain compatibility.
   Security considerations:  this type MUST NOT be used except as part
                             of a control message for the creation or
                             modification of a Netnews newsgroup
   Published specification:  [USEFOR]

   The content of the "application/news-groupinfo" body part is defined
   as:

      groupinfo-body      = [ newsgroups-tag CRLF ]
                               newsgroups-line CRLF
      newsgroups-tag      = %x46.6F.72 SP %x79.6F.75.72 SP
                               %x6E.65.77.73.67.72.6F.75.70.73 SP
                               %x66.69.6C.65.3A
                               ; case sensitive
                               ; "For your newsgroups file:"
      newsgroups-line     = newsgroup-name
                               [ 1*HTAB newsgroup-description ]
                               [ 1*WSP moderation-flag ]
      newsgroup-description
                          = utext *( *WSP utext )
      moderation-flag     = %x28.4D.6F.64.65.72.61.74.65.64.29
                               ; case sensitive "(Moderated)"

   The newsgroup-description MUST NOT contain any occurrence of the
   string "(Moderated)" within it.  The whole groupinfo-body is intended
   to be interpreted as a text written in the UTF-8 character set.

   The "application/news-groupinfo" is used in conjunction with the
   "newgroup" (7.2.1) and "mvgroup" (7.2.3) control messages.  The
   newsgroup-name in the newsgroups-line MUST agree with the newsgroup-
   name in the "newgroup" or "mvgroup" control message.  The Content-
   Type "application/news-groupinfo" MUST NOT be used except as a part
   of such control messages.  Although optional, the newsgroups-tag
   SHOULD be included until such time as this standard has been widely
   adopted, to ensure compatibility with present practice.

   Moderated newsgroups MUST be marked by appending the case sensitive
   text " (Moderated)" at the end. It is NOT recommended that the
   moderator's email address be included in the newsgroup-description as
   has sometimes been done.

   Although, in accordance with [RFC 2822] and section 4.5 of this
   standard, a newsgroups-line could have a maximum length of 998
   octets, as a matter of policy a far lower limit, expressed in
   characters, Ought to be set. The current convention is to limit its
   length so that the newsgroup-name, the HTAB(s) (interpreted as 8-

C. H. Lindsey                                                  [Page 69]


                          News Article Format                   May 2002

   character tabs that takes one at least to column 24) and the
   newsgroup-description (excluding any moderation-flag) fit into 79
   characters.  However, this standard does not seek to enforce any such
   rule, and reading agents SHOULD therefore enable a newsgroups-line of
   any length to be displayed, e.g. by wrapping it as required.

        NOTE: The newsgroups-line is intended to provide a brief
        description of the newsgroup, written in the UTF-8 character
        set.  Since newsgroup-names are required to be expressed in
        UTF-8 when they appear in headers, and since [NNTP] requires the
        use of UTF-8 when such a description is transmitted by the LIST
        NEWSGROUPS command, it would also be convenient for servers that
        keep a "newsgroups" file to store them in that form, so as to
        avoid unnecessary conversions.
[If, at the time of publication of this standard, [NNTP] is still [RFC
977], that NOTE will need to be changed to indicate that "it is expected
that a future extension of [RFC 977] will require ...".]

7.2.1.3.  Initial Articles

   Some subparts of a "newgroup" or "mvgroup" control message MAY
   contain an initial set of articles to be posted to the affected
   newsgroup(s) as soon as it has been created or modified. These parts
   are identified by having the Content-Type "application/news-
   transmission", possibly with the parameter "usage=inject".  The body
   of each such part should be a complete proto-article, ready for
   posting. This feature is intended for the posting of charters,
   initial FAQs and the like to the newly formed group(s).

   The Newsgroups-header of the proto-article MUST include the
   newsgroup-name of the newly created or modified group. It MAY include
   other newsgroup-names. If the proto-article includes a Message-ID-
   header, the message indentifier in it MUST be different from that of
   any existing article and from that of the control message as a whole.
   Alternatively such a message identifier MAY be derived by the
   injecting agent when the proto-article is posted. The proto-article
   SHOULD include the header "Distribution: local".

   The proto-article SHOULD be injected at the serving agent that
   processes the control message AFTER the newsgroup in question has
   been created or modified. It MUST NOT be injected if the newsgroup is
   not, in fact, created (for whatever reason). It MUST NOT be submitted
   to any relaying agent for transmission beyond the server(s) upon
   which the newsgroup creation has just been effected (in other words,
   it is to be treated as having a "Distribution:  local" header,
   whether such a header is actually present or not).

        NOTE: It is not precluded that the proto-article is itself a
        control message or other type of special article, to be
        activated only upon creation of the new newsgroup. However,
        except as might arise from that possibility, any
        "application/news-transmission" within some nested "multipart/*"
        structure within the proto-article is not to be activated.


C. H. Lindsey                                                  [Page 70]


                          News Article Format                   May 2002

7.2.1.4.  Example

   A "newgroup" with its charter:

      From: "example.all Administrator" <admin@noc.example>
      Newsgroups: example.admin.info,example.admin.announce
      Date: 27 Feb 2002 12:50:22 +0200
      Subject: cmsg newgroup example.admin.info moderated
      Approved: admin@noc.example
      Control: newgroup example.admin.info moderated
      Message-ID: <ng-example.admin.info-20020227@noc.example>
      MIME-Version: 1.0
      Content-Type: multipart/mixed; boundary="nxtprt"
      Content-Transfer-Encoding: 8bit

      This is a MIME control message.
      --nxtprt
      Content-Type: application/news-groupinfo

      For your newsgroups file:
      example.admin.info      About the example.* groups (Moderated)

      --nxtprt
      Content-Type: application/news-transmission

      Newsgroups: example.admin.info
      From: "example.all Administrator" <admin@noc.example>
      Subject: Charter for example.admin.info
      Message-ID: <charter-example.admin.info-20020227@noc.example>
      Distribution: local
      Content-Type: text/plain; charset=us-ascii
      Content-Transfer-Encoding: 7bit

      The group example.admin.info contains regularly posted
      information on the example.* hierarchy.

      --nxtprt--

7.2.2.  The 'rmgroup' Control Message

      control-message     =/ Rmgroup-message
      Rmgroup-message     = "rmgroup" Rmgroup-arguments
      Rmgroup-arguments   = CFWS newsgroup-name

   The "rmgroup" control message requests that the specified group be
   removed from the list of valid groups. The Content-Type of the body
   is unspecified; it MAY contain anything, usually an explanatory text.

        NOTE: It is entirely proper for a serving agent to retain the
        group until all the articles in it have expired, provided that
        it ceases to accept new articles.




C. H. Lindsey                                                  [Page 71]


                          News Article Format                   May 2002

7.2.2.1.  Example

   Plain "rmgroup":

      From: "example.all Administrator" <admin@noc.example>
      Newsgroups: example.admin.obsolete, example.admin.announce
      Date: 4 Apr 2002 22:04 -0900 (PST)
      Subject: cmsg rmgroup example.admin.obsolete
      Message-ID: <rm-example.admin.obsolete-20020404@noc.example>
      Approved: admin@noc.example
      Control: rmgroup example.admin.obsolete

      The group example.admin.obsolete is obsolete. Please remove it
      from your system.

7.2.3.  The 'mvgroup' Control Message

      control-message   =/ Mvgroup-message
      Mvgroup-message   = "mvgroup" Mvgroup-arguments
      Mvgroup-arguments = CFWS newsgroup-name CFWS newsgroup-name
                             [ CFWS newgroup-flag ]

   The "mvgroup" control message requests that the group specified by
   the first (old-)newsgroup-name be moved to that specified by the
   second (new-)newsgroup-name. Thus it is broadly equivalent to a
   "newgroup" control message for the second group followed by a
   "rmgroup" control message for the first group.

   The second (new-)newsgroup-name MUST conform to all requirements
   prescribed for the newsgroup-name of a "newgroup" control message
   (7.2.1) and Ought, similarly, to conform to any established policies
   of the hierarchy.  The message body contains an "application/news-
   groupinfo" part (7.2.1.2) containing machine- and human-readable
   information about the new group, and possibly other subparts as for a
   "newgroup" control message. The information conveyed in the
   "application/news-groupinfo" body part, notably its newsgroups-line
   (7.2.1.2), is applied to the new group.

   When this message is received, the new group is created (if it does
   not exist already) as for a "newgroup" control message, and MUST in
   any case be made moderated if a newgroup-flag "moderated" is present,
   and vice versa. At the same time, arrangements SHOULD be made to
   remove the old group (as with a "rmgroup" control message), but only
   after a suitable overlap period to allow the network to adjust to the
   new arrangement.

   At the same time as a serving agent acts upon this message, all
   injecting agents associated with that serving agent SHOULD inhibit
   the posting of new articles to the old group (preferably with some
   indication to the poster that the new group should have been used).
   Relaying agents, however, MUST continue to propagate such articles
   during the overlap period.



C. H. Lindsey                                                  [Page 72]


                          News Article Format                   May 2002

        NOTE: It is to be expected that different serving agents will
        act on this message at different points of time, users of the
        old group will have to become accustomed to the new arrangement,
        and followups to already established threads will likely
        continue under the old group. Therefore, there needs to be an
        overlap period during which articles may continue to be accepted
        by relaying and serving agents in either group. This standard
        does not specify any standard period of overlap (though it would
        be expected to be expressed in days rather than in months). The
        inhibition of injection of new articles to the old group may
        seem draconian, but it is the surest way to prevent the
        changeover from dragging on indefinitely.

   Since the "mvgroup" control message is newly introduced in this
   standard and may not be widely implemented initially, it SHOULD be
   followed shortly afterwards by a corresponding "newgroup" control
   message; and again, after a reasonable overlap period, it MUST be
   followed by a "rmgroup" control message for the old group.

   In order to facilitate a smooth changeover, serving agents MAY
   arrange to service requests for access to the old group by providing
   access to the new group, which would then contain, or appear to
   contain, all articles posted to either group (including, ideally, the
   pre-changeover articles from the old one). Nevertheless, if this
   feature is implemented, the articles themselves, as supplied to
   reading agents, MUST NOT be altered in any way (and, in particular,
   their Newsgroups-headers MUST contain exactly those newsgroups
   present when they were injected). On the other hand, the Xref-header
   MAY contain entries for either group (or even both).

        NOTE: Some serving agents that use an "active" file permit an
        entry of the form "oldgroup xxx yyy =newgroup", which enables
        any articles arriving for oldgroup to be diverted to newgroup,
        thus providing a simple implementation of this feature. However,
        it is known that not all current serving agents will find
        implementation so easy (especially in the short term) which is
        why it is not mandated by this standard. Nevertheless, its
        eventual implementation in all serving agents is to be
        considered highly desirable.

        On the other hand, it is recognized that this feature would
        likely not be implementable if the new group was already in
        existence with existing articles in it. This situation should
        not normally arise except when there is already some confusion
        as to which groups are, or are not, supposed to exist in that
        hierarchy. Note that the "mvgroup" control message is not really
        intended to be used for merging two existing groups.

7.2.3.1.  Example

      From: "example.all Administrator" <admin@noc.example>
      Newsgroups: example.oldgroup,example.newgroup,example.admin.announce
      Date: 30 Apr 2002 22:04 -0500 (EST)
      Subject: cmsg mvgroup example.oldgroup example.newgroup moderated

C. H. Lindsey                                                  [Page 73]


                          News Article Format                   May 2002

      Message-ID: <mvgroup-example.oldgroup-20020430@noc.example>
      Approved: admin@noc.example
      Control: mvgroup example.oldgroup example.newgroup moderated
      MIME-Version: 1.0
      Content-Type: multipart/mixed; boundary=nxt

      --nxt
      Content-Type: application/news-groupinfo

      For your newsgroups file:
      example.newgroup        The new replacement group (Moderated)

      --nxt

      The moderated group example.oldgroup is replaced by
      example.newgroup. Please update your configuration, and please,
      if possible, arrange to file articles arriving for
      example.oldgroup as if they were in example.newgroup.

      --nxt--

7.2.4.  The 'checkgroups' Control Message

   The "checkgroups" control message contains a list of all the valid
   groups in a complete hierarchy.

      control-message     =/ Checkgroup-message
      Checkgroup-message  = "checkgroups" Checkgroup-arguments
      Checkgroup-arguments= [ chkscope ] [ chksernr ]
      chkscope            = 1*( CFWS ["!"] newsgroup-name )
      chksernr            = CFWS "#" 1*DIGIT

   A "checkgroups" message applies to any (sub-)hierarchy with a prefix
   listed in the chkscope parameter, provided that the rightmost
   matching newsgroup-name in the list is not immediately preceded by a
   "!".  If no chkscope parameter is given, it applies to all
   hierarchies for which group statements appear in the message.

        NOTE: Some existing software does not support the "chkscope"
        parameter.  Thus a "checkgroups" message SHOULD also contain the
        groups of other subhierarchies the sender is not responsible
        for. "New" software MUST ignore groups which do not fall within
        the chkscope parameter of the "checkgroups" message.

   The chksernr parameter is a serial number, which can be any positive
   integer (e.g. just numbered or the date in YYYYMMDD).  It SHOULD
   increase by an arbitrary value with every change to the group list
   and MUST NOT ever decrease.

        NOTE: This was added to circumvent security problems in
        situations where the Date-header cannot be authenticated.




C. H. Lindsey                                                  [Page 74]


                          News Article Format                   May 2002

   Example:

      Control: checkgroups de !de.alt #248

   which includes the whole of the 'de.*' hierarchy, with the exception
   of its 'de.alt.*' sub-hierarchy.

   The body of the message has the Content-Type "application/news-
   checkgroups".  It asserts that the newsgroups it lists are the only
   newsgroups in the specified hierarchies.

        NOTE: The checkgroups message is intended to synchronize the
        list of newsgroups stored by a serving agent, and their
        newsgroup-descriptions, with the lists stored by other serving
        agents throughout the network. However, it might be inadvisable
        for the serving agent actually to create or delete any
        newsgroups without first obtaining the approval of its
        administrators for such proposed actions.

7.2.4.1.  Application/news-checkgroups

   The "application/news-checkgroups" body part contains a complete list
   of all the newsgroups in a hierarchy, their newsgroup-descriptions
   and their moderation status.

   The MIME content type definition of "application/news-checkgroups"
   is:

   MIME type name:           application
   MIME subtype name:        news-checkgroups
   Required parameters:      none
   Disposition:              by default, inline
   Encoding considerations:  "7bit" or "8bit" is sufficient and MUST be
                             used to maintain compatibility.
   Security considerations:  this type MUST NOT be used except as part
                             of a checkgroups control message

   The content of the "application/news-checkgroups" body part is
   defined as:

      checkgroups-body    = *( valid-group CRLF )
      valid-group         = newsgroups-line ; see 7.2.1.2
   The whole checkgroups-body is intended to be interpreted as a text
   written in the UTF-8 character set.

   The "application/news-checkgroups" content type is used in
   conjunction with the "checkgroups" control message (7.2.4).

        NOTE: The possibility of removing a complete hierarchy by means
        of an "invalidation" line beginning with a '!' is no longer
        provided by this standard. The intent of the feature was widely
        misunderstood and it was misused more often than it was used
        correctly. The same effect, if required, can now be obtained by
        the use of an appropriate chkscope argument in conjunction with

C. H. Lindsey                                                  [Page 75]


                          News Article Format                   May 2002

        an empty checkgroups-body.

7.3.  Cancel

   The cancel message requests that a target article be "canceled" i.e.
   be withdrawn from circulation or access. A cancel message may be
   issued in the following circumstances.

   1. The poster of an article (or, more specifically, any entity
      mentioned in the From-header or the Sender-header, whether or not
      that entity was the actual poster) is always entitled to issue a
      cancel message for that article, and serving agents SHOULD honour
      such requests. Posting agents SHOULD facilitate the issuing of
      cancel messages by posters fulfilling these criteria.

   2. The agent which injected the article onto the network (more
      specifically, the entity identified by the path-identity in front
      of the leftmost '%' delimiter in the Path-header (5.6) or in the
      Injector-Info-header (6.19) and, where appropriate, the moderator
      (more specifically, any entity mentioned in the Approved-header)
      is always entitled to issue a cancel message for that article, and
      serving agents SHOULD honour such requests.

   3. Other entities MAY be entitled to issue a cancel message for that
      article, in circumstances where established policy for any
      hierarchy or group in the Newsgroup-header, or established custom
      within Usenet, so allows (such policies and customs are not
      defined by this standard). Such cancel messages MUST include an
      Approved-header identifying the responsible entity. Serving agents
      MAY honour such requests, but SHOULD first take steps to verify
      their appropriateness.


      control-message     =/ Cancel-message
      Cancel-message      = "cancel" Cancel-arguments
      Cancel-arguments    = CFWS msg-id

   The argument identifies the article to be cancelled by its message
   identifier.  The body SHOULD contain an indication of why the
   cancellation was requested. The cancel message SHOULD be posted to
   the same newsgroup, with the same distribution, as the article it is
   attempting to cancel.

   A serving agent that elects to honour a cancel message SHOULD make
   the article unavailable for relaying or serving (perhaps by deleting
   it completely). If the target article is unavailable, and the
   acceptability of the cancel message cannot be established without it,
   activation of the cancel message SHOULD be delayed until the target
   article has been seen.  See also sections 8.3 and 8.4.

        NOTE: It is expected that the security extension envisaged in
        section 7.1 will make more detailed provisions for establishing
        whether honouring a particular cancel message is in order. In
        particular, it is likely that there will be provision for the

C. H. Lindsey                                                  [Page 76]


                          News Article Format                   May 2002

        digital signature of 3rd party cancels.

        NOTE: The former requirement [RFC 1036] that the From and/or
        Sender-headers of the cancel message should match those of the
        original article has been removed from this standard, since it
        only encouraged cancel issuers to conceal their true identity,
        and it was not usually checked or enforced by canceling
        software.  Therefore, both the From and/or Sender-headers and
        any Approved-header should now relate to the entity responsible
        for issuing the cancel message.

7.4.  Ihave, sendme

   The "ihave" and "sendme" control messages implement a crude batched
   predecessor of the NNTP [NNTP] protocol. They are largely obsolete on
   the Internet, but still see use in conjunction with some transport
   protocols such as UUCP, especially for backup feeds that normally are
   active only when a primary feed path has failed. There is no
   requirement for relaying agents that do not support such transport
   protocols to implement them.

        NOTE: The ihave and sendme messages defined here have ABSOLUTELY
        NOTHING TO DO WITH NNTP, despite similarities of terminology.

   The two messages share the same syntax:

      control-message     =/ Ihave-message
      Ihave-message       = "ihave" Ihave-arguments
      Ihave-arguments     = relayer-name
      control-message     =/ Sendme-message
      Sendme-message      = "sendme" Sendme-arguments
      Sendme-arguments    = Ihave-arguments
      relayer-name        = path-identity  ; see 5.6.1
      ihave-body          = *( msg-id CRLF )
      sendme-body         = ihave-body

   The body of the message consists of a list of msg-ids, one per line.
   [RFC 1036] also permitted the list of msg-ids to appear in the Ihave-
   or Sendme-arguments with the syntax
      Ihave-arguments     = *( msg-id FWS ) [relayer-name]
   but this form SHOULD NOT now be used, though relaying agents MAY
   recognize and process it for backward compatibility.

   The ihave message states that the named relaying agent has received
   articles with the specified message identifiers, which may be of
   interest to the relaying agents receiving the ihave message.  The
   sendme message requests that the agent receiving it send the articles
   having the specified message identifiers to the named relaying agent.

   These control messages are normally sent essentially as point-to-
   point messages, by using newsgroup-names in the Newsgroups-header of
   the form "to."  followed by one (or possibly more) components in the
   form of a relayer-name (see section 5.5.1 which forbids "to" as the
   first component of a newsgroup-name). The control message SHOULD then

C. H. Lindsey                                                  [Page 77]


                          News Article Format                   May 2002

   be delivered ONLY to the relaying agent(s) identified by that
   relayer-name, and any relaying agent receiving such a message which
   includes its own relayer-name MUST NOT propagate it further. Each
   pair of relaying agent(s) sending and receiving these messages MUST
   be immediate neighbors, exchanging news directly with each other.
   Each relaying agent advertises its new arrivals to the other using
   ihave messages, and each uses sendme messages to request the articles
   it lacks.

   To reduce overhead, ihave and sendme messages SHOULD be sent
   relatively infrequently and SHOULD contain reasonable numbers of
   message IDs. If ihave and sendme are being used to implement a backup
   feed, it may be desirable to insert a delay between reception of an
   ihave and generation of a sendme, so that a slightly slow primary
   feed will not cause large numbers of articles to be requested
   unnecessarily via sendme.

7.5.  Obsolete control messages.

   The following control message verbs are declared obsolete by this
   standard:

        sendsys
        version
        whogets
        senduuname

8.  Duties of Various Agents

   The following section sets out the duties of various agents involved
   in the creation, relaying and serving of Usenet articles.

   In this section, the word "trusted", as applied to the source of some
   article, means that an agent processing that article has verified, by
   some means, the identity of that source (which may be another agent
   or a poster).

        NOTE: In many implementations, a single agent may perform
        various combinations of the injecting, relaying and serving
        functions. Its duties are then the union of the various duties
        concerned.

8.1.  General principles to be followed

   There are two important principles that news implementors (and
   administrators) need to keep in mind. The first is the well-known
   Internet Robustness Principle:

        Be liberal in what you accept, and conservative in what you
        send.

   However, in the case of news there is an even more important
   principle, derived from a much older code of practice, the
   Hippocratic Oath (we may thus call this the Hippocratic Principle):

C. H. Lindsey                                                  [Page 78]


                          News Article Format                   May 2002

        First, do no harm.

   It is VITAL to realize that decisions which might be merely
   suboptimal in a smaller context can become devastating mistakes when
   amplified by the actions of thousands of hosts within a few minutes.

   In the case of gateways, the primary corollary to this is:

        Cause no loops.

8.2.  Duties of an Injecting Agent

   An Injecting Agent is responsible for taking a proto-article from a
   posting agent and either forwarding it to a moderator or injecting it
   into the relaying system for access by readers.

   As such, an injecting agent is considered responsible for ensuring
   that any article it injects conforms with the rules of this standard
   and the policies of any newsgroups or hierarchies that the article is
   posted to. It is also expected to bear some responsibility towards
   the rest of the network for the behaviour of its posters (and
   provision is therefore made for it to be easily contactable by
   email).

   To this end injecting agents MAY cancel articles which they have
   previously injected (see 7.3).

8.2.1.  Proto-articles

   A proto-article is one that has been created by a posting agent and
   has not yet been injected into the news system by an injecting agent.
   It SHOULD NOT be propagated in that form to other than injecting
   agents. A proto-article has the same format as a normal article
   except that some of the following mandatory headers MAY be omitted:
   Message-Id-header, Date-header, Path-header (and even From-header if
   the particular injecting agent can derive that information from other
   sources). These headers MUST NOT contain invalid values; they MUST
   either be correct or not present at all.

   A proto-article SHOULD NOT contain the '%' path-delimiter in any
   Path-header, except in the rare cases where an article gets injected
   twice. It MAY contain path-identities with other path-delimiters in
   the pre-injection portion of the Path-header (5.6.3).

8.2.2.  Procedure to be followed by Injecting Agents

   A injecting agent receives proto-articles from posting and followup
   agents. It verifies them, adds headers where required and then either
   forwards them to a moderator or injects them by passing them to
   serving or relaying agents.

   If an injecting agent receives an otherwise valid article that has
   already been injected it SHOULD either act as if it is a relaying
   agent or else pass the article on to a relaying agent completely

C. H. Lindsey                                                  [Page 79]


                          News Article Format                   May 2002

   unaltered. Exceptionally, it MAY reinject the article, perhaps as a
   part of some complex gatewaying process (in which case it will add a
   second '%' path-delimiter to the Path-header).  It MUST NOT forward
   an already injected article to a moderator.

   An injecting agent processes articles as follows:

   1. It MUST remove any Injector-Info- or Complaints-To-header already
      present (though it might be useful to copy them to suitable X-
      headers). It SHOULD likewise remove any NNTP-Posting-Host or other
      undocumented tracing header.

   2. It SHOULD verify that the article is from a trusted source.
      However, it MAY allow articles in which headers contain "forged"
      email addresses, that is, addresses which are not valid for the
      known and trusted source, especially if they end in ".invalid".

   3. It MUST reject any article whose Date-header is more than 24 hours
      into the past or into the future (cf. 5.1).

   4. It MUST reject any article that does not have the correct
      mandatory headers for a proto-article (5 and 8.2.1) present, or
      which contains any header that does not have legal contents, and
      it SHOULD reject any article which contains any header deprecated
      for Netnews (4.2.1).

   5. If the article is rejected, or is otherwise incorrectly formatted
      or unacceptable due to site policy, the posting agent MUST be
      informed (such as via an NNTP 44x response code) that posting has
      failed and the article MUST NOT then be processed further.

   6. The Message-ID and Date-headers (and their content) MUST be added
      when not already present.

   7. A Path-header with a tail-entry (5.6.3) MUST be correctly added if
      not already present (except that it SHOULD NOT be added if the
      article is to be forwarded to a moderator).

   8. The path-identity of the injecting agent with a '%' path-delimiter
      (5.6.2) MUST be prepended to the Path-header; moreover, that
      path-identity MUST be an FQDN mailable address (5.6.2).

   9  An Injector-Info-header (6.19) SHOULD be added, identifying the
      trusted source of the article, and a suitable Complaints-To-header
      (6.20) MAY be added (except that these two headers SHOULD NOT be
      added if the article is to be forwarded to a moderator).

   10.The injecting agent MUST NOT alter the body of the article in any
      way. It MAY add other headers not already provided by the poster,
      but SHOULD NOT alter, delete, or reorder any existing header, with
      the specific exception of "tracing" headers such as Injector-Info
      and Complaints-To, which are to be removed as already mentioned.



C. H. Lindsey                                                  [Page 80]


                          News Article Format                   May 2002

        NOTE: The addition of non-mandatory headers by the injecting
        agent may alter the posting agent's preferred presentation of
        information. In particular, adding a Sender-header that exposes
        a sender's mailbox has privacy implications; where the main or
        only purpose for doing so is as tracing information, it is
        preferable to use instead one of the options provided for the
        Injector-Info header (6.19.1).

   11.If the Newsgroups line contains one or more moderated groups and
      the article does NOT contain an Approved-header, then the
      injecting agent MUST forward it to the moderator of the first
      (leftmost) moderated group listed in the Newsgroups line via
      email. The complete article SHOULD be encapsulated (headers and
      all) within the email, preferably using the Content-Type
      "application/news-transmission" (6.21.6.1).

        NOTE: This standard does not prescribe how the email address of
        the moderator is to be determined, that being a matter of policy
        to be arranged by the agency responsible for the oversight of
        each hierarchy. Nevertheless, there do exist various agents
        worldwide which provide the service of forwarding to moderators,
        and the address to use with them is obtained by replacing each
        '.' in the newsgroups-name with a '-'. For example, articles
        intended for "news.announce.important" would be emailed to
        "new-announce-important@forwardingagent.example".
[If the IDNS people come up with specific proposals before this draft is
finally submitted, we may be able to replace the following paragraph.]

        In the event that the newsgroup-name contains any UTF8-xtra-
        char, this will result in an addr-spec whose local-part is not
        consistent the present email standards ([RFC2822]).  It is
        anticipated that extensions to those standards currently under
        consideration will in due course provide means for encoding such
        local-parts but, in the meantime, agencies responsible for
        creating moderated newsgroups with such names will need to make
        special arrangements.

   12.Otherwise, the injecting agent forwards the article to one or more
      relaying or serving agents.

8.3.  Duties of a Relaying Agent

   A Relaying Agent accepts injected articles from injecting and other
   relaying agents and passes them on to relaying or serving agents
   according to mutually agreed policy. Relaying agents SHOULD accept
   articles ONLY from trusted agents.

   A relaying agent processes articles as follows:

   1. It MUST verify the leftmost entry in the Path-header and then
      prepend its own path-identity with a '/' path-delimiter, and
      possibly also the verified path-identity of its source with a '?'
      path-delimiter (5.6.2).


C. H. Lindsey                                                  [Page 81]


                          News Article Format                   May 2002

   2. It MUST reject any article whose Date-header is stale (see 5.1).

   3. It MUST reject any article that does not have the correct
      mandatory headers (section 5) present with legal contents.

   4. It SHOULD reject any article whose optional headers (section 6) do
      not have legal contents.

   5. It SHOULD reject any article that has already been sent to it (a
      database of message identifiers of recent messages is usually kept
      and matched against).

   6. It SHOULD reject any article that matches an already received
      cancel message (or an equivalent, Supersedes-header) issued by its
      poster or by some other trusted entity.

   7. It MAY reject any article without an Approved-header posted to
      newsgroups known to be moderated (this practice is strongly
      recommended, but the information necessary to do it may not be
      available to all agents).

   8. It then passes articles which match mutually agreed criteria on to
      neighboring relaying and serving agents. However, it SHOULD NOT
      forward articles to sites whose path-identity is already in the
      Path-header.

        NOTE: It is usual for relaying and serving agents to restrict
        the Newsgroups, Distributions, age and size of articles that
        they wish to receive.

   If the article is rejected as being invalid, unwanted or unacceptable
   due to site policy, the agent that passed the article to the relaying
   agent SHOULD be informed (such as via an NNTP 43x response code) that
   relaying failed. In order to prevent a large number of error messages
   being sent to one location, relaying agents MUST NOT inform any other
   external entity that an article was not relayed UNLESS that external
   entity has explicitly requested that it be informed of such errors.

        NOTE: In order to prevent overloading, relaying agents should
        not routinely query an external entity (such as a DNS-server) in
        order to verify an article (though a local cache of the required
        information might usefully be consulted).

   Relaying agents MUST NOT alter, delete or rearrange any part of an
   article expect for headers designated as variant (4.2.5.3).

8.4.  Duties of a Serving Agent

   A Serving Agent takes an article from a relaying or injecting agent
   and files it in a "news database". It also provides an interface for
   reading agents to access the news database. This database is normally
   indexed by newsgroup with articles in each newsgroup identified by an
   article-locater (usually in the form of a decimal number - see 6.16).


C. H. Lindsey                                                  [Page 82]


                          News Article Format                   May 2002

        NOTE: Since control messages are often of interest, but should
        not be displayed as normal articles in regular newsgroups, it is
        common for serving agents to make them available in a pseudo-
        newsgroup named "control" or in a pseudo-newsgroup in a sub-
        hierarchy under "control." (e.g. "control.cancel").

   A serving agent processes articles as follows:

   1. It MUST verify the leftmost entry in the Path-header and then
      prepend its own path-identity with a '/' path-delimiter, and
      possibly also the verified path-identity of its source with a '?'
      path-delimiter (5.6.2).

   2. It MUST reject any article whose Date-header is stale (see 5.1).

   3. It MUST reject any article that does not have the correct
      mandatory headers (section 5) present, or which contains any
      header that does not have legal contents.

   4. It SHOULD reject any article that has already been sent to it (a
      database of message identifiers of recent messages is usually kept
      and matched against).

   5. It SHOULD reject any article that matches an already received
      cancel message (or an equivalent, Supersedes-header) issued by its
      poster or by some other trusted entity.

   6. It MUST reject any article without an Approved-header posted to
      any moderated newsgroup which it is configured to receive, and it
      MAY reject such articles for any newsgroup it knows be moderated.

   7. It MUST remove any Xref-header (6.16) from each article.  It then
      MAY (and usually will) generate a fresh Xref-header.

   8. Finally, it stores the article in its news database.

8.5.  Duties of a Posting Agent

   A Posting Agent is used to assist the poster in creating a valid
   proto-article and forwarding it to an injecting agent.

   Postings agents SHOULD ensure that proto-articles they create are
   valid news articles according to this standard and other applicable
   policies.

   Posting agents meant for use by ordinary posters SHOULD reject any
   attempt to post an article which cancels or Supersedes another
   article of which the poster is not the author.

8.6.  Duties of a Followup Agent

   A Followup Agent is a special case of a posting agent and as such is
   bound by all the posting agent's requirements plus additional ones.
   Followup agents MUST create valid followups, in particular by

C. H. Lindsey                                                  [Page 83]


                          News Article Format                   May 2002

   providing correctly adjusted forms of those headers described as
   inheritable (4.2.5.2), notably the Newsgroups-header (5.5), the
   Subject-header (5.4) and the References-header (6.10), and they Ought
   to observe appropriate quoting conventions in the body (see 4.3.2).

   Followup agents SHOULD initialize the Newsgroups-header from the
   precursor's Followup-To-header, if present, when preparing a
   followup; however posters MAY then change this before posting if they
   wish.

   Followup agents MUST NOT attempt to send email to any address ending
   in ".invalid".  Followup agents SHOULD NOT email copies of the
   followup to the poster of the precursor unless this has been
   explicitly requested by means of a Mail-Copies-To-header (6.8), but
   they SHOULD include a Posted-And-Mailed-header (6.9) whenever a copy
   is so emailed.

8.7.  Duties of a Moderator

   A Moderator receives news articles by email, decides whether to
   accept them and, if so, either injects them into the news stream or
   forwards them to further moderators.

   A moderator processes an article, as submitted to any newsgroup that
   he moderates, as follows:

   1. He decides, on the basis of whatever moderation policy applies to
      his group, whether to accept or reject the article. He MAY do this
      manually, or else partially or wholly with the aid of appropriate
      software for whose operation he is then responsible. He MAY modify
      the article if that is in accordance with the applicable
      moderation policy (and in particular he MAY remove redundant
      headers and add Comments and other informational headers). He MAY
      inform the poster as to whether the article has been accepted or
      rejected.

      If the article is rejected, then it fails for all the newsgroups
      for which it was intended (in particular the moderator SHOULD NOT
      resubmit the article, with a reduced Newsgroups-header, to any
      remaining groups, especially if this will break any authentication
      checks present in the article). If the article is accepted, the
      moderator proceeds with the following steps.

   2. The Date-header SHOULD be retained, except that if it is stale
      (5.1) for reasons understood by the moderator (e.g. delays in the
      moderation process) he MAY substitute the current date (but must
      then take responsibility for any loops that ensue). Any variant
      headers (4.2.5.3) MUST be removed, except that a Path-header MAY
      be truncated to only its pre-injection region (5.6.3).  Any
      Injector-Info-header (6.19) or Complaints-To-header (6.20) MUST be
      removed.




C. H. Lindsey                                                  [Page 84]


                          News Article Format                   May 2002

   3. He adds an Approved-header (6.14) containing a mailbox identifying
      himself (or, if the article already contains an Approved-header
      from another moderator, he adds that identifying information to
      it). He MAY also include that Approved-header within some digital
      signature scheme (see 7.1).

   4. If the Newsgroups-header contains further moderated newsgroups for
      which approval has not already been given, he forwards the article
      to the moderator of the leftmost such group (which, if this
      standard has been followed correctly, will always be the group
      immediately to the right of the group(s) for which he is
      responsible). However, he MUST NOT alter the order in which the
      newsgroups are listed in the Newsgroups-header.

   5. Otherwise, he causes the article to be injected, having first
      observed all the duties of a posting agent (8.5).

        NOTE: This standard does not prescribe how the moderator or
        moderation policy for each newsgroup is established; rather it
        assumes that whatever agencies are responsible for the relevant
        network or hierarchy (1.1) will have made appropriate
        arrangements in that regard.

   It SHOULD be the case that articles will be received by the moderator
   encapsulated as an object of Content-Type application/news-
   transmission (8.2.2), or possibly encapsulated but without an
   explicit Content-Type-header. In such a case, the complete article is
   immediately available for processing by the moderator.

   However, prior to the introduction of this standard, it was more
   common for injecting agents to transform proto-articles into email
   messages, mixing the Netnews headers with the Email headers.
   Moderators SHOULD therefore be prepared to accept submission in this
   format, although they need then to be aware of the Duties of an
   Incoming Gateway (8.8.2) (and, in particular, they SHOULD adopt the
   Message-ID- and Date-headers of the email message, though they SHOULD
   NOT add any Sender-header).

8.8.  Duties of a Gateway

   A Gateway transforms an article into the native message format of
   another medium, or translates the messages of another medium into
   news articles. Encapsulation of a news article into a message of MIME
   type application/news-transmission, or the subsequent undoing of that
   encapsulation, is not gatewaying, since it involves no transformation
   of the article.

   There are two basic types of gateway, the Outgoing Gateway that
   transforms a news article into a different type of message, and the
   Incoming Gateway that transforms a message from another medium into a
   news article and injects it into a news system. These are handled
   separately below.



C. H. Lindsey                                                  [Page 85]


                          News Article Format                   May 2002

   The primary dictat for a gateway is:

        Above all, prevent loops.

   Transformation of an article into another medium stands a very high
   chance of discarding or interfering with the protection inherent in
   the news system against duplicate articles. The most common problem
   caused by gateways is "spews," gateway loops that cause previously
   posted articles to be reinjected repeatedly into Usenet. To prevent
   this, a gateway MUST take precautions against loops, as detailed
   below.

   If bidirectional gatewaying (both an incoming and an outgoing
   gateway) is being set up between Netnews and some other medium, the
   incoming and outgoing gateways SHOULD be coordinated to avoid
   reinjection of gated articles. Circular gatewaying (gatewaying a
   message into another medium and then back into Netnews) SHOULD NOT be
   done; encapsulation of the article SHOULD be used instead where this
   is necessary.

   A second general principal of gatewaying is that the transformations
   applied to the message SHOULD be as minimal as possible while still
   accomplishing the gatewaying. Every change made by a gateway
   potentially breaks a property of one of the media or loses
   information, and therefore only those transformations made necessary
   by the differences between the media should be applied.

   It is worth noting that safe bidirectional gatewaying between a
   mailing list and a newsgroup is far easier if the newsgroup is
   moderated. Posts to the moderated group and submissions to the
   mailing list can then go through a single point that does the
   necessary gatewaying and then sends the message out to both the
   newsgroup and the mailing list at the same time, eliminating most of
   the possibility of loops. Bidirectional gatewaying between a mailing
   list and an unmoderated newsgroup, in contrast, is difficult to do
   correctly and is far more fragile.

   Newsgroups intended to be bidirectionally gated to a mailing list
   SHOULD therefore be moderated where possible, even if the moderator
   is a simple gateway and injecting agent that correctly handles
   crossposting to other moderated groups and otherwise passes all
   traffic.

8.8.1.  Duties of an Outgoing Gateway

   From the perspective of Netnews, an outgoing gateway is just a
   special type of reading agent. The exact nature of what the outgoing
   gateway will need to do to articles depends on the medium to which
   the articles are being gated. The operation of the outgoing gateway
   is only subject to additional constraints in the presence of one or
   more corresponding incoming gateways back from that medium to
   Netnews, since this opens the possibility of loops.



C. H. Lindsey                                                  [Page 86]


                          News Article Format                   May 2002

   Where the format of the news article is incompatible with that of the
   target medium, it may be necessary to apply transformations. In
   particular, the presence of UTF8-xtra-chars in headers may be a
   source of such incompatibility when gatewaying into Email. On the
   other hand, some email systems (especially those supporting the
   8BITMIME extensions [RFC 2821]) may well transport such material
   correctly, and some user agents may even display it.

   It is not the purpose of this standard to set requirements to be
   followed by implementors of outgoing gateways. Those implementors are
   in the best position to know the capabilities of the systems to which
   the article is to be sent, the purposes for which it is being sent,
   and the extent to which those purposes will be vitiated if the
   content of some header is mutilated en route, or fails to display
   correctly upon arrival; this is a matter for their judgement.
   Nevertheless, it is useful to draw attention to a few transformations
   which such implementors might find useful.

    o Encapsulating the whole article as a message/rfc822 (6.21.2.2) may
      make it less likely to be mutilated during transport, especially
      where 8BITMIME is supported. Alternatively, encapsulating as an
      application/news-transmission (6.21.6.1) will guarantee correct
      transmission and is the method of choice where the intent is to
      gateway it back into Netnews later on.
    o Encoding words containing UTF8-xtra-chars according to [RFC 2047],
      where permitted by that standard (i.e. within phrases and
      unstructured headers), and preferably using the charset utf-8,
      should ensure their correct display upon arrival. Indeed, many
      user agents will display this encoding correctly in contexts not
      allowed by [RFC 2047].
    o In particular, treating a newsgroup-name as an encoded word
      according to [RFC 2047] is recommended (see also 5.5).  Even if it
      is not decoded at the far end, it is preferable to display the
      encoded form than to display nothing at all. Note, however, that
      such encoded newsgroup-names MUST be restored to their canonical
      form before reinjection into any Netnews system.
    o Parameters whose values contain UTF8-xtra-chars may use the
      encoding defined in [RFC 2231], again preferably using the charset
      utf-8.

   In general, the following practices are recommended for all outgoing
   gateways, regardless of whether there is known to be a related
   incoming gateway, both as a precautionary measure and as a guideline
   to quality of implementation.

   1. The message identifier of the news article should be preserved if
      at all possible, preferably as or within the corresponding unique
      identifier of the other medium, but if not at least as a comment
      in the message. This helps greatly with preventing loops.

   2. The Date of the news article should also be preserved if possible,
      for similar reasons.



C. H. Lindsey                                                  [Page 87]


                          News Article Format                   May 2002

   3. The message should be tagged in some way so as to prevent its
      reinjection into Netnews. This may be impossible to do without
      knowledge of potential incoming gateways, but it is better to try
      to provide some indication even if not successful; at the least, a
      human-readable indication that the article should not be gated
      back to Netnews can help locate a human problem.

   4. Netnews control messages should not be gated to another medium
      unless they would somehow be meaningful in that medium.

8.8.2.  Duties of an Incoming Gateway

   The incoming gateway has the serious responsibility of ensuring that
   all of the requirements of this standard are met by the articles that
   it forms. In addition to its special duties as a gateway, it bears
   all of the duties and responsibilities of an injecting agent as well,
   and additionally has the same responsibility of a relaying agent to
   reject articles that it has already gatewayed.

   An incoming gateway MUST NOT gate the same message twice. It may not
   be possible to ensure this in the face of mangling or modification of
   the message, but at the very least a gateway, when given a copy of a
   message it has already gated identical except for trace headers (like
   Received in Email or Path in Netnews) MUST NOT gate the message
   again.  An incoming gateway SHOULD take precautions against having
   this rule bypassed by modifications of the message that can be
   anticipated.

   News articles prepared by gateways MUST be legal news articles. In
   particular, they MUST include all of the mandatory headers, MUST
   fully conform to the restrictions on said headers, and SHOULD exclude
   any deprecated headers (4.2.1).  This often requires that a gateway
   function not only as a relaying agent, but also partly as a posting
   agent, aiding in the synthesis of a conforming article from non-
   conforming input.

   Incoming gateways MUST NOT pass control messages (articles containing
   a Control- or Supersedes-header) without removing or renaming that
   header. Gateways MAY, however, generate their own cancel messages,
   under the general allowance for injecting agents to cancel their own
   messages (7.3).  If a gateway receives a message that it can
   determine is a valid equivalent of a cancel message in the medium it
   is gatewaying, it SHOULD discard that message without gatewaying it,
   generate a corresponding cancel message of its own, and inject that
   cancel message.

   Incoming gateways MUST NOT inject control messages other than
   cancels.  Encapsulation SHOULD be used instead of gatewaying, when
   direct posting is not possible or desirable.

        NOTE: It is not unheard of for mail-to-news gateways to be used
        to post control messages, but encapsulation should be used for
        these cases instead. Gateways by their very nature are
        particularly prone to loops. Spews of normal articles are bad

C. H. Lindsey                                                  [Page 88]


                          News Article Format                   May 2002

        enough; spews of control messages with special significance to
        the news system, possibly resulting in high processing load or
        even email sent for every message received, are catastrophic. It
        is far preferable to construct a system specifically for posting
        control messages that can do appropriate consistency checks and
        authentication of the originator of the message.

   If there is a message identifier that fills a role similar to that of
   the Message-ID-header in news, it SHOULD be used in the formation of
   the message identifier of the news article, perhaps with
   transformations required to meet the uniqueness requirement of
   Netnews. This transformation SHOULD be designed so that two messages
   with the same identifier generate the same Message-ID-header.

        NOTE: Message identifiers play a central role in the prevention
        of duplicates, and their correct use by gateways will do much to
        prevent loops. Netnews does, however, require that message
        identifiers be unique, and therefore message identifiers from
        other media may not be suitable for use without modification. A
        balance must be struck by the gateway between preserving
        information used to prevent loops and generating unique message
        identifiers.

   Exceptionally, if there are multiple incoming gateways for a
   particular set of messages, each to a different newsgroup(s), each
   one SHOULD generate a message identifier unique to that gateway. Each
   incoming gateway nonetheless MUST ensure that it does not gate the
   same message twice.

        NOTE: Consider the example of two gateways of a given mailing
        list into the world-wide Usenet newsgroups, both of which
        preserve the email message identifier. Each newsgroup may then
        receive a portion of the messages (different sites seeing
        different portions).  In these cases, where there is no one
        "official" gateway, some other method of generating message
        identifiers has to be used to avoid collisions. It would
        obviously be preferable for there to be only one gateway which
        crossposts, but this may not be possible to coordinate.

   If no date information is available, the gateway MAY supply a Date-
   header with the gateway's current date. If only partial information
   is available (e.g.  date but not time), this SHOULD be fleshed out to
   a full Date-header by adding default values rather than discarding
   this information. Only in very exceptional circumstances should Date
   information be discarded, as it plays an important role in preventing
   reinjection of old messages.

   An incoming gateway MUST add a Sender-header to the news article it
   forms containing the mailbox of the administrator of the gateway.
   Problems with the gateway may be reported to this address. The
   display-name portion of this mailbox SHOULD indicate that the entity
   responsible for injection of the message is a gateway. If the
   original message already had a Sender-header, it SHOULD be renamed so
   that its contents can be preserved.

C. H. Lindsey                                                  [Page 89]


                          News Article Format                   May 2002

8.8.3.  Example

   To illustrate the type of precautions that should be taken against
   loops, here is an example of the measures taken by one particular
   combination of mail-to-news and news-to-mail gateways at Stanford
   University designed to handle bidirectional gatewaying between
   mailing lists and unmoderated groups.

   1. The news-to-mail gateway preserves the message identifier of the
      news article in the generated email message. The mail-to-news
      gateway likewise preserves the email message identifier provided
      that it is syntactically valid for Netnews.  This allows the news
      system's built-in suppression of duplicates to serve as the first
      line of defense against loops.

   2. The news-to-mail gateway adds an X-Gateway-header to all messages
      it generates. The mail-to-news gateway discards any incoming
      messages containing this header. This is robust against mailing
      list managers that replace the message identifier, and against any
      number of email hops, provided that the other message headers are
      preserved.

   3. The mail-to-news gateway inserts the host name from which it
      received the email message in the pre-injection region of the Path
      (5.6.3).  The news-to-mail gateway refuses to gateway any message
      that contains the list server name in the pre-injection region of
      its Path-header. This is robust against any amount of munging of
      the message headers by the mailing list, provided that the email
      only goes through one hop.

   4. The mail-to-news gateway is designed never to generate bounces to
      the envelope sender. Instead, articles that are rejected by the
      news server (for reasons not warranting silent discarding of the
      message) result in a bounce message sent to an errors address
      known not to forward to any mailing lists, so that they can be
      handled by the news administrators.

   These precautions have proven effective in practice at preventing
   loops for this particular application (bidirectional gatewaying
   between mailing lists and locally distributed newsgroups where both
   gateways can be designed together). General gatewaying to world-wide
   newsgroups poses additional difficulties; one must be very wary of
   strange configurations, such as a newsgroup gated to a mailing list
   which is in turn gated to a different newsgroup.

9.  Security and Related Considerations

   There is no security. Don't fool yourself. Usenet is a prime example
   of an Internet Adhocratic-Anarchy; that is, an environment in which
   trust forms the basis of all agreements.  It works.





C. H. Lindsey                                                  [Page 90]


                          News Article Format                   May 2002

9.1.  Leakage

   Articles which are intended to have restricted distribution are
   dependent on the goodwill of every site receiving them.  The
   "Archive: no" header (6.12) is available as a signal to automated
   archivers not to file an article, but that cannot be guaranteed.

   The Distribution-header makes provision for articles which should not
   be propagated beyond a cooperating subnet. The key security word here
   is "cooperating". When a machine is not configured properly, it may
   become uncooperative and tend to distribute all articles.

   The flooding algorithm is extremely good at finding any path by which
   articles can leave a subnet with supposedly restrictive boundaries,
   and substantial administrative effort is required to avoid this.
   Organizations wishing to control such leakage are strongly advised to
   designate a small number of official gateways to handle all news
   exchange with the outside world (however, making such gateways too
   restrictive can also encourage the setting up of unofficial paths
   which can be exceedingly hard to track down).

   The sendme control message (7.4), insofar as it is still used, can be
   used to request articles with a given message identifier, even one
   that is not supposed to be supplied to the requester.

9.2.  Attacks

9.2.1.  Denial of Service

   The proper functioning of individual newsgroups can be disrupted by
   the massive posting of "noise" articles, by the repeated posting of
   identical or near identical articles, by posting followups unrelated
   to their precursors, or which quote their precursors in full with the
   addition of minimal extra material (especially if this process is
   iterated), and by crossposting to, or setting followups to, totally
   unrelated newsgroups.

   Many have argued that "spam", massively multiposted (and to a lesser
   extent massively crossposted) articles, usually for advertising
   purposes, also constitutes a DoS attack in its own regard.  This may
   be so.

   Such articles intended to deny service, or other articles of an
   inflammatory nature, may also have their From or Reply-To addresses
   set to valid but incorrect email addresses, thus causing large
   volumes of email to descend on the true owners of those addresses.

   Similar effects could be caused by any email header which could cause
   every reading agent receiving it to take some externally visible
   action.  For example, the Disposition-Notification-To-header defined
   in [RFC 2298] could cause huge numbers of acknowledgements to be
   emailed to an unsuspecting third party (for which reason [RFC 2298]
   declares that that header SHOULD NOT be used in Netnews).


C. H. Lindsey                                                  [Page 91]


                          News Article Format                   May 2002

   It is a violation of this standard for a poster to use as his address
   a mailbox which he is not entitled to use.  Even addresses with an
   invalid local-part but a valid domain can cause disruption to the
   administrators of such domains.  Posters who wish to remain anonymous
   or to prevent automated harvesting of their addresses, but who do not
   care to take the additional precautions of using more sophisticated
   anonymity measures, should avoid that violation by the use of
   addresses ending in the ".invalid" top-level-domain (see 5.2).

   A malicious poster may also prevent his article being seen at a
   particular site by preloading that site into the Path-header (5.6.1)
   and may thus prevent the true owner of a forged From or Reply-To
   address from ever seeing it.

   A malicious complainer may submit a modified copy of an article (e.g.
   with an altered Injector-Info-header) to the administrator of an
   injecting agent in an attempt to discredit the author of that article
   and even to have his posting privileges removed. Administrators
   should therefore obtain a genuine copy of the article from their own
   serving agent before taking such precipitate action.

   Administrative agencies with responsibility for establishing policies
   in particular hierarchies can and should set bounds upon the
   behaviour that is considered acceptable within those hierarchies (for
   example by promulgating charters for individual newsgroups, and other
   codes of conduct).

   Whilst this standard places an onus upon injecting agents to bear
   responsibility for the misdemeanours of their posters (which includes
   non-adherence to established policies of the relevant hierarchies as
   provided in section 8.2), and to provide assistance to the rest of
   the network by making proper use of the Injector-Info- (6.19) and
   Complaints-To- (6.20) headers, it makes no provision for enforcement,
   which may in consequence be patchy. Nevertheless, injecting sites
   which persistently fail to honour their responsibilities or to comply
   with generally accepted standards of behaviour are likely to find
   themselves blacklisted, with their articles refused propagation and
   even subject to cancellation, and other relaying sites would be well
   advised to withdraw peering arrangements from them.

9.2.2.  Compromise of System Integrity

   The posting of unauthorized (as determined by the policies of the
   relevant hierarchy) control messages can cause unwanted newsgroups to
   be created, or wanted ones removed, from serving agents.
   Administrators of such agents SHOULD therefore take steps to verify
   the authenticity of such control messages, either by manual
   inspection (particularly of the Approved-header) or by checking any
   digital signatures that may be provided (see 7.1).  In addition, they
   SHOULD periodically compare the newsgroups carried against any
   regularly issued checkgroups messages, or against lists maintained by
   trusted servers and accessed by out-of-band protocols such as FTP or
   HTTP.


C. H. Lindsey                                                  [Page 92]


                          News Article Format                   May 2002

   Malicious cancel messages (7.3) can cause valid articles to be
   removed from serving agents. Administrators of such agents SHOULD
   therefore take steps to verify that they originated from the
   (apparent) poster, the injector or the moderator of the article, or
   that in other cases they came from a place that is trusted to work
   within established policies and customs. Such steps SHOULD include
   the checking of any digital signatures, or other security devices,
   that may be provided (see 7.1).  Articles containing Supersedes-
   headers (6.15) are effectively cancel messages, and SHOULD be subject
   to the same checks.  Currently, many sites choose to ignore all
   cancel messages on account of the difficulty of conducting such
   checks.

   Improperly configured serving agents can allow articles posted to
   moderated groups onto the net without first being approved by the
   moderator. Injecting agents SHOULD verify that moderated articles
   were received from one of the entities given in their Approved-
   headers and/or check any digital signatures that may be provided (see
   7.1).

   The filename parameter of the Archive-header (6.12) can be used to
   attempt to store archived articles in inappropriate locations.
   Archiving sites should be suspicious of absolute filename parameters,
   as opposed to those relative to some location of the archiver's
   choosing.

   There may be weaknesses in particular implementations that are
   subject to malicious exploitation. In particular, it has not been
   unknown for complete shell scripts to be included within Control-
   headers. Implementors need to be aware of this.

   Reading agents should be chary of acting automatically upon Mime
   objects with an "application" Content-Type that could change the
   state of that agent, except in contexts where such applications are
   specifically expected (see 6.21).  Even the Content-Type "text/html"
   could have unexpected side effects on account of embedded objects,
   especially embedded executable code or URLs that invoke non-news
   protocols such as HTTP [RFC 2616].  It is therefore generally
   recommended that reading agents do not enable the execution of such
   code (since it is extremely unlikely to have a valid application
   within Netnews) and that they only honour URLs referring to other
   parts of the same article.

   Non-printable characters embedded in article bodies may have
   surprising effects on printers or terminals, notably by reconfiguring
   them in undesirable ways which may become apparent only after the
   reading agent has terminated.

9.3.  Liability

   There is a presumption that a poster who sends an article to Usenet
   intends it to be stored on a multitude of serving agents, and has
   therefore given permission for it to be copied to that extent.
   Nevertheless, Usenet is not exempt from the Copyright laws, and it

C. H. Lindsey                                                  [Page 93]


                          News Article Format                   May 2002

   should not be assumed that permission has been given for the article
   to be copied outside of Usenet, nor for its permanent archiving
   contrary to any Archive-header that may be present.

   Posters also need to be aware that they are responsible if they
   breach Copyright, Libel, Harassment or other restrictions relating to
   material that they post, and that they may possibly find themselves
   liable for such breaches in jurisdictions far from their own. Serving
   agents may also be liable in some jurisdictions, especially if the
   breach has been explicitly drawn to their attention.

   Users who are concerned about such matters should seek advice from
   competent legal authorities.

10.  IANA Considerations

   IANA is requested to register the following media types, described
   elsewhere in this standard for use with the Content-Type-header, in
   the IETF tree in accordance with the procedures set out in [RFC
   2048].

      application/news-transmission  (6.21.6.1)
      application/news-groupinfo     (7.2.1.2)
      application/news-checkgroups   (7.2.4.1)

   IANA is also requested to change the status of the following media
   type to "OBSOLETE".

      message/news                   (6.21.6.2)

        NOTE: "Application/news-transmission" is an update, with
        clarification and additional optional parameters, to an existing
        registration. "Message/rfc822" should now be used in place of
        the obsoleted "message/news".

11.  References


   [ANSI X3.4] "American National Standard for Information Systems -
        Coded Character Sets - 7-Bit American National Standard Code for
        Information Interchange (7-Bit ASCII)", ANSI X3.4, 1986.

   [ISO 3166] "Codes for the representation of names of countries and
        their subdivisions -- Part 1: Country codes", ISO 3166, 1997.

   [ISO 8859] International Standard - Information Processing - 8-bit
        Single-Byte Coded Graphic Character Sets.  Part 1: Latin
        alphabet No. 1, ISO 8859-1, 1987 Part 2: Latin alphabet No. 2,
        ISO 8859-2, 1987 Part 3: Latin alphabet No. 3, ISO 8859-3, 1988
        Part 4: Latin alphabet No. 4, ISO 8859-4, 1988 Part 5:
        Latin/Cyrillic alphabet, ISO 8859-5, 1988 Part 6: Latin/Arabic
        alphabet, ISO 8859-6, 1987 Part 7: Latin/Greek alphabet, ISO
        8859-7, 1987 Part 8: Latin/Hebrew alphabet, ISO 8859-8, 1988


C. H. Lindsey                                                  [Page 94]


                          News Article Format                   May 2002

   [ISO/IEC 10646] "International Standard - Information technology -
        Universal Multiple-Octet Coded Character Set (UCS) - Part 1:
        Architecture and Basic Multilingual Plane", ISO/IEC 10646-
        1:2000, 2000.

   [NNTP] S. Barber, "Network News Transport Protocol", draft-ietf-
        nntpext-base-*.txt.

   [PGPVERIFY] David Lawrence,
        <ftp://ftp.isc.org/pub/pgpcontrol/README.html>.

   [RFC 1034] P. Mockapetris, "Domain Names - Concepts and Facilities",
        RFC 1034, November 1987.

   [RFC 1036] M. Horton and R. Adams, "Standard for Interchange of
        USENET Messages", RFC 1036, December 1987.

   [RFC 1153] F. Wancho, "Digest Message Format", RFC 1153, April 1990.

   [RFC 1847] J. Galvin, S. Murphy, S. Crocker, and N. Freed, "Security
        Multiparts for MIME: Multipart/Signed and Multipart/Encrypted",
        RFC 1847, October 1995.

   [RFC 1864] J. Myers and M. Rose, "The Content-MD5 Header Field", RFC
        1864, October 1995.

   [RFC 2045] N. Freed and N. Borenstein, "Multipurpose Internet Mail
        Extensions (MIME) Part One: Format of Internet Message Bodies",
        RFC 2045, November 1996.

   [RFC 2046] N. Freed and N. Borenstein, "Multipurpose Internet Mail
        Extensions (MIME) Part Two: Media Types", RFC 2046, November
        1996.

   [RFC 2047] K. Moore, "MIME (Multipurpose Internet Mail Extensions)
        Part Three: Message Header Extensions for Non-ASCII Text", RFC
        2047, November 1996.

   [RFC 2048] N. Freed, J. Klensin, and J. Postel, "Multipurpose
        Internet Mail Extensions (MIME) Part Four: Registration
        Procedures", RFC 2048, November 1996.

   [RFC 2119] S. Bradner, "Key words for use in RFCs to Indicate
        Requirement Levels", RFC 2119, March 1997.

   [RFC 2142] D. Crocker, "Mailbox Names for Common Services, Roles and
        Functions", RFC 2142, May 1997.

   [RFC 2156] S. Kille, "MIXER (Mime Internet X.400 Enhanced Relay):
        Mapping between X.400 and RFC 822/MIME", RFC 2156, January 1998.

   [RFC 2183] R. Troost, S. Dorner, and K.Moore, "Communicating
        Presentation Information in Internet Messages: The Content-
        Disposition Header Field", RFC 2183, August 1997.

C. H. Lindsey                                                  [Page 95]


                          News Article Format                   May 2002

   [RFC 2231] N. Freed and K. Moore, "MIME Parameter Value and Encoded
        Word Extensions: Character Sets, Languages, and COntinuations",
        RFC 2231, November 1997.

   [RFC 2234] D. Crocker and P. Overell, "Augmented BNF for Syntax
        Specifications: ABNF", RFC 2234, November 1997.

   [RFC 2279] F. Yergeau, "UTF-8, a transformation format of ISO 10646",
        RFC 2279, January 1998.

   [RFC 2298] R. Fajman, "An Extensible Message Format for Message
        Disposition Notifications", RFC 2298, March 1998.

   [RFC 2373] R. Hinden and S. Deering, "IP Version 6 Addressing
        Architecture", RFC 2373, July 1998.

   [RFC 2440] J. Callas, L. Donnerhacke, H. Finney, and R. Thayer,
        "OpenPGP Message Format", RFC 2440, November 1998.

   [RFC 2557] J. Palme, A. Hopmann, and N. Shelness, "MIME Encapsulation
        of Aggregate Documents, such as HTML (MHTML)", RFC 2557, March
        1999.

   [RFC 2606] D. Eastlake and A. Panitz, "Reserved Top Level DNS Names",
        RFC 2606, June 1999.

   [RFC 2616] R. Fielding, J. Gettys, J. Mogul, H. Frystyk, L. Masinter,
        P. Leach, and T. Berners-Lee, "Hypertext Transfer Protocol --
        HTTP/1.1", RFC 2616, June 1999.

   [RFC 2821] John C. Klensin and Dawn P. Mann, "Simple Mail Transfer
        Protocol", RFC 2821, April 2001.

   [RFC 2822] P. Resnick, "Internet Message Format", RFC 2822, April
        2001.

   [RFC 3156] M. Elkins, D. Del Torto, R. Levien, and T. Roessler, "MIME
        Security with OpenPGP", RFC 3156, August 2001.

   [RFC 820] J. Postel and J. Vernon, "Assigned Numbers", RFC 820,
        January 1983.

   [RFC 822] D. Crocker, "Standard for the Format of ARPA Internet Text
        Messages.", STD 11, RFC 822, August 1982.

   [RFC 850] Mark R. Horton, "Standard for interchange of Usenet
        messages", RFC 850, June 1983.

   [RFC 976] Mark R. Horton, "UUCP mail interchange format standard",
        RFC 976, February 1986.

   [Son-of-1036] Henry Spencer, "News article format and transmission",
        <ftp://ftp.zoo.toronto.edu/pub/news.txt.Z>, June 1994.


C. H. Lindsey                                                  [Page 96]


                          News Article Format                   May 2002

   [UNICODE 3.0] The Unicode Consortium, "The Unicode Standard - Version
        3.0", Addison-Wesley, 2000.

   [UNICODE 3.1] The Unicode Consortium, "The Unicode Standard - Version
        3.1, being an amendment to [UNICODE 3.0]", Unicode Standard
        Annex #27 <http://www.unicode.org/unicode/reports/tr27>, 2001.

   [USEFOR] This Standard.


12.  Acknowledgements

   The editor wishes to thank the following members of the IETF Usenet
   Format Working Group who made significant contributions to this
   endeavour.

   Per Abrahamsen                Brian Kelly
   Peter Alfredsen               Evan Kirshenbaum
   Russ Allbery                  Brad Knowles
   Greg Andruk                   Kent Landfield
   Ralph Babel                   David C. Lawrence
   Stan Barber                   Simon Lyall
   Dave Barr                     Todd Michel McComb
   Ian Bell                      Denis McKeon
   G. James Berigan              Seymour J. Metz
   Terje Bless                   John Moreno
   Seth Breidbart                Chris Newman
   Buddha Buck                   Dirk Nimmich
   Forrest J. Cavalier III       Paul Overell
   Evan Champion                 Jacob Palme
   Maurizio Codogno              Brian Palmer
   Don Croyle                    Pete Resnick
   Matt Curtin                   Jon Ribbens
   Bill Davidsen                 Dan Ritter
   Ian Davis                     Thomas Roessler
   Jean-Marc Desperrier          Doug Royer
   Martin J. Duerst              Frederic Senis
   Claus Andre Faerber           Erland Sommarskog
   Clive D.W. Feather            Henry Spencer
   David Formosa                 John Stanley
   Marty Fouts                   Brad Templeton
   Benjamin Franz                Florian Weimer
   Andrew Gierth                 Curt Welch
   Jonathan Grobe                Curtis Whalen
   Thomas Gschwind               Leonid Yegoshin
   Kai Henningsen                Jamie Zawinski
   Lars Magne Ingebrigtsen

13.  Contact Address






C. H. Lindsey                                                  [Page 97]


                          News Article Format                   May 2002

Editor

        Charles. H. Lindsey
        5 Clerewood Avenue
        Heald Green
        Cheadle
        Cheshire SK8 3JU
        United Kingdom
        Phone: +44 161 436 6131
        Email: chl@clw.cs.man.ac.uk

[

Working group chair

        David Barr
        Digital Island
        Email: barr@visi.com
]

   Comments on this draft should preferably be sent to the mailing list
   of the Usenet Format Working Group at

        usenet-format@landfield.com.

   This draft expires six months after the date of publication (see Page
   1) (i.e. in Nov 2002).

Appendix A.1 - A-News Article Format

   The obsolete "A News" article format consisted of exactly five lines
   of header information, followed by the body. For example:

      Aeagle.642
      news.misc
      cbosgd!mhuxj!mhuxt!eagle!jerry
      Fri Nov 19 16:14:55 1982
      Usenet Etiquette - Please Read
      body
      body
      body

   The first line consisted of an "A" followed by an article ID
   (analogous to a message ID and used for similar purposes).  The
   second line was the list of newsgroups. The third line was the path.
   The fourth was the date, in the format above (all fields fixed
   width), resembling an Internet date but not quite the same. The fifth
   was the subject.

   This format is documented for archeological purposes only.  Articles
   MUST NOT be generated in this format.




C. H. Lindsey                                                  [Page 98]


                          News Article Format                   May 2002

Appendix A.2 - Early B-News Article Format

   The obsolete pseudo-Internet article format, used briefly during the
   transition between the A News format and the modern format, followed
   the general outline of a MAIL message but with some non-standard
   headers. For example:

      From: cbosgd!mhuxj!mhuxt!eagle!jerry (Jerry Schwarz)
      Newsgroups: news.misc
      Title: Usenet Etiquette -- Please Read
      Article-I.D.: eagle.642
      Posted: Fri Nov 19 16:14:55 1982
      Received: Fri Nov 19 16:59:30 1982
      Expires: Mon Jan 1 00:00:00 1990

      body
      body
      body

   The From-header contained the information now found in the Path-
   header, plus possibly the full name now typically found in the From-
   header. The Title-header contained what is now the Subject-content.
   The Posted-header contained what is now the Date-content. The
   Article-I.D.-header contained an article ID, analogous to a message
   ID and used for similar purposes. The Newsgroups- and Expires-headers
   were approximately as now. The Received-header contained the date
   when the latest relayer to process the article first saw it. All
   dates were in the above format, with all fields fixed width,
   resembling an Internet date but not quite the same.

   This format is documented for archeological purposes only.  Articles
   MUST NOT be generated in this format.

Appendix A.3 - Obsolete Headers

   Early versions of news software following the modern format sometimes
   generated headers like the following:

      Relay-Version: version B 2.10 2/13/83; site cbosgd.UUCP
      Posting-Version: version B 2.10 2/13/83; site eagle.UUCP
      Date-Received: Friday, 19-Nov-82 16:59:30 EST

   Relay-Version contained version information about the relayer that
   last processed the article. Posting-Version contained version
   information about the posting agent that posted the article. Date-
   Received contained the date when the last relayer to process the
   article first saw it (in a slightly nonstandard format).

   In addition, this present standard obsoletes certain headers defined
   in [Son-of-1036] (see 6.22





C. H. Lindsey                                                  [Page 99]


                          News Article Format                   May 2002

      Also-Control: cancel <9urrt98y53@site.example>
      See-Also: <i4g587y@site1.example> <kgb2231+ee@site2.example>
      Article-Names: comp.foo:charter
      Article-Updates: <i4g587y@site1.example>

   Also-Control indicated a control message that was also intended to be
   filed as a normal article. See-Also listed related articles, but
   without the specific relationship with followups that pertains to the
   References-header.  Article-Names indicated some special significance
   of that article in relation to the indicated newsgroup. Article-
   Updates indicated that an earlier article was updated, without at the
   same time being superseded.

   These headers are documented for archeological purposes only.
   Articles containing these headers MUST NOT be generated.

Appendix A.4 - Obsolete Control Messages

   This present standard obsoletes certain control messages defined in
   [RFC 1036] (see 7.5 all of which had the effect of requesting a
   description of a relaying or serving agent's software, or its peering
   arrangements with neighbouring sites, to be emailed to the article's
   reply address. Whilst of some utility when Usenet was much smaller
   than it is now, they had become no more than a tool for the malicious
   sending of mailbombs. Moreover, many organizations now consider
   information about their internal connectivity to be confidential.

      version
      sendsys
      whogets
      senduuname

   "Version" requested details of the transport software in use at a
   site.  "Sendsys" requested the full list of newsgroups taken, and the
   peering arrangements. "Who gets" was similar, but restricted to a
   named newsgroup.  "Senduuname" resembled "sendsys" but restricted to
   the list of peers connected by UUCP.

   Historically, a checkgroups body consisting of one or two lines, the
   first of the form "-n newsgroup", caused check-groups to apply to
   only that single newsgroup.

   Historically, an article posted to a newsgroup whose name had exactly
   three components of which the third was "ctl" signified that article
   was to be taken as a control message.  The Subject-header specified
   the actions, in the same way the Control-header does now.

   These forms are documented for archeological purposes only; they MUST
   NO LONGER be used.






C. H. Lindsey                                                 [Page 100]


                          News Article Format                   May 2002

Appendix B - Collected Syntax

Appendix B.1 - Characters, Atoms and Folding

   In the following syntactic rules, numbers in the left hand margin
   indicate rules taken from other documents, specifically:
     2 from  [RFC 2822] with the exception of those elements described
       therein as "obsolete";
     3 from  [RFC 2373]
     4 from  [RFC 2234];
     5 from  [RFC 2045].

   Where the number is followed by an asterisk ('*'), it indicates that
   the rule in question has been modified for the purposes of this
   standard.

4  ALPHA                = %x41-5A /        ; A-Z
                          %x61-7A          ; a-z
2  CFWS                 = *([FWS] comment) (([FWS] comment) / FWS )
4  CR                   = %x0D             ; carriage return
4  CRLF                 = CR LF
4  DIGIT                = %x30-39          ; 0-9
4  DQUOTE               = %d34             ; quote mark
2  FWS                  = ([*WSP CRLF] 1*WSP); Folding whitespace
4  HEXDIG               = DIGIT / "A" / "B" / "C" / "D" / "E" / "F"
4  HTAB                 = %x09             ; horizontal tab
4  LF                   = %x0A             ; line feed
2  NO-WS-CTL            = %d1-8 /          ; US-ASCII control characters
                          %d11 /           ; which do not include the
                          %d12 /           ; carriage return, line feed,
                          %d14-31 /        ; and whitespace characters
                          %d127
4  SP                   = %x20             ; space
4  WSP                  = SP / HTAB        ; Whitespace characters
   UTF8-xtra-2-head     = %xC2-DF
   UTF8-xtra-3-head     = %xE0 %xA0-BF / %xE1-EC %x80-BF /
                          %xED %x80-9F / %xEE-EF %x80-BF
   UTF8-xtra-4-head     = %xF0 %x90-BF / %xF1-F7 %x80-BF
   UTF8-xtra-5-head     = %xF8 %x88-BF / %xF9-FB %x80-BF
   UTF8-xtra-6-head     = %xFC %x84-BF / %xFD    %x80-BF
   UTF8-xtra-char       = UTF8-xtra-2-head 1( UTF8-xtra-tail ) /
                          UTF8-xtra-3-head 1( UTF8-xtra-tail ) /
                          UTF8-xtra-4-head 2( UTF8-xtra-tail ) /
                          UTF8-xtra-5-head 3( UTF8-xtra-tail ) /
                          UTF8-xtra-6-head 4( UTF8-xtra-tail )
   UTF8-xtra-tail       = %x80-BF









C. H. Lindsey                                                 [Page 101]


                          News Article Format                   May 2002

2  atext                = ALPHA / DIGIT /
                          "!" / "#" /      ; Any character except
                          "$" / "%" /      ; controls, SP, and specials.
                          "&" / "'" /      ; Used for atoms
                          "*" / "+" /
                          "-" / "/" /
                          "=" / "?" /
                          "^" / "_" /
                          "`" / "{" /
                          "|" / "}" /
                          "~"
2  atom                 = [CFWS] 1*atext [CFWS]
2  ccontent             = ctext / quoted-pair / comment
2  comment              = "(" *([FWS] ccontent) [FWS] ")"
2* ctext                = NO-WS-CTL /      ; all of <text> except
                          %d33-39 /        ; SP, HTAB, "(", ")"
                          %d42-91 /        ; and "\"
                          %d93-126 /
                          UTF8-xtra-char
2  dcontent             = dtext / quoted-pair
2  dot-atom             = [CFWS] dot-atom-text [CFWS]
2  dot-atom-text        = 1*atext *( "." 1*atext )
2  dtext                = NO-WS-CTL /      ; Non white space controls
                          %d33-90 /        ; The rest of the US-ASCII
                          %d94-126         ; characters not including
                                           ; "[", "]", or "
2  phrase               = 1*word
2  qcontent             = qtext / quoted-pair
2* qtext                = NO-WS-CTL /      ; all of <text> except
                          %d33 /           ; SP, HTAB, "\" and DQUOTE
                          %d35-91 /
                          %d93-126 /
                          UTF8-xtra-char
2  quoted-pair          = "\" text
2  quoted-string        = [CFWS] DQUOTE
                             *( [FWS] qcontent ) [FWS]
                             DQUOTE [CFWS]
2  specials             = "(" / ")" /      ; Special characters used in
                          "<" / ">" /      ;  other parts of the syntax
                          "[" / "]" /
                          ":" / ";" /
                          "@" / "\" /
                          "," / "." /
                          DQUOTE
   strict-qcontent      = strict-qtext / strict-quoted-pair
   strict-quoted-pair   = "\" strict-text
   strict-quoted-string
                        = [CFWS] DQUOTE
                             *( [FWS] strict-qcontent ) [FWS]
                             DQUOTE [CFWS]
   strict-qtext         = NO-WS-CTL /      ; qtext restricted to
                          %d33 /           ; US-ASCII
                          %d35-91 /
                          %d93-126

C. H. Lindsey                                                 [Page 102]


                          News Article Format                   May 2002

   strict-text          = %d1-9 /          ; text restricted to
                          %d11-12 /        ; US-ASCII
                          %d14-127
2* text                 = %d1-9 /          ; all UTF-8 characters except
                          %d11-12 /        ; US-ASCII NUL, CR and LF
                          %d14-127 /
                          UTF8-xtra-char
5  tspecials            = "(" / ")" / "<" / ">" / "@" /
                          "," / ";" / ":" / "\" / DQUOTE /
                          "/" / "[" / "]" / "?" / "="
2* unstructured         = 1*( [FWS] utext ) [FWS]
2* utext                = NO-WS-CTL /      ; Non white space controls
                          %d33-126 /       ; The rest of US-ASCII
                          UTF8-xtra-char
2  word                 = atom / quoted-string

Appendix B.2 - Basic Forms

2  addr-spec            = local-part "@" domain
2  address              = mailbox / group
2  address-list         = address *( "," address )
2  angle-addr           = [CFWS] "<" addr-spec ">" [CFWS]
   article              = 1*( header CRLF ) separator body
5* attribute            = [CFWS] token [CFWS]
   body                 = *( *998text CRLF )
2  display-name         = phrase
2  date                 = day month year
2  date-time            = [ day-of-week "," ] date FWS time [CFWS]
2  day                  = [FWS] 1*2DIGIT
2  day-name             = "Mon" / "Tue" / "Wed" / "Thu" /
                          "Fri" / "Sat" / "Sun"
2  day-of-week          = [FWS] day-name
2  domain               = dot-atom / domain-literal
2  domain-literal       = [CFWS] "[" *([FWS] dcontent) [FWS] "]" [CFWS]
2  group                = display-name ":" [ mailbox-list / CFWS ] ";"
                             [CFWS]
   header-name          = 1*name-character *( "-" 1*name-character )
2  hour                 = 2DIGIT
2* local-part           = dot-atom / strict-quoted-string
2  mailbox              = name-addr / addr-spec
2  mailbox-list         = mailbox *( "," mailbox )
2  minute               = 2DIGIT
2  month                = FWS month-name FWS
2  month-name           = "Jan" / "Feb" / "Mar" / "Apr" /
                          "May" / "Jun" / "Jul" / "Aug" /
                          "Sep" / "Oct" / "Nov" / "Dec"
2  name-addr            = [display-name] angle-addr
   name-character       = ALPHA / DIGIT
   other-header         = header-name ":" 1*SP other-content
   other-content
                        = <the content of a header defined by some
                             other standard>
   other-parameter
                        = <a parameter not defined by this standard>

C. H. Lindsey                                                 [Page 103]


                          News Article Format                   May 2002

5  parameter            = attribute "=" value
2  second               = 2DIGIT
   separator            = CRLF
2  time                 = time-of-day FWS zone
2  time-of-day          = hour ":" minute [ ":" second ]
5  token                = 1*<any (US-ASCII) CHAR except SP, CTLs,
                             or tspecials>
5  value                = [CFWS] token [CFWS] / quoted-string
5* x-token              = "x-" token
2  year                 = 4*DIGIT
2* zone                 = (( "+" / "-" ) 4DIGIT) / "UT" / "GMT"

Appendix B.3 - Headers

Appendix B.3.1 - Header outlines

   header               = other-header /
                          Date-header /
                          From-header /
                          Message-ID-header /
                          Subject-header /
                          Newsgroups-header /
                          Path-header /
                          Reply-To-header /
                          Sender-header /
                          Organization-header /
                          Keywords-header /
                          Summary-header /
                          Distribution-header /
                          Followup-To-header /
                          Mail-Copies-To-header /
                          Posted-And-Mailed-header /
                          References-header /
                          Expires-header /
                          Archive-header /
                          Control-header /
                          Approved-header /
                          Supersedes-header /
                          Xref-header /
                          Lines-header /
                          User-Agent-header /
                          Injector-Info-header /
                          Complaints-To-header

   Approved-content     = From-content
   Approved-header      = "Approved" ":" SP Approved-content
                             *( ";" other-parameter )
   Archive-content      = [CFWS] ("no" / "yes" ) [CFWS]
   Archive-header       = "Archive" ":" SP Archive-content
                             *( ";" ( Archive-parameter /
                                      other-parameter ) )
   Archive-parameter    = <a parameter with attribute "filename"
                           and any value>
   Complaints-To-content= address-list

C. H. Lindsey                                                 [Page 104]


                          News Article Format                   May 2002

   Complaints-To-header = "Complaints-To" ":" SP Complaints-To-content
   Control-content      = [CFWS] control-message [CFWS]
   Control-header       = "Control" ":" SP Control-content
                             *( ";" other-parameter )
   Date-content         = date-time
   Date-header          = "Date" ":" SP Date-content
                             *( ";" other-parameter )
   Distribution-content = distribution *( dist-delim distribution )
   Distribution-header  = "Distribution" ":" SP Distribution-content
                             *( ";" other-parameter )
   Expires-content      = date-time
   Expires-header       = "Expires" ":" SP Expires-content
                             *( ";" other-parameter )
   Followup-To-content  = Newsgroups-content / [FWS] "poster" [FWS]
   Followup-To-header   = "Followup-To" ":" SP Followup-To-content
                             *( ";" other-parameter )
   From-content         = mailbox-list
   From-header          = "From" ":" SP From-content
   Injector-Info-content= [CFWS] path-identity [CFWS]
   Injector-Info-header = "Injector-Info" ":" SP Injector-Info-content
                             *( ";" ( Injector-Info-parameter /
                                      other-parameter ) )
   Injector-Info-parameter
                        = posting-host-parameter /
                          posting-account-parameter /
                          posting-sender-parameter /
                          posting-logging-parameter /
                          posting-date-parameter
   Keywords-content     = phrase *( "," phrase )
   Keywords-header      = "Keywords" ":" SP Keywords-content
                             *( ";" other-parameter )
   Lines-content        = [CFWS] 1*DIGIT [CFWS]
   Lines-header         = "Lines" ":" SP Lines-content
                             *( ";" other-parameter )
   Mail-Copies-To-content
                        = copy-addr / [CFWS] ( "nobody" /
                                      "poster" ) [CFWS]
   Mail-Copies-To-header= "Mail-Copies-To" ":" SP Mail-Copies-To-content
   Message-ID-content   = msg-id
   Message-ID-header    = "Message-ID" ":" SP Message-ID-content
                             *( ";" other-parameter )
   Newsgroups-content   = [FWS] newsgroup-name
                             *( [FWS] ng-delim [FWS] newsgroup-name )
                             [FWS]
   Newsgroups-header    = "Newsgroups"  ":" SP Newsgroups-content
                                  *( ";" other-parameter )
   Organization-content = unstructured
   Organization-header  = "Organization" ":" SP Organization-content
   Path-content         = [FWS] *( path-identity [FWS]
                                      path-delimiter [FWS]
                                 ) tail-entry [FWS]
   Path-header          = "Path" ":" SP Path-content
                             *( ";" other-parameter )


C. H. Lindsey                                                 [Page 105]


                          News Article Format                   May 2002

   Posted-And-Mailed-content
                        = [CFWS] ( "yes" / "no" ) [CFWS]
   Posted-And-Mailed-header
                        = "Posted-And-Mailed" ":" SP
                             Posted-And-Mailed-content
                             *( ";" other-parameter )
   References-content   = msg-id *( CFWS msg-id )
   References-header    = "References" ":" SP References-content
                             *( ";" other-parameter )
   Reply-To-content     = address-list
   Reply-To-header      = "Reply-To" ":" SP Reply-To-content
   Sender-content       = mailbox
   Sender-header        = "Sender" ":" SP Sender-content
                             *( ";" other-parameter )
   Subject-content      = [ [FWS] back-reference ] pure-subject
   Subject-header       = "Subject" ":" SP Subject-content
   Summary-content      = unstructured
   Summary-header       = "Summary" ":" SP Summary-content
   Supersedes-content   = msg-id
   Supersedes-header    = "Supersedes" ":" SP Supersedes-content
                             *( ";" other-parameter )
   User-Agent-content   = product-token *( CFWS product-token )
   User-Agent-header    = "User-Agent" ":" SP User-Agent-content
                             *( ";" other-parameter )
   Xref-content         = [CFWS] server-name 1*( CFWS location ) [CFWS]
   Xref-header          = "Xref" ":" SP Xref-content
                             *( ";" other-parameter )

Appendix B.3.2 - Control-message outlines

   control-message      = <empty> /
                          Newgroup-message /
                          Rmgroup-message /
                          Mvgroup-message /
                          Checkgroup-message /
                          Cancel-message /
                          Ihave-message /
                          Sendme-message

   Cancel-arguments     = CFWS msg-id
   Cancel-message       = "cancel" Cancel-arguments
   Checkgroup-arguments = [ chkscope ] [ chksernr ]
   Checkgroup-message   = "checkgroups" Checkgroup-arguments
   Ihave-arguments      = relayer-name
   Ihave-message        = "ihave" Ihave-arguments
   Mvgroup-arguments    = CFWS newsgroup-name CFWS newsgroup-name
                             [ CFWS newgroup-flag ]
   Mvgroup-message      = "mvgroup" Mvgroup-arguments
   Newgroup-arguments   = CFWS newsgroup-name [ CFWS newgroup-flag ]
   Newgroup-message     = "newgroup" Newgroup-arguments
   Rmgroup-arguments    = CFWS newsgroup-name
   Rmgroup-message      = "rmgroup" Rmgroup-arguments
   Sendme-arguments     = Ihave-arguments
   Sendme-message       = "sendme" Sendme-arguments

C. H. Lindsey                                                 [Page 106]


                          News Article Format                   May 2002

Appendix B.3.3 - Other header rules

   article-locator      = 1*( %x21-27 / %x29-3A / %x3C-7E )
                                  ; US-ASCII printable characters
                                  ; except'(' and ';'
   article-size         = 1*DIGIT
   back-reference       = %x52.65.3A.20
                                  ; which is a case-sensitive "Re: "
   batch                = 1*( batch-header article )
   batch-header         = "#!" SP rnews SP article-size CRLF
   checkgroups-body     = *( valid-group CRLF )
   chkscope             = 1*( CFWS ["!"] newsgroup-name )
   chksernr             = CFWS "#" 1*DIGIT
   combiner-ASCII       = DIGIT / ALPHA / "+" / "-" / "_"
   combiner-base        = combiner-ASCII / combiner-extended
   combiner-extended    = <any character with a Unicode code value of
                           0080 or greater and a combining class of 0,
                           but excluding any character in Unicode
                           categories Cc, Cf, Cs, Zs, Zl, and Zp>
   combiner-mark        = <any character with a Unicode code value of
                           0080 or greater and a combining class other
                           than 0>
   component            = 1*component-glyph
   component-glyph      = combiner-base *combiner-mark
   copy-addr            = address-list
   dist-delim           = ","
   distribution         = [FWS] distribution-name [FWS]
   distribution-name    = ALPHA 1*distribution-rest
   distribution-rest    = ALPHA / "+" / "-" / "_"
   groupinfo-body       = [ newsgroups-tag CRLF ]
                             newsgroups-line CRLF
3  hex4                 = 1*4HEXDIG
3  hexpart              = hexseq / hexseq "::" [ hexseq ] /
                          "::" [ hexseq ]
3  hexseq               = hex4 *( ":" hex4)
   host-value           = dot-atom /
                          [ dot-atom ":" ]
                            ( IPv4address / IPv6address )
                            ; see  [RFC 2373]
2  id-left              = dot-atom-text / no-fold-quote
2  id-right             = dot-atom-text / no-fold-literal
   ihave-body           = *( msg-id CRLF )
3  IPv4address          = 1*3DIGIT "." 1*3DIGIT "."
                             1*3DIGIT "." 1*3DIGIT
3  IPv6address          = hexpart [ ":" IPv4address ]
   location             = newsgroup-name ":" article-locator
   moderation-flag      = %x28.4D.6F.64.65.72.61.74.65.64.29
                             ; case sensitive "(Moderated)"
2  msg-id               = [CFWS] "<" id-left "@" id-right ">" [CFWS]
   newgroup-flag        = "moderated"
   newsgroup-description
                        = utext *( *WSP utext )
   newsgroup-name       = component *( "." component )


C. H. Lindsey                                                 [Page 107]


                          News Article Format                   May 2002

   newsgroups-line      = newsgroup-name
                             [ 1*HTAB newsgroup-description ]
                             [ 1*WSP moderation-flag ]
   newsgroups-tag       = %x46.6F.72 SP %x79.6F.75.72 SP
                             %x6E.65.77.73.67.72.6F.75.70.73 SP
                             %x66.69.6C.65.3A
                             ; case sensitive
                             ; "For your newsgroups file:"
   ng-delim             = ","
2* no-fold-literal      = "[" *( dtext / "\[" / "\]" / "\\" ) "]"
2* no-fold-quote        = DQUOTE
                             *( strict-qtext / "\\" / "\" DQUOTE )
                             qspecial
                             *( strict-qtext / "\\" / "\" DQUOTE )
                          DQUOTE
   path-delimiter       = "/" / "?" / "%" / "," / "!"
   path-identity        = ( ALPHA / DIGIT )
                             *( ALPHA / DIGIT / "-" / "." / ":" / "_" )
   posting-account-parameter
                        = <a parameter with attribute "posting-account"
                           and any value>
   posting-date-parameter
                        = <a parameter with attribute "posting-date"
                           and value some date-time>
   posting-host-parameter
                        = <a parameter with attribute "posting-host"
                           and value some host-value>
   posting-logging-parameter
                        = <a parameter with attribute "logging-data"
                           and any value>
   posting-sender-parameter
                        = <a parameter with attribute "sender"
                           and value some sender-value>
   product-token        = value [ "/" product-version ]
   product-version      = value
   pure-subject         = unstructured
   qspecial             = "(" / ")" /        ; same as specials except
                          "<" / ">" /        ; "\" and DQUOTE quoted
                          "[" / "]" /
                          ":" / ";" /
                          "@" / "\\" /
                          "," / "." /
                          "\" DQUOTE
   relayer-name         = path-identity
   rnews                = %x72.6E.65.77.73 ; case sensitive "rnews"
   sender-value         = mailbox / "verified"
   sendme-body          = ihave-body
   server-name          = path-identity
   tail-entry           = 1*( ALPHA / DIGIT / "-" / "." / ":" / "_" )
   valid-group          = newsgroups-line





C. H. Lindsey                                                 [Page 108]


                          News Article Format                   May 2002

Appendix C - Notices

Intellectual Property

   The IETF takes no position regarding the validity or scope of any
   intellectual property or other rights that might be claimed to
   pertain to the implementation or use of the technology described in
   this document or the extent to which any license under such rights
   might or might not be available; neither does it represent that it
   has made any effort to identify any such rights.  Information on the
   IETF's procedures with respect to rights in standards-track and
   standards-related documentation can be found in BCP-11.  Copies of
   claims of rights made available for publication and any assurances of
   licenses to be made available, or the result of an attempt made to
   obtain a general license or permission for the use of such
   proprietary rights by implementors or users of this specification can
   be obtained from the IETF Secretariat.

   The IETF invites any interested party to bring to its attention any
   copyrights, patents or patent applications, or other proprietary
   rights which may cover technology that may be required to practice
   this standard.  Please address the information to the IETF Executive
   Director.

Full Copyright Statement

   Copyright (C) The Internet Society (2002). All Rights Reserved

   This document and translations of it may be copied and furnished to
   others, and derivative works that comment on or otherwise explain it
   or assist in its implementation may be prepared, copied, published
   and distributed, in whole or in part, without restriction of any
   kind, provided that the above copyright notice and this paragraph are
   included on all such copies and derivative works.  However, this
   document itself may not be modified in any way, such as by removing
   the copyright notice or references to the Internet Society or other
   Internet organizations, except as needed for the  purpose of
   developing Internet standards in which case the procedures for
   copyrights defined in the Internet Standards process must be
   followed, or as required to translate it into languages other than
   English.

   The limited permissions granted above are perpetual and will not be
   revoked by the Internet Society or its successors or assigns.

   This document and the information contained herein is provided on an
   "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING
   TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING
   BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION
   HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF
   MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.




C. H. Lindsey                                                 [Page 109]