[Search] [txt|pdf|bibtex] [Tracker] [Email] [Diff1] [Diff2] [Nits]

Versions: 00 01                                                         
Network Working Group                                          C. Newman
Internet Draft: Application Protocol Design Principles          Innosoft
Document: draft-newman-protocol-design-01.txt                  July 1997
                                                   Expires in six months


                 Application Protocol Design Principles


Status of this memo

     This document is an Internet-Draft.  Internet-Drafts are working
     documents of the Internet Engineering Task Force (IETF), its areas,
     and its working groups.  Note that other groups may also distribute
     working documents as Internet-Drafts.

     Internet-Drafts are draft documents valid for a maximum of six
     months and may be updated, replaced, or obsoleted by other
     documents at any time.  It is inappropriate to use Internet-Drafts
     as reference material or to cite them other than as "work in
     progress."

     To view the entire list of current Internet-Drafts, please check
     the "1id-abstracts.txt" listing contained in the Internet-Drafts
     Shadow Directories on ftp.is.co.za (Africa), ftp.nordu.net
     (Europe), munnari.oz.au (Pacific Rim), ds.internic.net (US East
     Coast), or ftp.isi.edu (US West Coast).

Abstract

     There are a number of design principles which come into play over
     and over again when designing application protocols.  Many of these
     are entrenched in IETF lore and spread by word of mouth.  Most have
     been learned the hard way many times.

     This is an attempt to codify some of these principles so they can
     be referenced rather than spread by word of mouth.  The author has
     not invented any of these ideas and while the exercise of finding
     the originator of the ideas would be interesting, it is not deemed
     necessary for this project.

     Many of these principles have a much wider scope than application
     protocol design.  However, the author's primary experience is with
     application protocols and examples provided usually involve
     application protocols or elements.

     [Disclaimer: this is a preliminary draft.  Some of the case studies
     and exceptions need tuning.  Suggestions welcome.]



Newman                                                          [Page i]


Internet Draft   Application Protocol Design Principles        July 1997





                           Table of Contents



Status of this memo ...............................................    i
Abstract ..........................................................    i
1.   K.I.S.S.  ....................................................    1
2.   Make the Common Case Simple & Uncommon Case Possible .........    1
3.   0, 1, N Principle ............................................    1
4.   Be Liberal/Conservative ......................................    2
5.   Avoid Silly States ...........................................    3
6.   Text Not Numbers .............................................    4
7.   Avoid Alternative Representations ............................    4
8.   Announce Features, Not Version ...............................    5
9.   Avoid Unnecessary Layers .....................................    6
10.  Fully Qualify Data ...........................................    6
11.  Extensibility ................................................    6
12.  Conclusions Based on Design Principles .......................    7
13.  Security Considerations ......................................    8
14.  References ...................................................    8
15.  Author's Address .............................................    9



























Newman                                                         [Page ii]


1.   K.I.S.S.

     The "Keep It Simple, Stupid" principle or "KISS" principle is well
     known.  The basic idea is not to add complexity if there is any way
     to avoid it.  Sometimes this also involves a decision of where the
     complexity should live (e.g. client implementation, server
     implementation, protocol itself, external layers).  This is a very
     difficult principle to follow in practice.

     Consequences of Violation: design errors, implementation bugs, poor
     deployment, poor maintainability, interoperability problems, poor
     usability, less peer review, protocol has to be "profiled" to
     interoperate.

     Case Study: X.400 vs. SMTP/MIME.  X.400 is very complex and is
     losing ground steadily in the marketplace.  SMTP/MIME is much
     simpler and is gaining ground in the marketplace.

     Case Study: The OSI Virtual Terminal (VT) is much more complex than
     telnet and has received little success in the marketplace by
     comparison.  Note that telnet is unnecessarily complex by itself.

2.   Make the Common Case Simple & Uncommon Case Possible

     This is largely a corollary of the KISS principle.  Sometimes
     phrased as "design for the common case."  The idea is to make the
     common case very simple without disallowing the useful uncommon
     cases.  This requires identifying the common case which is a good
     idea.

     Consequences of Violation: Same as KISS.  If useful uncommon case
     is not possible, then a potentially complex protocol extension is
     necessary which results in more complexity than if the uncommon
     case was considered from the start.

     Case Study (common case too complex): ASN.1 makes the common case
     far too complex.  While it does provide for unlimited
     extensibility, in practice implementations can't deal with many
     legal structures due to the complexity.

     Case Study (uncommon case not possible): Internet Mail originally
     didn't allow non-text data.  MIME is more complex than it would
     have been if designed in from the beginning.

3.   0, 1, N Principle

     The "0, 1, N principle" is not an obvious principle but is true
     surprisingly often.  In general, any protocol element or object



Newman                                                          [Page 1]


Internet Draft   Application Protocol Design Principles        July 1997


     should come in quantities of 0, 1, or N where N is an arbitrary
     number.  If a limit is picked, it is likely to be too small.  This
     is especially true of hierarchy, and often true of names.

     Consequences of Violation: System has to be extended to allow
     larger values.  This causes a transition with severe
     interoperability problems or a semantic overload of existing
     structure which adds complexity and confusion.

     Case Study: 640K

     Case Study: 32-bit IPv4 address space

     Case Study: MIME media types.  Two levels of hierarchy were defined
     in the initial MIME specification.  This proved inadequate, so a
     new hierarchy delimiter had to be introduced to allow more naming
     hierarchy.

     Case Study: SMTP error codes have 3 levels of hierarchy with 10
     settings each.  This has proved to be insufficent and inflexible
     requiring the addition of ESMTP ENHANCEDSTATUSCODES [ESMTP-STATUS].

     Exception: A quantity of two is permitted for clearly binary
     situations.

     Exception: Has to be balanced with the KISS principle.  For
     example, current practice limits most numbers in protocols to
     32-bit values so they can be represented as native integers on
     32-bit machines.

4.   Be Liberal/Conservative

     The "Be Liberal in What You Accept, and Conservative in What You
     Generate" principle is well known in the IETF, but controversial in
     some cases.  The intention is to maximize interoperability.  The
     basic idea is that on generation one should follow the standard
     strictly as that will work with all other compliant software.  On
     acceptance if one tolerates minor protocol or format violations, it
     helps work around known bugs in other software.  This principle
     would work great if everyone followed it.  However, when there are
     mixtures of systems which follow this rule and others which don't
     the exceptions below need to be considered.

     Consequences of Violation: decreased interoperability, customers
     blaming the violator of this principle for bugs in other vendor's
     software, reduced extensibility, loss of functionality.

     Case Study (conservative accept): The X.400 OIW documents specify



Newman                                                          [Page 2]


Internet Draft   Application Protocol Design Principles        July 1997


     that implementations reject messages which have inconsistent
     global-domain-identifiers in their message-ids and trace
     information fields.  The result of this recommendation to be
     "conservative in what you accept" is that lots of mail is rejected
     unnecessarily, and other implementations now have to force those
     fields to be equal, which may damage the functionality they are
     supposed to provide.

     Case Study (liberal generate): Vendor extensions to html have
     caused serious interoperability problems in the field.

     Exception: Don't accept ambiguous interactive input.  If the user
     ends up seeing the data in an indecipherable context, severe
     consequences result.  It's often better to reject the data so the
     problem can be fixed at the source.

     Exception: Interactive servers.  If client vendors notice their
     illegal behavior before deploying, it gets fixed before it's
     deployed, and overall interoperability is increased.

     Exception Case Study: When netnews was initially deployed, a number
     of clients generated date headers in a variety of illegal formats.
     Fairly early in the deployment, a major implementation was modified
     to discard news messages which had missing or improperly formatted
     date headers.  Very soon after this was deployed, all date headers
     in news were interoperable.

5.   Avoid Silly States

     Whenever possible, design the system so no silly states are
     possible.  A silly state is a combination of options or values
     which contradict each other or are nonsensical.  A common occurance
     is use of function bits to indicate non-binary values (e.g. the
     Marked and Unmarked mailbox flags in IMAP4).

     Consequences of Violation: Increased complexity to deal with the
     possibility of the silly states occurring.

     Case Study: POP3 [POP3] includes both an octet count in the LIST
     command and an end of text mark in the RETR command.  This results
     in the possibility of the actual octet count differing from the
     LIST octet count and has caused bugs in practice.

     Case Study: The Internet mail format [IMAIL] permits the same
     header to appear multiple times even when this doesn't make sense.
     This has resulted in the generation of such broken messages which
     are inconsistantly interpretated.  The DRUMS (detailed revision and
     update of messaging standards) working group is addressing this



Newman                                                          [Page 3]


Internet Draft   Application Protocol Design Principles        July 1997


     problem.

6.   Text Not Numbers

     Whenever possible, text should be used instead of numbers.  Numbers
     almost always have to be looked up in order for humans to interpret
     them.  Text can be read and debugged by a mere mortal.  One common
     counter argument is that numbers are more compact, but if size is a
     serious concern, a general purpose compression layer is usually a
     better solution.  Another counter argument is that the mapping
     tables and parsers to convert to internal numbers add complexity.
     In practice the complexity of debugging a non-text protocol is
     usually greater than the complexity of the parser and tables.

     Consequences of Violation: Protocol is difficult to debug, protocol
     is difficult to understand, examples are hard to provide.  Previous
     three consequences make this equivalent to a KISS violation.
     Results in poor user interface.  Endian problems.

     Case Study: Problems in binary protocols such as X.400 are very
     hard to diagnose.  The protocol trace has to be recorded and run
     through an interpretor to debug.  SMTP can be debugged by observing
     the original protocol trace.

     Case Study: Whenever numeric error codes are used unqualified by
     text, humans are invariably presented with these error codes,
     resulting in a poor user interface and debugging difficulties.
     There is also the problem of poor correlation of numeric codes and
     actual errors -- in the X.400 case, this has resulted in the EMA
     publishing correlation tables between implementations, error
     numbers, and what the errors actually mean.

     Case Study: The telnet ENVIRON option had to be replaced with
     NEW-ENVIRON due to endian problems.

     Exceptions: Compression or Encrytion layers (which make things
     unreadable anyway).  Low-level protocols with high performance
     requirements.  Encapsulated non-text objects.

7.   Avoid Alternative Representations

     Having several ways to represent the same thing results in
     interoperability problems.  In general, implementors will only test
     the representation format they use.  The less often used
     representations will fail to work.  In a worst case scenario, two
     or more representions are widely used, but systems which use one
     often can't talk to systems which use another.




Newman                                                          [Page 4]


Internet Draft   Application Protocol Design Principles        July 1997


     Consequences of Violation: Serious interoperability problems, more
     bugs, conversion support necessary to interoperate.

     Case Study: The TIFF image format permits both a "big endian"
     version and a "little endian" version.  Some implementations can
     only read one or the other.  Many TIFF applications now have a
     "Macintosh format TIFF" vs "IBM format TIFF" option when saving
     TIFF files.

     Case Study: ASN.1 provides many ways of representing the same
     thing.  This has caused numerous interoperability problems as not
     all systems support all representations of a given field.  Profiles
     of ASN.1 are usually necessary to interoperate at all.

     Case Study: Internet addresses [IMAIL] allow several different ways
     to quote the same address.  The useless ones like:
     "foo"."bar"@do.main rarely work.

     Exceptions: An alternative representation may be necessary for a
     more expressive case.  For example, quoted strings and literals in
     IMAP.  Rare alternative representations should be avoided.

8.   Announce Features, Not Version

     While version numbers are fine to inform the user of what
     implementation or conformance level they are at, they are usually a
     bad idea in protocols.  A system where the server announces
     available features and the client activates the features it wants
     results in a far better protocol.  If a protocol needs to be
     redesigned from scratch, use of a different port number for the new
     version will allow a parallel transition period -- otherwise when a
     major version number is increased on the server, the old clients
     cease to interoperate with it.

     Consequences of Violation: Useless version number fields, painful
     version transition, complexity due to need to support older
     versions, meaning of version number sometimes ambiguous.

     Case Study: MIME [MIME-IMB] has the MIME-Version header.  Since
     MIME also has feature announcement via headers, the version number
     is useless and will never change.

     Case Study: X.400:1988 fails to interoperate with X.400:1993 due to
     certain body part types.  [XXX: need to confirm]







Newman                                                          [Page 5]


Internet Draft   Application Protocol Design Principles        July 1997


9.   Avoid Unnecessary Layers

     Whenever two layered services can be combined into a single service
     without a significant increase in complexity, it should be done.
     Unnecessary layers result in implementor confusion and more
     complexity.

     Consequences of Violation: same as KISS violations

     Case Study: RFC 822 has a multi-layer parsing model which includes
     unfolding lines, lexing, removal of linear-white-space, and
     parsing.  This has resulted in endless confusion and serious
     interoperability problems.  The DRUMS WG is folding these into a
     single formal syntax and the result looks promising.

     Case Study: ASN.1 requires additional unnecessary layers -- BER and
     DER.  This results in encoding mistakes when the DER layer is
     forgotten and also makes translating an ASN.1 protocol from its
     formal definition into code much more complex than necessary.

10.  Fully Qualify Data

     Whenever possible, make sure the data sent is fully qualified
     either explicitly or by context.

     Consequences of Violation: serious interoperability problems, a
     painful and expensive transition to fix the problem.

     Case Study: Year 2000 problem.  Failure to fully qualify years (by
     using 4 digits) has resulted in very severe problems.  This problem
     was solved for Internet email in 1989 [HOST-REQ].

     Case Study: Characters outside the US-ASCII [US-ASCII] repertoire
     have been used in email without a label.  This results in incorrect
     display of those characters when the email is received on a system
     with a different localized character set.  MIME solved the problem
     by requiring character set labels.  A better solution would be to
     require the use of an international charset such as UTF-8 [UTF8]
     since charset labels violate the alternate representations
     principle.

11.  Extensibility

     This is a corollary of making the uncommon case possible.  Protocol
     syntax has to be designed to permit extensibility so the protocol
     can evolve with the times.  This includes having servers announce
     features, and leaving simple rules in the formal grammar to skip
     parameters (especially in server responses) which will be defined



Newman                                                          [Page 6]


Internet Draft   Application Protocol Design Principles        July 1997


     in the future.  A common solution is to use arbitrary
     attribute/value pairs with unknown attributes ignored by older
     clients/servers.  This worked remarkably well in the Internet mail
     format [IMAIL].  PNG [PNG] is also well designed for this aspect
     and distinguishes between extensions which can be ignored, and
     extensions which can't.

     Consequences of Violation: Painful transition to add extensibility.
     Duplicate functionality in extended and un-extended form (resulting
     in alternate representations problem).  Complex feature probing.
     In the worst case, a new protocol version is necessary.

     Case Study: The IMAP4 "BODY" fetch item [IMAP4] was originally
     non-extensible and had to be replaced with the extensible
     "BODYSTRUCTURE" fetch item.

     Case Study: SMTP [SMTP] required the addition of the EHLO ESMTP
     command [ESMTP] which was a fairly painful transition.

     Case Study: X.400 has numerous formats for attachments because
     initial attempts were not sufficiently extensible.

     Case Study: ASN.1 provides formats for extensibility, but there is
     no simple way to skip unfamiliar extensions in most cases.  Thus
     implementations of ASN.1 protocols usually have to be revised to
     ignore new extensions.

12.  Conclusions Based on Design Principles

     Note that these are the conclusions of the author and do not
     represent any sort of formal IETF position.

     KISS: Every protocol should go through a "feature cut review"
     before going on the standards track.

     KISS/Text not Binary/Alternate Representations/Avoid Unnecessary
     Layers/Extensibility: Use of ASN.1 in new protocols should be
     strongly discouraged.

     0,1,N Principle/Text not Numbers: Use of 3 digit SMTP-style error
     codes in new protocols should be forbidden.

     Announce Features, Not Version: Server feature announcement should
     be required in most standards track protocols.

     Alternate Representations: CRLF line separators should be required.
     Big endian should be required in new binary protocols and formats.
     Use of UTF-8 should be preferred over labelled character sets in



Newman                                                          [Page 7]


Internet Draft   Application Protocol Design Principles        July 1997


     new protocols.

13.  Security Considerations

     Many of these can have profound security implications.  Violation
     of KISS makes a security bug more likely.  Alternate
     Representations makes a security bug more likely in a less
     frequently used representation.  A silly state could introduce a
     security bug if special handling isn't included.  Failure to follow
     the 0,1,N principle when implementing makes buffer overrun problems
     more likely.  Extensibility adds the potential for new security
     holes in the extensions.

     While it's harder to fix a security bug in a binary protocol due to
     the debugging complexity, text protocols tend to be more
     susceptible to buffer overrun security problems.  These two factors
     probably offset each other.

14.  References


     [ESMTP] Klensin, Freed, Rose, Stefferud, Crocker, "SMTP Service
     Extensions", RFC 1869, MCI, Innosoft, Dover Beach Consulting,
     Network Management Associates, Brandenburg Consulting, November
     1995.

         <ftp://ds.internic.net/rfc/rfc1869.txt>

     [ESMTP-STATUS] Freed, "SMTP Service Extension for Returning
     Enhanced Error Codes", RFC 2034, Innosoft, October 1996.

         <ftp://ds.internic.net/rfc/rfc2034.txt>

     [HOST-REQ] Braden, R., "Requirements for Internet Hosts --
     Application
      and Support", RFC 1123, Internet Engineering Task Force, October
     1989.

         <ftp://ds.internic.net/rfc/rfc1123.txt>

     [IMAIL] Crocker, D., "Standard for the Format of Arpa Internet Text
      Messages", RFC 822, University of Delaware, August 1982.

         <ftp://ds.internic.net/rfc/rfc822.txt>







Newman                                                          [Page 8]


Internet Draft   Application Protocol Design Principles        July 1997


     [IMAP4] Crispin, M., "Internet Message Access Protocol - Version
     4rev1", RFC 2060, University of Washington, December 1996.

         <ftp://ds.internic.net/rfc/rfc2060.txt>

     [MIME-IMB] Freed, Borenstein, "Multipurpose Internet Mail
     Extensions (MIME) Part One: Format of Internet Message Bodies", RFC
     2045, Innosoft, First Virtual, November 1996.

         <ftp://ds.internic.net/rfc/rfc2045.txt>

     [MIME-IMT] Freed, Borenstein, "Multipurpose Internet Mail
     Extensions (MIME) Part Two: Media Types", RFC 2046, Innosoft, First
     Virtual, November 1996.

         <ftp://ds.internic.net/rfc/rfc2046.txt>

     [PNG] Boutell, "PNG (Portable Network Graphics) Specification --
     Version 1.0", RFC 2083, Boutell.com, March 1997.

         <ftp://ds.internic.net/rfc/rfc2083.txt>

     [POP3] Myers, J., Rose, M., "Post Office Protocol - Version 3", RFC
     1939, Carnegie Mellon, Dover Beach Consulting, Inc., May 1996.

         <ftp://ds.internic.net/rfc/rfc1939.txt>

     [SMTP] Postel, "Simple Mail Transfer Protocol", RFC 821,
     Information Sciences Institute, August 1982.

         <ftp://ds.internic.net/rfc/rfc821.txt>

     [US-ASCII] "Coded Character Set--7-bit American Standard Code for
     Information Interchange", ANSI X3.4-1986.

     [UTF8] Yergeau, F. "UTF-8, a transformation format of Unicode and
     ISO 10646", RFC 2044, Alis Technologies, October 1996.

         <ftp://ds.internic.net/rfc/rfc2044.txt>

15.  Author's Address

     Chris Newman
     Innosoft International, Inc.
     1050 Lakes Drive
     West Covina, CA 91790 USA

     Email: chris.newman@innosoft.com



Newman                                                          [Page 9]


Internet Draft   Application Protocol Design Principles        July 1997





















































Newman                                                         [Page 10]