Modularity and efficiency in protocol implementation
RFC 817

Document Type RFC - Informational (July 1982; No errata)
Last updated 2016-04-08
Stream Legacy stream
Formats plain text html pdf htmlized (tools) htmlized bibtex
Stream Legacy state (None)
Consensus Boilerplate Unknown
RFC Editor Note (None)
IESG IESG state RFC 817 (Informational)
Telechat date
Responsible AD (None)
Send notices to (None)
RFC:  817


                             David D. Clark
                  MIT Laboratory for Computer Science
               Computer Systems and Communications Group
                               July, 1982

     1.  Introduction

     Many  protocol implementers have made the unpleasant discovery that

their packages do not run quite as fast as they had hoped.    The  blame

for  this  widely  observed  problem has been attributed to a variety of

causes, ranging from details in  the  design  of  the  protocol  to  the

underlying  structure  of  the  host  operating  system.   This RFC will

discuss  some  of  the  commonly  encountered   reasons   why   protocol

implementations seem to run slowly.

     Experience  suggests  that  one  of  the  most important factors in

determining the performance of an implementation is the manner in  which

that   implementation  is  modularized  and  integrated  into  the  host

operating system.  For this reason, it is useful to discuss the question

of how an implementation is structured at the same time that we consider

how it will perform.  In fact, this RFC will argue  that  modularity  is

one  of  the chief villains in attempting to obtain good performance, so

that the designer is faced  with  a  delicate  and  inevitable  tradeoff

between good structure and good performance.  Further, the single factor

which most strongly determines how well this conflict can be resolved is

not the protocol but the operating system.


     2.  Efficiency Considerations

     There  are  many aspects to efficiency.  One aspect is sending data

at minimum transmission cost, which  is  a  critical  aspect  of  common

carrier  communications,  if  not  in local area network communications.

Another aspect is sending data at a high rate, which may not be possible

at all if the net is very slow, but which may be the one central  design

constraint when taking advantage of a local net with high raw bandwidth.

The  final  consideration is doing the above with minimum expenditure of

computer resources.  This last may be necessary to achieve  high  speed,

but  in  the  case  of  the  slow  net may be important only in that the

resources used up, for example  cpu  cycles,  are  costly  or  otherwise

needed.    It  is  worth  pointing  out that these different goals often

conflict; for example it is often possible to trade off efficient use of

the computer against efficient use of the network.  Thus, there  may  be

no such thing as a successful general purpose protocol implementation.

     The simplest measure of performance is throughput, measured in bits

per second.  It is worth doing a few simple computations in order to get

a  feeling for the magnitude of the problems involved.  Assume that data

is being sent from one machine to another in packets of 576  bytes,  the

maximum  generally acceptable internet packet size.  Allowing for header

overhead, this packet size permits 4288 bits  in  each  packet.    If  a

useful  throughput  of  10,000  bits  per second is desired, then a data

bearing packet must leave the sending host about every 430 milliseconds,

a little over two per second.  This is clearly not difficult to achieve.

However, if one wishes to achieve 100 kilobits  per  second  throughput,


the packet must leave the host every 43 milliseconds, and to achieve one

megabit  per  second,  which  is not at all unreasonable on a high-speed

local net, the packets must be spaced no more than 4.3 milliseconds.

     These latter numbers are a slightly more alarming goal for which to

set one's sights.  Many operating systems take a substantial fraction of

a millisecond just to service an interrupt.  If the  protocol  has  been

structured  as  a  process,  it  is  necessary  to  go through a process

scheduling before the protocol code can even begin to run.  If any piece

of a protocol package or its data must be fetched from disk,  real  time

delays  of  between  30  to  100  milliseconds  can be expected.  If the

protocol must compete for cpu resources  with  other  processes  of  the

system,  it  may  be  necessary  to wait a scheduling quantum before the

protocol can run.   Many  systems  have  a  scheduling  quantum  of  100

milliseconds  or  more.   Considering these sorts of numbers, it becomes

immediately clear that the protocol must be fitted  into  the  operating

system  in  a  thorough  and  effective  manner  if  any like reasonable

throughput is to be achieved.

     There is one obvious conclusion immediately suggested by even  this

simple  analysis.    Except  in  very  special  circumstances, when many

packets are being processed at once, the cost of processing a packet  is

dominated  by  factors, such as cpu scheduling, which are independent of

the  packet  size.    This  suggests  two  general   rules   which   any
Show full document text