INTERNET-DRAFT                                              John Lazzaro
April 11, 2002                                            John Wawrzynek
Expires: October 11, 2002                                    UC Berkeley



              The MIDI Wire Protocol Packetization (MWPP)

                 <draft-ietf-avt-mwpp-midi-rtp-03.txt>


Status of this Memo


This document is an Internet-Draft and is subject to all provisions of
Section 10 of RFC2026.

Internet-Drafts are working documents of the Internet Engineering Task
Force (IETF), its areas, and its working groups.  Note that other groups
may also distribute working documents as Internet-Drafts.

Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time.  It is inappropriate to use Internet- Drafts as reference material
or to cite them other than as "work in progress."

The list of current Internet-Drafts can be accessed at
http://www.ietf.org/1id-abstracts.html

The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html

                                Abstract

     The MIDI Wire Protocol Packetization (MWPP) is a general-purpose
     RTP packetization for the MIDI command language. MWPP is suitable
     for use in both interactive applications (such as pseudo-wire
     emulation of MIDI cables) and content-delivery applications (such
     as MIDI file streaming). MWPP is designed for use over unicast and
     multicast UDP, and defines MIDI-specific resiliency tools for the
     graceful recovery from packet loss. A lightweight configuration of
     MWPP supports efficient use over TCP.  MWPP is compatible with the
     MPEG-4 generic RTP payload format, to support MPEG 4 Audio codecs
     that accept MIDI control input.







Lazzaro/Wawrzynek                                               [Page 1]


INTERNET-DRAFT                                             11 April 2002


0. Change Log for <draft-ietf-avt-mwpp-midi-rtp-03.txt>

  o  Entire document rewritten and reorganized.

  o  New system journal protects MIDI Systems commands
     (Section 5, Appendices B.1-5).

  o  Several changes to the MIDI Command Section format,
     (Section 3) in response to comments from Dominique
     Fober, Phil Kerr, and Martijn Sipkema:

      -- New default semantics for MIDI command timestamps;
         new SDP parameters for customizing timestamps.

      -- Enforced monotonicity of MIDI command timestamps.

      -- Only System Realtime commands are permitted
         between SysEx command segments.

      -- New termination octets for SysEx command segments.

      -- New method of coding "dropped F7" SysEx construction.

  o  In response to comments from Dominique Fober, the
     standard SDP attribute maxptime is now used to
     request a minimum MWPP sending rate, to simplify
     clock-skew compensation algorithms (Section 2).

  o  In response to comments from Dominique Fober, Phil Kerr,
     and Martijn Sipkema, the new SDP parameter midiport
     associates an arbitrary integer value with an MWPP
     stream, to label the MIDI namespace of the stream.
     Multiple streams can target the same namespace, for
     MIDI merge and hybrid transport schemes; more commonly,
     multiple streams will have unique namespaces, to
     support MIDI applications that use a large number of
     MIDI channel (Section 2).

  o  In response to comments from Dominique Fober, recovery
     journal Chapters C and P no longer redundantly code
     bank select data, and Chapter C acts to code unpaired
     registered/non-registered parameter number commands.

  o  New SDP parameters to customize coverage of the
     recovery journal (Section 6). New definition for the
     G bit in the recovery journal (Sections 4 and 5).

  o  New Acknowledgements section (Section 10).



Lazzaro/Wawrzynek                                               [Page 2]


INTERNET-DRAFT                                             11 April 2002


1. Introduction

The MIDI standard [1] defines a command set that describes sound as a
series of events (NoteOn command to start a musical note event, NoteOff
command to end a note, etc). The command execution time is not specified
in the MIDI command syntax, so that each sub-part of the standard may
customize execution time coding to its requirements.  For example, the
MIDI file format provides an explicit timestamp for each command, but
the MIDI wire protocol codes execution time in an implicit way, as the
time of arrival of commands on an asynchronous serial line.

This memo describes a general-purpose RTP packetization for the MIDI
command set, that is capable of coding MIDI streams whose original
execution time encoding takes an implicit or an explicit form. The
packetization is suitable for both interactive applications (such as
pseudo-wire emulation of MIDI cables) and content-delivery applications
(such as MIDI file streaming). The packetization is named the MIDI Wire
Protocol Packetization (MWPP), due to its origins in network musical
performance research [6].

MWPP is a modular packetization. The simplest form of MWPP uses the MIDI
command section (described in Sections 2 and 3) as a complete self-
framed RTP payload. This lightweight version of MWPP is suitable for use
over reliable transport such as TCP.

MWPP is also suitable for use over unreliable transport such as unicast
and multicast UDP. MWPP provides resiliency by inserting a recovery
journal section (described in Sections 4 and 5) into each RTP packet.
The recovery journal codes the recent history of the stream.

MWPP uses the Session Description Protocol (SDP) to configure stream
properties. SDP options (described in Sections 6 and 7) support multiple
MIDI namespaces, control of command execution time semantics, fine-
grained control of recovery journal coverage, and compatibility with the
MPEG-4 generic RTP format [3] [5].

This memo assumes a working knowledge of MIDI networking issues.
Readers unfamiliar with the application domain may wish to examine
introductory materials [6] [7] [8] before reading this memo.


2. MWPP Packet Format.

Figure 1 shows the format of an MWPP packet.  An MWPP packet has two or
three sections: the RTP header, the MIDI command section, and the
optional recovery journal. In Figure 1, vertical space delineates the
RTP header and the payload.




Lazzaro/Wawrzynek                                               [Page 3]


INTERNET-DRAFT                                             11 April 2002


 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|V=2|P|X|  CC   |M|     PT      |        Sequence number        |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                           Timestamp                           |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                             SSRC                              |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                             CSRCs                             |
|                              ...                              |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+


+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                     MIDI command section ...                  |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                       Recovery journal ...                    |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

                  Figure 1 -- MWPP packet format

An MWPP packet begins with an RTP header.  The marker bit is always set
to 1, for compatibility with the MPEG 4 generic payload format [3].  The
RTP sequence number increments by one (modulo 65536) for each packet
sent.  As is standard in RTP, the sequence number is initialized to a
randomly chosen value. MWPP does not use header extensions.

The RTP timestamp sets the base timestamp value for the packet. The
event times coded in the MIDI command section are specified relative to
this timestamp. If the MIDI command section carries no events, the
timestamp indicates the instant the RTP packet was encoded.

The RTP timestamp has the units of the SDP rtpmap parameter srate (see
Section 6). For example, if srate has a value of 44100 (Hz), two MWPP
packets whose base timestamp values differ by 2 seconds have RTP
timestamps that differ by 88200.

MWPP RTP timestamps do not necessarily increment at a fixed rate. The
timestamps for two sequential RTP packets may be identical, or the
second packet may have a timestamp arbitrarily larger than the first
packet (modulo 2^32). As is standard in RTP, the timestamp field is
initialized to a randomly chosen value.

The optional SDP attribute maxptime (defined in [9]) specifies the
maximum amount of media time an MWPP packet encodes. The media time of
an MWPP packet is the RTP timestamp difference (modulo 2^32) between
consecutively sent packets. Applications set maxptime if a minimum rate



Lazzaro/Wawrzynek                                               [Page 4]


INTERNET-DRAFT                                             11 April 2002


of RTP packet transmission is required, independent of the source rate
of MIDI event data, for the benefit of algorithms performing clock-skew
compensation, network latency estimation, and packet loss recovery.

The MWPP payload begins with the variable-length MIDI command section,
described in detail in Section 3. The commands encoded in this section
reference a single MIDI namespace (16 MIDI channels + MIDI Systems).
The SDP rtpmap parameter midiport (see Section 6) associates this
namespace with an arbitrary integer value.

Applications may support large MIDI namespaces by creating several MWPP
streams, each with a different midiport value. In this case,
applications SHOULD independently choose initial RTP random timestamp
offsets for each stream, and MAY choose different srate values for each
stream.  To synchronize the streams, applications SHOULD use the
standard RTP synchronization tools [2].

In addition, applications may create several MWPP streams that share the
same MIDI namespace, by assigning the same midiport value to each
stream. For example, a unicast application may use a UDP stream to send
real-time oriented MIDI commands, but use a TCP stream for the reliable
transport of MIDI Sample Dump commands. All MWPP streams that share the
same midiport value MUST use the same RTP timestamp timebase (SDP srate
parameter + initial randomly chosen RTP timestamp offset).

If a stream is configured for resiliency, every MWPP packet includes a
variable-length recovery journal section, described in detail in
Sections 4 and 5. If a stream is not configured for resiliency, the
recovery journal never appears in the MWPP payload. The SDP rtpmap
parameter rj (see Section 6) configures an MWPP stream for resiliency.


3. MIDI Command Section

Figure 2 shows the format of the MIDI command section.

 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|B|Z| LEN ...  |          MIDI list ...                         |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

                 Figure 2 -- MIDI command section

The MIDI command section begins with a variable-length header.  The
header field LEN codes the length (in units of octets) of the MIDI list
that follows the header.




Lazzaro/Wawrzynek                                               [Page 5]


INTERNET-DRAFT                                             11 April 2002


If the header flag B is 0, the header is one octet long, and LEN is a
6-bit field, supporting a maximum MIDI list length of 63 octets. If B is
1, the header is two octets long, and LEN is a 14-bit field, supporting
a maximum MIDI list length of 16383 octets.

A LEN value of 0 is legal, and codes an empty MIDI list.  If the MIDI
list is empty, the RTP timestamp indicates the instant the RTP packet
was encoded.

If LEN is nonzero, the MIDI list has the structure shown in Figure 3.

 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|    Delta Time 0 (if Z = 1)   |     MIDI Command 0 ...         |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|        Delta Time 1          |     MIDI Command 1 ...         |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|        Delta Time 2          |     MIDI Command 2 ...         |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                            .....                              |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|        Delta Time N          |     MIDI Command N ...         |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

                 Figure 3 -- MIDI list structure

If the header flag Z is 1, the MIDI list begins with a complete MIDI
command (MIDI Command 0) preceded by a delta time (Delta Time 0). If Z
is 0, the Delta Time 0 field is not present in the MIDI list, and MIDI
Command 0 has an implicit delta time of 0.  The MIDI list structure may
also optionally encode a list of N additional complete MIDI commands.
Each additional command is preceded by a delta time.

The MWPP delta time syntax is a modified form of the MIDI File delta
time syntax [1]. MWPP delta times use 1-4 octet fields to encode 32-bit
unsigned integers. Figure 4 shows the encoded and decoded forms of delta
times. Note that delta time values may be legally encoded in multiple
formats; for example, there are four legal ways to encode the zero delta
time (0x00, 0x8000, 0x800000, 0x80000000).











Lazzaro/Wawrzynek                                               [Page 6]


INTERNET-DRAFT                                             11 April 2002


  One-Octet Delta Time:

     Encoded form: 0ddddddd
     Decoded form: 00000000 000000000 00000000 0ddddddd

  Two-Octet Delta Time:

     Encoded form: 1ccccccc 0ddddddd
     Decoded form: 00000000 00000000 00cccccc cddddddd

  Three-Octet Delta Time:

     Encoded form: 1bbbbbbb 1ccccccc 0ddddddd
     Decoded form: 00000000 000bbbbb bbcccccc cddddddd

  Four-Octet Delta Time:

     Encoded form: 1aaaaaaa 1bbbbbbb 1ccccccc 0ddddddd
     Decoded form: 0000aaaa aaabbbbb bbcccccc cddddddd

              Figure 4 -- Decoding delta time formats

MWPP uses delta times to encode a timestamp for each MIDI command. The
timestamp for MIDI Command K is the summation (modulo 2^32) of the RTP
timestamp and decoded delta times 0 through K. All command timestamps in
a packet MUST be less than or equal to the RTP timestamp of the next
packet in the MWPP stream (modulo 2^32).

By default, a command timestamp indicates the execution time for the
command. The difference between two timestamps indicates the time delay
between the execution of the commands. This difference may be zero,
coding simultaneous execution. MIDI sources that use explicit command
timestamps, such as the MIDI file format, are simple to transcode into
MWPP streams using these default semantics.

MIDI command sources that use implicit command timing, such as the MIDI
wire protocol, must be annotated with timestamps as part of the MWPP
transcoding process. The hardware and systems environment for an
application may dictate a particular approach to timestamps, that may
not be a good fit for the default MWPP timestamp semantics. To address
this issue, MWPP timestamp semantics are configurable, via SDP
parameters. Section 6 describes these SDP parameters and their use in
MIDI wire protocol transcoding.

As a rule, each MIDI Command field in the MIDI list contains a complete
MIDI command, in the binary command format defined in the MIDI standard
[1]. In the remainder of this section, we describe exceptions to this
rule.



Lazzaro/Wawrzynek                                               [Page 7]


INTERNET-DRAFT                                             11 April 2002


The first MIDI channel command in the MIDI list MUST include a status
octet; running status coding, as defined in [1], may be used for all
subsequent MIDI channel commands in the MIDI list. As in [1], System
Common messages (0xF0 ... 0xF7) cancel running status state, but System
Realtime messages (0xF8 ... 0xFF) do not effect running status state.

In the MIDI wire protocol [1], a System Realtime command may be embedded
inside of another "host" MIDI command.  This syntactic construction is
not supported in MWPP: a MIDI Command field in the MIDI list codes
exactly one complete MIDI command.

To encode an embedded System Realtime command, senders MUST extract the
command from its host, and code it in the MIDI list as a separate
command. The host command and System Realtime command SHOULD appear in
the same MIDI list. The delta time of the System Realtime command SHOULD
result in a command timestamp that encodes the System Realtime command
placement in its original embedded position.

Two methods are provided for encoding MIDI System Exclusive (SysEx)
commands in the MIDI list. A SysEx command may be encoded in a MIDI
Command field verbatim: an 0xF0 octet, followed by an arbitrary number
of data octets, followed by an 0xF7 octet.  Alternatively, a SysEx
command may be encoded as multiple segments.  The command is divided
into two or more SysEx command segments; each segment is encoded in its
own MIDI Command field in the MIDI list.

MWPP supports segmentation in order to encode SysEx commands that encode
information in the temporal pattern of data octets; by encoding these
commands as a series of segments, each data octet is associated with a
delta time. Segmentation may also be useful in coding very large SysEx
commands across several RTP packets.

To segment a SysEx command, first partition its data octet list into two
or more sublists; each sublist must contain at least one data octet.  To
complete the segmentation, add status octets to the head and tail of
each sublist, as detailed in Figure 5. Figure 6 shows examples.

    -----------------------------------------------------------
   | Sublist Position |  Head Status Octet | Tail Status Octet |
   |-----------------------------------------------------------|
   |    first         |       0xF0        |       0xF0         |
   |-----------------------------------------------------------|
   |    middle        |       0xF7        |       0xF0         |
   |-----------------------------------------------------------|
   |    last          |       0xF7        |       0xF7         |
    -----------------------------------------------------------

           Figure 5 -- Command Segmentation Status Octets



Lazzaro/Wawrzynek                                               [Page 8]


INTERNET-DRAFT                                             11 April 2002


  Original SysEx command:

     0xF0 0x01 0x02 0x03 0x04 0x05 0x06 0x07 0x08 0xF7

  A two-segment segmentation:

     0xF0 0x01 0x02 0x03 0x04 0xF0

     0xF7 0x05 0x06 0x07 0x08 0xF7

  A different two-segment segmentation:

     0xF0 0x01 0xF0

     0xF7 0x02 0x03 0x04 0x05 0x06 0x07 0x08 0xF7

  A three-segment segmentation:

     0xF0 0x01 0x02 0xF0

     0xF7 0x03 0x04 0xF0

     0xF7 0x05 0x06 0x07 0x08 0xF7

  The segmentation with the largest number of segments:

     0xF0 0x01 0xF0

     0xF7 0x02 0xF0

     0xF7 0x03 0xF0

     0xF7 0x04 0xF0

     0xF7 0x05 0xF0

     0xF7 0x06 0xF0

     0xF7 0x07 0xF0

     0xF7 0x08 0xF7


                   Figure 6 -- Example segmentations







Lazzaro/Wawrzynek                                               [Page 9]


INTERNET-DRAFT                                             11 April 2002


The relative ordering of SysEx command segments in a MIDI list must
match the relative ordering of the sublists in the original SysEx
command. Only System Realtime MIDI commands may appear between SysEx
command segments. If the command segments of a SysEx command are placed
in the MIDI lists of two or more RTP packets, the segment ordering rules
apply to the concatenation of all affected MIDI lists.

The MIDI wire protocol [1] permits a "dropped 0xF7" construction for
SysEx commands; in this coding method, the 0xF7 octet is dropped from
the end of the SysEx command, and the status octet of the next MIDI
command acts both to terminate the SysEx command and start the next
command. To encode this construction in MWPP, follow these steps:

  o  Determine the appropriate delta times for the SysEx command and
     the command that follows the SysEx command.

  o  Insert the "dropped" 0xF7 octet at the end of the SysEx command,
     to form the standard SysEx syntax.

  o  Code both commands into the MIDI list using the rules above.

  o  Replace the 0xF7 octet that terminates the verbatim SysEx
     encoding or the last segment of the segmented SysEx encoding
     with a 0xF6 command. This substitution informs the receiver
     of the original dropped 0xF7 coding.


4. Recovery Journal Overview

In this section we introduce the recovery journal, the MWPP resiliency
tool for unreliable transport. In Section 5, we define the bitfield
format for the recovery journal; in Section 6, we describe SDP
parameters for recovery journal configuration.

A MIDI stream sent over MWPP is fragile. Consider an MWPP stream in
which one packet codes the start of a trumpet note (via a NoteOn command
in the MIDI command section) and a second packet codes the end of the
note (via a matching NoteOff command). If the second packet is lost, the
trumpet note sustains indefinitely.

One solution to loss recovery is to retransmit lost packets. MWPP over
TCP provides resiliency via packet retransmission (at a lower layer of
the network stack). However, in some MWPP applications packet
retransmission is undesirable. Retransmission adds latency, adding a
round-trip time for lost packets; if TCP is used, head-of-line blocking
latency is also an issue. Simple retransmission is also unsuitable for
multicast applications, due to scaling issues.




Lazzaro/Wawrzynek                                              [Page 10]


INTERNET-DRAFT                                             11 April 2002


A feed-forward approach to resiliency avoids retransmission by using
information encoded in the forward packet stream to guide loss recovery.
Consider this simple resiliency scheme for stuck notes: if a receiver
detects lost RTP packets via sequence number breaks, it issues NoteOff
commands for all active notes as a precaution.  This scheme solves the
problem of notes that sound forever, but the immediate effect on the
stream is jarring: the music stops.

The MWPP recovery journal system implements feed-forward resiliency in a
more graceful way. Each MWPP packet includes a special section (the
"recovery journal") that codes the recent history of the stream. Upon
detection of a packet loss, a receiver uses the recovery journal history
to guide the stream repair process, fixing long-term problems such as
stuck notes while minimizing audible artifacts.

The recovery journal does not code a literal history of the MIDI stream.
In general, it is not possible to reconstruct the lost MIDI command
stream from the recovery journal contents. Instead, the recovery journal
format codes only the information necessary for the graceful recovery
from packet loss. This coding strategy trades off generality for
bandwidth efficiency [6].

The recovery journal codes the history of the MWPP stream, back to an
earlier packet called the checkpoint packet. The size of this checkpoint
history (a precise term defined in Appendix A.1) is sent in each
recovery journal. A receiver is able to detect if the checkpoint history
is too shallow for a graceful recovery from a particular packet loss
incident.

A sender dynamically controls the size of the recovery journal by
choosing the checkpoint history depth. The sender does not have other
levers for dynamic control, because this memo normatively defines the
length and contents of the recovery journal, given the MIDI stream
contents and checkpoint history depth (static control is provided via
SDP parameters, described in Section 6). Receiver designers rely on the
normative nature of the journal definitions to devise recovery
algorithms, much as audio and video codecs designers rely on normative
bitstream definitions to act as a common media language.

Senders may choose a variety of open-loop schemes for choosing a
checkpoint history size for each packet: protection of a constant
increment of media time, protection of a constant number of packets,
maximization of protection for an average payload bandwidth, etc.  These
schemes share a common problem: if a receiver has sustained too many
consecutive lost packets, the checkpoint history of the recovery journal
may be too shallow, forcing the receiver to resort to an "ungraceful"
recovery method.




Lazzaro/Wawrzynek                                              [Page 11]


INTERNET-DRAFT                                             11 April 2002


A closed loop approach to checkpoint history management avoids this
problem. Senders monitor the last RTP packet received by each receiver,
via the "extended highest sequence number received" field in standard
RTCP RR packets [2]. If senders do not advance the checkpoint packet to
extended sequence number N until all receivers have received an MWPP
packet with extended sequence number M >= (N - 1), the depth of the
checkpoint history is sufficient for receivers to gracefully recover
from an arbitrary packet loss.

We define the term "guaranteed policy" to describe sending algorithms
that obey the M >= (N - 1) inequality for the checkpoint packet.  A
guaranteed policy MAY use the RTCP method described above to implement
its sending policy, or MAY use other means of direct feedback from
receivers. We reference the guaranteed policy in the definition of the
recovery journal bitfield format in Section 5.

The guaranteed policy is multicast compatible, as it may be implemented
via standard RTCP RR packets. However, the guarantee is only in effect
for a receiver if the sender is aware of the receiver in the session. In
practice, this limitation only impacts the start of a stream, as the RTP
standard provides several mechanisms for a receiver to sense that a
sender is aware of its presence.


5. Recovery Journal Format

This section introduces the structure of the recovery journal, and
defines the bitfields of recovery journal headers. Appendices A.2-8 and
B.1-5 complete the bitfield definition of the recovery journal; Appendix
A.1 provides normative definitions for common terms and bitfield
structures used throughout the recovery journal.

The recovery journal has a three-level structure:

  o Top-level header.

  o Channel and system journal headers. Encodes recovery
    information for a single MIDI channel (channel journal)
    and for all MIDI Systems commands (system journal).

  o Chapters. Describes recovery information for a single MIDI
    command type.

Figure 7 shows the top-level structure of the recovery journal.  A
recovery journals consists of a 3-octet header, optionally followed by a
system journal and a list of channel journals.





Lazzaro/Wawrzynek                                              [Page 12]


INTERNET-DRAFT                                             11 April 2002


 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|S|A|G|Y|TOTCHAN|    Checkpoint Packet Seqnum   |     ...       |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|   ... System journal ...      |  Channel journals ...         |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

            Figure 7 -- Top-level recovery journal format

If the Y bit is set to 1, a system journal follows the recovery journal
header. If the A bit is set to 1, the recovery journal ends with a list
of (TOTCHAN + 1) channel journals. If A and Y are both zero, the
recovery journal only contains the 3-octet header, and is considered to
be an "empty" journal.

The S (single-packet loss) bit appears in most recovery journal
structures. It helps receivers efficiently parse the recovery journal in
the common case of the loss of a single packet.  Appendix A.1 defines S
bit semantics.

The 16-bit Checkpoint Packet Seqnum field codes the sequence number of
the checkpoint packet for this journal. The choice of the checkpoint
packet sets the depth of the recovery journal history, as defined in
Appendix A.1.

If the choice of the checkpoint packet adheres to the guaranteed policy
defined in Section 4, the G ("guaranteed") bit SHOULD be set to 1. If
the choice of the checkpoint packet does not adhere to the guaranteed
policy, the G bit MUST be set to 0.

 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|S| CHAN  |R|      LENGTH       |P|W|N|A|T|C|M|R|  Chapters ... |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

                Figure 8 -- Channel journal format

Figure 8 shows the structure of a channel journal: a 3-octet header,
followed by a list of leaf elements called channel chapters. A channel
journal encodes information about MIDI commands on the MIDI channel
coded by the 4-bit CHAN header field.

The 10-bit LENGTH field codes the length of the channel journal; the R
bit is reserved. The semantics for LENGTH and R fields are uniform
throughout the recovery journal, and are defined in Appendix A.1.




Lazzaro/Wawrzynek                                              [Page 13]


INTERNET-DRAFT                                             11 April 2002


The third octet of the channel journal header is the Table of Contents
(TOC) of the channel journal. The TOC is a set of bits that encode the
presence of a chapter in the journal. Each chapter contains information
about a certain class of MIDI channel command:

   o  Chapter P: MIDI Program Change (0xC)
   o  Chapter W: MIDI Pitch Wheel (0xE)
   o  Chapter N: MIDI NoteOff (0x8), NoteOn (0x9)
   o  Chapter A: MIDI Poly Aftertouch (0xA)
   o  Chapter T: MIDI Channel Aftertouch (0xD)
   o  Chapter C: MIDI Control Change (0xB)
   o  Chapter M: MIDI Parameter System (part of 0xB)

Chapters appear in a list following the header, in order of their
appearance in the TOC. Appendices A.1-8 describe the bitfield format for
each chapter, and define the conditions under which a chapter type MUST
appear in the recovery journal. If any chapter types are required for a
channel, an associated channel journal MUST appear in the recovery
journal.

 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|S|D|V|Q|E|X|      LENGTH       |  System chapters ...          |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

                Figure 9 -- System journal format

Figure 9 shows the structure of the system journal: a 2-octet header,
followed by a list of system chapters.  System chapters code information
about a specific class of MIDI Systems command:

   o  Chapter D: Song Select (0xF3), Tune Request (0xF6), Reset (0xFF)
   o  Chapter V: Active Sense (0xFE)
   o  Chapter Q: Sequencer State (0xF2, 0xF8, 0xF9, 0xFA, 0xFB, 0xFC)
   o  Chapter E: MTC Tape Position (0xF1, 0xF0 0x7F 0xcc 0x01 0x01)
   o  Chapter X: System Exclusive (all other 0xF0)

If header bits D, V, Q, or E are set to 1, one chapter for each chapter
type whose associated bit is set appears in a list following the header.
The chapter ordering follows the ordering of chapter header bits in the
header bitfield. If header bit X is set to 1, one or more Chapter X
bitfields appear at the end of the chapter list.

Appendices B.1-5 describe the bitfield format for the system chapters,
and define the conditions under which a chapter type MUST appear in the
recovery journal. If any system chapter type is required to appear in
the recovery journal, the system journal MUST appear in the recovery



Lazzaro/Wawrzynek                                              [Page 14]


INTERNET-DRAFT                                             11 April 2002


journal.


6. Session Description Protocol for RTP Transport

This section describes Session Description Protocol (SDP) [9]
definitions for MWPP transport directly over RTP. Section 7 describes
the SDP definitions for MWPP transport over the MPEG-4 generic RTP
payload format.

The MIME name for this packetization is mwpp. The SDP rtpmap attribute
is declared as:

a=rtpmap: <payload> mwpp/<srate>/<midiport>/<rj>

The integer parameter <srate> codes the sampling rate used for the RTP
timestamp field, and has the units of Hz.

The integer parameter <midiport> codes an arbitrary identification
number for the MIDI namespace (16 MIDI channels + MIDI Systems) coded by
an MWPP stream. See Section 2 for details on midiport usage.

The binary parameter <rj> codes the presence (1) or absence (0) of the
recovery journal section in MWPP packets.

For example, the following lines bind the packetization to dynamic
payload number 96, and specifies an srate of 44100 Hz, a midiport value
of 56, and the presence of a recovery journal in each RTP packet:

m=audio 5004 RTP/AVP 96
c=IN IP4 169.229.60.64
a=rtpmap: 96 mwpp/44100/56/1

The following lines set up 32 channels of MIDI data over two MWPP
streams. As is standard in RTP/AVP, each stream has its own UDP port
number. Each stream has a unique midiport value, coding the independence
of the MIDI namespaces of the two streams.

m=audio 5004 RTP/AVP 96
c=IN IP4 169.229.60.64
a=rtpmap: 96 mwpp/44100/40/1
m=audio 5006 RTP/AVP 97
c=IN IP4 169.229.60.64
a=rtpmap: 97 mwpp/44100/41/1

The following lines set up 16 channels of MIDI transport over two MWPP
streams. Note that both streams share the same midiport value, coding
that the streams share the same MIDI namespace.



Lazzaro/Wawrzynek                                              [Page 15]


INTERNET-DRAFT                                             11 April 2002


m=audio 5004 RTP/AVP 96
c=IN IP4 169.229.60.64
a=rtpmap: 96 mwpp/44100/67/1
m=audio 5006 RTP/AVP 97
c=IN IP4 169.229.60.64
a=rtpmap: 97 mwpp/44100/67/1

MWPP defines SDP format parameters to customize the semantics of the
recovery journal. The presence of channel and system chapters in the
recovery journal is controlled by the normative text in Appendices A.1-8
and B.1-5. These appendices use the MUST keyword to specify the
conditions under which a chapter must appear in the recovery journal.

The SDP format parameters chmay, chnever, and chmust act to change the
inclusion conditions for chapters.  The chmay parameter changes the MUST
keyword conditional for chapter inclusion into a MAY.  The chnever
parameter specifies chapter types that must never appear in the recovery
journal. The chmust parameter reaffirms the default MUST keyword for a
chapter; this parameter simplifies the SDP for complex recovery journal
configurations.

These chmay, chnever, and chmust parameters use the following syntax:

  <parameter> = [optional comma-separated channel list,][chapter list];

The channel list specifies the channel journals for which this parameter
applies; if no channel list is provided, the parameter applies to all
channel journals.  The chapter list specifies the channel and system
chapters for which this parameter applies, using a concatenated list of
one or more upper-case letters corresponding to the chapter types. The
channel list is irrelevant for system chapters.  Multiple assignments to
these parameters have a cumulative effect, and are applied in the order
of parameter appearance.

For example, the following format commands remove protection for poly
and channel aftertouch commands on all channels, weaken note command
protection for channels 14 and 15, and remove pitch wheel protection for
all channels except channel 12:

a=fmpt: 96 chnever=WTA;chmay=14,15,N;chmust=12,W;

MWPP also defines SDP format parameters to configure timestamps
semantics for the MIDI command section. The tsmode parameter indicates
the timestamp mode, and takes on one of three symbolic values:

  o tsmode = comex. This mode selects the default semantics defined
    in Section 3. The octpos, mperiod, and linerate parameters
    (described below) may not be used in this mode.



Lazzaro/Wawrzynek                                              [Page 16]


INTERNET-DRAFT                                             11 April 2002


  o tsmode = async. The MWPP stream transcodes a MIDI source
    with implicit "time of arrival" time coding. The MWPP sender
    attaches nominally accurate timestamps to each MIDI command
    that code the time of arrival. The octpos and linerate
    parameters may be used with this mode; if these parameters
    do not appear, their values are considered undefined.

  o tsmode = buffer. The MWPP stream transcodes a MIDI source
    with implicit "time of arrival" time coding. The MWPP sender
    examines the MIDI source at periodic intervals, and uses the
    same timestamp value for encoded commands received in the
    interval. The octpos, mperiod, and linerate parameters may be
    used with this mode; if these parameters do not appear, their
    values are considered undefined.

We now describe the secondary format parameters octpos, mperiod, and
linerate.

The octpos parameter associates a timestamp with the first (octpos =
first) or last (octpos = last) octet of the MIDI command field. If
tsmode = buffer, octpos indicates if commands split across multiple
intervals use the timestamp of the first interval or the last interval
in which its octets appear.

The mperiod parameter sets periodic interval for tsmode = buffer.  The
mperiod parameter is an integer with units of microseconds.

The linerate parameter describes the underlying bandwidth of the MIDI
source. Linerate is an integer with units of nanoseconds, and codes the
time extent of one MIDI octet in the MIDI source medium. For example, a
standard MIDI cable has a linerate value of 320000 nanoseconds.

The following format commands set up a buffer mode session for a MIDI
cable, with a 1 ms sampling interval, and end-of-command timestamps:

a=fmpt: 96 tsmode=buffer;linerate=320000;octpos=last;mperiod=1000;


7. Session Description Protocol for MPEG-4 generic transport

This section describes Session Description Protocol (SDP) definitions
for the MPEG-4 generic RTP payload format [3] [4] [5]. Note that MWPP as
defined in this memo creates valid MPEG-4 generic RTP packets; only SDP
customization is necessary.







Lazzaro/Wawrzynek                                              [Page 17]


INTERNET-DRAFT                                             11 April 2002


The MIME name for this packetization is mpeg4-generic. The SDP rtpmap
attribute is declared as:

a=rtpmap: <payload> mpeg4-generic/<srate>/<midiport>/<rj>

The definitions of srate and rj are identical to the descriptions in
Section 6. All format parameters defined in Section 6 are supported.  In
addition, mpeg4-generic uses format parameters for transport
configuration, as shown below:

a=fmpt: <payload> streamtype=5; profile-level-id=15; mode=mwpp;

To signal SingleSL mode, we omit the ConstantSize and SizeLength format
parameters from the fmpt command. If the MPEG 4 audio codec requires
configuration data be sent via SDP, AudioSpecificConfig() may be added.


8. Security Considerations

Cryptographic authentication of incoming RTP and RTCP packets is highly
recommended when using MWPP. Without such protections, attackers could
forge MIDI commands into an ongoing streams, potentially damaging
speakers and eardrums. An attacker could also craft RTP and RTCP packets
to exploit known bugs in the client, and take effective control of a
client machine.


9. Congestion Control

MWPP has congestion control issues that are unique for an RTP audio
packetization. In certain applications such as network musical
performance [6], the packet rate is linked to the gestural rate of a
human performer.

MWPP implementations SHOULD sense the MIDI wire protocol stream for
command patterns that result in excessive packet rates, and filter these
streams as part of MWPP to reduce the packet rate.


10. Acknowledgements

We thank the networking, media compression, and computer music community
members who have contributed to the MWPP standardization effort,
including Steve Casner, Robin Davies, Dominique Fober, Philippe Gentric,
Phil Kerr, Young-Kwon Lim, Colin Perkins, Larry Rowe, Dave Singer, and
Martijn Sipkema.





Lazzaro/Wawrzynek                                              [Page 18]


INTERNET-DRAFT                                             11 April 2002


Appendix A.1. Recovery Journal Definitions

In this Appendix, we define the terminology and the coding idioms that
are used in the recovery journal bitfield descriptions in Section 5
(journal header structure), Appendices A.2-8 (channel journal chapters)
and Appendices B.1-5 (system journal chapters).

These descriptions assume that the recovery journal resides in an RTP
packet with sequence number I ("packet I") and that the Checkpoint
Packet Seqnum field in the top-level recovery journal header refers to a
packet with sequence number C. Sequence number algorithms defined for
the recovery journal system use modulo 2^16 arithmetic.

Several bitfield coding idioms appear throughout the recovery journal
system, with consistent semantics. Most recovery journal elements begin
with an "S" (Single-packet loss) bit. S bits are designed to help
receivers efficiently parse through the recovery journal hierarchy in
the common case of the loss of a single packet.

The default value of the S bit is 1. An S bit for a recovery journal
element in packet I is set to 0 if the element encodes data about a MIDI
command stored in the MIDI command section of packet I - 1. If an
element has its S bit set to 0, all higher-level recovery journal
elements that contain it also have S bits that are set to 0, including
the top-level recovery journal header (Figure 7 in Section 5).

Other coding idioms that appear with consistent semantics throughout the
recovery journal system are described below.

  o R flag bit. R flag bits are reserved for future use by MWPP.
    Sender MUST set R bits to 0; receivers MUST ignore R bit values.

  o LENGTH field. All fields named LENGTH (as distinct from LEN)
    code the number of octets in the structure that contains it,
    including the header it resides in and all hierarchical levels
    below it. This definition simplifies parsing, as receivers may
    skip over the entire structure with an addition operation.

We now define normative terms used to describe recovery journal
semantics.

  o Checkpoint history. The checkpoint history of a recovery journal
    is the concatenation of the MIDI command sections of packets C
    through I - 1. The last MIDI command in MIDI command section for
    packet I - 1 is considered the most recent command; the first
    MIDI command in the MIDI command section for packet C is
    the oldest command. A checkpoint history with no MIDI commands
    is considered to be empty. The checkpoint history never contains



Lazzaro/Wawrzynek                                              [Page 19]


INTERNET-DRAFT                                             11 April 2002


    the MIDI Command section of the packet I (the packet containing
    the recovery journal), so if C == I, the checkpoint history is
    empty by definition.

  o Session history. The session history of a recovery journal is
    the concatenation of MIDI command sections from the first
    packet of the session up to packet I - 1. The definitions of
    MIDI command recency and history emptiness are the same as in
    the checkpoint history. The session history never contains the
    MIDI command section of packet I, and so the session history of
    the first packet in the session is empty by definition.

  o Active commands (default). For most types of MIDI commands,
    an active MIDI command is defined to be a MIDI command that does
    not appear before one of the following MIDI commands in the session
    history:  System Reset (0xFF), General MIDI System Enable
    (0xF0 0x7E 0xcc 0x09 0x01 0xF7), General MIDI System Disable
    (0xF0 0x7E 0xcc 0x09 0x00 0xF7). A few types of MIDI commands
    use a modified meaning of active (see below).

  o Active commands (NoteOn, Noteoff, Poly Aftertouch). For MIDI NoteOn,
    NoteOff, and Poly Aftertouch commands, an active MIDI command is
    defined to be a MIDI command that does not appear before one of the
    following MIDI commands in the session history: System Reset (0xFF),
    General MIDI System Enable (0xF0 0x7E 0xcc 0x09 0x01 0xF7), General
    MIDI System Disable (0xF0 0x7E 0xcc 0x09 0x00 0xF7), MIDI Control
    Change number 120 (All Notes Off) or 124 (All Sound Off).

  o Active commands (MIDI Control Change). For MIDI Control Change
    commands, an active MIDI command is defined to be a MIDI command
    that does not appear before one of the following MIDI commands in
    the session history: System Reset (0xFF), General MIDI System Enable
    (0xF0 0x7E 0xcc 0x09 0x01 0xF7), General MIDI System Disable
    (0xF0 0x7E 0xcc 0x09 0x00 0xF7), MIDI Control Change number 121
    (All Controllers Off).

The chapter definitions in Appendices A.2-8 and B.1-5 reflect the
default recovery journal behavior of MWPP. The chmay, chmust, and
chnever SDP parameters modulate these definitions, as described in
Section 6.

Finally, we note that channel journals only encode information about
MIDI commands appearing on the MIDI channel the journal protects. All
references to MIDI commands in Appendices A.2-8 should be read as "MIDI
commands appearing on this channel."






Lazzaro/Wawrzynek                                              [Page 20]


INTERNET-DRAFT                                             11 April 2002


Appendix A.2. Chapter P: MIDI Program Change

A channel journal MUST contain Chapter P if an active Program Change
(0xC) command appears in the checkpoint history.  Figure A.2.1 shows the
format for Chapter P.

         0                   1                   2
         0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3
        +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
        |S|   PROGRAM   |C| BANK-COARSE |F| BANK-FINE   |
        +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

                Figure A.2.1 -- Chapter P Format

The chapter has a fixed size of 24 bits.  The PROGRAM field indicates
the program value of the most recent Program Change command in the
checkpoint history.

By default, bits 8-23 of Chapter P are set to 0.  However, if an active
Control Change (0xB) command for controller 0 (Bank Select Coarse)
appears before this Program Change command in the session history, the C
bit is set to 1, and the BANK-COARSE field is set to the 7-bit data
value for the most recent Control Change command for controller 0. The F
bit and BANK-FINE field code the Control Change command for controller
32 (Bank Select Fine) in an identical manner.


Appendix A.3. Chapter W: MIDI Pitch Wheel

A channel journal MUST contain Chapter W if an active MIDI Pitch Wheel
(0xE) command appears in the checkpoint history.  Figure A.3.1 shows the
format for Chapter W.

                 0                   1
                 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5
                +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
                |S|     FIRST   |R|    SECOND   |
                +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

                Figure A.3.1 -- Chapter W Format

The chapter has a fixed size of 16 bits.  The FIRST and SECOND fields
are the 7-bit values of the first and second data octets of the most
recent active Pitch Wheel command in the checkpoint history.







Lazzaro/Wawrzynek                                              [Page 21]


INTERNET-DRAFT                                             11 April 2002


Appendix A.4. Chapter N: MIDI NoteOff and NoteOn

In this Appendix, we consider NoteOn commands with zero velocity to be
NoteOff commands.

A channel journal MUST contain Chapter N if an active MIDI NoteOn (0x9)
or NoteOff (0x8) command appears in the checkpoint history. Figure A.4.1
shows the format for Chapter N.

   0                   1                   2                   3
   0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 8 0 1
  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
  |B|     LEN     |  LOW  | HIGH  |S|   NOTENUM   |Y|  VELOCITY   |
  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
  |S|   NOTENUM   |Y|  VELOCITY   | ....                          |
  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
  |   BITFIELD    |   BITFIELD    |     ....      |   BITFIELD    |
  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

                  Figure A.4.1 -- Chapter N Format

Chapter N codes the most recent active NoteOn or NoteOff reference to a
MIDI note number in the checkpoint history.  Chapter N consists of a
2-octet header, followed by least one of the following data structures:

   o A list of note logs to code NoteOn commands.
   o A NoteOff bitfield structure to code NoteOff commands.

The note log list MUST contain an entry for all note numbers whose most
recent checkpoint history appearance is in a NoteOn command, except in
cases where 128 note logs would be required (Chapter N codes a maximum
of 127 note logs). The NoteOff bitfield structure MUST contain a set bit
for all note numbers whose most recent checkpoint history appearance is
in a NoteOff command. A note number is never coded in both structures.

The header for Chapter N, reproduced in Figure A.4.2, codes the size of
the note list and bitfield structures.

                 0                   1
                 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5
                +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
                |B|     LEN     |  LOW  | HIGH  |
                +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

                Figure A.4.2 -- Chapter N Header

The 7-bit LEN field codes the number of 2-octet note logs in the note
list. Zero is a valid value for LEN, and codes the empty note list.



Lazzaro/Wawrzynek                                              [Page 22]


INTERNET-DRAFT                                             11 April 2002


The 4-bit LOW and HIGH fields code the number of NoteOff bitfield octets
that follow the note log list. LOW and HIGH are unsigned integer values.
If LOW is less that or equal to HIGH, there are (HIGH - LOW + 1) NoteOff
bitfield octets in the chapter. An empty NoteOff bitfield structure is
coded by setting LOW to 15 and HIGH to 0.

The B bit is set to 1 if the MIDI command section of packet I - 1 does
not include a NoteOff command for this channel. The B bit, like the S
bit (Appendix A.1), helps receivers efficiently parse recovery journals
in the common case of the loss of a single packet.

We now describe the 2-octet note log structure, reproduced in Figure
A.4.3.

                 0                   1
                 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5
                +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
                |S|   NOTENUM   |Y|  VELOCITY   |
                +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

                Figure A.4.3 -- Chapter N Note Log

The 7-bit NOTENUM field codes the note number for the log; a note number
may not be represented by multiple note logs in the note list.  The
7-bit VELOCITY field codes the velocity value for the most recent NoteOn
command for the note number in the checkpoint history. VELOCITY is never
zero; NoteOn commands with zero velocity are coded as NoteOff commands
in the NoteOff bitfield structure.

The note log does not code the execution time of the NoteOn command;
however, the Y bit codes information about the execution time.  The Y
bit is set to 1 if MIDI Command N in the MIDI command section of packet
I is considered to be simultaneous with the NoteOn command coded by the
note log. If the MIDI command section contains no events, Y is set to 1
if a hypothetical MIDI command occurring at the RTP timestamp time would
be considered simultaneous. The definition of simultaneity is
implementation dependent.

We now describe the NoteOff bitfield structure.  A NoteOff bitfield
octet codes NoteOff information for eight consecutive MIDI note numbers,
with the MSB representing the lowest note number. The MSB of the first
bitfield octet codes the note number 8*LOW; the MSB of the last bitfield
octet codes the note number 8*HIGH.

A set bit codes a NoteOff command for the note number; Chapter N does
not code NoteOff velocity data.  In the most efficient coding for the
NoteOff bitfield structure, the first and last octets of the structure
contain at least one set bit.



Lazzaro/Wawrzynek                                              [Page 23]


INTERNET-DRAFT                                             11 April 2002


Appendix A.5. Chapter A: MIDI Poly Aftertouch

A channel journal MUST contain Chapter A if an active Poly Aftertouch
(0xA) command appears in the checkpoint history.  Figure A.5.1 shows the
format for Chapter A.

    0                   1                   2                   3
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 8 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |S|    LEN      |S|   NOTENUM   |R|  PRESSURE   |S|   NOTENUM   |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |R|  PRESSURE   |  ....                                         |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

                    Figure A.5.1 -- Chapter A format

The chapter consists of a 1-octet header, followed by a variable length
list of 2-octet note logs. A note log MUST appear for a note number if
an active Poly Aftertouch command for the note number appears in the
checkpoint history.  A note number may not be represented by multiple
note logs in the note list.

The 7-bit LEN field codes the number of note logs in the list, minus
one. Figure A.5.2 reproduces the note log structure of Chapter A.

                 0                   1
                 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5
                +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
                |S|   NOTENUM   |R|  PRESSURE   |
                +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

                Figure A.5.2 -- Chapter A Note Log

The 7-bit PRESSURE field codes the pressure value of the most recent
Poly Aftertouch command in the checkpoint history. The MIDI note number
for this command is coded in the 7-bit NOTENUM field.


Appendix A.6. Chapter T: MIDI Channel Aftertouch

A channel journal MUST contain Chapter T if an active MIDI Channel
Aftertouch (0xD) command appears in the checkpoint history.  Figure
A.6.1 shows the format for Chapter T.








Lazzaro/Wawrzynek                                              [Page 24]


INTERNET-DRAFT                                             11 April 2002


                        0
                        0 1 2 3 4 5 6 7
                       +-+-+-+-+-+-+-+-+
                       |S|   PRESSURE  |
                       +-+-+-+-+-+-+-+-+

                Figure A.6.1 -- Chapter T Format

The chapter has a fixed size of 8 bits.  The 7-bit PRESSURE field holds
the pressure value of the most recent active Channel Aftertouch command
sent on this channel.


Appendix A.7. Chapter C: MIDI Control Change

A channel journal MUST contain Chapter C if an active Control Change
(0xB) command appears in the checkpoint history (excepting controller
numbers 0, 6, 32, 38, 96, 97, 98, 99, 100, and 101). In certain cases
(defined later in this Appendix) this rule also applies to the excepted
controller numbers. Figure A.7.1 shows the format for Chapter C.

    0                   1                   2                   3
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 8 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |S|     LEN     |S|   NUMBER    |A|  VALUE/ALT  |S|   NUMBER    |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |A| VALUE/ALT   |  ....                                         |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

                    Figure A.7.1 -- Chapter C format

The chapter consists of a 1-octet header, followed by a variable length
list of 2-octet controller logs.  The list MUST contain an entry for a
controller number if an active Control Change command for the number
appears in the checkpoint history (excepting numbers 0, 6, 32, 38, 96,
97, 98, 99, 100, 101, 124, 125, 126, and 127). In certain cases (defined
later in this Appendix) this rule also applies to the excepted
controller numbers.

The 7-bit LEN field codes the number of controller logs in the list,
minus one.  A controller number may not appear in multiple controller
logs in the list. Figure A.7.2 reproduces the controller log structure
of Chapter C.








Lazzaro/Wawrzynek                                              [Page 25]


INTERNET-DRAFT                                             11 April 2002


                 0                   1
                 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5
                +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
                |S|    NUMBER   |A|  VALUE/ALT  |
                +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

             Figure A.7.2 -- Chapter C Controller Log

The 7-bit NUMBER field identifies the controller number. The 7-bit
VALUE/ALT field codes recovery information for the most recent Control
Change command for this number in the checkpoint history.

Chapter C provides three tools for coding recovery information for a
command in the VALUE/ALT field: the value tool, the toggle tool, and the
count tool. Implementations may choose among the tools to code a Control
Change command.

In the value tool, the 7-bit VALUE field codes the control value of the
most recent Control Change command for this controller number.  This
tool works best for controllers that code a continuous quantity, such as
number 1 (Modulation Wheel). If the value tool is chosen, the A bit is
set to 0.

The A bit is set to 1 to code the toggle or count tool. These tools work
best for controllers that code discrete actions.  Figure A.7.3 shows the
controller log for these tools.

                0                   1
                0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5
               +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
               |S|    NUMBER   |1|T|    ALT    |
               +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

          Figure A.7.3 -- Controller Log for ALT tools

The T flag is set to 1 to code the toggle tool; T is set to 0 to code
the count tool. Both methods use the 6-bit ALT field as an unsigned
integer.

The toggle tools works best for controllers that act as on/off switches,
such as 64 (Hold Pedal). These controllers code the "off" state with
control values 0-63 and the "on" state with 64-127. The ALT field codes
the total number of toggles (off->on and on->off) due to Control Change
commands in the session history. Toggle counting is performed modulo 64,
and the controller is assumed to be off at the start of a session.

The Hold Pedal controller illustrates the benefit of the toggle tool
over the value tool for switch controllers. As often used in piano



Lazzaro/Wawrzynek                                              [Page 26]


INTERNET-DRAFT                                             11 April 2002


applications, the "on" state of the Hold Pedal lets notes resonate,
while the "off" state immediately damps notes to silence. The loss of
the "off" command in an "on->off->on" sequence results in ringing notes
that should have been damped silent.  The toggle tool lets receivers
detect this lost "off" command but the value tool does not.

The count tool is similar to the toggle tool, but is optimized for
controllers whose value octet is ignored, such as 120 (All Notes Off).
For the count tool, the ALT field codes the total number of Control
Change commands in the session history. Command counting is performed
modulo 64, and the command count is set to 0 at the start of the
session.

We now describe normative coding rules for the controller numbers that
are excepted from the general rules presented in the beginning of this
Appendix. For each excepted controller number, we define the conditions
under which a control log MUST appear in Chapter C for the controller
number. By extension, these conditions imply that Chapter C MUST appear
in the recovery journal.

If active Control Change commands for controller numbers 0 (Bank Select
Coarse) or 32 (Bank Select Fine) appear in the checkpoint history, the
most recent commands for these numbers MUST appear as entries in the
controller list only if the data value for these commands are not coded
in the BANK-COARSE (0) or BANK-FINE (32) fields of the Chapter P
(Appendix A.2) for the channel journal. This rule avoids redundant
coding in Chapters C and P.

Several controller numbers pairs are defined to be mutually exclusive.
Controller numbers 124 (Omni Off) and 125 (Omni On) form a mutually
exclusive pair, as do controller numbers 126 (Mono) and 127 (Poly).  If
active Control Change commands for one or both members of a mutually
exclusive pair appear in the checkpoint history, one and only one
controller log MUST appear in controller list to code the pair.  This
controller log MUST code the controller number of the most recent
Control Change command of the pair.

Appendix A.8 defines Chapter M, the MIDI Parameter chapter, to provide
resiliency for the MIDI registered/non-registered parameter system.
Here, we define the Chapter C rules for coding Control Change commands
related to the registered/non-registered parameter system. These Chapter
C rules serve to minimize redundancy with Chapter M.

Control Change commands for controller numbers 6 and 38 (Data Slider)
and 96 and 97 (Data Button) may be used as part of the parameter system,
or may be used as general-purpose controllers. Control Change commands
for controller numbers 6, 38, 96, or 97 that appear in the checkpoint
history, and that are used in the parameter system, MUST NOT appear as



Lazzaro/Wawrzynek                                              [Page 27]


INTERNET-DRAFT                                             11 April 2002


entries in the controller list.

However, if active Control Change commands for controller numbers 6, 38,
96, or 97 appear in the checkpoint history, and these commands are used
as general-purpose controllers, the most recent general-purpose command
instance for these numbers MUST appear as entries in the controller
list.

A parameter system transaction begins with paired Control Change
commands for numbers 98 and 99 (Non-Registered Parameter LSB and MSB) or
100 and 101 (Registered Parameter LSB and MSB). Chapter M codes these
paired Control Change commands. The Chapter C rule below acts to code
"unpaired" commands for these controller numbers, that appear in the
checkpoint history if a (98, 99) or (100, 101) pair is split across the
MIDI command sections of two MWPP packets.

If the most recent active Control Change command for controller 98, 99,
100, or 101 in the checkpoint history is part of a (98, 99) or (100,
101) command pair that begins a parameter system transaction, the
command MUST NOT appear in the controller list. However, if the most
recent active Control Change command for controller 98, 99, 100, or 101
in the checkpoint history does not form part of a (98, 99) or (100, 101)
command pair, an entry MUST appear in the controller list.


Appendix A.8. Chapter M: MIDI Parameter System

A channel journal MUST contain Chapter M if an active Control Change
command that forms part of an initiated parameter system transaction (as
defined below) appears in the checkpoint history.

We begin by defining the terms "parameter system", "parameter system
transaction", and "initiated parameter system transaction" as used in
the Appendix.

  o  Parameter system. This phrase refers to a MIDI feature that
     provides two sets of 16,384 parameters to augment the
     Control Change controller number space. Registered Parameter
     Names (RPN) system and the Non-Registered Parameter Names
     (NRPN) system each provides 16,384 parameters.

  o  Parameter system transaction. The value of RPNs and NRPNs are
     changed by a series of Control Change commands that form a
     transaction. A transaction begins with two Control Change
     commands to set the parameter number (controller numbers
     98 and 99 for NRPNs, controller numbers 100 and 101 for RPNs).
     The transaction continues with an arbitrary number of
     Data Entry (controller numbers 6 and 38) and Data Button



Lazzaro/Wawrzynek                                              [Page 28]


INTERNET-DRAFT                                             11 April 2002


     (controller numbers 96 and 97) Control Change commands to
     set the parameter value. The transaction ends with a second
     pair of (98, 99) or (100, 101) Control Change commands. These
     terminal commands are considered a part of the transaction.
     In addition, the terminal commands may start a second
     parameter system transaction; in this case, these commands
     belong to two transactions.

  o  Initiated parameter system transaction. An initiated parameter
     system transaction is a transaction whose (98, 99) or (100, 101)
     initial active Control Change command pair appears in the session
     history. Under certain conditions, unpaired active Control Change
     commands for controller numbers 98, 99, 100, or 100 are coded in
     Chapter C, as described in Appendix A.7.

Figure A.8.1 shows the variable-length format of Chapter M.

    0                   1                   2                   3
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |S|P|N|R|R|R|      LENGTH       |  Transaction log list ...     |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

              Figure A.8.1  Top-level Chapter M format

Chapter M consists of a 2-octet header, followed by list of transaction
log entries. The 10-bit LENGTH field codes the length of Chapter M, and
conforms to semantics described in Appendix A.1.

If an active Control Change command that forms part of an initiated
parameter system transaction appears in the checkpoint history, a log
entry for the transaction MUST appear in the transaction list.

The relative order of transaction list entries MUST reflect the relative
position of parameter transactions in the session history: the first log
entry codes the most recent parameter transaction in the history, the
second log entry codes a transaction that appears before the first
parameter transaction in the history, etc.

The P header bit is set to 1 if an active Control Change command pair to
terminate the first RPN transaction in the log list does not appear in
the session history. The N header bit has the same role for the first
NRPN transaction in the log list.








Lazzaro/Wawrzynek                                              [Page 29]


INTERNET-DRAFT                                             11 April 2002


Figure A.8.2 shows the structure of a transaction log.

    0                   1                   2                   3
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |S|T|       PARAM-NUMBER        |     KEY       |  DATA   ...   |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |   ...         |      KEY      |   DATA ...                    |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

              Figure A.8.2  Transaction Log Structure

The transaction log consists of a 2-octet header, followed by a
compressed enumeration of the Control Change commands for controller
numbers 6, 38, 96, and 97 for this transaction in the session history.
The presence of Control Change commands to terminate the transaction log
are coded implicitly by the P and N header bits of the top-level chapter
format (Figure A.8.1).

A transaction log header codes the parameter identity. If T is set to 1,
the log codes an NRPN parameter; if T is set to 0, the log codes an RPN
parameter. The 14-bit PARAM-NUMBER header field codes the parameter
number.

The KEY and DATA fields that follow log header encode the compressed
enumeration of the Control Change commands for numbers 6, 38, 96, and
97. The ordering of this enumeration matches the ordering of commands in
the transaction: the first transaction command appears as the first
command in the enumeration, the second transaction command appears as
the second command in the enumeration, etc.

KEY and DATA fields always appear in pairs in the transaction log; at
least one KEY-DATA pair MUST appear in a transaction log, even if no
Control Change commands need to be coded. The KEY field has a fixed
1-octet size, and acts as a directory for the KEY-DATA pair; the DATA
fields has a variable size of 0-3 octets. Figure A.8.3 shows the format
of the KEY octet.

                        0
                        0 1 2 3 4 5 6 7
                       +-+-+-+-+-+-+-+-+
                       |S|M|IN1|IN2|IN3|
                       +-+-+-+-+-+-+-+-+

                   Figure A.8.3 -- Key Octet

The two-bit fields IN1, IN2, and IN3 code the appearance and meaning of
the first, second, and third DATA octet that may follows the KEY octet.



Lazzaro/Wawrzynek                                              [Page 30]


INTERNET-DRAFT                                             11 April 2002


The IN fields code the following information:

  o  IN_k = 00. The DATA octet for this position is not present. The
     permitted placements of the 00 value are: IN1 = IN2 = IN3 = 00
     (no DATA octets follow the KEY octet), IN2 = IN3 = 00 (one DATA
     octet follow the KEY octet), IN3 = 00 (two DATA octets follow the
     KEY octet).

  o  IN_k = 01. Indicates an active Control Change command for
     controller number 6 (Data Entry Slider Coarse); the DATA
     octet codes the third octet of the Control Change command.

  o  IN_k = 02. Indicates an active Control Change command for
     controller number 38 (Data Entry Slider Fine); the DATA
     octet codes the third octet of the Control Change command.

  o  IN_k = 03. Indicates one or more active Control Change commands
     for controller number 96 (Data Button Increment) and/or 97
     (Data Button Decrement), without an intervening Control Change
     command 6 or 38.The DATA octet codes the cumulative effect of the
     Data Button commands, as a two's complement 8-bit value:
     controller 96 commands increment the value by 1, controller
     97 commands decrement the value by 1.

The M flag is 1 if another KEY octet follows the DATA octet(s). If M is
0, another transaction log may follow the DATA octet(s), or the DATA
octet(s) may mark the end of Chapter M, depending on the LENGTH field of
the top-level Chapter M header shown in Figure A.8.1.

In comparison with other recovery journal chapters, Chapter M is
inefficient: each transaction for a parameter number in the checkpoint
history is listed in the transaction list, and each Control Change
command for a transaction is enumerated in a transaction log. This
design decision trades off recovery journal size for design simplicity.
In practice, parameter system commands rarely appear in MIDI streams,
and this design design decision does not have a significant impact on
MWPP bandwidth requirements.


Appendix B.1. System Chapter D: Reset, Song Select, Tune Request

The system journal MUST contain Chapter D if an active MIDI Reset
(0xFF), MIDI Tune Request (0xF6), or MIDI Song Select (0xF3) command
appears in the checkpoint history.  Note that General MIDI reset
commands are coded in Chapter X (Appendix B.5), not in Chapter D.
Figure B.1.1 shows the variable-length format for Chapter D.





Lazzaro/Wawrzynek                                              [Page 31]


INTERNET-DRAFT                                             11 April 2002


    0                   1                   2                   3
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |S|E|T|G|R|R|R|R|  Command logs ...                             |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

               Figure B.1.1 -- System Chapter D Format

The chapter consists of a 1-octet header, followed by one or more
command logs. Header flag bits indicate the presence of command logs for
the Reset (E = 1), Tune Request (T = 1), and Song Select (G = 1)
commands. Command logs appear in a list following the header, in the
order that their flag bits appear in the header.

Figure B.1.2 shows the 1-octet command log format for the Reset and Tune
Request commands.

                         0
                         0 1 2 3 4 5 6 7
                        +-+-+-+-+-+-+-+-+
                        |S|    COUNT    |
                        +-+-+-+-+-+-+-+-+

       Figure B.1.2 -- Command Log for Reset and Tune Request

Chapter D MUST contain the Reset command log if an active Reset command
appears in the checkpoint history. The 7-bit COUNT field codes the total
number of Reset commands (modulo 128) present in the session history.

Chapter D MUST contain the Tune Request command log if an active Tune
Request command appears in the checkpoint history. The 7-bit COUNT field
codes the total number of Tune Request commands (modulo 128) present in
the session history.

Figure B.1.3 shows the 1-octet command log format for the Song Select
command.

                         0
                         0 1 2 3 4 5 6 7
                        +-+-+-+-+-+-+-+-+
                        |S|    VALUE    |
                        +-+-+-+-+-+-+-+-+

           Figure B.1.3 -- Song Select Command Log Format

Chapter D MUST contain the Song Select command log if an active Song
Select command appears in the checkpoint history. The 7-bit VALUE field
codes the song number of the most recent Song Select command in the



Lazzaro/Wawrzynek                                              [Page 32]


INTERNET-DRAFT                                             11 April 2002


checkpoint history.


Appendix B.2. System Chapter V: Active Sense Command

The system journal MUST contain Chapter V if an active MIDI Active Sense
(0xFE) command appears in the checkpoint history.  Figure B.2.1 shows
the format for Chapter V.

                         0
                         0 1 2 3 4 5 6 7
                        +-+-+-+-+-+-+-+-+
                        |S|    COUNT    |
                        +-+-+-+-+-+-+-+-+

               Figure B.2.1 -- System Chapter V Format

The 7-bit COUNT field codes the total number of Active Sense commands
(modulo 128) present in the session history.


Appendix B.3. System Chapter Q: Sequencer State Commands

The system journal MUST contain Chapter Q if an active MIDI Song
Position Pointer (0xF2), MIDI Clock (0xF8), MIDI Tick (0xF9), MIDI Start
(0xFA), MIDI Continue (0xFB) or MIDI Stop (0xFC) command appears in the
checkpoint history.  Figure B.3.1 shows the variable-length format for
Chapter Q.

 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|S|N|D|C|T|Q|TOP|          CLOCK                |   TICKS       |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|      ...                       |             QNOTE            |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|  ...          |
+-+-+-+-+-+-+-+-+

               Figure B.3.1 -- System Chapter Q Format

Chapter Q encodes the most recent sequencer system state held in the
session history. In a temporal sense, the fields of Chapter Q reflect
system state up to (but not including) the moment encoded by the RTP
timestamp of packet I.

Chapter Q consists of a 1-octet header followed by several optional
fields, in the order shown in Figure B.3.1.  Three header bits (C, T,



Lazzaro/Wawrzynek                                              [Page 33]


INTERNET-DRAFT                                             11 April 2002


and Q) indicate the presence of fields following the header.  Two header
bits (N and D) encode aspects of the sequencer system state directly.

Header flag bits C, T, and Q signal the presence of the 16-bit CLOCK
field (C set to 1), the 24-bit TICKS field (T set to 1) and the 24-bit
QNOTE field (Q set to 1).

The N header bit encodes the relative occurrence of the Start, Continue
and Stop commands in the session history.  If an active Start or
Continue command appears most recently, N is set to 1.  If an active
Stop appears most recently, or if no active instances of these commands
appear in the session history, N is set to 0.

The D header bit encodes the presence of the downbeat.  If N is set to
1, D is set to 1 if at least one Clock or Tick command follows the most
recent Start or Continue command in the session history. If this
condition does not hold, or if N is 0, then D is set to 0.

If N is set to 0 (coding a stopped sequence), or if N is set to 1 and D
is set to 0 (coding a sequence on the verge of beginning), Chapter Q
MUST encode the starting song position of the sequence. The C and T
header flags, the optional CLOCK (if C is set to 1) and TICKS (if T is
set to 1) fields, and the TOP header field, act to code the starting
song position, via the methods described below.

   o If C = 0 and T = 0, the starting song position is at the
     beginning of the song.

   o If C = 1 and T = 0, the 2-bit TOP header field and the 16-bit
     CLOCK field are combined to form the 18-bit unsigned quantity
     65536*TOP + CLOCK. This value encodes the starting song
     position, in units of clocks (24 clocks per quarter note).
     Use this method if the MIDI source uses Clock commands as
     timing pulses.

   o If C = 0 and T = 1, the 24-bit TICKS field codes the starting
     song position, in units of milliseconds. Use this method
     if the MIDI source uses Tick commands as timing pulses
     (10 ms per Tick). The song position MUST be encoded using
     sub-Tick (i.e. sub-10ms) resolution.

   o If C = 1 and T = 1, the starting song position is the sum of
     the positions encoded by the CLOCK, TOP and TICKS fields, as
     described above. Used this method if the MIDI stream
     uses Tick commands as timing pulses and also uses the
     clock-based Song Position Pointer commands to reposition
     the sequence.




Lazzaro/Wawrzynek                                              [Page 34]


INTERNET-DRAFT                                             11 April 2002


If the N and D header bits are both set to 1, the sequence is playing,
and Chapter Q MUST encode the current song position in the sequence.
The current song position is coded using the same fields and methods as
the starting song position (see above). If the TICKS field is used to
code the current song position, the field value counts time up to the
moment encoded by the RTP timestamp of packet I.

Chapter Q MAY encode an estimate of the current tempo, by setting the Q
header bit to 1, and placing the estimated tempo value in the 24-bit
QNOTE field. The QNOTE field has units of microseconds per quarter note.
This memo does not define a normative algorithm for tempo estimation for
the QNOTE field.  Note that Q may be set to 1 even if N is set to 0,
providing a method for coding current tempo while the sequence is
stopped.


Appendix B.4. System Chapter E: MIDI Time Code Tape Position

The system journal MUST contain Chapter E if an active MIDI System
Common Quarter Frame message (0xF1) or an active System Exclusive
Universal Real Time MIDI Time Code Full Frame command (F0 7F cc 01 01 hr
mn sc fr F7) command appears in the checkpoint history.  Figure B.4.1
shows the variable-length format for Chapter E.

 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|S|Q|C|P|D|POINT|                COMPLETE                       |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                 PARTIAL                       |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+


               Figure B.4.1 -- System Chapter E Format

Chapter E holds information about the most recent MIDI Time Code (MTC)
tape position coded in the session history. Chapter E consists of a
1-octet header followed by two optional fields, in the order shown in
Figure B.4.1.

Two header bits (C and P) indicate the presence of fields following the
header. The other header fields (bit flags Q and D, and the 3-bit field
POINT) directly code information about the tape position.

MTC tape position updates may occur atomically, via the Full Frame
command, or incrementally, via a series of Quarter Frame commands. The Q
bit codes if the Quarter Frame (Q set to 1) or the Full Frame (Q set to
0) appears most recently in the session history.



Lazzaro/Wawrzynek                                              [Page 35]


INTERNET-DRAFT                                             11 April 2002


At any moment in time, the session history may hold a sequence of zero
or more complete MTC frame values. A partially complete MTC frame value
may also appear in the session history (after the most recent complete
MTC frame value, if one exists).

If the session history holds a complete MTC frame, and if the Quarter
Frame or Full Frame command that completes this frame encoding appears
in the checkpoint history, Chapter E MUST include the 24-bit COMPLETE
field to encode the frame value. The C header bit is set to 1 to signal
the presence of the COMPLETE field.

If a partially complete MTC frame value appears in the session history
(after the most recent complete MTC frame value, if one exists), and if
at least one Quarter Frame command coding this partial value appears in
the checkpoint history, Chapter E MUST include the 24-bit PARTIAL field
to encode the frame value in progress. The P header bit is set to 1 to
signal the presence of the PARTIAL field.

The D header flag bit signals the direction the tape is moving.  D is
set to 0 for forward or no movement; D is set to 1 for reverse movement.
If Q is set to 1, the relative motion of the upper nibble of the Quarter
Frame data value determines D. If Q is set to 0, the relative tape
motion from its last position determines D.

The 3-bit POINT field hold information about the incremental Quarter
Frame encoding in the session history. If Q is set to 1, POINT codes the
upper nibble of the most recent Quarter Frame data value in the session
history. If Q is set to 0, POINT is reserved for future use; senders
MUST set POINT to 0x0, and receivers must ignore its value.

Figure B.4.2 shows the common format for the COMPLETE and PARTIAL
fields.

          0                   1                   2
          0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3
         +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
         |TYP|  HOURS  |  MINUTES  | SECONDS   | FRAMES  |
         +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

           Figure B.4.2 -- COMPLETE and PARTIAL format

The 5-bit HOURS, 6-bit MINUTES, 6-bit SECONDS, and 5-bit FRAMES fields
encode the SMPTE values encoded in Full Frame and Quarter Frame
commands.  The bit allocations are sufficient to encode legal SMPTE
values; note that for some fields, the associated MIDI commands use
larger encodings. The 2-bit TYP field encodes the SMPTE frame type,
using same encoding as the Quarter Frame and Full Frame commands.




Lazzaro/Wawrzynek                                              [Page 36]


INTERNET-DRAFT                                             11 April 2002


If used in the COMPLETE field, the TYP, HOURS, MINUTES, SECONDS, and
FRAMES fields hold the most recent complete frame value, encoded by a
Full Frame command or a series of 8 Quarter Frame commands in the
session history.

If used in the PARTIAL field, the TYP, HOURS, MINUTES, SECONDS, and
FRAMES fields do not all contain valid values.  Recall that the PARTIAL
field encodes a partially complete SMPTE value encoded by a series of
Quarter Frame commands in the session history. The size and direction of
the Quarter Frame command series may be inferred from the POINT and D
header values. For each bit position in Figure B.4.2, the bit contains
valid data if its associated command appears in the session history;
elsewise, the bit position holds 0.

If a COMPLETE field represents a Quarter Frame command series, its coded
value MUST include the 2-frame offset adjustment for Quarter-Frame
transmission. However, the PARTIAL field MUST NOT include this offset.


Appendix B.5. System Chapter X: System Exclusive

MIDI System Exclusive (opcode 0xF0, abbreviation SysEx) commands may
have arbitrary length. In this Appendix, we describe System Chapter X,
whose encoding is optimized for the short SysEx commands that signal
real-time events.

Note that Chapter X is not suitable for use with the longer SysEx
commands used in bulk data transport.  A MIDI session that combines
real-time and bulk-data functions SHOULD be sent over two MWPP streams:
a bulk-data stream sent over reliable transport, and a real-time
unreliable stream for shorter commands. The midiport SDP parameter
(Sections 2 and 6) supports split-stream operation.

We now describe Chapter X in detail. The system journal MUST contain at
least one Chapter X entry if an active SysEx command (excluding the MTC
Full Frame command) appears in the checkpoint history.  A SysEx command
is said to "appear" in the checkpoint history if the history contains a
verbatim encoding of the SysEx command, or if the history contains at
least one segment of the segmental encoding of the SysEx command.

Note that the structure of the system journal (Figure 9 in Section 5)
permits multiple entries for Chapter X. Each Chapter X entry codes
information about exactly one SysEx command. The relative ordering of
Chapter X entries MUST reflect the relative position of commands in the
checkpoint history: the first Chapter X entry codes the most recent
SysEx command in the history, the second Chapter X entry codes a SysEx
command that appears before the first coded SysEx command in the
history, etc.



Lazzaro/Wawrzynek                                              [Page 37]


INTERNET-DRAFT                                             11 April 2002


Chapter X provides two tools for encoding multiple SysEx commands of the
same type. Each command of a certain type may be encoded in a separate
Chapter X entry (the list tool) or only the most recent command of a
certain type may be encoded (the recency tool).

Each active SysEx command that appears in the checkpoint history MUST be
associated with a Chapter X entry via the list or recency tool
(excluding the MTC Full Frame command).  For each SysEx command type, an
implementation may choose either coding tool. Simple implementations may
use the list tool for all command types; sophisticated implementations
may reduce bandwidth by using the recency tool for some command types.

Figure B.5.1 shows the variable length format for System Chapter X.

 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|S|IDC|L|T| LEN |  DATA ...                                     |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

            Figure B.5.1 -- System Chapter X Format

Chapter X consists of a 1-octet header, following by an arbitrary length
DATA field. The DATA field encodes a modified version of the data octets
of the SysEx command, as described below. The leading 0xF0 and trailing
0x7F SysEx octets never appear in the DATA field.

If the Manufacturer ID value of the SysEx command (coded in the first
octet of the MIDI command) has the values 0x00, 0x7E, or 0x7F, the DATA
field begins with the second data octet of the SysEx command; for all
other Manufacturer ID values, the DATA field begins with the first data
octet of the SysEx command. The 2-bit IDC header field codes 0x00, 0x7E,
and 0x7F ID values, using the method shown in Figure B.5.2.

-----------------------------------------------------------------------
| IDC | Manufacturer ID                | First DATA octet is:         |
|--------------------------------------|------------------------------|
| 0x0 | 0x7E (Universal Real-Time)     | 2nd SysEx data octet         |
|--------------------------------------|------------------------------|
| 0x1 | 0x7F (Universal Non-Real-Time) | 2nd SysEx data octet         |
|--------------------------------------|------------------------------|
| 0x2 | 0x00 (Extension Escape Code)   | 2nd SysEx data octet         |
|--------------------------------------|------------------------------|
| 0x3 | in the range 0x01--0x7D        | 1st SysEx data octet         |
----------------------------------------------------------------------|

                Figure B.5.2 -- IDC Header Field Encoding




Lazzaro/Wawrzynek                                              [Page 38]


INTERNET-DRAFT                                             11 April 2002


The 3-bit LEN header field codes the exact length of short, complete
SysEx commands, and signals alternative coding techniques for longer
commands and truncated commands.

The LEN values 0x0 through 0x5 indicate that the length of the DATA
field is 1-6 octets. For these LEN values, the DATA field encodes a
complete SysEx command, as a verbatim copy of the SysEx data octets
(possibly skipping the first octet, as detailed in Figure B.5.2).

The LEN value 0x6 indicates that the DATA field contains 7 or more
octets. The DATA field encodes a complete SysEx command, as a verbatim
copy of the data octets of the SysEx command (possibly skipping the
first octet, as detailed in Figure B.5.2), with one exception: bit 7
(the most-significant bit) of the final data octet is set to one. This
set bit implicitly codes the length of the DATA field (MIDI data octets,
by definition, clear bit 7).

The LEN value 0x7 indicates that the DATA field encodes a truncated
SysEx command. This coding option is only to be used for SysEx commands
encoded using the segmented method, for the case where not all segments
appear in the session history.

If LEN is 0x7, the DATA field encodes the data octets of the SysEx
command segments that appear in the session history. The DATA field
holds a verbatim copy of the data octets of the coded portion of the
SysEx command, with two exceptions: the first octet may be skipped (as
detailed in Figure B.5.2) and bit 7 (the most-significant bit) of the
final coded data octet is set to one (to provide an implicit field
length, as in the case where LEN is 0x6).

The L and T header flags describe the coding tool used for the Chapter X
bitfield. If L is set to 1 (the list tool), all SysEx commands of this
type have an associated Chapter X bitfield in the system journal.  If L
is set to 0 (the recency tool), only the most recent SysEx command of
this type has an associated Chapter X bitfield in the system journal.

The T flag defines the meaning of the word "type" in the previous
paragraph. The T flag has different semantics for MIDI Universal SysEx
commands (Manufacturers ID 0x7E and 0x7F) and for generic SysEx commands
(all other Manufacturers ID values).

We first define the T flag for Universal SysEx commands. The first four
data octets of Universal commands have a defined semantics in the MIDI
standard; we symbolically represent these four octets as: ID cc SubID
SubID1. If T is set to 0, all Universal commands with the same ID, cc,
SubID, and SubID1 values are considered the same type. If T is set to 1,
all Universal commands with the same ID, cc, and SubID values are
considered the same type.



Lazzaro/Wawrzynek                                              [Page 39]


INTERNET-DRAFT                                             11 April 2002


For generic SysEx commands (all Manufacturers ID values except 0x7E and
0x7F), we define the T flag as follow. The first data octet of a generic
SysEx command is the Manufacturers ID; the remaining data octets may
have an arbitrary organization, but often have a set of octets coding
device and sub-command, followed by data octets for the command.

If T is set to 0, all generic SysEx commands with the same ID value are
considered to be of the same type. If T is set to 1, the SysEx command
is assumed to have a device/sub-command/data organization, and all
generic SysEx commands with the same ID value, device, and sub-command
values are considered to be of the same type. If the SysEx command has a
multi-level sub-command structure, these semantics require identical
sub-command values at all levels.


Appendix C. Author Addresses

John Lazzaro (corresponding author)
UC Berkeley
CS Division
315 Soda Hall
Berkeley CA 94720-1776
Email: lazzaro@cs.berkeley.edu

John Wawrzynek
UC Berkeley
CS Division
631 Soda Hall
Berkeley CA 94720-1776
Email: johnw@cs.berkeley.edu


Appendix D. References

[1] MIDI Manufacturers Association. The complete MIDI 1.0
detailed specification, 1996. http://www.midi.org

[2] H. Schulzrinne, S. Casner, R. Frederick, and V. Jacobson.
RFC 1889: RTP: A transport protocol for real-time applications,
1996.

[3] Internet Engineering Task Force. RTP Payload Format for MPEG-4
Streams.  Work in progress, draft-ietf-avt-mpeg4-multisl-02.txt.

[4] Internet Engineering Task Force. Use of "RFC-generic" for MPEG-4
Elementary Streams with no SL layer. Work in progress,
draft-ietf-avt-mpeg4-simple-00.txt.




Lazzaro/Wawrzynek                                              [Page 40]


INTERNET-DRAFT                                             11 April 2002


[5] International Standards Organization. ISO 14496 MPEG-4,
Part 3 (Audio) Subpart 5 (Structured Audio) 1999.

[6] John Lazzaro and John Wawrzynek. A Case for Network
Musical Performance. The 11th International Workshop on Network
and Operating Systems Support for Digital Audio and Video
(NOSSDAV 2001) June 25-26, 2001, Port Jefferson, New York.
http://www.cs.berkeley.edu/~lazzaro/sa/pubs/pdf/nossdav01.pdf

[7] Sfront source code release, includes a Linux networking
client that implements the MIDI RTP packetization.
http://www.cs.berkeley.edu/~lazzaro/sa/

[8] Dominique Fober, Yann Orlarey, Stephane Letz. Real Time Musical
Events Streaming over Internet. Proceedings of the International
Conference on WEB Delivering of Music 2001, pages 147-154
http://www.grame.fr/~fober/RTESP-Wedel.pdf

[9] M. Handley and V. Jacobson. RFC 2327: SDP: Session Description
Protocol.  1998.


Appendix E. Expiration Notice

This document expires October 11, 2002.


























Lazzaro/Wawrzynek                                              [Page 41]