INTERNET-DRAFT                         Laile L. Di Silvestro (Microsoft)
Expires in 6 months                           Greg Baribault (Microsoft)
                                                   Microsoft Corporation
                                                           June 20, 1999


                       Waveform Audio File Format
                       MIME Sub-type Registration
                       <draft-ema-vpim-wav-00.txt>


Status of this memo:

   This document is an Internet-Draft and is in full conformance with
   all provisions of Section 10 of RFC2026.

   This document is an Internet Draft.  Internet Drafts are working
   documents of the Internet Engineering Task Force (IETF), its Areas,
   and its Working Groups.  Note that other groups may also
   distribute working documents as Internet Drafts.

   Internet Drafts are valid for a maximum of six months and may be
   updated, replaced, or obsoleted by other documents at any time.
   It is inappropriate to use Internet Drafts as reference material
   or to cite them other than as a "work in progress".

   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/ietf/1id-abstracts.txt

   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html.

   To learn the current status of any Internet-Draft, please check the
   "1id-abstracts.txt" listing contained in the Internet-Drafts Shadow
   Directories on ftp.is.co.za (Africa), nic.nordu.net (Europe),
   munnari.oz.au (Pacific Rim), ds.internic.net (US East Coast), or
   ftp.isi.edu (US West Coast).

   This draft is being discussed by the Electronic Messaging
   Association VPIM work group. To subscribe to the mailing list, send
   a message to EMA Listserv Requests [listserv@listmail.ema.org] with
   the line "subscribe VPIM-L" in the body of the message.




















Di Silvestro, Baribault      Expires 12/20/99                   [Page 1]


Internet Draft                 audio/wav                          4/1/99


Abstract

   This document describes the registration of the MIME sub-type
   audio/wav for Waveform Audio File Format. This audio file format
   is based on RIFF and is defined by Microsoft in the Platform SDK.

1. Introduction
   This document describes the registration of the MIME sub-type
   audio/wav for the encapsulation of toll-quality audio in the
   Waveform Audio File Format. This audio file format is based on
   Resource Interchange File Format (RIFF), and is defined by Microsoft
   in the Platform SDK.

   The MIME subtype "wav" is being defined primarily for use in
   multimedia and voice messaging standards. the Voice Profile for
   Internet Messaging, version 3 [VPIM3] working draft specifies that
   all VPIM version 3 compliant implementations MAY generate
   audio/wav bodyparts and MUST receive audio/wav bodyparts. The VPIM
   version 3 specification further states that all compliant
   implementations MUST support receipt of wav-encapsulated 32KADPCM
   (g.726 ADPCM), BASIC (g.711 mu-law), and MS-GSM (Microsoft g.610
   GSM) encoded audio.

   Because the Waveform Audio File format is not well-defined and has
   not undergone a process of standardization, this document briefly
   defines the format that will be supported by VPIM version 3. For
   more detailed information, refer to the specification.

   This document does not obsolete the informational draft RFC 2361
   [WAVE] which describes audio/vnd.wav. Whereas RFC 2361 describes
   a mechanism for indicating a codec registered in the wav or avi
   vendor tree registries, this document proposes a standard for
   specifying wav-encapsulated audio content in a MIME stream.

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED",  "MAY", and "OPTIONAL" in
   this document are to be interpreted as described in [REQ].


2. WAV Definition

   Waveform Audio File Format is a file format for the storing of
   audio data in data chunks according to the Resource Interchange File
   Format (RIFF). Although the Waveform format is described in detail
   in xxxxxxx, lack of standardization and a proliferation of
   interpretations and enhancements make the format difficult to
   implement and support in an interoperable fashion. This document
   seeks to rectify the situation by defining the Waveform Audio
   File Format features that MUST be inplemented and supported for
   conformance with the proposed VPIM version 3 standard.



Di Silvestro, Baribault      Expires 12/20/99                   [Page 2]


Internet Draft                 audio/wav                          4/1/99

2.1 Data Organization
   Data MUST be stored in 8-bit bytes in little-endian order.
   Multi-byte values MUST be stored with the low-order bytes first,
   and the bits left-justified:

   (lsb = least-significant bit, msb = most-significant bit)

         7  6  5  4  3  2  1  0
       +-----------------------+
 char: | msb               lsb |
       +-----------------------+

         7  6  5  4  3  2  1  0 15 14 13 12 11 10  9  8
       +-----------------------+-----------------------+
short: | msb     byte 0        |       byte 1      lsb |
       +-----------------------+-----------------------+

2.2 File Format
   The Waveform Audio File Format follows the Resource Interchange File
   Format (RIFF) standard in which all data is organized into 'chunks'
   and 'sub-chunks.' Each chunk MUST comprise a 4-byte chunk ID, a
   4-byte length field specifying the size of the data, and the chunk
   data.

   To be compliant with this proposed standard, wav-formatted audio
   data MUST include the following chunks:
        RIFF header chunk: ID = 'RIFF'
        Format chunk:      ID = 'fmt '
        Sound data chunk:  ID = 'data'
        Fact chunk:        ID = 'fact'

   The chunks MAY appear in any order except that the Format chunk
   MUST be placed before the Sound data chunk (but not necessarily
   contiguous to the Sound data chunk). Any additional chunks
   MUST be expected and MAY be ignored.

2.2.1 The RIFF Header Chunk
   The RIFF header corresponds to the outermost chunk. In an audio/wav
   file, it MUST adhere to the following format:

        OFFSET  LENGTH  VALUE   DESCRIPTION
        0       4 bytes 'RIFF'  The file format ID.
        4       4 bytes         Length of the file minus (-) 8 bytes.
        8       4 bytes 'WAVE'  The data format ID.


2.2.2 The Format Chunk
   The Format chunk specifies the characteristics of the audio data
   necessary to decompress it and play it. Each audio/wav file MUST
   include one and only one Format chunk. This chunk MUST include
   the following fields:



Di Silvestro, Baribault      Expires 12/20/99                   [Page 3]


Internet Draft                 audio/wav                          4/1/99

        OFFSET  LENGTH  VALUE   DESCRIPTION
        12      4 bytes 'fmt '  The chunk ID.
        16      4 bytes 32      Length of the chunk excluding the 8
                                bytes for the ID and length.
        20      4 bytes         The codec ID.
        24      4 bytes         The number of channels.
        28      8 bytes         Samples per second.
        36      8 bytes         Average bytes per second.
        44      4 bytes         Block alignment.
        48      4 bytes         Bits per sample.

   Codec ID: The codec ID indicates what codec was used to compress
   the audio data. Three codecs are supported by the proposed VPIM
   version 3 standard, and one of them SHOULD be specified in the
   Codec ID field. The Codec ID field MAY indicate a codec other
   that the three listed below only in situations where it is certain
   that the recipient has the corresponding capabilities.

        CODEC                   ID
        g.711 mu-law            0x0007
        g.610 MS-GSM            0x0031
        g.726 32kADPCM          0x0064

   Number of Channels: To preserve network bandwidth and minimize
   memory requirements, the Format chunk SHOULD specify and the Data
   chunk SHOULD provide only one channel (mono) unless it is certain
   that the recipient supports multi-channel playback.

        CHANNELS                VALUE
        one (mono)              1

   Samples per Second: This field indicates the rate at which the
   audio is to be played (once uncompressed), expressed in sample
   frames per second. The following table specified the samples per
   second that correspond to each VPIM version 3 codec:

        CODEC                   RATE (samples per second)
        g.711 mu-law            8000
        g.610 MS-GSM            8000
        g.726 32kADPCM          8000


   Average Bytes per Second: This field specifies the number of
   bytes that play per second. It provides an indication of
   the buffer size needed to store the audio in order to avoid
   latency. It SHOULD be calculated according to the following
   formula: samples/second * block alignment (rounded up to nearest
   whole number).

        CODEC                   RATE (average bytes per second)
        g.711 mu-law            8000
        g.610 MS-GSM            1625
        g.726 32kADPCM          4000

Di Silvestro, Baribault      Expires 12/20/99                   [Page 4]


Internet Draft                 audio/wav                          4/1/99

   Block Alignment: This field indicates the size of a sample frame
   in bytes. It SHOULD be calculated according to the following
   formula: number of channels * (bits per sample / 8)

        CODEC                   SIZE
        g.711 mu-law            1
        g.610 MS-GSM            65
        g.726 32kADPCM          2
              since there are 4 bits per sample, the frames will not
              align on one byte.  It is customary to add silence bits
              (oxF) to the end of the sample to make the frame end on
              a byte boundary.

   Bits per Sample: This field specifies the bit resolution of a
   sample point.

        CODEC                   BITS (bits per sample)
        g.711 mu-law            8
        g.610 MS-GSM            0
              data immediately followed by: 0x40 0x01
        g.726 32kADPCM          4

2.2.3 The Data Chunk
   The Data chunk contains the compressed audio data. This chunk
   MUST be preceded (though not immediately) by the Format chunk.
   The Data chunk MUST adhere to the following format:

        OFFSET  LENGTH  VALUE   DESCRIPTION
        52      4 bytes 'data'  The chunk ID.
        56      4 bytes         Length of the data
                                (chunk size minus (-) 8 bytes.
        60                      The compressed audio.

2.2.4 The Fact Chunk
   All audio/wav files MUST include a Fact chunk as they contain
   compressed data. The Fact chunk MUSt contain one field indicating
   the size (in sample points) of the audio data after decompression.
   The Fact chunk MUST adhere to the following format:

        OFFSET  LENGTH  VALUE   DESCRIPTION
                4 bytes 'fact'  The chunk ID.
                4 bytes  8      Chunk size minus (-) 8 bytes.
                8 bytes         Sample length.











Di Silvestro, Baribault      Expires 12/20/99                   [Page 5]


Internet Draft                 audio/wav                          4/1/99

3. MIME Definition

3.1 audio/wav

   [Specification] describes a file format for the encapsulation of
   raw and compressed audio data. This Waveform Audio File Format (WAVE)
   is based on the Resource Interchange File Format specification
   developed by Microsoft and IBM in 1991. The WAVE format organizes
   audio data and the information needed to decompress and play it in
   chunks.

   The MIME sub-type audio/WAV is defined to hold binary audio data
   encoded in 32 kbit/s ADPCM (g.726), mu-law (g.711), or MS-GSM
   (g.610), and encapsulated in the WAVE format. The content transfer
   encoding is typically either binary or base64.

3.2 VPIM Usage

   The audio/wav sub-type is a component of the proposed VPIM version
   3 specification [VPIM3]. In this context, the Content-Description
   headers is used to succinctly describe the contents of the audio
   body.

   All VPIM Version 3 systems MUST be capable of receiving audio
   encapsulated in a WAVE file format. Sending systems MAY choose to
   send raw audio data or encapsulate it in the WAVE file format. All
   audio data MUST be compressed in one of the VPIM v3 codecs and
   encapsulated according to the guidelines provided in the section
   2.0 of this document.

   Refer to the VPIM Specifcation for proper usage.

3.3 Relation to RFC 2361

   RFC 2361, "WAVE and AVI Codec Registries," is an informational
   draft describing IANA namespaces for codecs registered in
   Microsoft's WAVE and AVI registries. Such codecs may be described
   in the following format: audio/vnd.wave; codec = [codec ID].
   This format is not suited to the description of a wave file as
   defined in this document, as it does not indicate the format
   standard that audio/wav must adhere to for interoperability
   between messaging systems. On desktop-oriented messaging systems,
   audio/wav (rather than audio/vnd.wave) is the defacto standard.











Di Silvestro, Baribault      Expires 12/20/99                   [Page 6]


Internet Draft                 audio/wav                          4/1/99

4.  IANA Registration

   To: ietf-types@iana.org
   Subject: Registration of MIME media type audio/wav

   MIME media type name: audio
   MIME subtype name: wav

   Required parameters: none
   Optional parameters: codec = [codec id]

   Encoding considerations:
      Binary or Base-64 generally preferred

   Security considerations:
      There are no known security risks with the sending or
      playing of audio data. Wav-encapsulated audio data is
      typically interpreted only by a codec supported by a
      wav audio player. Unintended information introduced into
      the data stream will result in noise.

   Interoperability considerations:

   Published specification:
      None

   Applications which use this media type:
      Multimedia and voice messaging applications

   Additional information:
     Magic number(s): ?
     File extension(s): .wav
     Macintosh File Type Code(s):  WAVE

   Person & email address to contact for further information:

     Laile L. Di Silvestro
     lailed@microsoft.com

     Greg Baribault
     gregbari@microsoft.com

   Intended usage: COMMON

   Author/Change controller:
     Laile L. Di Silvestro
     Greg Baribault







Di Silvestro, Baribault      Expires 12/20/99                   [Page 7]


Internet Draft                 audio/wav                          4/1/99

5. Authors' Addresses

   Laile L. Di Silvestro
   Microsoft Corporation
   One Microsoft Way
   Redmond, WA 98052
   lailed@microsoft.com

   Greg Baribault
   Microsoft Corporation
   One Microsoft Way
   Redmond, WA 98052
   gregbari@microsoft.com


6. References

   [G726] CCITT Recommendation G.726 (1990), General Aspects of Digital
          Transmission Systems, Terminal Equipment - 40, 32, 24,16
          kbit/s Adaptive Differential Pulse Code Modulation (ADPCM).

   [MIME4] Freed, N., Klensin, J., and J. Postel, "Multipurpose Internet
           Mail Extensions (MIME) Part Four: Registration Procedures",
           RFC 2048, November 1996.

   [VPIM1] Vaudreuil, G., "Voice Profile for Internet Mail", RFC 1911,
           February 1996.

   [VPIM2] Vaudreuil, G., and G. Parsons, "Voice Profile for Internet
           Mail - version 2", RFC 2421, September 1998.

   [VPIM3] Vaudreuil, Greg, "Voice Profile for Internet Mail, Version
           2", Work In Progress, <draft-ema-VPIMv3-00.txt>,
           December 1998.

   [REQ]   Bradner, S., "Key words for use in RFCs to Indicate
           Requirement Levels", BCP 14, RFC 2119, March 1997.

   [WAVE]  Fleischman, E., "WAVE and AVI Codec Registries", RFC 2361,
           June 1998.














Di Silvestro, Baribault      Expires 12/20/99                   [Page 8]


Internet Draft                 audio/wav                          4/1/99

7.  Full Copyright Statement
   Copyright (C) The Internet Society (1999).  All Rights Reserved.
   This document and translations of it may be copied and furnished
   to others, and derivative works that comment on or otherwise
   explain it or assist in its implementation may be prepared,
   copied, published and distributed, in whole or in part, without
   restriction of any kind, provided that the above copyright notice
   and this paragraph are included on all such copies and derivative
   works.  However, this document itself may not be modified in any
   way, such as by removing the copyright notice or references to the
   Internet Society or other Internet organizations, except as needed
   for the purpose of developing Internet standards in which case the
   procedures for copyrights defined in the Internet Standards process
   must be followed, or as required to translate it into languages
   other than English.
   The limited permissions granted above are perpetual and will not
   be revoked by the Internet Society or its successors or assigns.

   Microsoft hereby grants to the IETF, a perpetual, nonexclusive,
   non-sublicensable, non assignable, royalty-free, world-wide right
   and license under any Microsoft copyrights in this contribution to
   copy, publish and distribute the contribution, as well as a right
   and license of the same scope to any derivative works prepared by
   the IETF and based on, or incorporating all or part of the
   contribution.
   Microsoft further agrees that, upon adoption of this contribution
   as an Internet Standard, Microsoft will grant to any party a
   royalty-free license on other reasonable and non-discriminatory
   terms under applicable Microsoft intellectual property rights to
   implement and use the technology proposed in this contribution for
   the purpose of supporting the Internet Standard.  Microsoft
   expressly reserves all other rights it may have in the material and
   subject matter of this contribution.
   Microsoft expressly disclaims any and all warranties regarding this
   contribution including any warranty that (a) this contribution does
   not violate the rights of others, (b) the owners, if any, of other
   rights in this contribution have been informed of the rights and
   permissions granted to IETF herein, and (c) any required
   authorizations from such owners have been obtained.
   This document and the information contained herein is provided on
   an "AS IS" basis and MICROSOFT DISCLAIMS ALL WARRANTIES, EXPRESS OR
   IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE
   OFTHE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY
   IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR
   PURPOSE.
   IN NO EVENT WILL MICROSOFT BE LIABLE TO ANY OTHER PARTY INCLUDING
   THE IETF AND ITS MEMBERS FOR THE COST OF PROCURING SUBSTITUTE GOODS
   OR SERVICES, LOST PROFITS, LOSS OF USE, LOSS OF DATA, OR ANY
   INCIDENTAL, CONSEQUENTIAL, INDIRECT, OR SPECIAL DAMAGES WHETHER
   UNDER CONTRACT, TORT, WARRANTY, OR OTHERWISE, ARISING IN ANY WAY
   OUT OF THIS OR ANY OTHER AGREEMENT RELATING TO THIS DOCUMENT,
   WHETHER OR NOT SUCH PARTY HAD ADVANCE NOTICE OF THE POSSIBILITY OF
   SUCH DAMAGES.

Di Silvestro, Baribault      Expires 12/20/99                   [Page 9]