datatracker.ietf.org
Sign in
Version 5.10.0, 2014-12-21
Report a bug

The VCDIFF Generic Differencing and Compression Data Format
RFC 3284

Document type: RFC - Proposed Standard (July 2002; No errata)
Was draft-korn-vcdiff (individual)
Document stream: IETF
Last updated: 2013-03-02
Other versions: plain text, pdf, html

IETF State: (None)
Document shepherd: No shepherd assigned

IESG State: RFC 3284 (Proposed Standard)
Responsible AD: (None)
Send notices to: No addresses provided

Network Working Group                                            D. Korn
Request for Comments: 3284                                     AT&T Labs
Category: Standards Track                                   J. MacDonald
                                                             UC Berkeley
                                                                J. Mogul
                                                 Hewlett-Packard Company
                                                                   K. Vo
                                                               AT&T Labs
                                                               June 2002

      The VCDIFF Generic Differencing and Compression Data Format

Status of this Memo

   This document specifies an Internet standards track protocol for the
   Internet community, and requests discussion and suggestions for
   improvements.  Please refer to the current edition of the "Internet
   Official Protocol Standards" (STD 1) for the standardization state
   and status of this protocol.  Distribution of this memo is unlimited.

Copyright Notice

   Copyright (C) The Internet Society (2002).  All Rights Reserved.

Abstract

   This memo describes VCDIFF, a general, efficient and portable data
   format suitable for encoding compressed and/or differencing data so
   that they can be easily transported among computers.

Korn, et. al.               Standards Track                     [Page 1]
RFC 3284                         VCDIFF                        June 2002

Table of Contents

    1.  Executive Summary ...........................................  2
    2.  Conventions .................................................  4
    3.  Delta Instructions ..........................................  5
    4.  Delta File Organization .....................................  6
    5.  Delta Instruction Encoding .................................. 12
    6.  Decoding a Target Window .................................... 20
    7.  Application-Defined Code Tables ............................. 21
    8.  Performance ................................................. 22
    9.  Further Issues .............................................. 24
   10.  Summary ..................................................... 25
   11.  Acknowledgements ............................................ 25
   12.  Security Considerations ..................................... 25
   13.  Source Code Availability .................................... 25
   14.  Intellectual Property Rights ................................ 26
   15.  IANA Considerations ......................................... 26
   16.  References .................................................. 26
   17.  Authors' Addresses .......................................... 28
   18.  Full Copyright Statement .................................... 29

1.  Executive Summary

   Compression and differencing techniques can greatly improve storage
   and transmission of files and file versions.  Since files are often
   transported across machines with distinct architectures and
   performance characteristics, such data should be encoded in a form
   that is portable and can be decoded with little or no knowledge of
   the encoders.  This document describes Vcdiff, a compact portable
   encoding format designed for these purposes.

   Data differencing is the process of computing a compact and
   invertible encoding of a "target file" given a "source file".  Data
   compression is similar, but without the use of source data.  The UNIX
   utilities diff, compress, and gzip are well-known examples of data
   differencing and compression tools.  For data differencing, the
   computed encoding is called a "delta file", and for data compression,
   it is called a "compressed file".  Delta and compressed files are
   good for storage and transmission as they are often smaller than the
   originals.

   Data differencing and data compression are traditionally treated as
   distinct types of data processing.  However, as shown in the Vdelta
   technique by Korn and Vo [1], compression can be thought of as a
   special case of differencing in which the source data is empty.  The
   basic idea is to unify the string parsing scheme used in the Lempel-
   Ziv'77 (LZ'77) style compressors [2] and the block-move technique of
   Tichy [3].  Loosely speaking, this works as follows:

Korn, et. al.               Standards Track                     [Page 2]
RFC 3284                         VCDIFF                        June 2002

      a. Concatenate source and target data.
      b. Parse the data from left to right as in LZ'77 but make sure
         that a parsed segment starts the target data.

[include full document text]