Common Format and MIME Type for Comma-Separated Values (CSV) Files
draft-shafranovich-rfc4180-bis-00

Document Type Active Internet-Draft (individual)
Author Yakov Shafranovich 
Last updated 2021-03-26
Stream (None)
Intended RFC status (None)
Formats plain text pdf htmlized (tools) htmlized bibtex
Stream Stream state (No stream defined)
Consensus Boilerplate Unknown
RFC Editor Note (None)
IESG IESG state I-D Exists
Telechat date
Responsible AD (None)
Send notices to (None)
Network Working Group                                    Y. Shafranovich
Internet-Draft                                  Nightwatch Cybersecurity
Intended status: Informational                             26 March 2021
Expires: 27 September 2021

   Common Format and MIME Type for Comma-Separated Values (CSV) Files
                   draft-shafranovich-rfc4180-bis-00

Abstract

   This RFC documents the common format used for Comma-Separated Values
   (CSV) files and updates the associated MIME type "text/csv".

Status of This Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF).  Note that other groups may also distribute
   working documents as Internet-Drafts.  The list of current Internet-
   Drafts is at https://datatracker.ietf.org/drafts/current/.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   This Internet-Draft will expire on 27 September 2021.

Copyright Notice

   Copyright (c) 2021 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents (https://trustee.ietf.org/
   license-info) in effect on the date of publication of this document.
   Please review these documents carefully, as they describe your rights
   and restrictions with respect to this document.  Code Components
   extracted from this document must include Simplified BSD License text
   as described in Section 4.e of the Trust Legal Provisions and are
   provided without warranty as described in the Simplified BSD License.

Table of Contents

   1.  Introduction
     1.1.  Terminology
     1.2.  Motivation For and Status of This Document
   2.  Definition of the CSV Format
     2.1.  High level description
     2.2.  Default charset and line break values
     2.3.  ABNF Grammar
   3.  Common implementation concerns
     3.1.  Null values
     3.2.  Empty files
     3.3.  Empty lines
     3.4.  Fields spanning multiple lines
     3.5.  Unique header names
     3.6.  Whitespace outside of quoted fields
     3.7.  Other field separators
     3.8.  Escaping double quotes
     3.9.  BOM header
   4.  Update to MIME Type Registration of text/csv
     4.1.  IANA Considerations
   5.  Security Considerations
   6.  Acknowledgments
   7.  References
     7.1.  Normative References
     7.2.  Informative References
   Appendix A.  Major changes since RFC4180
   Appendix B.  Note to Readers
   Author's Address

1.  Introduction

   The comma separated values format (CSV) has been used as a common way
   to exchange data between disparate systems and applications for many
   years.  Surprisingly, while this format is very popular, it has never
   been formally documented and didn't have a media type registered.
   This was addressed in 2005 via publication of [RFC4180] and the
   concurrent registration of the "text/csv" media type.

   Since the publication of [RFC4180], the CSV format has evolved and
   this specification seeks to reflect these changes as well as update
   the "text/csv" media type registration.

1.1.  Terminology

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
   "OPTIONAL" in this document are to be interpreted as described in BCP
   14 [RFC2119] [RFC8174] when, and only when, they appear in all
   capitals, as shown here.

1.2.  Motivation For and Status of This Document

   The original motivation of [RFC4180] was to provide a reference in
   order to register the media type "text/csv".  It tried to document
   existing practices at the time based on the approaches used by most
   implementations.  This document continues to do the same, and updates
   the original document to reflect current practices for generating and
   consuming of CSV files.

   Both [RFC4180] and this document are published as informational RFC
   for the benefit of the Internet community and and not intended to be
   used as formal standards.  Implementers should consult [RFC1796] and
   [RFC2026] for crucial differences between IETF standards and
   informational RFCs.

2.  Definition of the CSV Format

   While there had been various specifications and implementations for
   the CSV format (for ex.  [CREATIVYST], [EDOCEO], [CSVW] and [ART])),
   prior to publication of [RFC4180] there is no attempt to provide a
   common specification.  This section documents the format that seems
   to be followed by most implementations (incorporating changes since
   the publication of [RFC4180]).

2.1.  High level description

   1.  Each record is located on a separate line, ended by a line break
Show full document text