Common Format and MIME Type for Comma-Separated Values (CSV) Files
draft-shafranovich-rfc4180-bis-00
|
Document |
Type |
|
Active Internet-Draft (individual)
|
|
Author |
|
Yakov Shafranovich
|
|
Last updated |
|
2021-03-26
|
|
Stream |
|
(None)
|
|
Intended RFC status |
|
(None)
|
|
Formats |
|
plain text
pdf
htmlized (tools)
htmlized
bibtex
|
Stream |
Stream state |
|
(No stream defined) |
|
Consensus Boilerplate |
|
Unknown
|
|
RFC Editor Note |
|
(None)
|
IESG |
IESG state |
|
I-D Exists
|
|
Telechat date |
|
|
|
Responsible AD |
|
(None)
|
|
Send notices to |
|
(None)
|
Network Working Group Y. Shafranovich
Internet-Draft Nightwatch Cybersecurity
Intended status: Informational 26 March 2021
Expires: 27 September 2021
Common Format and MIME Type for Comma-Separated Values (CSV) Files
draft-shafranovich-rfc4180-bis-00
Abstract
This RFC documents the common format used for Comma-Separated Values
(CSV) files and updates the associated MIME type "text/csv".
Status of This Memo
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet-
Drafts is at https://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
This Internet-Draft will expire on 27 September 2021.
Copyright Notice
Copyright (c) 2021 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents (https://trustee.ietf.org/
license-info) in effect on the date of publication of this document.
Please review these documents carefully, as they describe your rights
and restrictions with respect to this document. Code Components
extracted from this document must include Simplified BSD License text
as described in Section 4.e of the Trust Legal Provisions and are
provided without warranty as described in the Simplified BSD License.
Table of Contents
1. Introduction
1.1. Terminology
1.2. Motivation For and Status of This Document
2. Definition of the CSV Format
2.1. High level description
2.2. Default charset and line break values
2.3. ABNF Grammar
3. Common implementation concerns
3.1. Null values
3.2. Empty files
3.3. Empty lines
3.4. Fields spanning multiple lines
3.5. Unique header names
3.6. Whitespace outside of quoted fields
3.7. Other field separators
3.8. Escaping double quotes
3.9. BOM header
4. Update to MIME Type Registration of text/csv
4.1. IANA Considerations
5. Security Considerations
6. Acknowledgments
7. References
7.1. Normative References
7.2. Informative References
Appendix A. Major changes since RFC4180
Appendix B. Note to Readers
Author's Address
1. Introduction
The comma separated values format (CSV) has been used as a common way
to exchange data between disparate systems and applications for many
years. Surprisingly, while this format is very popular, it has never
been formally documented and didn't have a media type registered.
This was addressed in 2005 via publication of [RFC4180] and the
concurrent registration of the "text/csv" media type.
Since the publication of [RFC4180], the CSV format has evolved and
this specification seeks to reflect these changes as well as update
the "text/csv" media type registration.
1.1. Terminology
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
"OPTIONAL" in this document are to be interpreted as described in BCP
14 [RFC2119] [RFC8174] when, and only when, they appear in all
capitals, as shown here.
1.2. Motivation For and Status of This Document
The original motivation of [RFC4180] was to provide a reference in
order to register the media type "text/csv". It tried to document
existing practices at the time based on the approaches used by most
implementations. This document continues to do the same, and updates
the original document to reflect current practices for generating and
consuming of CSV files.
Both [RFC4180] and this document are published as informational RFC
for the benefit of the Internet community and and not intended to be
used as formal standards. Implementers should consult [RFC1796] and
[RFC2026] for crucial differences between IETF standards and
informational RFCs.
2. Definition of the CSV Format
While there had been various specifications and implementations for
the CSV format (for ex. [CREATIVYST], [EDOCEO], [CSVW] and [ART])),
prior to publication of [RFC4180] there is no attempt to provide a
common specification. This section documents the format that seems
to be followed by most implementations (incorporating changes since
the publication of [RFC4180]).
2.1. High level description
1. Each record is located on a separate line, ended by a line break
Show full document text