Network Working Group                                     T. Hansen, Ed.
Internet-Draft                                         AT&T Laboratories
Intended status: Informational                               L. Masinter
Expires: November 18, 2016                                      M. Hardy
                                                                   Adobe
                                                            May 17, 2016


              PDF for an RFC Series Output Document Format
                      draft-iab-rfc-use-of-pdf-02

Abstract

   This document discusses options and requirements for the PDF
   rendering of RFCs in the RFC Series, as outlined in RFC 6949.  It
   also discusses the use of PDF for Internet-Drafts, and available or
   needed software tools for producing and working with PDF.

Status of This Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF).  Note that other groups may also distribute
   working documents as Internet-Drafts.  The list of current Internet-
   Drafts is at http://datatracker.ietf.org/drafts/current/.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   This Internet-Draft will expire on November 18, 2016.

Copyright Notice

   Copyright (c) 2016 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   (http://trustee.ietf.org/license-info) in effect on the date of
   publication of this document.  Please review these documents
   carefully, as they describe your rights and restrictions with respect
   to this document.  Code Components extracted from this document must
   include Simplified BSD License text as described in Section 4.e of




Hansen, et al.          Expires November 18, 2016               [Page 1]


Internet-Draft                PDF for RFCs                      May 2016


   the Trust Legal Provisions and are provided without warranty as
   described in the Simplified BSD License.

Table of Contents

   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   3
   2.  Choosing PDF versions and Standards . . . . . . . . . . . . .   3
   3.  Options and Requirements for PDF RFCs . . . . . . . . . . . .   4
     3.1.  "Visible" Requirements  . . . . . . . . . . . . . . . . .   4
       3.1.1.  General Visible Requirements  . . . . . . . . . . . .   4
       3.1.2.  Page Size, Margins  . . . . . . . . . . . . . . . . .   5
       3.1.3.  Headers and Footers . . . . . . . . . . . . . . . . .   5
       3.1.4.  Paragraph Numbering . . . . . . . . . . . . . . . . .   5
       3.1.5.  Paged Content Layout  . . . . . . . . . . . . . . . .   6
       3.1.6.  Typeface Choices  . . . . . . . . . . . . . . . . . .   6
       3.1.7.  Hyphenation and Line Breaks . . . . . . . . . . . . .   7
       3.1.8.  Hyperlinks  . . . . . . . . . . . . . . . . . . . . .   8
       3.1.9.  Similarity to Other Outputs . . . . . . . . . . . . .   8
     3.2.  "Invisible" Options and Requirements  . . . . . . . . . .  10
       3.2.1.  Internal Text Representation  . . . . . . . . . . . .  10
       3.2.2.  Unicode Support . . . . . . . . . . . . . . . . . . .  11
       3.2.3.  Image Processing (Artwork)  . . . . . . . . . . . . .  11
       3.2.4.  Text Description of Images (Alt-Text) . . . . . . . .  11
       3.2.5.  Metadata Support  . . . . . . . . . . . . . . . . . .  12
       3.2.6.  Document Structure Support  . . . . . . . . . . . . .  12
       3.2.7.  Embedded Files  . . . . . . . . . . . . . . . . . . .  12
     3.3.  Digital Signatures  . . . . . . . . . . . . . . . . . . .  13
   4.  References  . . . . . . . . . . . . . . . . . . . . . . . . .  14
     4.1.  References  . . . . . . . . . . . . . . . . . . . . . . .  14
     4.2.  Informative References  . . . . . . . . . . . . . . . . .  14
   Appendix A.  History and Current Use of PDF with RFCs and
                Internet-Drafts  . . . . . . . . . . . . . . . . . .  15
     A.1.  RFCs  . . . . . . . . . . . . . . . . . . . . . . . . . .  15
     A.2.  Internet-Drafts . . . . . . . . . . . . . . . . . . . . .  16
   Appendix B.  Paged Content Layout Quality . . . . . . . . . . . .  16
   Appendix C.  Tooling  . . . . . . . . . . . . . . . . . . . . . .  17
     C.1.  PDF Viewers . . . . . . . . . . . . . . . . . . . . . . .  17
     C.2.  Printers  . . . . . . . . . . . . . . . . . . . . . . . .  17
     C.3.  PDF Generation Libraries  . . . . . . . . . . . . . . . .  17
     C.4.  Typefaces . . . . . . . . . . . . . . . . . . . . . . . .  18
     C.5.  Other Tools . . . . . . . . . . . . . . . . . . . . . . .  18
   Appendix D.  Acknowledgements . . . . . . . . . . . . . . . . . .  18
   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .  19








Hansen, et al.          Expires November 18, 2016               [Page 2]


Internet-Draft                PDF for RFCs                      May 2016


1.  Introduction

   The RFC Series is evolving, as outlined in [RFC6949].  Future
   documents will use a canonical format, XML, with renderings in
   various formats, including PDF.

   Because PDF has a wide range of capabilities and alternatives, not
   all PDFs are "equal".  For example, visually similar documents could
   consist of scanned or rasterized images, or include text layout
   options, hyperlinks, embedded fonts, and digital signatures.  (See
   [I-D.hardy-pdf-mime] for a history of PDF.)

   This document explains some of the relevant options and makes
   recommendations, both for the RFC series and Internet-Drafts.

   The PDF format and the tools to manipulate it are not as well known
   as those for the other RFC formats, at least in the IETF community.
   This document discusses some of the processes for creating and using
   PDFs using both open source and commercial products.

   The details described in this document are expected to change based
   on experience gained in implementing the RFC production center's
   toolset.  Revised documents will be published capturing those changes
   as the toolset is completed.  Other implementers must not expect
   those changes to remain backwards-compatible with the details
   described in this document.

   NOTE: [RFC-EDITOR: This note should be removed before publication.]
   See <https://github.com/masinter/pdfrfc> for XML source, related
   files, and an issue tracker for this document.

2.  Choosing PDF versions and Standards

   PDF [PDF] has gone through several revisions, primarily for the
   addition of features.  PDF features have generally been added in a
   way that older viewers 'fail gracefully', but even so, the older the
   PDF version produced, the more legacy viewers will support that
   version, but the fewer features will be enabled.

   As PDF has evolved a broad set of capabilities, additional standards
   for PDF files are applicable.  These standards establish ground rules
   that are important for specific applications.  For example PDF/X was
   specifically designed for Prepress digital data exchange, with
   careful attention to color management and printing instructions.  The
   PDF/E standard was designed for engineering documents with dynamic
   workflows (where a document continues to be revised after
   publication) and allows interactive media (including animation and
   3D).



Hansen, et al.          Expires November 18, 2016               [Page 3]


Internet-Draft                PDF for RFCs                      May 2016


   Two additional standards families are important to the RFC format,
   though: long-term preservation (PDF/A), and user accessibility (PDF/
   UA [PDFUA]).  These then have sub-profiles (PDF/A-1, PDF/A-2 [PDFA2],
   PDF/A-3 [PDFA3]), each of which have conformance levels.  These
   standards are then supported by various software libraries and tools.

   It is effective and useful to use these standards to capture PDF for
   RFC requirements, and they will make the PDF files useful in
   workflows that expect them.

   Recommendations:

      Use PDF 1.7; although relatively recent, it is well supported by
      widely available viewers.

      For RFCs, require PDF/A-3 with conformance level "U".  This
      captures the archivability and long-term stability of PDF 1.7
      files, mandatory Unicode mapping, and many of the requirement
      features.

      Use PDF/A-3 for embedding additional data (including the XML
      source file) in RFCs and Internet-Drafts.

      Use PDF/UA for user accessibility.

3.  Options and Requirements for PDF RFCs

   This section lays out options and requirements for PDFs produced by
   the RFC editor for RFCs.  There are two sections: "Visible" options
   are related to how the PDF appears when it is viewed with a PDF
   viewer.  "Internal Structure" options affect the ability to process
   PDFs in other ways, but do not control the way the document appears.
   (Of course, a viewer UI might display processing capabilities, such
   as showing whether a document has been digitally signed.)

   In many cases, the choice of PDF requirements is heavily influenced
   by the capabilities of available tools to create PDFs.  Most of the
   discussion of tooling is to be found in Appendix C.

3.1.  "Visible" Requirements

   PDF supports rich visible layout of fixed-sized pages.

3.1.1.  General Visible Requirements

   For a consistent "look" of RFC and good style, the PDFs produced by
   the RFC editor should have a clear, consistent, identifiable and




Hansen, et al.          Expires November 18, 2016               [Page 4]


Internet-Draft                PDF for RFCs                      May 2016


   easy-to-read style.  They should print well on the widest range of
   printers, and look good on displays of varying resolution.

3.1.2.  Page Size, Margins

   PDF files are laid out for a particular size of page and margins.
   There are two paper sizes in common use: "US Letter" (8.5 x 11
   inches, 216x279 mm, in popular use in North America) and "A4"
   (210x297 mm, 8.27x11.7 inches, standard for the rest of the world).
   Usually PDF printing software is used in a "shrink to fit" mode where
   the printing is adjusted to fit the paper in the printer.  There is
   some controversy, but the argument that A4 is an international
   standard is compelling.  However, if the margins and header
   positioning are chosen appropriately, the document can be printed
   without any scaling.

   Recommendation: The Internet-Draft and RFC processors should produce
   A4 size by default.  However, the margins and header positioning need
   to be chosen to look good on both paper sizes without scaling.
   Following the advice found in [RFC2346], this means that we should
   use A4 portrait mode with left and right margins of 20 mm, and top
   and bottom margins of 33 mm.

3.1.3.  Headers and Footers

   Page headers and footers are part of the page layout.  There are a
   variety of options.  Note that page headers and footers in PDF can be
   typeset in a way that the entire (longer) title might fit.

   Recommendation: Page headers and footers should contain similar
   information as the headings in the current text versions of
   documents, including page numbers, title, author, working group.
   However, the page headers and footers should be typeset in a way so
   as to be unobtrusive.  The page headers and footers should be placed
   into the PDF in a way not to interfere with screen readers.

3.1.4.  Paragraph Numbering

   One common feature of the Internet-draft output formats are optional
   visible paragraph numbers, to aid in discussions.  In the PDF and
   thus printed rendition, it is possible to make paragraph numbers
   unobtrusive, and even to impinge on the margins.

   Recommendation: When the XML "editing=yes" option has been chosen,
   show paragraph numbers in the right margin, typeset in a way so as to
   be unobtrusive.  (The right margin instead of the left margin
   prevents the paragraph numbers from being confused with the section




Hansen, et al.          Expires November 18, 2016               [Page 5]


Internet-Draft                PDF for RFCs                      May 2016


   numbers.)  If possible, the paragraph numbers should be coded in a
   way that they do not interfere with screen readers.

3.1.5.  Paged Content Layout

   By its nature, PDF is paginated, so pagination issues must be
   considered.  This is reflected in two areas: running headers and
   footers, and how text is layed out on a page for optimal reading.

   Appendix B describes the process of creating a paged document from
   running text such that related material is present on the same page
   together and artifacts of pagination don't interfere with easy
   reading of the document.

   Layout engines differ in the quality of the algorithms used to
   automate these processes.  In some cases, the automated processes
   require some manual assistance to ensure, for example, that a text
   line intended as a heading is "kept" with the text it is heading for.

   Recommendations:

   o  Headers and footers should be printed on each page.  The
      information should include the RFC number or internet-draft name,
      the page number, the category (informational, etc.), a shortened
      version of the authors' names, the date of the RFC or internet-
      draft, and the short form of the document title.

   o  Choose a layout engine so that manual intervention is minimized,
      and that widow and orphan processing, heading and title
      contiguation are automatic.

3.1.6.  Typeface Choices

   A PDF may refer to a font by name, or it may use an embedded font.
   When a font is not embedded, a PDF viewer will attempt to locate a
   locally installed font of the same name.  If it can not find an exact
   match, it will find a "close match".  If a close match is not
   available, it will fall back to something implementation dependent
   and usually undesirable.

   In addition, the PDF/A standards mandate the embedding of fonts.
   Instead of using additional software to embed the fonts, the software
   generating the PDF files should produce PDF/A-conforming files
   directly, thus ensuring that all glyphs include Unicode mappings and
   embedded fonts from the outset.






Hansen, et al.          Expires November 18, 2016               [Page 6]


Internet-Draft                PDF for RFCs                      May 2016


   If the HTML version of the document is being visually mimicked, the
   font(s) chosen should have both variable width and constant width
   components, as well as bold and italic representations.

   The typefaces used by Internet-Drafts and by RFCs need not be
   identical.

   Few fonts have glyphs for the entire repertoire of Unicode
   characters; for this purpose, the PDF generation tool may need a set
   of fonts and a way of choosing them.  The RFC Editor is defining
   where Unicode characters may be used within
   RFCs.[I-D.flanagan-nonascii]

   Typefaces are typically licensed and, in many cases, there is a fee
   for use by PDF creation tools; however, not for display or print of
   the embedded fonts.

   Recommendations:

   o  For consistent viewing, all fonts should be embedded.  The fonts
      used must be available for use by the IETF community.  Some
      discussion of available typefaces can be found in Appendix C.4.

   o  The choice of type faces with respect to serif, sans serif,
      monospace, etc., should follow the recommendations for HTML and
      CSS rendering [I-D.hildebrand-html-rfc] and
      [I-D.flanagan-rfc-css].

   o  The range of Unicode characters allowed in the XML source for
      Internet-Drafts and RFCs may be bounded by the availability of
      embeddable fonts with appropriate glyphs [I-D.flanagan-nonascii].

3.1.7.  Hyphenation and Line Breaks

   Typically, when doing page layout of running text, especially with
   narrow page width and long words, layout processors of English text
   often have the option of hyphenating words, or using existing hyphens
   as a place to introduce word breaks.  However, inserting line breaks
   mid-word can be harmful when the "word" is actually a sequence of
   characters representing a protocol element or protocol sequence.

   Recommendation: avoid introducing hyphenated line breaks mid-word
   into the visual display, consistent with requirements for plain text
   and HTML.







Hansen, et al.          Expires November 18, 2016               [Page 7]


Internet-Draft                PDF for RFCs                      May 2016


3.1.8.  Hyperlinks

   PDF supports hyperlinks both to sections of the same document and to
   other documents.

   The conversion to PDF can generate:

   o  hyperlinks within the document

   o  hyperlinks to other RFCs and Internet-Drafts

   o  hyperlinks to external locations

   o  hyperlinks within a table of contents

   o  hyperlinks within an index

   Recommendations:

   o  All hyperlinks available in the HTML rendition of the RFC should
      also be visible and active in the PDF produced.  This includes
      both internal hyperlinks and hyperlinks to external resources.

   o  The table of contents, including page numbers, are useful when
      printed.  These should also be hyperlinked to their respective
      sections.

   o  As specified in the section on Referencing RFCs in [RFC7322],
      hyperlinks to RFCs from the references section should point to the
      RFC "info" page, which then links to the various formats
      available.

   o  Hyperlinks to Internet-Drafts from the references section should
      point to the datatracker entry page for the draft, which then
      links to the various formats available.

3.1.9.  Similarity to Other Outputs

   There is some advantage to having the PDF files look like the text or
   HTML renderings of the same document.  There are several options even
   so.  The PDF

   1.  could look like the text version of the document, or

   2.  could look like the text version of the document but with
       pictures rendered as pictures instead of using their ASCII-art
       equivalent, or




Hansen, et al.          Expires November 18, 2016               [Page 8]


Internet-Draft                PDF for RFCs                      May 2016


   3.  could look like the HTML version.

   Recommendation: the PDF rendition should look like the HTML
   rendition, at least in spirit.  Some differences from the HTML
   rendition would include different typeface and size (chosen for
   printing), page numbers in the table of contents and index, and the
   use of page headers and footers.

   Most of the choices used for the [I-D.hildebrand-html-rfc] rendering
   and [I-D.flanagan-rfc-css]  are thus applicable.  See those documents
   for specifics on the rendering of the specific XML elements.  Some
   notes are:

      Every place in the document that would receive an HTML ID would be
      given an identical PDF named destination.  In addition, a named
      destination will be created for each page with the form "pg-#", as
      in "pg-35".

      No pilcrows are generated or made visible.

      The table of contents (generated if the XML's <rfc> element's
      tocInclude attribute has the value "true") will have the section
      number linked to that section named destination, but will also
      include a page number that is linked to the page named
      destination.  The section title and the page number will be
      separated by a visually-appropriate separator and the page numbers
      will be aligned with each other.

      The index (generated if the XML's <rfc> element's indexInclude
      attribute has the value "true") will have the section number
      linked to that section named destination, but will also include a
      page number that is linked to the page named destination.

      The running header in one line (on page 2 and all subsequent
      pages) has the RFC number on the left (RFC NNNN), the (possibly
      shortened form) title centered, and the date (Month Year) on the
      right.  The text is rendered in a way that is visually
      unobtrusive.

      The running footer in one line (on all pages) has the author's
      last name on the left, category centered, and the page number on
      the right ([Page N]).  The text is rendered in a way that is
      visually unobtrusive.

      We should not attempt to replicate in PDF the feature of the HTML
      format that includes a dynamic block that displays up-to-date
      information on updates, obsoletions and errata.




Hansen, et al.          Expires November 18, 2016               [Page 9]


Internet-Draft                PDF for RFCs                      May 2016


3.2.  "Invisible" Options and Requirements

   PDF offers a number of features which improve the utility of PDF
   files in a variety of workflows, at the cost of extra effort in the
   xml2rfc conversion process; the tradeoffs may be different for the
   RFC editor production of RFCs and for Internet-Drafts.

3.2.1.  Internal Text Representation

   The contents of a PDF file can be represented in many ways.  The PDF
   file could be generated:

   o  as an image of the visual representation, such as a JPEG image of
      the word "IETF".  That is, there might be no internal
      representation of letters, words or paragraphs at all.

   o  placing individual characters in position on the page, such as
      saying "put an 'F' here", then "put an 'T' before it", then "put
      an 'E' before that", then "put an 'I' before that" to render the
      word "IETF".  That is, there might be no internal representation
      of words or paragraphs at all.

   o  placing words in position on the page, such as keeping the word
      "IETF" would be kept together.  That is, there might be no
      internal representation of paragraphs at all.

   o  ensuring that the running order of text in the content stream
      matches the logical reading order.  That is, a sentence such as
      'The Internet Engineering Task Force (IETF) supports the
      Internet.' would be kept together as a sentence, and multiple
      sentences within a paragraph would be kept together.

   All of these end up with essentially the same visual representation
   of the output.  However, each level has tradeoffs for auxiliary uses,
   such as searching or indexing, commenting and annotation, and
   accessibility (text-to-speech).  Keeping the running order of text in
   the content stream in the proper order supports all of these
   auxiliary uses.

   In addition, the "role map" feature of PDF
   (<http://help.adobe.com/en_US/acrobat/X/pro/using/
   WS58a04a822e3e50102bd615109794195ff-7cd8.w.html>) would additionally
   allow for the mapping of the logical tags found in the original XML
   into tags in the PDF.

   Recommendations:





Hansen, et al.          Expires November 18, 2016              [Page 10]


Internet-Draft                PDF for RFCs                      May 2016


   o  Text in content streams should follow the XML document's logical
      order (in the order of tags) to the extent possible.  This will
      provide optimal reuse by software that does not understand Tagged
      PDF.  (PDF/UA requires this.)

   o  It might be possible to use the "role map" annotation to capture
      enough of the xml2rfc source structure, to the point where it is
      possible to reconstruct the XML source structure completely.
      However, there is not a compelling case to do so over embedding
      the original XML, as described in Section 3.2.7.

3.2.2.  Unicode Support

   PDF itself does not require use of Unicode.  Text is represented as a
   sequence of glyphs which then can be mapped to Unicode.

   Recommendations:

      PDF files generated must have the full text, as it appears in the
      original XML.

      Unicode normalization may occur.

      Text within SVG for SVG images should also have Unicode mappings.

      Alt-text for images should also support Unicode.

3.2.3.  Image Processing (Artwork)

   The XML allows both ASCII art and SVG to be used for artwork.

   Recommendations:

      If both ASCII art and SVG are available for a picture, the SVG
      artwork should be the preferred over the ASCII artwork.

      ASCII artwork must be rendered using a monospace font.

3.2.4.  Text Description of Images (Alt-Text)

   Guidelines for accessibility of PDF <http://www.w3.org/TR/WCAG20-
   TECHS/PDF1.html> recommend that images, formulas, and other non-text
   items provide textual alternatives, using the '/Alt' Tag in PDF to
   provide human-readable text that can be vocalized by text-to-speech
   technology.

   Recommendation: Any alt-text for artwork and figures available in the
   XML source should be stored using the PDF /Alt property.  Internet



Hansen, et al.          Expires November 18, 2016              [Page 11]


Internet-Draft                PDF for RFCs                      May 2016


   draft authors and the RFC editor should ensure inclusion of alt-text
   for all SVG or images, within the XML source.

3.2.5.  Metadata Support

   Metadata encodes information about the document authors, the document
   series, date created, etc.  Having this metadata within the PDF file
   allows it to be used by search engines, viewers and other reuse
   tools.  PDF supports embedded metadata in a variety of ways,
   including using XMP [XMP], the Extensible Metadata Platform (XMP).
   The RFC editor maintains metadata about an RFC on its info page.

   Recommendation: The PDFs generated should have all of the metadata
   from the XML version embedded directly as XMP metadata, including the
   author, date, the document series, and a URL for where the document
   can be retrieved.  This information should be consistent with the RFC
   editor info page at the time of publication.

3.2.6.  Document Structure Support

   PDF supports an "outline" feature where sections of the document are
   marked; this could be used in addition to the table of contents as a
   navigation aid.

   The section structure of an RFC can be mapped into the PDF elements
   for the document structure.  This will allow the bookmark feature of
   PDF readers to be used to quickly access sections of the document.

   Recommendation: The section structure of an RFC should be mapped into
   the PDF elements for the document structure.  This would include
   section headings for the boilerplate sections such as the Abstract,
   Status of the Document, Table of Contents, and Author Addresses, plus
   the obvious section headings that are normally included in the
   Table of Contents.  If possible, this should be done in a way that
   the same fragment identifiers for the HTML version of the RFC will
   work for the PDF version.

3.2.7.  Embedded Files

   PDF has the capability of including other files; the files may be
   labeled both by a media type and a role, the AFRelationship key
   [PDFA3].  In this way, the PDF file acts also as a container.

   Embedded content may be compressed.

   Many PDF viewers support the ability to view and extract embedded
   files, although this capability is not universal.




Hansen, et al.          Expires November 18, 2016              [Page 12]


Internet-Draft                PDF for RFCs                      May 2016


   Embedding content in the PDF file allows the PDF to act as a complete
   package, which can be transformed, archived, and digitally signed.
   (Some sample code illustrating how items can be attached to a PDF
   file and subsequently extracted can be found at
   <https://github.com/Aiybe/xmptest>.)  Useful possibilities:

      Embed the source XML input file itself within the PDF.  If the
      source SVG and images for illustrations are also embedded, this
      would make the PDF file totally self-referential.

      Embed directly extractable components that are useful for
      independent processing, including ABNF, MIBs, source code for
      reference implementations.  This capability might be supported
      through other mechanisms from the XML source files, but could also
      be supported within the PDF.

      Finding, extracting and embedding other components may require
      additional markup to clearly identify them, and additional review
      to ensure the correctness of embedded files that are not visible.

   Recommendations:

      Embed the XML source and all illustrations, for RFCs, as a
      standard feature for xml2rfc's PDF output.

      If possible, make this a standard feature for Internet-Drafts as
      well.

      Named <sourcecode> entries should be embedded.

      Bitmap images (SVG sources, JPEGs, PNGs, etc) should be embedded.

3.3.  Digital Signatures

   The RFC Editor and staff are at times called to provide evidence that
   a particular RFC is the "original" and has not been modified; digital
   signatures can provide that verification.  As signatures also apply
   to embedded content, embedding the XML source will provide a way of
   signing the source XML that was used to product the PDF file as well.

   PDF has supported digital signatures since PDF 1.2, and there are
   multiple methods and options available for signing PDF files.  The
   signing of internet-drafts and RFCs will be guided by
   [I-D.housley-rfc-and-id-signatures].







Hansen, et al.          Expires November 18, 2016              [Page 13]


Internet-Draft                PDF for RFCs                      May 2016


4.  References

4.1.  References

   [PDF]      ISO, "Portable document format -- Part 1: PDF 1.7",
              ISO 32000-1, 2008.

              Also available free from Adobe.

   [XMP]      ISO, "Extensible metadata platform (XMP) specification --
              Part 1: Data model, serialization and core properties",
              ISO 16684-1, 2012.

              Not available free, but there are a number of descriptive
              resources, e.g., <http://en.wikipedia.org/wiki/
              Extensible_Metadata_Platform>

   [PDFA2]    ISO, "Electronic document file format for long-term
              preservation -- Part 2: Use of ISO 32000-1 (PDF/A-2).",
              ISO 19005-2, 2011.

   [PDFA3]    ISO, "Electronic document file format for long-term
              preservation -- Part 3: Use of ISO 32000-1 with support
              for embedded files (PDF/A-3)", ISO 19005-3, 2012.

   [PDFUA]    ISO, "Electronic document file format enhancement for
              accessibility -- Part 1: Use of ISO 32000-1 (PDF/UA-1)",
              ISO 19005-3, 2012.

4.2.  Informative References

   [RFC2346]  Palme, J., "Making Postscript and PDF International",
              RFC 2346, DOI 10.17487/RFC2346, May 1998,
              <http://www.rfc-editor.org/info/rfc2346>.

   [RFC6949]  Flanagan, H. and N. Brownlee, "RFC Series Format
              Requirements and Future Development", RFC 6949,
              DOI 10.17487/RFC6949, May 2013,
              <http://www.rfc-editor.org/info/rfc6949>.

   [RFC7322]  Flanagan, H. and S. Ginoza, "RFC Style Guide", RFC 7322,
              DOI 10.17487/RFC7322, September 2014,
              <http://www.rfc-editor.org/info/rfc7322>.

   [I-D.flanagan-nonascii]
              Flanagan, H., "The Use of Non-ASCII Characters in RFCs",
              draft-flanagan-nonascii-06 (work in progress), November
              2015.



Hansen, et al.          Expires November 18, 2016              [Page 14]


Internet-Draft                PDF for RFCs                      May 2016


   [I-D.flanagan-rfc-css]
              Flanagan, H., "CSS Requirements for RFCs", draft-flanagan-
              rfc-css-04 (work in progress), September 2015.

   [I-D.hardy-pdf-mime]
              Hardy, M., Masinter, L., Markovic, D., Johnson, D., and M.
              Bailey, "The application/pdf Media Type", draft-hardy-pdf-
              mime-01 (work in progress), April 2016.

   [I-D.hildebrand-html-rfc]
              Hildebrand, J. and P. Hoffman, "HyperText Markup Language
              Request For Comments Format", draft-hildebrand-html-rfc-10
              (work in progress), August 2015.

   [I-D.housley-rfc-and-id-signatures]
              Housley, R., "Digital Signatures on RFC and Internet-Draft
              Documents", draft-housley-rfc-and-id-signatures-02 (work
              in progress), May 2016.

4.3.  URIs

   [1] https://sourceforge.net/projects/
       sourcesans.adobe/?source=directory

   [2] https://sourceforge.net/projects/
       sourceserifpro.adobe/?source=directory

   [3] https://sourceforge.net/projects/
       sourcecodepro.adobe/?source=drectory

   [4] https://www.rosettatype.com/Skolar

   [5] https://www.google.com/get/noto/

Appendix A.  History and Current Use of PDF with RFCs and Internet-
             Drafts

   NOTE: this section is meant as an overview to give some background.

A.1.  RFCs

   The RFC series has for a long time accepted Postscript renderings of
   RFCs, either in addition to or instead of the text renderings of
   those same RFCs.  These have usually been produced when there was a
   complicated figure or mathematics within the document.  For example,
   consider the figures and mathematics found in RFC 1119 and RFC 1142,
   and compare the figures found in the text version of RFC 3550 with
   those in the Postscript version.  The RFC editor has provided a PDF



Hansen, et al.          Expires November 18, 2016              [Page 15]


Internet-Draft                PDF for RFCs                      May 2016


   rendering of RFCs.  Usually, this has been a print of the text file
   that does not take advantage of any of the broader PDF functionality,
   unless there was a Postscript version of the RFC, which would then be
   used by the RFC editor to generate the PDF.

A.2.  Internet-Drafts

   In addition to PDFs generated and published by the RFC editor, the
   IETF tools community has also long supported PDF for Internet-Drafts.
   Most RFCs start with Internet-Drafts, edited by individual authors.
   The Internet-Drafts submission tool at https://datatracker.ietf.org/
   submit/ accepts PDF and Postscript files in addition to the
   (required) text submission and (currently optional) XML.  If a PDF
   wasn't submitted for a particular version of an Internet-Draft, the
   tools would generate one from the Postscript, HTML, or text.

Appendix B.  Paged Content Layout Quality

   The process of creating a paged document from running text typically
   involves ensuring that related material is present on the same page
   together, and that artifacts of pagination don't interfere with easy
   reading of the document.  Typical high-quality layout processors do
   several things:

   Widow and Orphan Management:  Widows and orphans
      (<https://en.wikipedia.org/wiki/Widows_and_orphans>) should be
      avoided automatically (unless the entire paragraph is only one
      line).  Ensure that a page break does not occur after the first
      line of a paragraph (orphans), if necessary, using slightly longer
      page sizes.  Similarly, ensure that a page break does not occur
      before the last line of a paragraph (widows).

   Keep Section Heading Contiguous:  Do not insert a page break
      immediately after a section heading.  If there isn't room on a
      page for the first (two) lines of a section after the section
      heading, insert a page break before the heading.

   Avoid Splitting Artwork:  Figures should not be split from figure
      titles.  If possible, keep the figure on the same page as the
      (first) mention of the figure.

   Headers for Long Tables after Page Breaks:  Another common option in
      producing paginated documents is to include the column headings of
      a table if the table cannot be displayed on a single page.
      Similarly, tables should not be split from the table titles.






Hansen, et al.          Expires November 18, 2016              [Page 16]


Internet-Draft                PDF for RFCs                      May 2016


   keepWithNext and keepWithPrevious:  The XML attributes of
      "keepWithNext" and "keepWithPrevious" should be followed whenever
      possible.

   Whitespace Preservation:  The XML entities such as NBSP and NBHYPHEN
      should be followed as directed whenever possible.

Appendix C.  Tooling

   This section discusses tools for viewing, comparing, creating,
   manipulating, transforming PDF files, including those currently in
   use by the RFC editor and Internet-Drafts, as well as outlining
   available PDF tools for various processes.

C.1.  PDF Viewers

   As with most file formats, PDF files are experienced through a reader
   or viewer of PDF files.  For most of the common platforms in use
   (iOS, OS X, Windows, Android, ChromeOS, Kindle) and for most browsers
   (Edge, Safari, Chrome, Firefox), PDF viewing is built in.  In
   addition there are many PDF viewers available for download and
   install.

   PDF viewers vary in capabilities, and it is important to note which
   PDF viewers support the features utilized in PDF RFCs and Internet-
   Drafts (features such as links, digital signatures, Tagged PDF and
   others mentioned in Section 3).

C.2.  Printers

   While almost all viewers also support printing of PDF files, printing
   is one of the most important use cases for PDFs.  Some printers have
   direct PDF support.

C.3.  PDF Generation Libraries

   Because the xml2rfc format is a unique format, software for
   converting XML source documents to the various formats will be
   needed, including PDF generation.

   One promising direction is suggested in
   <http://greenbytes.de/tech/webdav/rfc2629xslt/
   rfc2629xslt.html#output.pdf.fop>: using XSLT to generate XSL-FO which
   is then processed by a formatting object processor such as Apache
   FOP.

   Several libraries are also available for generating PDF signatures.
   The choice of library to use for xml2pdf will depend on many factors:



Hansen, et al.          Expires November 18, 2016              [Page 17]


Internet-Draft                PDF for RFCs                      May 2016


   programming language, quality of implementation, quality of PDF
   generated, support, cost, availability, and so forth.

C.4.  Typefaces

   This section is intended to discuss available typefaces that might
   satisfy requirements.  Some openly available fixed-width typefaces
   (without extensive Unicode support, however) include:

   o  Source Sans [1]

   o  Source Serif Pro [2]

   o  Source Code Pro [3]

   A font that looks promising for its broad Unicode support is Skolar
   [4], but it requires licensing.  Another potentially useful set of
   typefaces is the Noto [5] family from Google.

C.5.  Other Tools

   In addition to generating and viewing PDF, other categories of PDF
   tools are available and may be useful both during specification
   development and for published RFCs.  These include tools for
   comparing two PDFs, checkers that could be used to validate the
   results of conversion, reviewing and commentary tools that attach
   annotations to PDF files, and digital signature creation and
   validation.

   Validation of an arbitrary author-generated PDF file would be quite
   difficult; there are few PDF validation tools.  However, if RFCs and
   Internet-Drafts are generated by conversion from XML via xml2rfc,
   then explicit validation of PDF and adherence to expected profiles
   would mainly be useful to ensure that xml2rfc has functioned
   properly.

   Recommendations:

   o  Discourage (but allow) submission of a PDF representation for
      Internet-Drafts.  In most cases, the PDF for an Internet-Draft
      should be produced automatically when XML is submitted, with an
      opportunity to verify the conversion.

Appendix D.  Acknowledgements

   The input of the following people is gratefully acknowledged: Nevil
   Brownlee (ISE), Brian Carpenter, Chris Dearlove, Martin Duerst,
   Heather Flanagan (RSE), Joe Hildebrand, Paul Hoffman, Duff Johnson,



Hansen, et al.          Expires November 18, 2016              [Page 18]


Internet-Draft                PDF for RFCs                      May 2016


   Ted Lemon, Sean Leonard, Henrik Levkowetz, Julian Reschke, Adam
   Roach, Leonard Rosenthol, Alice Russo, Robert Sparks, Andrew
   Sullivan, and Dave Thaler.

Authors' Addresses

   Tony Hansen (editor)
   AT&T Laboratories
   200 Laurel Ave. South
   Middletown, NJ  07748
   USA

   Email: tony@att.com


   Larry Masinter
   Adobe
   345 Park Ave
   San Jose, CA  95110
   USA

   Email: masinter@adobe.com
   URI:   http://larry.masinter.net


   Matthew Hardy
   Adobe
   345 Park Ave
   San Jose, CA  95110
   USA

   Email: mahardy@adobe.com



















Hansen, et al.          Expires November 18, 2016              [Page 19]