FTPEXT Working Group R. Elz
Internet Draft University of Melbourne
Expiration Date: January 1998
P. Hethmon
Hethmon Brothers
July 1997
Extended Directory Listing and Restart Mechanism for FTP
draft-ietf-ftpext-mlst-02.txt
Status of this Memo
This document is an Internet-Draft. Internet-Drafts are working
documents of the Internet Engineering Task Force (IETF), its areas,
and its working groups. Note that other groups may also distribute
working documents as Internet-Drafts.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
To learn the current status of any Internet-Draft, please check the
"1id-abstracts.txt" listing contained in the Internet-Drafts Shadow
Directories on ftp.is.co.za (Africa), nic.nordu.net (Europe),
munnari.oz.au (Pacific Rim), ds.internic.net (US East Coast), or
ftp.isi.edu (US West Coast).
Abstract
In order to overcome the problems caused by the undefined format of
the current FTP LIST command output, a new command is needed to
transfer standardized listing information from Server-FTP to Client-
FTP. Commands to enable this are defined in this document.
This proposal also extends the FTP protocol to allow character sets
other than US-ASCII[1] by allowing the transmission of 8-bit
characters and the recommended use of UTF-8[2] encoding.
Much implemented, but long undocumented, mechanisms to permit
restarts of interrupted data transfers in STREAM mode, are also
included here.
Elz & Hethmon [Expires January 1998] [Page 1]
Internet Draft draft-ietf-ftpext-mlst-02.txt July 1997
This version contains corrections and additions agreed on the mailing
list. Some sections incomplete in the previous draft have been
completed. Several editorial adjustments have been made. This
version is still not nearly complete. This paragraph will be deleted
from the final version of this document.
Table of Contents
Abstract ................................................ 1
1 Introduction ............................................ 3
2 Document Conventions .................................... 3
2.1 Basic Tokens ............................................ 3
2.2 Pathnames ............................................... 4
2.3 Times ................................................... 5
2.4 Server Replies .......................................... 6
3 File Modification Time (MDTM) ........................... 6
3.1 Syntax .................................................. 7
3.2 Error responses ......................................... 7
3.3 FEAT response for MDTM .................................. 7
4 File SIZE ............................................... 8
4.1 Syntax .................................................. 8
4.2 Error responses ......................................... 8
4.3 FEAT response for SIZE .................................. 9
5 Restart of Interrupted Transfer (REST) .................. 9
5.1 Restarting in STREAM Mode ............................... 9
5.2 ERROR RECOVER AND RESTART ............................... 10
5.3 Syntax .................................................. 11
5.4 FEAT response for REST .................................. 11
6 Listings for Machine Processing (MLST and MLSD) ......... 12
6.1 Format of MLST Request .................................. 13
6.2 Format of MLST Response ................................. 13
6.3 Filename encoding ....................................... 15
6.4 Format of Facts ......................................... 16
6.5 Standard Facts .......................................... 16
6.6 FEAT response for MLST .................................. 23
6.7 OPTS parameters for MLST ................................ 24
7 Interpretation of STAT command output ................... 24
7.1 FEAT response for STAT .................................. 25
8 Impact On Other FTP Commands ............................ 25
8.1 Impact on Pathnames and Filenames ....................... 26
9 Character sets and Internationalisation ................. 26
10 Security Considerations ................................. 26
11 References .............................................. 26
Acknowledgements ........................................ 27
Editors' Addresses ...................................... 28
Elz & Hethmon [Expires January 1998] [Page 2]
Internet Draft draft-ietf-ftpext-mlst-02.txt July 1997
1. Introduction
This document amends the File Transfer Protocol (FTP) [6]. Four new
commands are added: "SIZE", "MDTM", "MLST", and "MLSD". Two existing
commands are modified, those are "REST" and "STAT". Of those, the
"SIZE" and "MDTM" commands, and the modifications to "REST" have been
in wide use for many years. The others are new.
These commands allow a client to restart an interrupted transfer in
transfer modes not previously supported in any documented way, and to
obtain a directory listing in a machine friendly, predictable,
format.
2. Document Conventions
This document makes use of the document conventions defined in BCP14
[9]. That provides the interpretation of capitalized imperative
words like MUST, SHOULD, etc.
This document also uses notation defined in STD 9 [6]. In
particular, the terms "reply", "user", "NVFS", "file", "pathname",
"FTP commands", "DTP", "user-FTP process", "user-PI", "user-DTP",
"server-FTP process", "server-PI", "server-DTP", "mode", "type",
"NVT", "control connection", "data connection", and "ASCII", are all
used here as defined there.
Syntax required is defined using the Augmented BNF defined in [3].
Some general ABNF definitions are required throughout the document,
those will be defined later in this section. At first reading, it
may be wise to simply recall that these definitions exist here, and
skip to the next section.
2.1. Basic Tokens
This document imports the core definitions given in Appendix A of
[3]. There definitions will be found for basic ABNF elements like
ALPHA, DIGIT, SP, etc. To that, the following terms are added for
use in this document.
PRCHAR = %x21-7E ; a printing character, ! to ~
TXTCHAR = PRCHAR / SP / %x09 ; printing plus white space
RCHAR = ALPHA / DIGIT / "," / "." / ":" / "!" /
"@" / "#" / "$" / "%" / "^" /
"&" / "(" / ")" / "-" / "_" /
"+" / "?" / "/" / "\" / "'" /
%x22 ; <"> -- double quote character
Elz & Hethmon [Expires January 1998] [Page 3]
Internet Draft draft-ietf-ftpext-mlst-02.txt July 1997
The PRCHAR, TXTCHAR, and RCHAR types give basic character types from
varying sub-sets of the ASCII character set for use in various
commands and responses.
token = 1*RCHAR
A "token" is a string whose precise meaning depends upon the context
in which it is used. In some cases it will be a value from a set of
possible values maintained elsewhere. In others it might be a string
invented by one party to an FTP conversation from whatever sources it
finds relevant.
error-response = error-code SP *TXTCHAR CRLF
error-code = ("4" / "5") 2DIGIT
Note that in ABNF, string literals are case insensitive. That
convention is preserved in this document. However note that ALPHA,
in particular, is case sensitive. That implies that a "token" is a
case sensitive value. That implication is correct.
2.2. Pathnames
Various FTP commands take pathnames as arguments, or return pathnames
in responses. When the MLST command is supported, as indicated in
the response to the FEAT command [10], pathnames are to be
transferred in one of the following two formats.
utf-8-name = <a UTF-8 encoded Unicode string>
raw = <any string not being a valid UTF-8 encoding>
Which format is used is at the option of the user-PI or server-PI
sending the pathname. UTF-8 encodings contain enough internal
structure that it is always, in practice, possible to determine
whether a UTF-8 or raw encoding has been used, in the cases where it
matters. Note that ASCII is a subset of UTF-8.
Unless otherwise specified, the pathname is terminated by the CRLF
that terminates the FTP command, or by the CRLF that ends a reply.
Any trailing spaces preceding that CRLF form part of the name.
Exactly one space will precede the pathname and serve as a separator
from the preceding syntax element. Any additional spaces form part
of the pathname. See [4] for a fuller explanation of the character
encoding issues. All implementations supporting MLST MUST support
[4].
Implementations should also beware that the control connection uses
Telnet NVT conventions [11], and that the Telnet IAC character, if
part of a pathname sent over the control connection, MUST be
Elz & Hethmon [Expires January 1998] [Page 4]
Internet Draft draft-ietf-ftpext-mlst-02.txt July 1997
correctly escaped as defined by the Telnet protocol.
2.3. Times
The syntax of a time value is:
time-val = 12DIGIT [ "." 1*DIGIT ]
The leading, mandatory, twelve digits are to be interpreted as, in
order from the leftmost, four digits giving the year, with a range of
1000-9999, two digits giving the month of the year, with a range of
01-12, two digits giving the day of the month, with a range of 01-31,
two digits giving the hour of the day, with a range of 00-23, two
digits giving minutes past the hour, with a range of 00-59, and
finally, two digits giving seconds past the minute, with a range of
00-60 (with 60 being used only at a leap second). Years in the tenth
century, and earlier, cannot be expressed. This is not considered a
serious defect of the protocol.
[ Ed-Note: Should we permit 12*DIGIT (or maybe
12*13DIGIT) so times in the 101st century and beyond can
be represented? ]
The optional digits, which must be preceded by a period, give decimal
fractions of a second. These may be given to whatever precision is
appropriate to the circumstance, however implementations MUST NOT add
precision to time-vals where that precision does not exist in the
underlying value being transmitted.
Symbolically, a time-val may be viewed as
YYYYMMDDHHMMSS.sss
The "." and subsequent digits are optional.
Time values are always represented in UTC (GMT), and in the Gregorian
calendar regardless of what calendar may have been in use at the date
and time indicated at the location of the server-PI.
The technical differences between GMT, UTC, UT1, UT2, etc, are not
considered here. A server-FTP process should always use the same
time reference, so the times it returns will be consistent. Clients
are not expected to be time synchronised with the server, so the
possible difference in times that might be reported by the different
time standards is not considered important.
Elz & Hethmon [Expires January 1998] [Page 5]
Internet Draft draft-ietf-ftpext-mlst-02.txt July 1997
2.4. Server Replies
Section 4.2 of [6] defines the format and meaning of replies by the
server-PI to FTP commands from the user-PI. Those reply conventions
are used here without change. Implementors should note that the ABNF
syntax (which was not used in [6]) in this document, and other FTP
related documents, sometimes shows replies using the one line format.
Unless otherwise explicitly stated, that is not intended to imply
that multi-line responses are not permitted. Implementors should
assume that, unless stated to the contrary, any reply to any FTP
command (including QUIT) may be of the multiline format described in
[6].
Throughout this document, replies will be identified by the three
digit code that is their first element. Thus the term "500 reply"
means a reply from the server-PI using the three digit code "500".
3. File Modification Time (MDTM)
The FTP command, MODIFICATION TIME (MDTM), can be used to determine
when a file in the server NVFS was last modified. This command has
existed in many FTP servers for many years, as an adjunct to the REST
command for STEAM mode, thus is widely available. However, where
supported the "mtime" fact which can be provided in the result from
the new MLST command is recommended as a superior alternative.
When attempting to restart a RETRieve, if the User FTP makes use of
the MDTM command, it can check and see if the modification time of
the source file is more recent than the modification time of the
partially transferred file. If it is, then most likely the source
file has changed and it would be unsafe to restart in the middle of
the file transfer.
When attempting to restart a STORe, the User FTP can use the MDTM
command to discover the modification time of the partially
transferred file. If it is older than the modification time of the
file that is about to be STORed, then most likely the source file has
changed and it would be unsafe to restart in the middle of the file
transfer.
Note that using MLST (described below) where available, can provide
this information, and much more, thus giving an even better
indication that a file has changed, and that restarting a transfer
would not give valid results.
Note that this is applicable to any RESTart attempt, regardless of
the mode of the file transfer.
Elz & Hethmon [Expires January 1998] [Page 6]
Internet Draft draft-ietf-ftpext-mlst-02.txt July 1997
3.1. Syntax
The syntax for the MDTM command is:
mdtm = "MdTm" SP ( utf-8-name / raw ) CRLF
The server-PI will respond to the MDTM command with a 213 reply
giving the last modification time of the file whose pathname was
supplied, or an error response if the file does not exist, the
modification time is unavailable, or some other error has occurred.
mdtm-response = "213" SP time-val CRLF /
error-response
3.2. Error responses
Where the command is correctly parsed, but the modification time is
not available, either because the pathname identifies no existing
entity, or because the information is not available for the entity
named, then a 550 reply should be sent. Where the command cannot be
correctly parsed, a 500 or 501 reply should be sent, as specified in
[6].
3.3. FEAT response for MDTM
When replying to the FEAT command [10], a FTP server process that
supports the MDTM command MUST include a line containing the single
word "MDTM". This MAY be sent in upper or lower case, or a mixture
of both (it is case insensitive) but SHOULD be transmitted in upper
case only. That is, the response SHOULD be
C> Feat
S> 211- <any descriptive text>
S> ...
S> MDTM
S> ...
S> 211 End
The ellipses indicate placeholders where other features may be
included, and are not required. The one space indentation of the
feature lines is mandatory [10].
Elz & Hethmon [Expires January 1998] [Page 7]
Internet Draft draft-ietf-ftpext-mlst-02.txt July 1997
4. File SIZE
The FTP command, SIZE OF FILE (SIZE), is used to obtain the transfer
size of a file from the server-FTP process. That is, the exact
number of octets (8 bit bytes) which would be transmitted over the
data connection should that file be transmitted. This value will
change depending on the current STRUcture, MODE and TYPE of the data
connection, or a data connection which would be created were one
created now. Thus, the result of the SIZE command is dependent on
the currently established STRU, MODE and TYPE parameters.
The SIZE command returns how many octets would be transferred if the
file were to be transferred using the current transfer structure,
mode and type. This command is normally used in conjunction with the
RESTART (REST) command. The server-PI might need to read the
partially transferred file, do any appropriate conversion, and count
the number of octets that would be generated when sending the file in
order to correctly respond to this command. Estimates of the file
transfer size MUST NOT be returned, only precise information is
acceptable.
4.1. Syntax
The syntax of the SIZE command is:
size = "Size" SP ( utf-8-name / raw ) CRLF
The server-PI will respond to the SIZE command with a 213 reply
giving the transfer size of the file whose pathname was supplied, or
an error response if the file does not exist, the size is
unavailable, or some other error has occurred. The value returned is
in a format suitable for use with the RESTART (REST) command for mode
STREAM, provided the transfer mode and type are not altered.
size-response = "213" SP 1*DIGIT CRLF /
error-response
4.2. Error responses
Where the command is correctly parsed, but the size is not available,
either because the pathname identifies no existing entity, or because
the entity named cannot be transferred in the current MODE and TYPE
(or at all), then a 550 reply should be sent. Where the command
cannot be correctly parsed, a 500 or 501 reply should be sent, as
specified in [6].
Elz & Hethmon [Expires January 1998] [Page 8]
Internet Draft draft-ietf-ftpext-mlst-02.txt July 1997
4.3. FEAT response for SIZE
When replying to the FEAT command [10], a FTP server process that
supports the SIZE command MUST include a line containing the single
word "SIZE". This word is case in-sensitive, and MAY be sent in any
mixture of upper or lower case, however it SHOULD be sent in upper
case. That is, the response SHOULD be
C> FEAT
S> 211- <any descriptive text>
S> ...
S> SIZE
S> ...
S> 211 END
The ellipses indicate placeholders where other features may be
included, and are not required. The one space indentation of the
feature lines is mandatory [10].
5. Restart of Interrupted Transfer (REST)
To avoid having to resend the entire file if the file is only
partially transferred, both sides need some way to be able to agree
on where in the data stream to restart the data transfer.
The FTP specification [6] includes three modes of data transfer,
Stream, Block and Compressed. In Block and Compressed modes, the
data stream that is transferred over the data connection is
formatted, allowing the embedding of restart markers into the stream.
The sending DTP can include a restart marker with whatever
information it needs to be able to restart a file transfer at that
point. The receiving DTP can keep a list of these restart markers,
and correlate them with how the file is being saved. To restart the
file transfer, the receiver just sends back that last restart marker,
and both sides know how to resume the data transfer. Note that there
are some flaws in the description of the restart mechanism in RFC 959
[6]. See section 4.1.3.4 of RFC 1123 [7] for the corrections.
5.1. Restarting in STREAM Mode
In Stream mode, the data connection contains just a stream of
unformatted octets of data. Explicit restart markers thus cannot be
inserted into the data stream, they would be indistinguishable from
data. For this reason, the FTP specification [6] did not provide the
ability to do restarts in stream mode. However, there is not really
a need to have explicit restart markers in this case, as restart
markers can be implied by the octet offset into the data stream.
Elz & Hethmon [Expires January 1998] [Page 9]
Internet Draft draft-ietf-ftpext-mlst-02.txt July 1997
Because the data stream defines the file in STREAM mode, a different
data stream would represent a different file. Thus, an offset will
always represent the same position within a file. On the other hand,
in other modes than STREAM, the same file can be transferred using
quite different octet sequences, and yet be reconstructed into the
one identical file. Thus an offset into the data stream in transfer
modes other than STREAM would not give an unambiguous restart point.
If the data representation TYPE is IMAGE, and the STRUcture is File,
for many systems the file will be stored exactly in the same format
as it is sent across the data connection. It is then usually very
easy for the receiver to determine how much data was previously
received, and notify the sender the offset where the transfer should
be restarted. In other representation types and structures more
effort will be required, but it remains always possible to determine
the offset with finite, but perhaps non-negligible, effort. In the
worst case an FTP process may need to open a data connection to
itself, set the appropriate transfer type and structure, and actually
transmit the file, counting the transmitted octets.
If the user-FTP process is intending to restart a retrieve, it will
directly calculate the restart marker, and send that information in
the RESTart command. However, if the user-FTP process is intending
to restart sending the file, it needs to be able to determine how
much data was previously sent, and correctly received and saved. A
new FTP command is needed to get this information. This is the
purpose of the SIZE command, as documented in section 4.
5.2. ERROR RECOVER AND RESTART
STREAM MODE transfers with FILE STRUcture may be restarted even
though no restart marker has been transferred in addition to the data
itself. This is done by perhaps the SIZE command, if needed, in
combination with the RESTART (REST) command, and one of the standard
file transfer commands.
When using TYPE ASCII or IMAGE, the SIZE command will return the
number of octets that would actually be transferred if the file were
to be sent between the two systems. I.e. with type IMAGE, the SIZE
normally would be the number of octets in the file. With type ASCII,
the SIZE would be the number of octets in the file including any
modifications required to satisfy the TYPE ASCII CR-LF end of line
convention.
Elz & Hethmon [Expires January 1998] [Page 10]
Internet Draft draft-ietf-ftpext-mlst-02.txt July 1997
5.3. Syntax
The syntax for the REST command when the current transfer mode is
STREAM is:
rest = "Rest" SP 1*DIGIT CRLF
The numeric value gives the number of octets of the immediately
following transfer to not actually send, effectively causing the
transmission to be restarted at a later point. The server-PI will
respond to the REST command with a 350 reply, indicating that the
REST parameter has been saved, and that another command, which should
be either RETR or STOR, should then follow to complete the restart.
rest-response = "350" SP *TXTCHAR CRLF /
error-response
Server-FTP processes may permit transfer commands other than RETR and
STOR, such as APPE and STOU, to complete a restart, however, this is
not recommended. STOU (store unique) is undefined in this usage, as
storing the remainder of a file into a unique filename is rarely
going to be useful. If APPE (append) is permitted, it MUST act
identically to STOR when a restart marker has been set. That is, in
both cases, octets from the data connection are placed into the file
at the location indicated by the restart marker value.
An error-response will follow a REST command only when the server
does not implement the command, or the restart marker value is
syntactically invalid for the current transfer mode. That is, in
STREAM mode, if something other than one or more digits appears in
the parameter to the REST command. Any other errors, including such
problems as restart marker out of range, should be reported when the
following transfer command is issued.
5.4. FEAT response for REST
Where a server-FTP process supports RESTart in STREAM mode, as
specified here, it MUST include in the response to the FEAT command
[10], a line containing exactly the string "REST STREAM". This
string is not case sensitive, but SHOULD be transmitted in upper
case. Where REST is not supported at all, or supported only in block
or compressed modes, the REST line MUST NOT be included in the FEAT
response. Where required, the response SHOULD be
Elz & Hethmon [Expires January 1998] [Page 11]
Internet Draft draft-ietf-ftpext-mlst-02.txt July 1997
C> feat
S> 211- <any descriptive text>
S> ...
S> REST STREAM
S> ...
S> 211 end
The ellipses indicate placeholders where other features may be
included, and are not required. The one space indentation of the
feature lines is mandatory [10].
6. Listings for Machine Processing (MLST and MLSD)
The MLST and MLSD commands are intended to standardize the file and
directory information returned by the Server-FTP process. These
commands differ from the LIST command in that the format of the
replies is strictly defined although extensible.
Two commands are defined, MLST which provides data about exactly the
object named on its command line, and no others. MLSD on the other
hand will list the contents of a directory if a directory is named,
otherwise a 501 error reply will be returned. In either case, if no
object is named, the current directory is assumed. That will cause
MLST to send a one line response, describing the current directory
itself, and MLSD to list the contents of the current directory.
[ Ed-Note: Do we need something for recursive listings ?
]
In the sequel only MLST will be described. Other than as previously
mentioned, MLSD is identical.
The MLST and MLSD commands also extend the FTP protocol as presented
in RFC 959 [6] and RFC 1123 [7] to allow that transmission of 8-bit
data over the control connection. Note this is not specifying
character sets which are 8-bit, but specifying that FTP
implementations are to specifically allow the transmission and
reception of 8-bit bytes, with all bits significant, over the control
connection. That is, all 256 possible octet values are permitted.
The MLST command allows both UTF-8/Unicode and "raw" forms as
arguments, and in responses to the MLST and MLSD commands, and all
other FTP commands which take pathnames as arguments.
Elz & Hethmon [Expires January 1998] [Page 12]
Internet Draft draft-ietf-ftpext-mlst-02.txt July 1997
6.1. Format of MLST Request
The MLST and MLSD commands each allow a single optional argument.
This argument may be either a directory name or a filename. For
these purposes, a "filename" is the name of any entity in the server
NVFS which is not a directory. If a directory name is given then
MLSD must return a listing of the contents of the named directory,
otherwise it issues a 501 reply, and does not open a data connection.
In all cases for MLST, only a single fact line containing the
information about the named file or directory shall be returned.
If no argument is given then MLSD must return a listing of the
contents of the current working directory, and MLST must return a
listing giving information about the current working directory
itself.
No title, or header, lines, or any other formatting, other than as is
specified below, is ever returned in the output of an MLST or MLSD
command.
If the Client-FTP sends an invalid argument, the Server-FTP MUST
reply with an error code of 501.
The syntax for the MLST command is:
mlst = "MLst" [ SP ( utf-8-name / raw ) ] CRLF
6.2. Format of MLST Response
The format of a response to the MLST command is as follows:
mlst-response = initial-response final-response
initial-response = "150" [ SP response-message ] CRLF /
error-response
response-message = *TXTCHAR
final-response = "226" SP response-message CRLF
data-response = *( entry CRLF )
entry = [ facts ] SP ( utf-8-name / raw )
facts = fact *( ";" fact )
fact = factname "=" value
factname = "Size" / "Modify" / "Create" /
"Type" / "Unique" / "Perm" /
"Lang" / "Media-Type" / "CharSet" /
os-depend-fact / local-fact
os-depend-fact = <IANA assigned OS name> "." 1*RCHAR
local-fact = "X." 1*RCHAR
value = 1*RCHAR
Elz & Hethmon [Expires January 1998] [Page 13]
Internet Draft draft-ietf-ftpext-mlst-02.txt July 1997
end-token = "End"
Upon receipt of a MLST or MLSD command, the server will verify the
parameter, and if invalid return an error-response. If valid, the
server will open a data connection as indicated in section 3.2 of
RFC959 [6]. If that fails, the server will return an error-response.
If all is OK, the server will return the initial-response, send the
appropriate data-response over the new data connection, close that
connection, and then send the final-response.
The data connection opened for a MLST or MLSD response shall be a
connection as if the "TYPE L 8", "MODE S", and "STRU F" commands had
been given, whatever FTP transfer type, mode and structure had
actually been set, and without causing those settings to be altered
for future commands. That is, this transfer type shall be set for
the duration of the data connection established for this command
only. While the content of the data sent can be viewed as a series
of lines, implementations should note that there is no maximum line
length defined. Implementations should be prepared to deal with
arbitrarily long lines.
The facts part of the specification would contain a series of "file
facts" about the file or directory named on the same line. Typical
information to be presented would include file size, last
modification time, creation time, a unique identifier, and a
file/directory flag.
The complete format for a successful reply to the MLSD command would
be:
facts SP utf-8-name CRLF
facts SP utf-8-name CRLF
facts SP utf-8-name CRLF
...
Note that the format is intended for machine processing, not human
viewing, and as such the format is very rigid. Implementations MUST
NOT vary the format by, for example, inserting extra spaces for
readability, including header or title lines, or inserting blank
lines, or in any other way alter this format. Exactly one space is
always required after the set of facts (which may be empty). More
spaces may be present on a line if, and only if, the file name
presented contains significant spaces. The set of facts must not
contain any spaces anywhere inside it.
Elz & Hethmon [Expires January 1998] [Page 14]
Internet Draft draft-ietf-ftpext-mlst-02.txt July 1997
6.3. Filename encoding
A FTP implementation supporting the MLST command must be 8-bit clean.
This is necessary in order to transmit UTF-8 encoded filenames. This
specification recommends the use of UTF-8 encoded filenames. FTP
implementations SHOULD use UTF-8 whenever possible to encourage the
maximum interoperability.
Filenames are not restricted to UTF-8, however treatment of arbitrary
character encodings is not specified by this standard. Applications
are encouraged to treat non-UTF-8 encodings of filenames as octet
sequences.
Note that this encoding is unrelated to that of the contents of the
file, even if the file contains character data.
Further information about filename encoding for FTP may be found in
"Internationalization of the File Transfer Protocol" [4].
6.3.1. Notes about the Filename
The filename returned in the MLST response should be the same name as
was specified in the MLST command. Where no argument was given to
the MLST command, the server-PI may either include an empty filename
in the response, or it may supply a name that refers to the current
directory, if such a name is available.
Filenames returned in the output from an MLSD command should be
unqualified names within the directory named, or the current
directory if no argument was given. That is, the directory named in
the MLSD command should not appear as a component of the filenames
returned.
If the server-FTP process is able, and the "type" fact is being
returned, it MAY return in the MLSD response, an entry whose type is
"cdir", which names the directory from which the contents of the
listing were obtained. Where more than one name exists, multiple of
these entries may be returned. The server MUST return type "cdir"
names in a format such that if the user-PI takes a name of type
"cdir", and appends a name of type which is not "cdir", and which
appeared in the same MLSD response as the type=cdir name, with no
intervening separators, then a valid pathname will be produced, using
which the user-PI can reference the file indicated from its current
working directory.
Alternatively, the user-PI can issue a CWD command ([6]) giving the
name of type "cdir", and from that point reference the files returned
in the MLSD response from which the cdir was obtained by using the
Elz & Hethmon [Expires January 1998] [Page 15]
Internet Draft draft-ietf-ftpext-mlst-02.txt July 1997
filename components of the listing. Once having attempted any CWD
command however, it is no longer guaranteed that a file can be
referenced by the combination of type "cdir" and other names, whether
using CWD or name concatenation.
[ Ed-Note: This whole scheme is (yet again) open to
revision or removal - more discussion of its worth and if
worthwhile, the details of the scheme is needed ]
6.3.2. Examples
Once upon a (future) time, examples existed here.
6.4. Format of Facts
The "facts" for a file in a reply to a MLST command consist of
information about that file. The facts are a series of keyword=value
pairs separated by a semi-colon (";") character. The complete series
of facts may not contain the space character.
A sample of a typical series of facts would be: (spread over two
lines for presentation here only)
size=4161;lang=en-us;modify=19970214165800;create=19961001124534;
type=file;x.myfact=foo,bar
6.5. Standard Facts
This document defines a standard set of facts as follows:
size -- Size in octets
modify -- Last modification time
create -- Creation time
type -- Entry type
unique -- Unique id of file/directory
perm -- File permissions, whether read, write, execute is
allowed for the login id.
lang -- Language of the filename per IANA[5] registry.
media-type -- MIME media-type of file contents per IANA registry.
charset -- Character set per IANA registry (if not UTF-8)
Fact names are case-insensitive. Size, size, SIZE, and SiZe are the
same fact.
Further operating system specific keywords could be specified by
using the IANA operating system name as a prefix (examples only):
Elz & Hethmon [Expires January 1998] [Page 16]
Internet Draft draft-ietf-ftpext-mlst-02.txt July 1997
OS/2.ea -- OS/2 extended attributes
MACOS.rf -- MacIntosh resource forks
UNIX.mode -- Unix file modes (permissions)
Implementations may define keywords for experimental, or private use.
All such keywords MUST begin with the two character sequence "x.".
As type names are case independent, "x." and "X." are equivalent.
For example:
x.ver -- Version information
x.desc -- File description
x.type -- File type
6.5.1. The type Fact
The type fact needs a special description. Part of the problem with
current practices is deciding when a file is a directory. If it is a
directory, is it the current directory, a regular directory, or a
parent directory? The MLST specification makes this unambiguous
using the type fact. The type fact given specifies information about
the object listed on the same line of the MLST response.
Five values are possible for the type fact:
file -- a file entry
cdir -- the current directory
pdir -- the parent directory
dir -- a directory or sub-directory
OS.name=type -- an OS or file system dependent file type
The syntax is defined to be:
type-fact = type-label "=" type-val
type-label = "Type"
type-val = "File" / "cdir" / "pdir" / "dir" /
os-type
6.5.1.1. type=file
The presence of the type=file fact indicates the listed entry is a
file containing non-system data. That is, it may be transferred from
one system to another of quite different characteristics, and perhaps
still be meaningful.
6.5.1.2. type=cdir
The type=cdir fact indicates the listed entry is the pathname of the
directory whose contents are listed. The value of this entry (the
Elz & Hethmon [Expires January 1998] [Page 17]
Internet Draft draft-ietf-ftpext-mlst-02.txt July 1997
filename part) plus the value of a type=file entry from the same MLSD
listing together should represent a complete pathname suitable for a
RETR command. The value for the type=cdir entry should include any
necessary system delimiters used between path components. An example
would be the forward slash "/" on a UNIX(TM) system, or a back slash
"\" on an OS/2 or Windows system.
Note: systems for which no suitable delimiter valid in all situations
exists can still make use of the "cdir" type by formatting in in a
way which is recognisable when returned as a pathname, and then
reformatting the supplied pathname as appropriate for the command it
is being used as an argument to.
6.5.1.3. type=dir
If present, the type=dir entry is the name of a directory. When
executed with the current directory in the same place in the NVFS as
it was when the MLST or MLSD command was issued, a CWD with its
argument being the formed by appending the name with type=pdir to a
name with type=cdir should succeed (assuming the user has the
appropriate access rights).
6.5.1.4. type=pdir
If present, which will occur only in the response to a MLSD command,
the type=pdir entry represents a pathname of the parent directory of
the listed directory. As well as having the properties of a
type=dir, a CWD command with the appropriate value should change the
user to the parent directory of the listed directory. A CDUP command
may also have the effect of changing to the listed directory. User-
FTP processes should note not all responses will include this
information, and that some systems may provide multiple type=pdir
responses.
For the purposes of this type value, a "parent directory" is any
directory in which there is an entry of type=dir which refers to the
directory in which the type=pdir entity was found. Thus it is not
required that all entities with type=pdir refer to the same
directory, the "unique" fact can be used to determine whether there
is a relationship between the type=pdir entries or not.
6.5.1.5. System defined types
Files types that are specific to a specific operating system, or file
system, can be encoded using the "OS." type names. The format is:
Elz & Hethmon [Expires January 1998] [Page 18]
Internet Draft draft-ietf-ftpext-mlst-02.txt July 1997
os-type = "OS." os-name "=" localtype
os-name = <an IANA registered operating system name>
localtype = 1*RCHAR
The "os-name" indicates the specific system type which supports the
particular localtype. It should be taken from the IANA maintained
list of operating systems wherever possible. The "localtype"
provides the system dependent information as to the type of the file
listed. The os-name and localtype strings in an os-type are case
independent. "OS.unix=block" and "OS.Unix=BLOCK" represent the same
type.
6.5.2. The unique Fact
The unique fact is used to present a unique identifier for a file or
directory in the NVFS accessed via a server-FTP process. The value
of this fact should be the same for any number of filenames that
refer to the same underlying file. The fact should have different
values for names which reference distinct files. The mapping between
files, and unique fact tokens should be maintained, and remain
consistent, for at least the lifetime of the control connection from
user-PI to server-PI.
unique-fact = "Unique" "=" token
This fact would be expected to be used by Server-FTPs whose host
system allows things such as symbolic links so that the same file may
be represented in more than one directory on the server. The only
conclusion that should be drawn is that if two different names each
have the same value for the unique fact, they refer to the same
underlying object. The value of the unique fact (the token) should
be considered an opaque string for comparison purposes, and is a case
dependent value. The tokens "A" and "a" do not represent the same
underlying object.
6.5.3. The modify Fact
The modify fact is used to determine the last time the content of the
file (or directory) indicated was modified. Any change of substance
to the file should cause this value to alter. That is, if a change
is made to a file such that the results of a RETR command would
differ, then the value of the modify fact should alter. User-PIs
should not assume that a different modify fact value indicates that
the file contents are necessarily different than when last retrieved.
Some systems may alter the value of the modify fact for other
reasons, though this is discouraged wherever possible. Also a file
may alter, and then be returned to its previous content, which would
often be indicated as two incremental alterations to the value of the
Elz & Hethmon [Expires January 1998] [Page 19]
Internet Draft draft-ietf-ftpext-mlst-02.txt July 1997
modify fact.
For directories, this value should alter whenever a change occurs to
the directory such that different filenames would (or might) be
included in MLSD output of that directory.
modify-fact = "Modify" "=" time-val
6.5.4. The create Fact
The create fact indicates when a file, or directory, was first
created. Exactly what "creation" is for this purpose is not
specified here, and may vary from server to server. About all that
can be said about the value returned is that it can never indicate a
later time than the mtime fact.
create-fact = "Create" "=" time-val
Implementation Note: Implementors of this fact on UNIX(TM) systems
should note that the unix "stat" "st_ctime" field does not give
creation time, and that unix filesystems do not record creation
time at all. Unix (and POSIX) implementations will normally not
include this fact.
6.5.5. The perm Fact
The perm fact is used to indicate access rights the current FTP user
has over the object listed. Its value is always an unordered
sequence of alphabetic characters.
perm-fact = "Perm" "=" pvals
pvals = "a" / "c" / "d" / "e" / "f" /
"l" / "m" / "p" / "r" / "w"
There are ten permission indicators currently defined. Many are
meaningful only when used with a particular type of object. The
indicators are case independent, "d" and "D" are the same indicator.
The "a" permission applies to objects of type=file, and indicates
that the APPE (append) command may be applied to the file named.
The "c" permission applies to objects of type=dir (and type=pdir,
type=cdir). It indicates that files may be created in the directory
named. That is, that a STOU command is likely to succeed, and that
STOR and APPE commands might succeed if the file named did not
previously exist, but is to be created in the directory object that
has the "c" permission. It also indicates that the RNTO command is
likely to succeed for names in the directory.
Elz & Hethmon [Expires January 1998] [Page 20]
Internet Draft draft-ietf-ftpext-mlst-02.txt July 1997
The "d" permission applies to all types. It indicates that the
object named may be deleted, that is, that the RMD command may be
applied to it if it is a directory, and otherwise that the DELE
command may be applied to it.
The "e" permission applies to the directory types. When set on an
object of type=dir, type=cdir, or type=pdir it indicates that a CWD
command naming the object should succeed, and the user should be able
to enter the directory named. For type=pdir it also indicates that
the CDUP command should succeed.
The "f" permission for objects indicates that the object named may be
renamed - that is, may be the object of an RNFR command.
The "l" permission applies to the directory file types, and indicates
that the listing commands, LIST, NLST, and MLSD may be applied to the
directory in question, and that MLST, LIST, NLST, and STAT may be
applied to objects in the directory.
The "m" permission applies to directory types, and indicates that the
MKD command may be used to create a new directory within the
directory under consideration.
The "p" permission applies to directory types, and indicates that
objects in the directory may be deleted, or (stretching naming a
little) that the directory may be purged. Note: it does not indicate
that the RMD command may be used to remove the directory named, the
"d" permission indicator indicates that.
The "r" permission applies to type=file objects, and for some
systems, perhaps to other types of objects, and indicates that the
RETR command may be applied to that object.
The "w" permission applies to type=file objects, and for some
systems, perhaps to other types of objects, and indicates that the
STOR command may be applied to the object named.
Note: That a permission indicator is set can never imply that the
appropriate command is guaranteed to work - just that it might.
Other system specific limitations, such as limitations on
available space for storing files, may cause an operation to
fail, where the permission flags may have indicated that it was
likely to succeed. The permissions are a guide only.
Implementation note: The permissions are described here as they apply
to FTP commands. They may not map easily into particular
permissions available on the server's operating system. Servers
are expected to synthesize these permission bits from the
Elz & Hethmon [Expires January 1998] [Page 21]
Internet Draft draft-ietf-ftpext-mlst-02.txt July 1997
permission information available from operating system. For
example, to correctly determine whether the "D" permission bit
should be set on a directory for a server running on the
UNIX(TM) operating system, the server should check that the
directory named is empty, and that the user has write permission
on both the directory under consideration, and its parent
directory.
Some systems may have more specific permissions than those
listed here, such systems should map those to the flags defined
as best they are able. Other systems may have only more broad
access controls. They will generally have just a few possible
permutations of permission flags, however they should attempt to
correctly represent what is permitted.
6.5.6. The lang Fact
The lang fact describes the natural language of the filename for use
in display purposes. Values used here should be taken from the
language registry of the IANA.
lang-fact = "Lang" "=" token
Server-FTP implementations MUST NOT guess language values. Language
values must be determined in an unambiguous way such as filesystem
tagging of language or by user configuration. Note that the lang
fact provides no information at all about the content of a file, only
about its name.
6.5.7. The size Fact
The size should always reflect the approximate size of the file.
This should be as accurate as the server can make it, without going
to extraordinary lengths, such as reading the entire file. The size
is expressed in units of octets.
Given limitations in some systems, Client-FTP implementations must
understand this size may not be precise and may change between the
time of a MLST and RETR operation.
Clients that need highly accurate size information for some
particular reason should use the SIZE command as defined in section
4. The most common need for this accuracy is likely to be in
conjunction with the REST command described in section 5. The size
fact, on the other hand, should be used for purposes such as
indicating to a human user the approximate size of the file to be
transferred, and perhaps to give an idea of expected transfer
completion time.
Elz & Hethmon [Expires January 1998] [Page 22]
Internet Draft draft-ietf-ftpext-mlst-02.txt July 1997
size-fact = "Size" "=" 1*DIGIT
6.5.8. The media-type Fact
The media-type fact represents the IANA media type of the file. The
list of values used must follow the guidelines set by the IANA
registry.
media-type = "Media-Type" "=" <per IANA guidelines>
Server-FTP implementations MUST NOT guess media type values. Media
type values must be determined in an unambiguous way such as
filesystem tagging of media-type or by user configuration.
6.5.9. The charset Fact
The charset fact represents the IANA character set name for the
encoded names in a MLST response. The default character set is UTF-8
unless specified otherwise. FTP implementations SHOULD use UTF-8 if
possible to encourage maximum interoperability.
charset-type = "Charset" "=" token
6.6. FEAT response for MLST
When responding to the FEAT command, a server-FTP process that
supports MLST, and the related commands, MLSD, and the modified STAT,
plus internationalisation of pathnames, MUST indicate that this
support exists. It does this by including a MLST feature line. As
well as indicating the basic support, the MLST feature line indicates
which MLST facts are available from the server, and which of those
will be returned if no subsequent "OPTS MLST" command is sent.
mlst-feat = SP "MLST" [SP factlist] CRLF
factlist = factname ["*"] *( ";" factname ["*"] )
The initial space shown in the mlst-feat response is that required by
the FEAT command, two spaces are not permitted. If no factlist is
given, then the server-FTP process is indicating that it supports
MLST, but implements no facts. Only pathnames can be returned. This
would be a minimal MLST implementation, and useless for most
practical purposes. Where the factlist is present, the factnames
included indicate the facts supported by the server. Where the
optional asterisk appears after a factname, that fact will be
included in MLST format responses, until an "OPTS MLST" is given to
alter the list of facts returned.
Elz & Hethmon [Expires January 1998] [Page 23]
Internet Draft draft-ietf-ftpext-mlst-02.txt July 1997
[ Ed-Note: Perhaps the sense of the "*" should be
reversed? That is, make the asterisk indicate those
facts not returned? ]
6.7. OPTS parameters for MLST
For the MLST command, the Client-FTP may specify a list of facts it
wishes to be returned in all subsequent MLST commands until another
OPTS MLST command is sent. The format is specified by:
mlst-opts = "OPTS" SP "MLST"
[ SP factname *(";" factname) ]
By sending the "OPTS MLST" command, the client requests the server to
include only the facts listed as arguments to the command in
subsequent output from MLST commands. Facts not included in the
"OPTS MLST" command must not be returned by the server. Facts that
are included should be returned for each entry returned from the MLST
command where they apply. Facts requested that are not supported, or
which are inappropriate to the file or directory being listed should
simply be omitted from the MLST output. This is not an error. Note
that where no factname arguments are present, the client is
requesting that only the file names be returned. In this case, and
in any other case where no facts are included in the result, the
space that separates the fact names and their values from the file
name is still required. That is, the first character of the output
line will be a space, and the file name will start immediately
thereafter.
Note, there is no "OPTS MLSD" command, the fact names set with the
"OPTS MLST" command apply to both MLST and MLSD commands, and to the
STAT command when used with a file name argument and no transfer in
progress.
7. Interpretation of STAT command output
Where a server-FTP process supports the MLST and MLSD commands, it
MUST also support the format of the STAT command that allows a
pathname to be given ([6] section 4.1.3). Further, the response to
that STAT command MUST be in MLST format, just as if an MLST command
for the same argument had been given, but slightly modified for
transport over the control connection rather than over a data
connection.
To construct the response to this form of the STAT command, the
server-PI should first construct the MLST output for the file named
by the argument. That should then be broken that output into
segments no longer than 79 octets each. Each segment should have a
Elz & Hethmon [Expires January 1998] [Page 24]
Internet Draft draft-ietf-ftpext-mlst-02.txt July 1997
space prepended, and CRLF appended. Then send a multi-line reply,
where the first line is "211-<any text at all>", the subsequent lines
are those created above, with NUL after CR insertion (other than the
CR in the end of line CRLF) and IAC escaping, as required. Finally a
terminating line "211 <any text at all>" is sent.
The leading space on each line guarantees that none of the MLST
output can be mis-interpreted as the terminating line. Server-PIs
are free to be creative in splitting the MLST output in creative ways
should they desire, however this should be relevant only to human
end-users who happen to see the raw form of the output. User-PIs
receiving this form of STAT output should simply reconstruct the MLST
format response by ignoring the leading and terminating lines, after
checking that no error occurred of course, deleting the leading space
from each interior line, deleting the terminating CRLF, and
performing escape character reduction (remove the NUL after each CR,
and delete any IAC escapes) then join the remaining lines in order,
to produce the original MLST response.
[ Ed-Note: this is a very cumbersome description of a
very simple procedure... ]
7.1. FEAT response for STAT
There is no output in the FEAT command that specifically indicates
that the STAT command behaves as described above. Implementations
must infer this from support of the MLST command by the server, which
is indicated in the FEAT output.
8. Impact On Other FTP Commands
Along with the introduction of MLST, traditional FTP commands must be
extended to allow for the use of more than US-ASCII or EBCDIC
character sets. In general, the support of MLST requires support for
arbitrary character sets wherever filenames and directory names are
allowed. This applies equally to both arguments given to the
following commands and to the replies from them, as appropriate.
CWD
RETR
STOR
STOU
APPE
RNFR
RNTO
DELE
RMD
MKD
Elz & Hethmon [Expires January 1998] [Page 25]
Internet Draft draft-ietf-ftpext-mlst-02.txt July 1997
PWD
STAT
The arguments to all of these commands should be processed the same
way that MLST commands and responses are processed with respect to
handling embedded CRs and NULs. See section 2.2.
8.1. Impact on Pathnames and Filenames
The design of MLST requires the Server-FTP to allow concatenation of
certain elements of a MLST response. Specifically, a typical
response would include an element which indicates the current
directory and one or more elements which are files in the indicated
directory. A Server-FTP must be able to accept a simple
concatenation of these two names even if the underlying operating
system does not accept a simple concatenation. The Server-FTP must
perform any translation of the concatenated name to local
equivalents.
9. Character sets and Internationalisation
This section will set out just what is going on with char sets, what
data is part of the protocol, and always appears exactly as is
specified (and could almost as easily be numbers, or any other kind
of encoding), and what is text for users, which should be able to
appear in their language of choice, or otherwise be handled in some
kind of rational way. That is, it will once it is written. This is
merely a placeholder.
10. Security Considerations
This memo does not yet discuss security. It is possible that no new
security concerns are raised in this memo above what already exists
within the FTP protocol. However, the working group needs to
consider this carefully.
11. References
[1] Coded Character Set--7-bit American Standard Code for Information
Interchange, ANSI X3.4-1986.
[2] Yergeau, F., "UTF-8, a transformation format of Unicode and ISO
10646", RFC 2044, October 1996.
[3] Crocker, D., "Augmented BNF for Syntax Specifications: ABNF",
Work In Progress <draft-ietf-drums-abnf-03.txt>, March 1997.
Elz & Hethmon [Expires January 1998] [Page 26]
Internet Draft draft-ietf-ftpext-mlst-02.txt July 1997
[4] Curtin, W., "Internationalization of the File Transfer Protocol",
Work In Progress <draft-ietf-ftpext-itln-02.txt>, June 1997
[5] Internet Assigned Numbers Authority. http://www.isi.edu/div7/iana/
Email: iana@iana.org.
[6] Postel, J., Reynolds, J., "File Transfer Protocol (FTP)",
STD 9, RFC 959, October 1985
[7] Braden, R,. "Requirements for Internet Hosts -- Application
and Support", STD 3, RFC 1123, October 1989
[8] ISO 3307 (need a citation for this please!)
[9] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, March 1997
[10] Hethmon, P., Elz, R., "Feature negotiation mechanism for the
File Transfer Protocol", Work in progress,
<draft-ietf-ftpext-feat-01.txt> July 1997.
[11] Postel, J., Reynolds, J., "Telnet protocol Specification"
STD 8, RFC 854, May 1983
Acknowledgements
The following people have contributed to this document:
Alex Belits
D. J. Berstein
Martin J. Duerst
Mark Harris
Alun Jones
James Matthews
Keith Moore
Stephen Tihor
and the entire FTPEXT working group of the IETF.
The description of the modifications to the REST command and the MDTM
and SIZE commands comes from a set of modifications suggested for
RFC959 by Rick Adams in 1989. A draft containing just those
commands, edited by David Borman, has been merged with this document.
Elz & Hethmon [Expires January 1998] [Page 27]
Internet Draft draft-ietf-ftpext-mlst-02.txt July 1997
Editors' Addresses
Robert Elz
University of Melbourne
Department of Computer Science
Parkville, Vic 3052
Australia
Email: kre@munnari.OZ.AU
Paul Hethmon
Hethmon Brothers
2305 Chukar Road
Knoxville, TN 37923 USA
Phone: +1 423 690 8990
Email: phethmon@hethmon.com
Elz & Hethmon [Expires January 1998] [Page 28]