Telechat Review of draft-ietf-nfsv4-rfc3530bis-26

Request Review of draft-ietf-nfsv4-rfc3530bis
Requested rev. no specific revision (document currently at 35)
Type Telechat Review
Team General Area Review Team (Gen-ART) (genart)
Deadline 2013-05-28
Requested 2013-03-28
Draft last updated 2013-05-09
Completed reviews Genart Last Call review of -25 by Martin Thomson (diff)
Genart Telechat review of -26 by Elwyn Davies (diff)
Genart Telechat review of -26 by Elwyn Davies (diff)
Genart Telechat review of -33 by Elwyn Davies (diff)
Genart Last Call review of -34 by Elwyn Davies (diff)
Secdir Last Call review of -25 by Yoav Nir (diff)
Assignment Reviewer Elwyn Davies
State Completed
Review review-ietf-nfsv4-rfc3530bis-26-genart-telechat-davies-2013-05-09
Reviewed rev. 26 (document currently at 35)
Review result Ready
Review completed: 2013-05-09



Gen-art additional review of draft-ietf-nfsv4-rfc3530bis-25

    I am another Gen-ART reviewer for this draft. For background on 

    Gen-ART, please see the FAQ at 



    Please resolve these comments along with any other Last Call

    you may receive. 

    Document: draft-ietf-nfsv4-rfc3530bis-25

    Reviewer: Elwyn Davies

    Review Date: 8 May 2013

    IETF LC End Date: -

    IESG Telechat date: 30 May 2013 


    This is a mammoth document.  The comments run to 650 lines.  Clearly
    there ae issues in modifying a -bis document given that it has to be
    pretty much compatible with its progenitor  - but that is now going
    on 14 years old. I believe that there are a significant number of
    minor issues to address and parts of the document (despite the
    overall size) need to provide a bit more explanation of what is
    going on.  In particular I think that a little bit more rationale
    over the way in which an NFS server attempts to provide something
    like a common interface to the two big player filesystem types (*ix
    and Windows) [Aside: I am disappointed that it doesn't try to handle
    versioning file systems like the venerable VAX filing system! ;-) ] 
    There are some areas where the addition of named attributes hasn't
    really been properly dealt with AFAICS. One major area that I felt
    ought to have been better addressed in the -bis was
    internationalisation.  As mentioned below, I don't understand why
    additional atrributes couldn't be added to help clients understand
    what the server is doing with internationalisation - the client has
    to do some complex acrobatics which seem close to guesswork to work
    out what is going on in the server.   As ever, I am not an expert in
    NFSv4 (although I used v2 extensively almost 30 years ago...
    arrrgghhh!) so there may well be comments where I don't understand
    the problem.


    Major issues:

    Overall section comments:

    s6: AccessControl Attributes

    Combining two different access control paradigms (UNIX-clone mode
    control bitmasks and access control lists) is an uphill struggle and
    provides a deal of complexity in situations where both, either one
    or neither may be specified.  The detailed text deals reasonably
    with the various cases.  There are, in my view, a couple of
    significant problems:

    - The first is essentially editorial:  I would say that rather more
    than basic understanding of UNIX-like file systems is needed.  An
    explanation or at least a suitable reference to explain the use of
    the mode bits in s6.2.2, especially the first three (SUID, SGID,
    SVTX) would help, and a clearer linkage to the explanations of
    owner, group and other sets of users that are distributed in parts
    of the text.

    - The second is more fundamental:  The intended functionality of
    inheritance is not explicitly documented.  To come to any sort of
    understanding of how inheritance is supposed to work you have to
    read through to the next to the last section (s6.4.3).  I have now
    read through the section two or three times and I have a model in my
    mind, but there is no text that would allow me to verify my
    understanding.  I *think* i have understood that (1) inheritance
    only applies when you initially create an object rather than being
    fed through to all 'inferior' objects in the tree if a superior
    object has a heritable ACE added; (2) inheritance only applies if
    you don't explicitly give any ACEs when the object is created, and
    (3) if you give a mode but no ACL then some complex combination of
    inherited ACL and mode is applied.  An upfront explanation of what
    is trying to be achieved would have helped a lot.  However, none of
    this explains what happens with either hard or symbolic links.  It
    is unclear to me whether inheritance of ACLs is applied when
    creating a hard or symbolic link to an existing object.

    s7, Multi-Server Namespace:

    This section is very verbose and contains a number of areas that
    bring in topics that  are actually forward references without
    indicating where the information can be found, particularly in
    relation to various 'change classes'.  Some reordering might help. 
    The whole concept of change classes seems a little odd, because when
    they are finally defined (s7.9.1), it appears that servers are
    rarely in the same class during any of the interesting transitions
    (s7.9.1 says that in most cases they have to be taken as in
    different classes).  Overall, this section contains an awful lot of
    discursive specification relating to servers that may not have any
    coordination which makes it very difficult both from the
    specification and the implementation angles to check that everything
    is working.

    s8, NFS Server namespace:  This is a important section for overall
    understanding and I think it would help to have it much earlier,
    certainly before s7.

    s9, File Locking and Share Reservations:

    Generally very well written, but the treatment of share reservations
    is very skimpy,

    It would also be good to be more 'honest' about the interaction with
    the semantics of the underlying file system as regards locks as
    described in s15.12.

    s10, Clientside Caching:

    Clearly very complex to implement but seems to be well-described.  A
    bit of earlier explanation of the use of CLAIM_PREVIOUS vs
    CLAIM_DELEGATION_PREV in client/server reboots would help.

    s11, Minor Versions:

    No particular issues.

    s12, Internationalisation:

    A minefield.  There seems to be an awful lot of guess work for
    clients trying to decide how the sever behaves.  On reflection, I am
    surprised that several of the things that clients are supposed to
    deduce (e.g., what normalisation the server does) were not supplied
    as attributes of the server.  Is there some good reason for this?  I
    was also surprised that there didn't seem to be a requirement that
    servers working on a particular filesystem all did the same
    internationalisation things.  It appears to me that if a filesystem
    gets migrated the client doesn't have any sort of guarantee that the
    internationalisation behaviour won't change under its feet - and it
    has no easy way to check if this is or is not the case because of
    the lack of attributes.  I have no idea to what extent the
    internationalisation is implemented in current servers.

    s13, Error Codes:

    Generally OK but with a few code value inconsistencies.  I haven't
    checked the cross-consistency of Table 9 and Table 11 or whether the
    sets of error codes per operation are really right - they are just
    too voluminous.

    s14, NFSv4 Requests:

    No problems.

    s15/16, Operations:

    Generally in good shape apart from some issues with the meaning of
    'regular files'.

    s17: Security:

    I have some doubts about the Kerberos algorithms suggested.  Maybe
    should be moving on to better algorithms even if this is a bis.


    Minor issues: 

    S1.1, bullet 7: Are there issues with backwards compatibility after
    removal of LIPKEY and SPKM-3 completely?

    s1.5.1: Should NFSv4 be using RPCSEC_GSS v2 (RFC 5403) rather than
    v1 (RFC 2203)?

    s., para 4, and s5.3: It would be useful to say upfront in
    s1.5.3.2 whether the attribute names are fully internationalized or
    whether this is still an ASCII name.

    s2.1:  This section contains a number of symbolic constants (chiefly
    for opaque element sizes) which have not been defined as yet.  It
    would be highly desirable if these were also defined before this

    s2.1, utf8val_RECOMMENDED4:  

 String SHOULD be sent UTF-8 

    What else could it be sent as in this type?

    s2.2/s6.2.1:  The type nfsace4 is used in the protocol: Accordingly
    it ought to be defined in s2.2 rather than s6.2.1.

    s2.2.3/s5.8.2.34/s5.8.2.40: It would probably be nicer if the union
    case SET_TO_SERVER_TIME4 were put into the union explicitly.  A note
    about being able to set the access and modify times to either client
    or server time would be appropriate in the two items in s5.8.2.  Are
    there any constraints on the values that can be given when setting
    from the client.. like not moving the time backwards (or is this
    explained later?).

    s3.1, para 1:

 Using the registered port for NFS services
   means the NFS client will not need to use the RPC binding protocols
   as described in [


]; this will allow NFS to transit firewalls.

    I think some explanation of why this will definitively allow NFS to
    transit firewalls is needed. 

    s3.1.1: Presumably the requirement to eschew silent dropping of
    'requests' does not apply if security is being negotiated.  I take
    it 'requests' should be interpreted as  'NFSv4 requests' and
    procedures as defined in s14/15 and this requirement should apply
    only once the secure connection is established.  The word
    'established' might help.

    s3.2: (reprise) Should we be using RPCSEC_GSSv2 (RFC 5203) rather
    than v1?


    s3.2.1.1: The Kerberos specifications appear to use rather ancient
    algorithms.  In particular DES, MD5.. should there not be more 
    modern algorithms?  DES and MD5 get a pretty bad press these days. 
    RFC 2623 is also a bit of an antique so I can't say its a modern

    s3.3.1,security poiicy boundaries etc:

 In general, the client will not have to use the SECINFO
   operation except during initial communication with the server or when

   the client crosses policy boundaries at the server.  It is possible
   that the server's policies change during the client's interaction
   therefore forcing the client to negotiate a new security triple.

     The  concept of 'security policy boundary' has not been previously
    discussed in the doc.  How would the client know it had crossed a
    security boundary or that the server had changed the policy under
    its feet? I suppose maybe the next section tells me... but... How
    does this interact with the no silent drop, mostly no retries
    requirement in s3.1.1? or have i to read s15.33.    

    I suspect that some discussion of the interaction of mount points
    and security policy boundaries may be needed at an earlier stage.  I
    suspect that mount points might well be security policy boundaries
    and crossing mount points might interact.  There is nothing in the
    description of mounted_on_fileid (s5.8.2.19).

    s5.6, item lease_time:  I can't find a definition of nfs_lease4 type
    used for this.

    s5.6, item rdattr_error: Just giving 'enum' as the type isn't too


It is understood that this space may be
   consumed by allocations to other files or directories though there is
   a rule as to which other files or directories.

    Where do I find the rule?

    s5.10, para 1:  

although there are variations that occur additional characters with a
   name including "SMALL" or "CAPITAL" are added in a subsequent version
   of Unicode.

    so... should the client be able to find out which version of Unicode
    the server is working to?

    s6.3.1.2: Is there any interaction between delegation and client
    considerations for ACLs? Do clients holding delegations have to do
    any access checks in the client that would normally be done in the
    server if there was no delegation?  I'm not sure.


   If a single location entry designates multiple server IP addresses,
   the client cannot assume that these addresses are multiple paths to
   the same server.  In most cases, they will be, but the client MUST
   verify that before acting on that assumption. 

    How is the verification achieved?  I don't see any attributes within
    NFSv4 that would help.

    s7.7.6.1, para 3: 

   If state has not been transferred transparently because the client ID
   is rejected when presented to the new server, the client should fetch
   the value of lease_time on the new (i.e., destination) server, and
   use it for subsequent locking requests.  However, the server must
   respect a grace period of at least as long as the lease_time on the
   source server, in order to ensure that clients have ample time to
   reclaim their lock before potentially conflicting non-reclaimed locks
   are granted.

    How does the destination server know what the lease_time/grace
    period on the source server was? It doesn't know what server the
    client was previously connected to in many cases and so can't ask. 
    Doesn't this amount to saying all servers for a given file system
    should (MUST) be configured with the same lease time in the
    non-transparent transparent case?

    s9.9: This section is a little skimpy.  There are several points
    that probably ought to be clarified:

    - The section explicitly refers to files:  Does it actually apply to
    all filesystem objects?

    - An explicit statement that the 'file_state' is maintained for each
    file(system object)

    - The pseudo-code applies to the OPEN case; different pseudo-code
    (presumably) applies to OPEN_DOWNGRADE.  I *think* I could guess it
    but it would be good to be explicit.

    - Is there any interaction between byte range locks and share
    reservations?  e.g., does a client holding (say) a byte range write
    lock but with no DENY state prevent another client taking out a
    DENY_WRITE  share reservation on the same file?

    - Does the discussion of placing lock state into stable storage
    across server reboot also apply to the share reservation file_state?

    s9.14:  The text in this section addresses topics also covered in
    s7.  This does  not seem desirable as there may be discrepancies
    (see next comment).  It would seem better to merge the text into s7
    and possibly provide a reference back to s7 from s9.  

    s9.14.1:  This section envisages migration of state from one server
    to another where the client already has state on the other server
    using the same client ID.  Merging of the states is suggested. 
    AFAICS this situation is not discussed in s7 which seems a little

    s12:  It seems the client has to play '20 questions' with the server
    to find out how it treats internationalized component names.  Some
    of it seems little better than guesswork.  I was wondering why some
    of the server choices are not described using attributes so that the
    client doesn't have to thrash around in the dark.  Also there
    doesn't seem to be any requirement for a filesystem to be treated
    identically after migration. As far as I can see the client can't
    really rely on the internationalisation treatment not changing after

    s18.1:  The NFSv4 Named Attribute Definitions registry has already
    been created in exactly the form specified here by RFC 5661
    (NFSv4.1).  This section should be removed and words noting that the
    existing registry is common to all minor versions of NFSv4  should
    be substituted.


    Nits/editorial comments:

    General: Review the remaining instances of 'file' and 'files' to
    determine if they should really be 'filesystem objects'.

    General: Consistency between 'file system' and 'filesystem'. 
    'filesystem' is used first in abstract and s1.3 but only File System
    is defined.  I cannot see whether there is any real distinction
    between the usages... but  am open to argument! OTOH there is
    definitely inconsistency: Consider 'fsid' and 'fsid4' - the type
    definition says fsid4 is for a filesystem (s2.2.5) but s5.8.1.9 says
    this is about a 'file system'

    General: The terms 'regular file' and 'named attribute' are not used
    uniformly.  s5.8.1.2 typifies the confusion.  In the first set of
    bullets, NF4REG is defined as 'a regular file' and NF4NAMEDATTR is
    defined as a Named Attribute.  In the last bullet, "is an[sic]
    regular file" is defined as implying either type NF4REG or

    Thus we have the same term used for a set and a superset of the same
    set (maybe when qualified with 'is a').  Consequently, for example
    in s15.25.4,  we have "the READ operation reads data from the
    regular file...".  Presumably this is supposed to cover both NF4REG
    and NF4NAMEDATTR types but one can't be absolutely sure (to be
    pedantic it doesn't explicitly say 'is a regular file' so perhaps it
    should be read as NF4REG only).  OTOH the OPEN operation (s15.18.5)
    has two separate paragraphs for regular files and named attributes. 
    This needs some tidying up.  One could try using 'data object' as an
    unambiguous term for the union of the actual file (NF4REG) class and
    the named attribute class.

    Abstract: s/owes heritage to/builds on the heritage of/  

    s1.1: Expand IDNA acronym.

    s1.2, bullet 5:s/mounted_on_filed/mounted_on_fileid/

    s1.3: Expand acronyms ONCRPC and RPCSEC_GSS (and/or ref to RFC 2203
    or 5403).

    s1.5: s/the reader that is new/the reader who is new/

    s1.5.1, para 2: s/provides the client a method/provides the client
    with a method/

    s1.5.3:  A couple of sentences outlining the presentational scheme
    for filesystem object names (i.e., /a/b/c etc) would be worthwhile
    even if 'we all know what this means'!  This could include defining
    'component name' used later with no introduction.

    s1.5.3.1, et seq: Would it be useful to identify 'persistent' and
    'volatile' as key words: e.g., 'Persistent' and 'Volatile'. 

    s1.5.3.2, para 3: For consistency the 'acl' example should have a
    ref (s6.2.1) earlier in the para.  So s/((Section 6)/(Section
    6.2.1)/ and move one sentence earlier.

    s1.5.5: There should be a discussion of share reservations in this

    s1.6, definition of Lease: There should be a definition of 'grace
    period' (as per s9.6.2) as there are lots of references to it
    earlier than s9.6.2.

    s1.6, definition of Lock:  There should be a definition of share

    s2.1 et seq:  Whilst the terms 'hard link' and 'symbolic link' are
    pretty commonly known,  they are so fundamental that I think putting
    them in the definitions would be highly desirable.


    s2.1, attrlist4:  In line with comment about opaque max sizes,
    should this have a max size? (Might also apply to bitmap4)

    s2.2: It would be useful to have section references to the places
    where the types are used where these are explicitly mentioned.

    s2.2.8 (and s2.1):  I guess that the bit ordering in bitmap4 is
    implicitly derived from the XDR representation, but it would be
    useful  to remind people of what this bit ordering actually is. 

    s2.2.9, atomic element:  With a name like that are there any special
    handling requirements for this element/structure?

    s2.2.10: Note  to self: verify external refs in this section.


    s2.2.11, cb_program: It should be explained that the cb_program is
    an RPC program number (had to ferret in s15.35.4 to understand
    this).  BTW this would entail an earlier expansion of RPC acronym.

    s2.2.12:  It would be good to have a definition of NFS4_OPAQUE_LIMIT
    before this section (see earlier comment about s2.1) so that all the
    magic number constants are concentrated in one place.

    s2.2.13:  Arrgh! Now we have two definitions of NFS_OPAQUE_LIMIT
    (fortunately they are the same until they aren't).

    s2.2.14: ARRRGGGHHH! THREE defs.

    s2.2.16, 'other':  And this definition contains the magic number 12
    not as a symbolic constant and with no explanation of why 12 is a
    good value! 

    s3.1, para 2:  

   Where an NFSv4 implementation supports operation over the IP network
   protocol, the supported transports between NFS and IP MUST be among
   the IETF-approved congestion control transport protocols, which
   include TCP and SCTP. 

    I know what is meant, but technically, I don't believe the IETF
    actually has a list marked 'congestion control transport
    protocols'.  A better way of phrasing this might be:

       Where an NFSv4 implementation supports operation over the IP

       protocol, the supported transport layer between NFS and IP MUST
    be an IETF

       standardised transport protocol that is specified to avoid

       congestion; such  transports include TCP and SCTP.

    [Check whether SCTP is on the IANA well-known acronym list]. 



    to use a different IETF-approved congestion control transport


    to use a different IETF standardised transport protocol with
    appropriate congestion control.

    s3.1, para 3: Expand WAN acronym. (I think)

    s3.1, para 4:  Where would I find a justification that DCCP or NORM
    are inadequate?

    s3.1, para 5: s/regardless whether/regardless of whether/

    s3.1, para 6: A rather throwaway comment about avoiding timer sync. 
    Not sure yet how many timers, but it would be good to know which
    ones were critical.  After reding the document I am still not sure. 
    It might almost be worth adding a section that summarizes the set of
    timers used.

    s3.2.1/s3.2.1.1: Check section title capitalization.  (Mostly OK -
    spotted also s6.4.1/s6.4.1.x - possibly)

    s4.2.1: Earlier parts of s4 have been generalised to talk about
    'filesystem objects'.  This section still has filehandles referring
    to 'files'.

    s4.2.3, para at top of p33; s4.3, para 4: Ditto talking about

    s4.3, para 1:  

   If possible, the client SHOULD recover from the receipt of an

    I think this is a 'should' rather than a 'SHOULD'. Its not something
    that the protocol designer has any control over.

    s5 title:  Attributes apply to more than just files. 

    s5, para 4: s/new  OPENATTR operation/OPENATTR operation introduced
    in NFSv4/

    s5.1, last sentence: I think the 'must' is a 'MUST' - the server is
    constrained to send the values if asked (but the client has a choice
    to ask or not).


A client MAY ask for the
   set of attributes the server supports and SHOULD NOT request
   attributes the server does not support.

    This should probably be 

       An operation allows the client to determine the set of attributes

       that the server supports; a client should not request attributes

       server does not support.

    Its not something the protocol can do anything about.


    s5.8.2.10, para 2: Not sure what root path has to do with locations.

    s5.9, para 4:

 Servers that do not provide support for all possible values of the
   owner and owner_group attributes SHOULD return an error
   (NFS4ERR_BADOWNER) when a string is presented that has no
   translation, as the value to be set for a SETATTR of the owner,
   owner_group, or acl attributes.

    What might the server do if it doesn't return NFS4ERR_BADOWNER? I
    would have thought this ought to be  MUST.

    s5.9, para 5:

   The "dns_domain" portion of the owner string is meant to be a DNS
   domain name.  For example, 

user at

.  Servers should accept
   as valid a set of users for at least one domain.  A server may treat
   other domains as having no valid translations.  A more general
   service is provided when a server is capable of accepting users for
   multiple domains, or for all domains, subject to security

    I suspect that the 'should' and 'may' ought to be 'SHOULD' and
    'MAY'.  Arguably the 'should' ought to be a 'MUST' if the server
    recognises owner and owner_group as attributes. i.e., a mandatory to

    s5.9, para 6:

In the absence of such a
   configuration, or as a default, the current DNS domain name should be
   the value used for the "dns_domain".

    The term 'current DNS domain' is not very specific. Do you mean the
    DNS domain in which the host resides?  

    s5.9, 1st bullet point on page 50: A reference to (probably) RFC
    5890 is needed to explain U-labels and other IDN stuff is needed
    (probably best in s1.6).

    s5.10: Expand UCS acronym (and add ref to ISO.10646-1.1993).

    s6.1, bullet 3:  An explanation of what is meant by (acl)
    inheritance is needed.

    s6.1, bullet 4, sub-bullet 1:

      *  Setting only the mode attribute should effectively control the
         traditional UNIX-like permissions of read, write, and execute
         on owner, owner_group, and other.

    Does this extend to the interpretation of the x bit as searchability
    on directory objects as per UNIX?  

    s6.1, bullet 5:  Does mode setting affect the inherited ACL
    attributes as well as the those of the object for which the mode is

    s6.2.1, next to last para: Expand SMB (and/or give reference).

    s6.2.1.1, sentence above table:

   All four but types are permitted in the acl attribute.

    'but' is clearly not the right word but I am not quite sure what
    this trying to say.


    Why is the RENAME operation affected if the object being renamed is
    a file rather than a directory?

    s6., ACE4_EXECUTE:  

    Given that directory operation aliases for ACE4_READ_DATA and
    ACE4_WRITE_DATA have been defined, it would seem sensible to define
    an alias for the directory/traverse use case of ACE4_EXECUTE.


    s/for information on/for information on how/ 

    s6., ACE4_WRITE_ATTRIBUTES - file times modification:

    Can the client find out if the server uses client or server times? I
    didn't see an attribute that tells me which is allowed on the server
    - cansettime only allows the client to find out if it can modify the
    times.  What happens (which error code comes back) if the SETATTR
    has the wrong sort of time spec (e.g., client time when the server
    only supports server times)? As previously mentioned, is the server
    expected to do any sanity checks on supplied client times (maybe not
    allowing - at least - ordinary users to set times backwards from
    previous values.)

    s6.2.1.4: A few words on the effect of inheritance would help and
    and clarify the next two comments.  Points that arise from the
    discussions in s6.

    - Is there a time factor in ACEs?  The phrase 'newly created' seems
    to imply that there is.  The document doesn't really explain how it
    is expected that inheritance will be implemented: It could be seen
    as copying the current set of ACEs from the parents up the tree at
    the time an object is created or tracking the ACEs up the tree when
    a access request has to be verified. 

    - Is the inheritance fully recursive down a directory tree?  I guess
    that it is unless the NO_PROPAGATE flag is set.. but what about
    symbolic links?

    - Does the inheritance propagate across mount points?

    Additionally, how does ACE inheritance interact with hard and
    symbolic links?  Does the inheritance attach to the underlying
    object or only to the name path?


    Three issues:

    - presumably this implies the ACE is (primarily) inherited by files
    in the directory to which it applies

    - does 'any sub-directory' imply that the inheritance applies
    recursively down the tree?

    - does 'any non-directory file' include symbolic links?


    Three issues:

    - Does 'should be added to each *new* directory created' imply that
    the ACE does not get inherited by sub-directories are already in
    existence when the ACE is applied? Either way it would be good to be
    explicit about the effect.

    - Does this apply recursively down the tree?

    - Would the inheritance apply to symbolic links to directories?


    It would be worth pointing out that the execute permission bits
    imply searchability for directory objects.  This isn't mentioned in
    READDIR... is this something missing?


    Are there restrictions on the mode bits that can be set on named
    attributes and their directories?  Wouldn't make a lot of sense to
    set the x bits on named attributes  (probably), nor the suid and
    svtx bits.

    s6.3.1.2, para 1: s/interpretation the ACL/interpretation of the

    s6.1, last para: Expand DCE/DFS acronyms.

    s6.2: Expand TLS and SPKM acronyms. (IPsec is 'well-known')


These attributes specify such file system
   instances by specifying a server address target (either as a DNS name
   representing one or more IP addresses or as a literal IP address)
   together with the path of that file system within the associated
   single-server namespace.

    I don't understand 'associated single-server namespace'.  Does this
    mean the aggregated logical file system or something else?  Having
    read further I suspect that this at least needs a forward reference
    to Section 8.

    s7.2 and s7.3:  These sections are very repetitive.  They
    essentially convey the fact that if a file system is absent only
    GETATTR on the absent system and READDIR on a  referring system will
    produce useful answers and only a couple of attributes are valid.


   A potential problem exists if a client were to allow an open owner to
   have state on multiple filesystems on server, in that it is unclear
   how the sequence numbers associated with open owners are to be dealt
   with, in the event of transparent state migration.  A client can
   avoid such a situation, if it ensures that any use of an open owner
   is confined to a single filesystem.

    This paragraph is not very helpful because the concepts of 'open
    owner', associated state and sequence numbers have not been
    introduced up to this point apart from as cryptic references in type
    definitions.  At worst a forward reference is needed.  It might be
    better to have a brief introduction to the state mechanism under the
    NFSv4 basic concepts section (maybe around s1.5.4).

    Without this introduction I have no idea under what circumstances
    the 'open owner' might have state on multiple file systems - and
    hence what I would have to do to avoid going down this rat hole.

    s7.7, last para:  Would be worthwhile to either substitute 'network
    order' for 'big endian' or add it as an alternative. (It turns out
    that s9.14 uses network order for just this requirement -
    consistency would be good!)


   When a single file system may be accessed at multiple locations,
   either because of an indication of file system identity as reported
   by the fs_locations attribute, the client will, depending on specific
   circumstances as discussed below, either:

    The first 'either' has no corresponding 'or'.

    The second 'either' needs an 'or' at the end of the first bullet.

    s7.7.1, 2nd bullet: s/Accesses/Access/

    s7.7.2, para 3 (and  para 2):

   When there is co-operation in filehandle assignment, the two file
   systems are reported as being in the same handle classes.

    It appears that the term 'handle class' is not defined until s7.9.1
    (like fileid class in the s7.7.3 and various other attribute
    classes).  A forward reference to this section is essential  (or
    maybe better a reordering of the sections).  It would be worth a
    note somewhere in this area that a client needs to ensure that it
    knows the fh_expire_type for any filesystem instance it is
    accessing  in case it get forcibly transitioned, since it can't find
    out post-facto from an absent filesystem.

    s7.7.2, paras 2 and 3:

   When there is no such cooperation in filehandle assignment, the two
   file systems are reported as being in different handle classes.  In
   this case, all filehandles are assumed to expire as part of the file
   system transition.  Note that this behavior does not depend on
   fh_expire_type attribute and depends on the specification of the

   When there is co-operation in filehandle assignment, the two file
   systems are reported as being in the same handle classes.  In this
   case, persistent filehandles remain valid after the file system
   transition, while volatile filehandles (excluding those that are only
   volatile due to the FH4_VOL_MIGRATION bit) are subject to expiration
   on the target server.

    I do not understand the final parts of these paras where
    FH4_VOL_MIGRATION is mentioned:

    - In para 2, FH4_VOL_MIGRATION is a bit in fh_expire_type, so I fail
    to understand how the behavior can both be independent of
    fh_expire_type and dependent on a bit within it at the same time.

    - In para 3, the wording appears to imply that filehandles with
    FH4_VOL_MIGRATION set will not expire. My  understanding of the
    definition of this bit was that such filehandles would explicitly
    expire on migration.  Accordingly I would have expected that such
    filehandles would have been guaranteed to expire on transition  -
    but maybe this is some difficulty with the meanings of transition
    and migration?

    s7.7.3, last para:

   Note that when consistent fileids do not exist across a transition
   (either because there is no continuity of fileids or because fileid
   is not a supported attribute on one of instances involved), and there
   are no reliable filehandles across a transition event (either because
   there is no filehandle continuity or because the filehandles are

    I don't understand how there can be no filehandle continuity other
    than because the filehandle is volatile.  If a filehandle is not
    volatile, then it is persistent and s4.2.2 appears to indicate that
    any persistent filehandle would be valid across a transition.

    s7.7.5 and s7.7.6, last paras:  s7.9.1 states that distinct server
    instances are always in different change classes. Since there
    doesn't seem to be anyway to check that two instances of a server
    are actually the same when accessed over different connections, the
    consistent change class mentioned in the last para of s7.7.5 and the
    continuity mentioned in the last para of s7.7.6 can never occur in
    practice and would probably be better omitted.

    s7.7.6, para 7: A reference to s9.6.3.4 where the edge conditions
    are described would be desirable.

    s7,7,8:  s7.9.1 says all server instances have different readdir
    classes, so once again the instances will never be deemed

    s7.9:  Copying the fs_location(s)4 structure definitions here
    creates a double maintenance problem.  Better just to reference
    ss2.2.6/2.2.7 where they are defined.

    s8:  Placing this section much earlier in the document would help
    understanding, especially of Section 7.

    s9:  There is a comment in s15.12.5 (para 2) that exactly how locks
    behave depends on the underlying semantics of the exported  file
    system.  It would be worth reproducing that statement here to ensure
    that this section is interpreted in that context. 

    s9, last para:

To support Win32 share reservations it is necessary to atomically
   OPEN or CREATE files.

    Add 'and apply the appropriate locks in the same operation' to the
    end of this sentence.

    s9.1: I think that for the sake of clarity, it would be worth
    explicitly stating that the client has to hold an open before
    applying locks to the same file.

    s9.1:  What happens if you try to acquire a byte range lock on a
    non-file filesystem object?

    s9.1.1, para after struct nfs_client_id4 definition: s/which the
    server has previously recorded the client/which the server has
    previously recorded for the client/.  Also copying the structure
    definition creates a double maintenance problem.  A ref to s2.2.12
    would be better.

    s9.1.3, first sentence:  Question: Should the 'including' list also
    have share reservations or are these included in byte-range locks or

    s9.1.3.5: Expand acronym I/O - this isn't on the RFC Editors 'well
    known' list - maybe strangely. 

    s9.1.3.2: NFS4_UINT32_MAX should be in the proposed constants

    s9.1.3.2, para 2 and s9.1.6, para 1: The two paragraphs have
    conflicting statements about the initial value of seqid for locks
    (s9.1.3.2 => 1, s9.1.6 => server choice) - not that it really
    matters much unless clients check that they always get 1!  Note that
    s9.1.6 does say anything about missing out 0 when wrapping round.

    s9.1.5, para 9 (bottom of page 116): s/ (e.g., another open specify
    denial of READs)/(e.g., another open specifying denial of READs)/

    s9.4:  Changing:


       The server is not

       required to maintain a list of pending blocked locks as it is
    used to

       increase fairness and not correct operation.


       The server is not

       required to maintain a list of pending blocked locks as it is not
    used to

       provide correct operation but only to increase fairness.

    scans more easily.

    s9.6: If I understand correctly, share reservations with a DENY
    reservation are essentially locks that cover the whole file and
    adjust dynamically to cover the entire byte range of the file if it
    changes. Presumably. therefore, the whole discussion of lock
    recovery applies both to byte range locks and share reservation
    implicit locks.  I think it would be worth pointing out that this is
    the case at the beginning of s9.6 (assuming this is indeed the case
    - or alternatively explaining how things differ with share

    s9.8, para 4: In the last sentence s/may not/MUST NOT/. 

    s9.8, para 5: 

The client may accomplish this task by
   issuing an I/O request, either a pending I/O or a zero-length read,
   specifying the stateid associated with the lock in question.

    I am not sure how you 'issue a pending I/O'.  Does this mean
    checking whether there is already a pending I/O request associated
    with this stateid? 

    s10.2, para 6: s/server times its wait/server times out its wait/

    s10.2, para 7: I think the lat two sentences are requiremenst:

    next to last sentence: s/may/MAY/

    last sentence: s/should not/SHOULD NOT/

    s10.2.1, para 9 (last para on p146):  I think that this para is
    referring to the server not the client: s/client/server/

    s10.2.1, para 12 (2nd bullet): Good to have a ref to the Courtesy
    Locks section (s9.6.3.1)

    s10.2.1, para 17:  "s special semantic adjustments" => either
    "special semantic adjustments" or "a special semantic adjustment"

    s10.2.1, para 19:  It might be better to explain about the
    distinction between the CLAIM_xxx attribute to be used after server
    and client restarts that doesn't come till para 24 currently - the
    changeover to CLAIM_PREVIOUS in this para isn't obvious without this

    s10.2.1, last para, sentence 1: s/Is/is/

    s10.3, 1st bullet: s/and not change/and not the change attribute/
    ... there are a couple of other instances where this might also be

    s10.3, 2nd bullet: s/client OPENS as file/the client OPENS a file/

    s10.3.3:  I need to think a bit more about what mandatory locking

    s10.4, para after 2nd set of bullets: should the 'standard stateid'
    be associated with the open-owner rather than the lock-owner as
    currently stated.

    s10.4.3: Should there be some discussion of wrap round in the
    'change' attribute?

    s10.5.1, para 2: s/a complete cached copy  of file/a complete cached
    copy  of the file/,

    s/In other case/In other cases/

    s10.7, para 2:


(regardless whether the file is local file or is being access remotely)


(regardless of whether the file is local file or is being accessed remotely)

    s11, item 8, first sentence: s/implement/inplemented/

    s11, items 9 and 10: Its probably not that important, but would
    OPTIONAL <=> REQURED be allowed - i.e., two 'levels' of change
    rather then the explicit one currently documented.

    s12.1.1, bullet 2: s/in many case/in many cases/

    s12.1.2, para 1: s/important to understand/important to

    s12.1.2: The quoted sections should be attributed to where they come
    from (including section numbers).

    s12.1.2: There seems to be a spurious ACCENT in this para: 

         ACCENT (U+0xxx, U+0304, U+0301)

    s12.1.2 (Maybe expand IDNA again - its 180 pages since it was last

    s12.2.3, para just after Table 5: s/or it in may/or it may/

    s12.3: (maybe expand UCS again since it is 130 pages since previous

    s12.7, 2nd bullet: s/lintext4/linktext4/;  also the term 'link text'
    hasn't been properly defined.

    s12. Need to expand NFC and NFD and provide a reference.

    s12., para 8, last sentence: s/it should generally mapped/it
    should generally be mapped/

    s12.7.3, bullet 4: Should 'converting a user to group to an internal
    id' be 'converting a user or group to an internal id'?

    s13.1.1.7: s/fit/fitted/

    s13.1.2.1 s/file handle/filehandle/

    s13.1.2.7: should refer to filesystem object rather than file.

    s13.1.4.2: Value of NFS4ERR_DQUOT is 19 here but 69 in table 8

    s13.1.4.3/4/5/7: should refer to filesystem object rather than file.

    s13.1.4.8: Does this apply to any filesystem object rather than just
    file or directory?

    s13,1.4.11: Value of NFS4ERR_NXIO is 5 here but 6 in table 8 (and
    value 5 is already used for NFS4ERR_IO).

    s13.1.5.2: Value of NFS4ERR_BAD_STATEID is 10026 here but 10025 in
    table 8 (and value 10026 is already used for NFS4ERR_BAD_SEQID).

    s13.1.5.2, bullet 1: should refer to filesystem object rather than

    s13.1.7: s/When the strings are not are of length zero./When the
    strings are not of length zero./

    s13.2-13.4: I didn't check these tables.  Table 11 can be checked
    automatically if it is thought worthwhile by inverting tables 9 and

    s14.2: Might be worth stating that COMPOUND doesn't represent a
    'transaction', as is expressed in para 3.

    s14.2, para 4:  The operation descriptions explicitly mention what
    happens to the current filehandle after operations.  Nothing is said
    about the stability of the saved filehandle.  Can this be assumed to
    be preserved in all cases whether the operation succeeds or fails? 
    I guess it would also be useful to explicitly state that there are
    no guarantees about the contents of the current filehandle after an
    operation unless explicitly stated (i.e., after most errors you
    ought to reload the current filehandle).

    s14.3, para 1: s/UNSTABLE/UNSTABLE4/

    s15.2.5, para 2: I am not sure how you *could* mix operations - I
    was under the impression that the COMPOUND is submitted in the
    context of a particular client ID and the current/saved filehandles
    are tied to the client ID/connection. The client ID can't be changed
    in a single compound statement so this seems a spurious statement. 
    Presumably if stateid's etc are used on the wrong client ID the
    server will object. See also s15.18.6.

    s15.3.4: Presumably the access permissions generally apply to named
    attributes.  Currently it only talks about files and directories.

    s15.3.5, para 2, sentence 3: s/which will result/which might

    s15.3.5, para 4: I found the following confusing:

   The client should use the effective credentials of the user to build
   the authentication information in the ACCESS request used to
   determine access rights.

    I am not sure where there is any authentication info (explicitly)
    "in the ACCESS request" - the parameter is just the requested access
    mask.  Where is the auth info?



The server returns an attribute bitmap that
   indicates the attribute values for which it was able to return,


The server returns an attribute bitmap that
   indicates the attributes for which it was able to return values,

    s15.9.5, last sentence: s/returned if while a delegation/returned if
    [or while] a delegation/

    s15.11.4:  This section refers explicitly to files.  Is it ever
    possible to make hard links to:

    - directories?

    - named attributes?

    - device and other special types of 'file'?

    It would be good to be explicit.

    s15.12.4, para 2:

 To lock the file from a specific offset
   through the end-of-file (no matter how long the file actually is) use
   a length field with all bits set to 1 (one).

    Is the end offset fixed at the value given by the length of the file
    at the time of applying the lock, or does this byte range vary as
    the length of the file changes?

    What happens if the length of the file gets to be less than the
    start offset, or indeed is currently less than the length of the

    Actually that is a more general question: Is any special treatment
    needed for locks that are outside the current length of the file? 
    Does the server make any checks on whether a lock is within the
    current length of the file when applied - or automatically delete
    locks off the end of a file.. probably not but would be good to be

    s15.14.5, para 1:

In all of
   these cases, allowed by POSIX locking [


] semantics, a client
   receiving this error, should if it desires support for such
   operations, simulate the operation using LOCKU on ranges
   corresponding to locks it actually holds, possibly followed by LOCK
   requests for the sub-ranges not being unlocked.

    Shouldn't the client apply the sub-range locks before unlocking? 
    Otherwise there is a potential race condition and some other client
    could get in a lock between the unlock and the relock. [After a bit
    of background reading I understand that this rather silly
    arrangement is 'the way it is done'

    s15.16.4/5:  What happens when you apply LOOKUPP to the 'hidden'
    named attributes directory?  Do you get NFS4ERR_NOENT or the
    filehandle of the object that owns the named attributes (inverting
    OPENATTR)?  s15.16.4 only talks about the root of the server's file

    s15.18.6: Do SHARES work on named attributes?  Just checking...

    s15.20.5, para3, sentence 1:  "Servers must not require..." I think
    this is a 'MUST NOT'.

    s15.23.5: Acronm SNEGO needs an expansion and a reference (RFC

    s15.25.4, para 4: If the OPEN didn't use locks or share reservations
    and there is no delegation, is the stateid in the READ request the
    'basic' OPEN stateid returned with all OPENs? It doesn't explicitly
    say this.  (also relevant to WRITE operation - s15.38.4, para 3).

    s15.25.5, para 1:


   is possible that the server reduce the transfer size and so return a
   short read result.  Server resource exhaustion may also occur in a
   short read.


   is possible that the server reduces the transfer size and so returns a
   short read result.  Server resource exhaustion may also result in a
   short read.

    s15.28.4: The specification says nothing about the semantics of
    sub-directory removal.  NFSv3 says that 'some servers won't allow
    removal of non-empty directories'.  What is the story for NFSv4?  Is
    there any standardised strategy for dealing with 'orphaned'
    resources in sub-directories if the server does allow a non-empty
    sub-directory to be deleted?

    s15.29.4, para 5:  It would be clearer to say that the operation
    will fail if the saved and current file handles refer to named
    attribute directories attached to different filesystem objects.

    s15.30.5, para 1 , last sentence: s/such that is could/such that it

    s15.33.5, para 2:

The filehandle is either provided by the client (PUTFH,
   PUTPUBFH, PUTROOTFH) or generated as a result of a name to filehandle
   translation (LOOKUP and OPEN).

    Aren't the PUTxxx filehandles supplied by the server?

    s15.38.4:  Does all the discussion of writing to stable storage
    apply when writing to a named attribute?

    s15.38.5, para next to next to last:

   Some implementations may return NFS4ERR_NOSPC instead of
   NFS4ERR_DQUOT when a user's quota is exceeded.  In the case that the
   current filehandle is a directory, the server will return
   NFS4ERR_ISDIR.  If the current filehandle is not a regular file or a
   directory, the server will return NFS4ERR_INVAL.

     I am not sure what circumstances the last sentence is referring
    to.  One interpretation would be that WRITE is not allowed at all
    for special files.  I guess this is not what is intended - please