Last Call Review of draft-ietf-nfsv4-rfc3530-migration-update-07
review-ietf-nfsv4-rfc3530-migration-update-07-genart-lc-davies-2016-01-18-00
| Request | Review of | draft-ietf-nfsv4-rfc3530-migration-update |
|---|---|---|
| Requested revision | No specific revision (document currently at 08) | |
| Type | IETF Last Call Review | |
| Team | General Area Review Team (Gen-ART) (genart) | |
| Deadline | 2016-01-18 | |
| Requested | 2016-01-07 | |
| Authors | David Noveck , Piyush Shivam , Chuck Lever , Bill Baker | |
| I-D last updated | 2016-10-06 (Latest revision 2016-02-02) | |
| Completed reviews |
Genart IETF Last Call review of -07
by Elwyn B. Davies
(diff)
Genart IETF Last Call review of -07 by Elwyn B. Davies (diff) Opsdir Telechat review of -07 by Victor Kuarsingh (diff) |
|
| Assignment | Reviewer | Elwyn B. Davies |
| State | Completed | |
| Request | IETF Last Call review on draft-ietf-nfsv4-rfc3530-migration-update by General Area Review Team (Gen-ART) Assigned | |
| Reviewed revision | 07 (document currently at 08) | |
| Result | Almost ready | |
| Completed | 2016-01-18 |
review-ietf-nfsv4-rfc3530-migration-update-07-genart-lc-davies-2016-01-18-00
I am the assigned Gen-ART reviewer for this draft. The General Area
Review Team (Gen-ART) reviews all IETF documents being processed
by the IESG for the IETF Chair. Please wait for direction from your
document shepherd or AD before posting a new version of the draft.
For more information, please see the FAQ at
<http://wiki.tools.ietf.org/area/gen/trac/wiki/GenArtfaq>
.
Document:
draft-ietf-nfsv4-rfc3530-migration-update-07.tx
Reviewer:
Elwyn Davies
Review Date:
2016/01/16
IETF LC End Date:
2016/01/18
IESG Telechat date: 2016/01/18
Summary:
Almost ready. There are a few minor issues and a considerable
number of nits that need attention. The text is extremely complex
and in some places the logic is not entirely clear. I would
encourage the authors to revisit the various usages of RFC 2119
language; a number of the instances, especially several of the
'SHOULDs', appear to refer to operational or administrative
decisions rather than matters that affect the protocol - and are
therefore not actionable by the receiving implementation. If
appropriate the language should be modified.
Major issues:
------------------
None
Minor issues:
------------------
Bits possibly not covered from RFC 7530:
---------------------------------------------------------
With respect to s4:
In s9.1.1, para 1 of RFC 7530 there is:
Breaking
the lease state amounts to the server removing all lock, share
reservation, and, where the server is not supporting the
CLAIM_DELEGATE_PREV claim type, all delegation state associated with
the same client with the same identity. For a discussion of
delegation state recovery, see
Section 10.2.1
.
There is nothing in the update about servers that don't support the
CLAIM_DELEGATE_PREV claim type. Should something be said about
this?
The general description of the distinction between open owners and
lock owners in para 2, 3 and 4 of s9.1.1 is lost if the new text is
taken as a true replacement of s9.1.1. This is undesirable.
------------------------------------------
Items in the text of this draft:
-----------------------------------------
s5.1.3: When a file system is migrated, the source server is queried
about the new location of the file system using
GETATTR(fs_locations). Since the file system involved has migrated,
the server no longer has a connection with the relevant file system
but still needs to respond to the GETATTR appropriately. Does there
need to be some discussion of how long the server needs to hang onto
information that will allow it to respond to these requests,
particularly since the suggested means of discovery uses an
arbitrary file from the file system. It is possible that I have
missed something or my understanding/memory of how fs_locations
works is faulty.
s7.4.5.1: It would be helpful to indicate the if-then structure of
the bullets by making current bullets 3 and 4 sub-bullets of bullet
2. Indeed, on second thoughts, it is not clear which 'if' (either
that in bullet 2 or bullet 3) the 'otherwise' in bullet 4 applies
to. Please make the logic clearer.
s7.5, bullet 1: Surely the decision as to whether privacy is needed
is an administrative/operational deployment decision? Transfer
integrity is vital and provision should be made for transfer privacy
to be possible if the installation requires it, but I can't see that
privacy needs to be ensured in all possible circumstances.
=======================
Nits/editorial comments:
-----------------------------------
General: s/e.g. /e.g., /g (5 instances); s/i.e. /i.e., /g (5
instances).
General (in s5): s/openowner/open-owner/g; s/lockowner/lock-owner/g
General: s/filesystem/file system/g (In line with usage in RFC
7530).
General: Lack of terminology section. There are a number of terms
that might be advantageously incorporated into a terminology section
- lock types, client id, stateid, incarnation verifier (and its
relation to reboots/'boot verifier'), server (address) trunking,
open owners, lock owners spring to mind.
s3 , para 6/s4.1, para 7: The terms 'client boot instance' and
(subsequently) 'boot verifier' are not used in RFC 7530. s9.1.1
uses the term 'client incarnation verifier' (and this term is
actually used in s4.1 , para 10 (second bullet after nfs_client_id4
data structure definition). I personally think is a more
descriptive term. I would be inclined to replace 'boot verifier'
with 'incarnation verifier' throughout, and provide a definition in
a terminology section (see above). This could also cover the case
where the incarnation verifier is changed because the structure is
destroyed rather then the client host rebooting (as noted in s4.6)
although the intention is that it should only change on a reboot.
s3, 2nd set of bullets, bullets 2 and 3; Also s4.9, para 1: s/client
id/client id string/
s3, last bullet: It would be worth pointing out that there is a
complete revision of the definition of the SETCLIENTID operation.
This is relevant because there is a pointer to the definition of
SETCLIENTID in s4.
s4: Correct section names from RFC 7530..
OLD:
The replaced sections are named "client ID" and "Server Release of
Clientid."
NEW:
The replaced sections are named "Client ID" and "Server Release of
Client ID."
END
s4.1, bullet 1: s/client IP address/the client IP address/
s4.1,bullet 2: I would replace 'save' by 'maintain a non-volatile
record across reboots of' to make it clear what is intended.
s4.1, last para: The concept of 'server trunking' needs to be
defined. NFSv4.0 per RFC 7530 does not have this concept - it is
introduced in NFSv4.1, thus somebody looking from the PoV of a 4.0
implementer need not know anything about trunking.
s4.2: Section references to relevant parts of RFC 7530 that provide
definitions of structures and operations would be helpful here.
s4.2, bullet 1 of second set of bullets (under struct
nfs_client_id):
The id field is a variable-length string which uniquely identifies
a specific client. Although, we describe it as a string and it is
often referred to as a "client string," it should be understood
that the protocol defines this as opaque data.
Later on s5.1 recommends that opaque data such as this should be
encoded in network byte order. It would be helpful to repeat or
introduce this recommendation (make it a MUST???) here to ensure
that the id will be identical on servers of whatever endianess.
Also the term 'network byte order' is primarily relevant to data for
which the internal representation is a (binary) number. A better
way of saying this might be 'the encoding and decoding processes
(e.g., using network byte order for the external representation) for
the opaque data result in the same internal representation whatever
the endianness of the originating and receiving machines' even if it
is somewhat more long winded.
s4.2, para 20 (across the boundary between p7 and p8):
This shorthand client identifier (a client ID) is
assigned by the server and should be chosen so that it will not
conflict with a client ID previously assigned by same server. This
applies across server restarts or reboots.
Would there be any advantage to recommending that servers should try
to generate clientid4's that are unique to the server as far as
possible as well as different across reboots, etc.? It would appear
to minimise the risk of clashes during migrations but this may be
unnecessarily complicated and the client has to cope with the remote
possibility anyway. I note that there is a statement in s4.8, para 4
on page 19
Given that it is already highly unlikely that the clientid XC is
duplicated by distinct servers, the probability that SCn is
duplicated as well has to be considered vanishingly small.
Adding the recommendation would support this statement.
s4.2, para 7 on page 8:
s/of administrative action,/of administrative action./ (swap comma
for period/full stop)
s4.2, last bullet:
o Merger of state under the associated lease with another lease
under a different clientid causes the clientid4 serving as the
source of the merge to cease being recognized on its server.
(Always returns NFS4ERR_STALE_CLIENTID)
Should 'clientid causes the clientid4' be 'client ID causes the
clientid4'?
There are 10 other instances of clientid (rather than clientid4)
where the same question applies.
s4.2, para after last bullet on p9:
In
cases of server or client error resulting in this error, use of
SETCLIENTID to establish a new lease is desirable as well.
Which error is meant? There isn't one mentioned in this para - I
guess _STALE_CLIENTID but could be 'any of the above errors'.
s4.2, page 9.
(See the section entitled "Server Failure and Recovery")
and
(see the section entitled "lock-owner" for
details)
I take these are intended to be sections in RFC 7530 (respectively
9.6.2 and 9.1.5) - adding these section numbers and the RFC ref
would be helpful.
s4.2, para last but two:
In the last two cases, different recovery procedures are required.
See Section 5.1.1 for details.
It isn't clear which cases you are referring to here... the last two
bullets two paras before or the last two cases referred to in the
previous para out of
In the event of a server reboot, loss of lease state due to lease
expiration, or administrative revocation of a clientid4, the client
Please make this more explicit.
s4.2, last para:
See the detailed descriptions of SETCLIENTID and SETCLIENTID_CONFIRM
for a complete specification of these operations.
Section references would help, plus a note that the definition of
SETCLIENTID is now in this document (s7.4) rather than Section 16.33
of [RFC7530]. SETCLIENTID_CONFIRM is at Section 16.34 of [RFC7530].
s4.3, last para:
In that event, when the server gets a SETCLIENTID specifying a client
id string for which the server has a clientid4
Which of the three events in the previous para is being referred
to? Presumably the deliberate change of the primcipal since the
other two are not really recognizable in a controlled way. ;-)
s4.4, 2nd bullet on page 11 (para 7):
the client to be
aware when two server IP addresses are connected to the same
server (they return the same server name in responding to an
EXCHANGE_ID).
The comment about 'return[ing] the same server name' in NFSv4.1 is
not the real reason the client is sure they are the same server.
Would suggest changing this to '(Section 2.10.5.1 of [RFC5661]
explains how the client is able to assure itself that the
connections are to the same logical server.)'
s4.4, 3rd bullet on page 11 (para 9): Suggest adding a reference to
the fs_locations discussions in RFC 7530 - Perhaps '(see Sections
8.1 and 8.42 of [RFC7530])'.
s4.4. 4th bullet on page 11 (para 10):
in that the two different
client id strings sent to different IP addresses may wind up on
the same IP address, adding confusion.
The phrase 'same IP address' doesn't really express what is going on
IMO. Perhaps 'same (logical) server'.
s4.5, para 3: s/per server network addresses./per server network
address./
s4.5, last para:
Therefore, client implementations
that support migration with transparent state migration SHOULD NOT
use the non-uniform client id string approach, except where it is
necessary for compatibility with existing server implementations
Arguably, this is an operational decision rather than something that
affects the protocol... possibly s/SHOULD NOT/should not/.
s4.6, para 10: s/typically the reboot time./typically at reboot
time./
s4.7, para 6: s/the spec/this specification/
s4.7, para 8: Spurious double quote (") at end of para:
enable transparent state migration."
s4.7, para 9: Missing period/full stop at end of para.
s4.8, para 3: s/blind-sided/surprised/ (blind-sided is rather too
idiomatic).
s4.8, para 7: s/Servers MUST NOT do that./Servers MUST NOT behave in
this way./
s4.8, last para on page 17:
In this algorithm, when SETCLIENTID is done it will use the common
nfs_client_id4 and specify the current target IP address as part of
the callback parameters.
I am unclear why (and, indeed, how) the target IP address needs to
be incorporated in the callback parameters. AFAICS, the process
described in this section does not examine a callback (by this I
mean the results of one of the CB_xxx operations) as part of the
process of trunking determination. The choice of callback_ident
(which appears to be the only relevant parameter) seems more likely
to be appropriately selected in line with the statement later in the
process (on page 19):
The specific callback
parameters chosen, in terms of cb_client4 and callback_ident, are
up to the client and should reflect its preferences as to callback
handling for the common clientid, in the event that X and IPn are
trunked together.
but chosen so that they would be appropriate if the server doesn't
end up trunked with any other existing client ID. Clearly the IP
address would be a convenient semi-random value which would work to
identify the callback uniquely eventually but the selected value
doesn't seem to have any effect on the algorithm.
s4.8, para 1 on page 18:
Note that when the client has done previous SETCLIENTID's, to any IP
addresses, with more than one principal or authentication flavor, we
have the possibility of receiving NFS4ERR_CLID_INUSE, since we do not
yet know which of our connections with existing IP addresses might be
trunked with our current one. In the event that the SETCLIENTID
fails with NFS4ERR_CLID_INUSE, one must try all other combinations of
principals and authentication flavors currently in use and eventually
one will be correct and not return NFS4ERR_CLID_INUSE.
I am not clear how this interacts with the variation of the
definition of NFS4ERR_CLID_INUSE given in s7.2. Would I expect the
server to accept authentication flavors other than the one already
used for another address leading to the same server? s7.2 says that
the server MAY accept a different flavor if it thinks the request is
bona fide. I don't understand what might or might not lead the
server to think an alternative flavor was bona fide - and would it
be likely to apply in this case?
s4.9, last set of bullets: Should the draft also advise applying a
one way function to the MAC address for privacy reasons?
s4.8, page 18:
o If one or more matching clientid4's is found, none of which is
marked unresolved, the new IP address X is entered and marked
unresolved. A SETCLIENTID_CONFIRM is done to X using XC and XV.
After applying the steps below to each of the lead IP addresses
with a matching clientid4, the address will have been resolved: It
may have been determined to be part of an already known server as
a new IP address to be added to an existing set of IP addresses
for that server. Otherwise, it will be recognized as a new
server. At the point at which this determination is made, the
unresolved indication is cleared and any suspended SETCLIENTID
processing is restarted
I suspect that the second para in the section quoted here is not
intended to be part of the bullet, and should be unindented.
-----------------
s5.1, para 2: It would be desirable to mention that the change from
big endian in [RFC7530] to network byte order is purely
terminological.
s5.1.1, para 1:
In the case of migration, the servers involved in the migration of a
filesystem SHOULD transfer all server state associated with the
migrating filesystem from source to the destination server.
Arguably this SHOULD is an operational decision. It does not affect
the protocol or the format of the bits on the wire. Accordingly I
suggest s/SHOULD/should/. This corresponds with the discussion in
[RFC7530]. On the other hand:
This
must be done in a way that is transparent to the client.
is probably a MUST: s/This must be done/If state is transferred it
MUST be done/
s5.1.1, para 7: s/NFS version 4.0 protocol/NFS version 4.0 protocol
(either in [RFC7530] or this update)/
s5.1.1, para 12:
When a
server implements migration and it does not transfer state
information, it SHOULD provide a filesystem-specific grace period, to
allow clients to reclaim locks associated with files in the migrated
filesystem.
I believe this is a MUST rather than a SHOULD. The alternative of
not doing it doesn't look good! If so the last sentence of s5..1.1
needs adapting to match.
s5.1.1.1: The {v, X, c} notation needs to be defined (presumably as
used in s7.4 and Section 16.33 of [RFC7530]). Section references in
this doc or [RFC7530] ought to be provided for SETCLIENTID (**s7.4
in this doc**) and SETCLIENTID_CONFIRM (s16.34 of [RFC7530]).
s5.1.1.2, para 5: How does the server know that the client has been
implemented to isolate owners to a particular filesystem? Looking
at the next para, I suspect that the server doesn't know - it merely
observes that owners don't sprerad across multiple file systems at
the time of migration but there is no guarantee - and it doesn't
matter - that the client might not have owners that spread across
filesystems some time.
s5.1.1.2, para 7 on page 29:
Such support only needs to be
provided for requests issued before the migration event whose status
as the last by sequence is invalidated by the migration event.
I can't parse the last line of this sentence.
s5.1.1.2, 1st and 2nd bullets on page 30: s/the last request for the
owner in effect/the sequence number for the last request for the
owner in effect/; s/longer one less the next sequence to be
received./longer one less than the next sequence to be received./
------------------------
s6.2, para 4:
o Some operations which modify locking state are not allowed to
return NFS4ERR_DELAY.
I think it be worth mentioning which operations these are in this
para. Inspection of Table 7 in RFC7530 indicates to me that the
relevant operations are OPEN_CONFIRM and RELEASE_LOCKOWNER plus
possibly RENEW. GETFH, RESTOREFH, SAVEFH and ILLEGAL don't return
NFS4ERR_DELAY but don't appear to affect the locking state. I
observe that all of these are mentioned explicitly except for
OPEN_CONFIRM: Does anything need to be said about this extra case?
s6.2, para 6 on page 37: s/cannot change target filesystem locking
state,/cannot change the target file system locking state,/
s6.2, para 7 on page 37:
o Keeping track of the earliest request started which is still in
execution (for example, by keeping a list of active requests
ordered by request start time). The server can then define T' to
be the first time at which the earliest-started active request
started after time T.
I don't understand what this criterion is driving at. The second
sentence appears to be aimed at identifying the time associated with
some action that is associated with the earliest-started request but
doesn't actually say what it is AFAICS.
s7.1: It would be helpful to identify the sections of RFC 7530 that
are affected by the changes summarized here, identifying
replacements and modifications. It would also be worth reordering
the summary to match with the order of subsections of s7.1.
s7.1, bullets 1 and 2: s/CLID_INUSE/NFS4ERR_CLID_INUSE/
s7.2: See the comment on s4.8, para 1 on page 18 above.
s7.3, para 7: s/Thus, a client that is prepared to receive
NFS4ERR_MOVED/Thus, if a client is prepared to receive NFS4ERR_MOVED
s7.4.5.1: A pointer to Section 9.1.7 of [RFC7530] where the DRC is
introduced would be helpful. BTW I am not sure that a server is
*allowed* to be without a DRC - s9.1.7 says it is critical to have
something that looks like a DRC.
s7.6, para 1: Please reference Section 19 of [RFC7530]. The
affected para is the penultimate rather than the ultimate para of
Section 19 of [RFC7530] judging from the initial words.
s7.6, para 2:
See the section "Client Identity
Definition" for further discussion.
Please add a section reference (Section 4 of this document)
s8: Suggest
OLD:
Is to be modified as specified in Section 7.6.
NEW:
The security considerations of [RFC7530] remain appropriate with the
exception of the modification to the penultimate paragraph
specified in Section 7.6 of this document and the addition of the
material in Section 7.5.
END