Last Call Review of draft-ietf-anima-grasp-api-07
review-ietf-anima-grasp-api-07-genart-lc-kyzivat-2020-10-28-00

Request Review of draft-ietf-anima-grasp-api
Requested rev. no specific revision (document currently at 08)
Type Last Call Review
Team General Area Review Team (Gen-ART) (genart)
Deadline 2020-10-28
Requested 2020-10-14
Authors Brian Carpenter, Bing Liu, Wendong Wang, Xiangyang Gong
Draft last updated 2020-10-28
Completed reviews Genart Last Call review of -07 by Paul Kyzivat (diff)
Secdir Last Call review of -07 by Joseph Salowey (diff)
Assignment Reviewer Paul Kyzivat 
State Completed
Review review-ietf-anima-grasp-api-07-genart-lc-kyzivat-2020-10-28
Posted at https://mailarchive.ietf.org/arch/msg/gen-art/E6hZlRTNnPw4xGWBjzYKY3InYsU
Reviewed rev. 07 (document currently at 08)
Review result Ready with Issues
Review completed: 2020-10-28

Review
review-ietf-anima-grasp-api-07-genart-lc-kyzivat-2020-10-28

I am the assigned Gen-ART reviewer for this draft. The General Area
Review Team (Gen-ART) reviews all IETF documents being processed
by the IESG for the IETF Chair.  Please treat these comments just
like any other last call comments.

For more information, please see the FAQ at

<https://trac.ietf.org/trac/gen/wiki/GenArtfaq>.

Document: draft-ietf-anima-grasp-api-07
Reviewer: Paul Kyzivat
Review Date: 2020-10-28
IETF LC End Date: 2020-10-28
IESG Telechat date: ?

This draft is on the right track but has open issues, described in the 
review.

General:

I began this review without any knowledge of Anima. I did read several 
of the related documents for context, but my overall understanding 
remains somewhat sketchy. So take my comments with with that in mind. 
(At least you get a fresh, untainted perspective.)

Issues:

Major: 5
Minor: 6
Nits:  2

1) MAJOR: Unclear model for API function usage

As I read through sections 2.3.2 through 2.3.6 these I got progressively 
more confused. I finally concluded that a big picture of how API 
functions interact is lacking.

For one thing, there isn't a clear delineation between those calls used 
by requesters and those used by responders. And then valid sequencing of 
calls is hard to tease out.

It would be helpful to have one state model for requesters and another 
for responders. It may also be helpful to break this out for threaded 
and event loop variants.

My subsequent issues regarding these sections are reflective of this 
confusion.

2) MAJOR: What state does GRASP retain for Objectives?

It is clear that the GRASP must retain the names of registered 
objectives. There are implications that it must keep more. Seemingly it 
must retain received flooded objectives, on a per-source basis. It is 
unclear if it is only for registered objectives, or also for 
unregistered objectives. This could potentially become a large resource 
drain if there are lots of nodes flooding values.

Negotiated objectives seem to be treated differently. It isn't clear to 
me if GRASP needs to retain copies of their values.

3) MAJOR: Consistency of Objective definitions

In section 2.3.1.2 and elsewhere, presumably all parties that use a 
particular objective must agree on the values of synch, neg, dry, and 
the format of the value.

Apparently you are relying on each caller getting this right. What 
happens when this isn't the case? How can blame be ascribed so that the 
problem can be fixed?

4) MAJOR: Confusing semantics of 'request_negotiate'

In section 2.3.4 I don't understand the following:

          1.  The 'session_nonce' parameter is null.  In this case the
              negotiation has succeeded immediately (the peer has
              accepted the request).  The returned 'proffered_objective'
              contains the value accepted by the peer.

IIUC this requires a network exchange with the peer. I don't see how 
this can complete *immediately*. ISTM that this could only complete 
immediately if it were satisfied from a local cache. That doesn't seem 
appropriate for this function.

Similarly, in bullet 2 I don't see how the proffered_objective would be 
available in the initial call.

Or is this text only applicable to a threaded implementation with 
synchronous calls?

Bullet 2 also says:

              ... The
              returned 'proffered_objective' contains the first value
              proffered by the negotiation peer.  The contents of this
              instance of the objective must be used in the subsequent
              negotiation call because it contains the updated loop
              count, sent by the negotiation peer.

Do you mean this must be used in the call to negotiate_step? But that 
would mean asking for what has already been granted. For the the 
negotiation to be useful I would expect that the next round would ask 
for a value somewhere between what was originally requested and what was 
initially offered.

I think you need to say more about the whole concept of negotiation and 
how it is expected to work. Also, additional discussion of the purpose 
and semantics of dry run negotiation.

One more thing: you don't explain the semantics of 'timeout'. Is it only 
used locally, to terminate the synchronous form of the call? Or is it 
transmitted and used by the responder to control how long it might spend 
trying to fulfill the request before giving up?

5) MAJOR: Confusing semantics of 'negotiate_step'

In section 2.3.4, I'm confused by:

             Called in the same thread as the
             preceding 'request_negotiate' or 'listen_negotiate'

I understand use of this following request_negotiate, but not with 
listen_negotiate. A negotiation should consist of an ordered sequence of 
"rounds" of bidding. Allowing both requester and responder to initiate a 
next step can lead to a disruption of the ordering. I would expect this 
to only be used by the caller of request_negotiate. What am I missing?

6) MINOR(MAJOR?): Confusing semantics of 'negotiate_wait'

Regarding section 2.3.4: I understand that the GRASP Confirm Waiting 
message (from [I-D.ietf-anima-grasp]) and hence 'negotiate_wait' is only 
used by the responder. How is the effect manifested in the requester? Is 
a synchronous requester just silently extended? What about an event loop 
requester?

7) MINOR(MAJOR?): Role of 'peer' in 'synchronize'

The description of 'synchronize' (in section 2.3.5) says:

       -  If the objective was already flooded, the flooded value is
          returned immediately in the 'result' parameter.  In this case,
          the 'peer' and 'timeout' are ignored.

Do you really want to ignore 'peer' in this case?

I infer from the description of get_flood that the GRASP retains flooded 
values separately for each flood source. So you should have the 
capability to return only a value flooded by the requested peer.

8) MINOR(MAJOR?): Confusing semantics of 'flood'

The descriptions of 'flood' in section 2.3.5 seem to imply that the 
GRASP in each of the nodes receiving the flood saves the value, for use 
in get_flood and synchronize. Is that right? (Actually, the description 
of get_flood implies that flooded values are saved separately for every 
flood source!)

Are only registered objectives saved? Or must the GRASP save all the 
values from every flood that it receives?

What if one flood contains X, Y, and Z. Then a later flood from the same 
source contains only X and Y. Presumably the new values of X and Y 
replace the old ones. Does GRASP retain the value of Z, or discard it?

ISTM (based on my limited understanding) that it would make more sense 
for the GRASP to simply cache all objectives that have been registered 
and/or received as a result of synchronization, together with their TTL. 
And then subsequent requests would be satisfied from the cache as long 
as the TTL hasn't expired.

9) MINOR: Clarify Session

Section 2.2.3 implies that Session is a full fledged entity. It would be 
helpful to flesh this out in more detail. E.g., describe it as a 
distinct entity in the API and include a state model.

10) MINOR: Representing mutually exclusive items with booleans

In section 2.3.1.3, 'synch' and 'neg' are mutually exclusive. By 
representing them as independent booleans you introduce the possibility 
of invalidly setting them both or neither. ISTM it would be better to 
represent these as a single value (enumeration).

A similar situation occurs in section 2.3.1.4.

11) MINOR: Managing thread for 'listen_negotiate'

In section 2.3.4, suppose the ASA wants try to ensure that it always has 
a listener ready for an incoming request. IIUC the only way to achieve 
that is to create the maximum number of threads it is willing to devote 
and set them all to listening. That is sub-optimal. It would be better 
if it could somehow learn that another one is needed and create it. Some 
additional API mechanisms would be needed to support that.

12) NIT: Idempotence

Bullet #2 of section 2.2.1 says:

        To facilitate this, the API implementation would provide non-
        blocking versions of all the functions that otherwise involve
        blocking and queueing.  In these calls, a 'noReply' code will be
        returned by each call instead of blocking, until such time as the
        event for which it is waiting (or a failure) has occurred.  Thus,
        for example, discover() would return 'noReply' instead of waiting
        until discovery has succeeded or timed out.  The discover() call
        would be repeated in every cycle of the main loop until it
        completes.  Effectively, it becomes a polling call.

This seems to contradict the earlier statement (in 2.2) that the calls 
are not idempotent. Section 2.2.2 tries to clarify this, but it is not 
very clear.

13) NIT: Confusing terminology in 'discover'

In the description of 'discover()' (in section 2.3.3) the 'age_limit' 
parameter is confusing. At the least the name seems wrong for the 
described semantics. The implication is that an expiration date/time is 
what GRASP stores, and that remaining time until expiration is what is 
being used here. It is hard to see how it is appropriate to call that an 
"age". ISTM that would better be called a TTL, and that is parameter 
might then be called 'minimum_TTL'.