Skip to main content

Minutes IETF114: tcpm
minutes-114-tcpm-01

Meeting Minutes TCP Maintenance and Minor Extensions (tcpm) WG
Date and time 2022-07-29 14:00
Title Minutes IETF114: tcpm
State Active
Other versions markdown
Last updated 2022-08-11

minutes-114-tcpm-01

TCPM meeting, IETF-114, Philadelphia

Friday, July 29, 2022 10:00-12:00 UTC-4 (120 mins)

Agenda

WG Status updates

Time: 15 mins 15/120

Martin: Errata of AO-test vectors. Least work for everyone is file a
single errata for all. Tell the RFC editor to deal with them.
Martin: Retract comment: Leave as is, 16 individual. Ask RFC editor how
to handle this after this WG meeting.
Martin: Prioritize AccECN, because L4S stuff is going through. Awkward
if this is significantly behind this.
Michael: Which to deprioritize.
Martin: Shepard please work on AccECN before other doc if working on
multiple.
Michael: Nothing should block AccECN, can start WGLC.
Bob: Yes, ask to get WGLC on generalized ECN.
Martin: Also part of L4S. If ECN doc in the queue please give it
priority, so that process things can go forward.

Working Group Items

Bob: Mentioned 5: Also mention other problems with beta 0.5, like more
queue variation
Yosi: Issue 1: I think this is the issue about slow start, not CUBIC.
just more visibile in case of CUBIC. So, it should be fixed in slow
start, not CUBIC. Not sure we need to address this in the draft.
Rodney: Is Hystart++ used with CUBIC?
Vidhi: Yes.
Stuart: largely documenting what is currently running. Fine to suggest
CUBIC 2.0 and discuss in IETF later. Alternative beta would take years
of study to find how it behaved on large scale.
Yoshi: From my point of view, the conclusion for this discussion is it's
not an easy issue. We can not settle this here - we need detailed
analysis. but should we hold the publication for it? I am not sure if
this is what we want. I think creating new version of CUBIC is totally
fine to do it later, but not sure in this draft. The consensus was to
publish as PS - no proposition to change. This draft is not a threat to
the internet. No one is arguing this. I just would like to check if
people want to wait for a detailed analysis, or publish.
Vidhi: These are more research questions, take lot of time.
Gorry: I don't like the approach to standardising this, it should have
been done differently. We need to note this important discussion in the
spec. We really should specify what has been deployed.
Martin: Point 5 thing:
Yoshi: issue 1, Reno fairness, and beta.
Martin: Case this is too complicated. Issue 2: 0.7 is what is deployed,
0.5 is research, not an issue.
Michael: Spend some time on discussing the issues, if this is a mistake
in the specs. Something like 0.7 vs 0.5: happy documenting that people
are using 0.7, add discussion around 0.5
Richard: I don't know value studying 0.5, or where the deployed
implementations would change anyway just because we change the spec.
Stuart: Strange conversation: CUBIC is the dominate CC, no RFC has been
specified yet. There is no point in documenting another fictional CC
that no one is using. Write down what is in use.
Bob: Switch to other issue: My model for tail drop, AQM is more
difficult. My testing presenting L4S give this results.
Vidhi: Those results are in the ICCRG slide. With 0.7 slow starts may
takes two rounds to get below the bottleneck. In CA - with 0.7 you don't
need as deep buffers as with NewReno. This is just about pros/cons. We
cannot look at only one side. Would need some research.
Neal: To the remark that no one would change beta in practice, I
disagree with this. In the future, I could see someone could change that
after sufficient research. Question to the tests Reno vs CUBIC - wasn't
those by themselves, and not interacting in the same queue? Issue 1 is
only expected with CUBIC and Reno in a drop-tail queue. More testing
should be put off, agree with room, to a later point. In CA - there is
likely an issue here as well, but we need long term research and should
not change the RFC.
Michael: Please document what is out there. If there is a discussion
what may be better, document this.
Bob: Re Neal's comment - there were results for the Prague CC. CUBIC and
CUBIC vs. Reno. Not just one per queue.
Martin: I see the tension between out there vs. an ideal. I think the
the right way to think about this - if we reach fast rough consensus on
something, we maybe should include that fix: If we find this was a bug,
we should be able to deploy this. But not make writing this spec a long
research project.
Yoshi (as individual): if we'll document current discussion points such
as 0.5 or 0.7, it might be better to create another draft, not in the
draft.
Yoshi: From these comments, I believe the current consensus is still to
publish it as a PS. Any objections on going forward with this, please
speak up now...

Yoshi: Sent comment a few months ago, some points got addressed. Please
adress remaining points.
Richard: Happy to see this work progressing. Lets move on this quickly
to be able to follow up with improved implementation.

Martin: Early allocation is a good idea. Good to know if that gets eaten
in the Internet. Did you reach out to Linux kernel maintainers to adopt
the code to the linux mainline. Would be good if linux servers would
support it.
Bob: Patch set for netdev community. They wait for IETF / RFC? Neal is
active pushing on this too.
Martin: Will ask Neal.
Bob: Unclear what stage of IETF doc status the Linux Maintainer would be
wait on.
Yuchung: We should get a a real option as soon as possible and stop the
experimental as soon as possible. Speaking from TFO experience.
Generalized ECN: Is this also good for DCTCP cases?
Michael: AD what is the procedure on early assignment?
Martin: I have to give my approval.
Richard: I informally notified IANA about an upcoming early assignment
request.
Martin: Chairs, please sent me a note, and I will approve it.
Yoshi: I see this draft is ready for WGLC.
Bob: Discussion is finished, I believe.
Yoshi: Confirm that discussions have finished.
Yuchung: Double check that AccECN has a working implementation?
Bob: I believe it does, and it should. Test it yourself is the best
answer though.
Yuchung: TSO was my concern.
Gorry: Bob solved the issue we discussed.
Neal: Re Yuchung: offload support - Question is TSO HW may be doing the
wrong thing with the ACE field.
Bob: When TSO are in use, in the Linux stack it is handled. With
Hardware, we haven't had these discussions yet. Need input from HW TSO
NIC driver folks to address concerns in this area.

Michael: You mentioned you changed the examples. Have you checked the
new examples work properly?
Mahesh: Yes, we take the examples directly and validated them.

Other Items

  • NG TCP Yang Model
    Speaker: Gyan Mishra
    Time: 15 mins 90/120

Bob: Just point out: all tcp option is probably excessive. There is a
lot that aren't. Parsing the list and see which are pertinent may be the
way to go.
Martin: Use case 1 - confused by example. Both sides are deadlocked? Is
this a case were we are just waiting for an ACK?
Gyan: Router A MGMT plane is hung, so he is not able to write to it's
buffer. Thus not able to process. Best thing would be a TCP RST or a
session close - converge after this. In this congested state, MGMT plane
is hung.
Martin: Problem is: A is hung, and B is not realizing.
Gyan: As long as session is up, it takes long time
Martin: You want to use B to ascertain state of A - makes sense. Also,
this are not generic TCP implementations, but routers.
Gyan: True.
Martin: Maybe this is a bit of overkill. But I understand the use case
better, thanks.
Michael: TCP can be up even if application can not process. Not know too
much about YANG - the request here are a lot of implementation specific
values. Continuous monitoring seems difficult on a generic standard.
Jeffrey: This is a giant list to work trough. This is not to implement
tcpdump on yang. Snapshot live state on each packet is not the goal.
Many things may cause this to happen. Majority is bugs or unusual
circumstances. In BGP case, client is waiting to get out of this
situation. Challenge is how to troubleshoot the situation. For
troubleshooting BGP remotely, need status of session. Zero window is a
common thing. If I see this for a longer period, I can take action. Most
of this is available via CLI, this is to make this more manageable. This
all are not unusual things.
Mahesh: Trying to keep track of live state doesn't make sense as a
replacement to tcpdump. Stuck connections, zero window would be helpful
though, good to have in the model.
Gyan: Exactly, getting a report to act on it would be great. Eg. reset
the session.
Martin: BGP comm is using the YANG - doing more work for them is fine. A
principle is not to do all TCP in to the YANG model, as the MIB
experience was terrible. Try to have a small as possible YANG model,
which relates to actionable stuff.
Jeffrey: Don't disagree. Choice to do enough of the work once in the
TCPM WG, or risk missing a important thing, and forking all vendors to
proprietary models.

Gorry: Flexibility here is not a good thing. One ACK per 100 is not a
big value. Why go to a 1000 even on big links.
Carles: Wonder how to make a good future proof trade off here.
Bob: Format of this is very close to AckCC draft we already have. Maybe
we have converged on something we already have. That gives you 256 as
your number. Perhaps see if there are things in that draft that preclude
your use case.
Carles: Please sent the info for the draft.
Yoshi: Use previous style and use reserved bit for big value. Then, you
can save 1 bytes in option space.
Michael: They are all even length.
Richard: Do we need the high length really? Padding is done, but could
be omitted to pack more different options.
Jonathan: Question, if value of R is harmful.
Michael: Clarification: Was a student project to get this in to FreeBSD
stack.