ForCES Working Group, Tuesday, July 31, 2012 CHAIR: Jamal Hadi Salim Minutes: Spyros Denazis 1) Meeting started with the chair presenting the agenda and pointing details to remote users. 2) Chair presented WG status. * The WG has mandate from the ADs to allow new work. - Chair encourages any new work as long as it relates to ForCES. * Since last meeting on the two core documents have undergone some small changes but chair wants to move quickly to publication. The interop draft is low priority. After these two publications are made the move to publish the interop will be made. 3) LFB Lib presentation (presented by Jamal on behalf of Weiming and authors) Since last meeting, v9 of LFB published. *Small editorials & typo fixes. Emphasize Redirect's metadata should be clearly stated as IANA controlled. *From a technical side, **a few components are now marked as optional. **Added statistics for Redirect In/Out LFBs. * Open Questions, Ed's response is one. Ed was present in the room and promised to send responses RSN. *Mistake: etherType was 32 bit, but it's a 16 bit field. Fixed but Leaves a hole of 16 bits. Attendees were probed if a reserved field should be added. Sue: Go with reserved field. Consensus to go with reserved field. 4) CEHA v3 published on July (presented by Jamal on behalf of Kentaro and authors) *Earlier version had made changes to backupCEs. New version has reverted backupCEs and added AllCEs. Implementation suggested it is needed for backward compat *New component: HAmode, which HA mode to be selected is. *New component: AcceptBackupGets flag, to allow queries from backup CEs. *MaximumCEAssociation capability, how many backup CEs connect to master CE (will be removed) Open Question 1: AcceptBackupGets to allow other CEs to do queries. Possible faulty CE (impossible malicious CE to join an NE), which does a lot of Gets, denies the master CE. Joel: The spec says: backup aren't allowed gets. Jamal: True, but they get events by having master subscribing to events already. Joel: If you allow it, you'll need to prioritize that master has precedence - too much complexity, it's my feeling though, not thorough analysis. Ed: Backup should do gets. Issues of rate limiting, protect yourself from extraneous traffic from misbehaving CE, (should probably be mentioned). Jamal: We'll take up the issue to the list.. 5) Redirect challenges - Jamal Hadi Salim Application using ForCES that must scale to very high volumes of users. Packets are redirected to CE. In CEHA we say all redirects go to master CE; problem is it means that we can scale the NE as much as the master CE can handle redirects. Consider an app where 30-40% traffic involves redirects. A redirect creates a policy update and one or multiple configs of >1 FE. A single CE works well. If you watch the graph of how much it can handle, you'll see a linear curve to the max loss-free rate and it stays constant. The SCTP-TML works well, regardless of how many redirects come in, at some point the max capacity for redirects is hit, configs still work. on a busy CE, the redirects are dropped and end user retransmits and things continue to work. Real world problem. Want to handle million users in one NE (past the MLFR) in other words, want to scale horizontally, want to add another CE to handle redirects. First solution to modify redirect LFB from LFB-lib document. This will introduce metadata (that will redirect to send to specific CEs). Master CE will configure a table on which backup CEs get which redirects. Consequence of such a change is it will delay publication of both LFBlib and CEHA. Joel: This is problematic. If you allow redirects to backup CEs, they would be able to send redirect back to the FE, that's not currently allowed. Maybe that should be in the rules as it's no big deal. Suggestion: using the redirect lfb to send to other FEs. Jamal: Problem is multiple FEs may need to be updated. Sending to FE rather than CE would require the FE to update those other FEs. Joel: Shouldn't send it to backup CEs as they don't have state. Jamal: assuming those CEs talk to each other (yes, it is out of scope for us). Joel: Valid to have the CE be multiple boxes, one of which is essentially works as load balancer. That's all invisible to the protocol as long as it behaves regular to the protocol (on the wire). If you implement a multi-processor CE by doing that outside of the FE, it's fully compliant. Depended of additional behaviour we haven't specified. I'm getting really nervous when we change the spec to something it can't work without changes inthe spec. Jamal: Not changing the protocol or the spec, only one LFB that has not been published. Solution 2, we write a proprietary LFB, or derive from it and publish it as a draft later. Joel: If you define a way to send packets with metadata through an interface. Conceptually there is the link to the CE you want to send to the controller. If you have links that look that are to some other FE in the same or different NE, which can carry metadata, fine, if you want to assign a packet format that sends a metadata carrying packet format, happy to allow this to be sent to FEs not handled by the same CE, because interaction control does not cause a problem there - since metadata IDs are consistent. A way to get there - not quickly. Jamal: That brings the same issue mentioned earlier. We have a redirect to an FE sent to another FE which results in that FE updating other FEs. Let me put last possible solution. How we handles event today in CEHA. It is a FEM Config option, all events broadcast to all is at TML level, totally transparent to any LFB. Could make a config option, if you see metadata x send to CE 4. Load balancer at the FE level, and write a draft. Joel: I can't stop you doing it, but i don't think it belongs to the spec. Jamal: Your solution is having a proxy CE. The proxy will be overloaded as well at some point even if it is going to improve the MLFR. Joel: Limit on number of connected FEs. A very dumb proxy can forward as fast as possible. You can built a distributer to handle. Always a scale where things breaks. Jamal: I'll take it back and you'll probably see us at next meeting with numbers. Adrian Farrel: Don't wait for next meeting. Bring up the discussio sooner. Jamal: On the list. Probably a draft. 6) FEM LFB (Jamal Hadi Salim) Jamal has seen a lot of comments that ForCES was only useful for configuring the datapath (including on the IRS problem statement recently). He claimed that ForCES can be used to configure any runtime attributes. Something ForCES was not originally intended for but came as a consequence of the model and agnostic protocol. He demonstrated the FEM config that is used to bootstrap the FE in the form of a config file. He then showed how the config is modelled as an LFB in their implementation. Allowing runtime queries and reconfigurations. Some of the attributes he showed were debug levels changes (claimed to be his favorite), syslog levels etc He then compared against netconf as used in OF and pointed out that there was no need for a second "config" protocol; that ForCES is used in both cases - simplifying the deployment. 7)OpenFlow ForCES LFB (Evangelos Haleplidis) Motivation to demonstrate the ability to describe OpenFlow via ForCES model and create an implementable OpenFlow switch using ForCES architecture. Additionally to facilitate building applications to create either OpenFlow-enabled ForCES switch or ForCES-enabled OpenFlow switch. Status: First on 25th of May, lots of comments and suggestions, especially from David Meyer and Zoltan Lajos Kis, thank you. New version 9th of July. *In the draft 00 for each Flow Table in an Ofswitch was modelled as an instance of the OFFlowTable. Was hard to illustrate and looked very complex. Also the OFActionSet was empty, although mostly implementation issue. Additionally there were some misconceptions from the OF spec. *illustration of modelling in second draft: - Put all FlowTables in one OFFlowTables in one instance. Much more clear graph visualization. - Redirect LFBs derived from original Redirect In/Out from lfb-library. - The OFSwitch LFB contains necessary info regarding the switch. - The action LFBs as separate classes - Evangelos then described how a packet would flow when entering the OFFlowTables. * summary of changes from 00. - All Flow Tables instances into one. - Makes metadata and actionset metadata invisible. - Decided to remove Action Set LFB. The Action Setting would occur inside the OFFlowTables and is imlpementation issue. - Buffering happens in the OFFlowTables also implementation issue. Sometimes instead of sending packets to the controller, they are buffered and send a buffer id. - Additionally there was a need to extend Redirect In and Redirect Out. We could possibly merge both into one LFB one point to the controller. Logically there would the buffering be done. Comments? Ed Crabbe: Keep them split. Question, if you put all tables in a single box, and removing action set table. How to maintain the action set. Evangelos: Implementation issue. There is an array of write actions, it can be configured, but how the action set metadata list is stored and executed, that is implementation specific. Ed Crabbe: how they are executed is important to the Openflow spec. My concern is that you create a box inside openflow and say implementation detail and put stuff around it. Joel: Let's separate. Two pieces of a protocol. There's the information exchange and the semantics of what you do. This model is the information exchange. By folding it in one box, the details of how you make use is in the description clause of the box. What they're modelling here is the information exchange that goes with it. and that is ForCES concerned about, things you want to control remotely. ForCES is not for describing behaviour. Ed: Not suggesting to model OSPF. Overlap, by including it like this you muddle the waters, let's take it off line. Jamal: Yes, preferably on the list. Evangelos: We'll keep them separate. And this version also correctly positioned the QueueLFBs However the PacketID is opaque to the ActionLFBs, which do not consume it, rather return it the LFB that called them. Therefore we were considering whether the PacketID is actually an implementation issue and not required to be modeled. Sue: Folds into what Ed said. Evangelos: Ok we keep it. Evangelos illustrates the OFSwitch LFB. Jamal: OFSwitch, has no input/output ports. Evangelos: Yes. Jamal: why are port events here then? Evangelos: The OFSwitch has info about ports, so the event occurs from here. Jamal: Wondering why that is not in port LFB Evangelos: OFSwitch has config information about the switch slide on OFFlowTables. Each table row has an array of flow entries, flow counter and miss behaviour definition. Ed: You maintain arrays. Whole bunch of text that alludes to the fact that progress is serially progressing in the array. The text seems to say. Problems with the text. Evangelos: We can fix that. Jamal: Send comments Ed. slide on Group Table LFB, each group table entry is an array of GroupID, GroupBucketType, GroupCounters and ActionBuckets slides onAction LFB. The action LFB is a template for the Action LFBs. We derive all action LFBs based on this lfb. For Example the set ip source has only the component of an array of ipv4 address and input/output ports are described in the action lfb. Next activity on this document: -finalize the xml of the library after comments and finalize the text. -continue with adding OF 1.0, 1.2 and 1.3 Jamal: Are you going to put all in one document. Evangelos: open to discuss. Jamal: This exercise may prove forward/backward compatibility. Is it worth the effort to go back to 1.0? Ed: I don't think so. No need to. You say you have a sufficient expressive meta-language that you can make them forward/backward compatible. 1.0 is then effectively covered. Jamal: That's a useful exercise, we want to be pragmatic. Ed: I don't think there is a need to model the complete set. Jamal: If the draft goes in 1.2 or 1.3 it should be sufficient. 8) Susan Hares OF vs ForCES comparison This question may be philosophical, sometimes two protocols that look the same can teach things. How OF relates to FoRCES how implementation look to each other. We good some really good feedback last time. How does this help us get to devices that work in the cloud, or in a strong-cloud services etc? Something that got me thinking is that while at the ONF trying to learn and contribute reached the conclusion that the ONF is constantly repackaging existing ForCES technology. In this presentation, I try to do something above all the buzz and do a solid comparison. Starting with the historically context. ForCES work originated with NPs to create an OEM comodity software. The NPF had a common API. The NPF, when it came to defining a protocol moved things to IETF for standards because they believed IETF has strong protocol base. OF started with researchers looking to large scale NG testbed and companies like Google picked it and worked on it. * In OF 1.x a bunch of controllers work with a bunch of switches, - In ForCES, a bunch of controllers (CEs) managed by a CE manager and a bunch of FEs(equivalent to switches) are managed by an FE manager, * In ForCES, if you have new flow logic, we define one or more LFBs using the ForCES modelling language. - OF 1.0 is pretty static the later ones where goes to one flow, drop, or table miss. The 1.4 table miss has some changes. Ed? Ed: Couple of discrepancies. 1.0 has one table modeled as a TCAM -everything in it. Should have said it's a graph in 1.1 Susan: Went to the group table and metadata and both. Using modelling, can we do the same things. * Ships in the night (Hybrid) approach in OF builds on lack of expressability of other things. i don't think it changes the basis issue that OF is lacking this capability. - On the ONF's future list, they thought that 1.1 is too static, and there is discussion to get modelling equivalent to LFBs - protocol security ForCES goes with IPSec, whereas OF uses SSL. Easy to add SSL to ForCES since the transport is independent. - protocol semantics ForCES separates configuration, events and packet exceptions; they are part of the protocol in OF. - configuration focus OF is focus on Table(s) configuration whereas ForCES is on LFB configuration (which may constitute tables or scalar attributes). Trying to say how to look at SDN. Can we do some modeling. My group in Huawei looks at that and tries to compare. Question: how is to work for implementation. we have implementation of OpenFlow in some of our routers. Conclusions, both ForCES and OFS follow the basic idea of separations of forwarding plane and control plane in network elements. Both are capable of operating in centralized control, distributed control. From an SDN perspective where SDN is more than flow control, ForCES is more mature and ready. It can be used to either control all components of an NE/switch (as an FE) or run in hybrid mode. ONF is re-inventing the equivalent of ForCES. It seems this re-invention is driven by code snippets from Google. So i think an implementation is needed accessible to both academic and commercial users. As we run down with Evangelos's draft it should become clear that we can implement OF with ForCES. Hope an available ForCES implementation will do the same for us. Here's what i would like to do, try implementation, try multiple controllers, multiple scenarios. On this draft: I am asking for adoption, if too drafty, let's speed it up, i want to continue of working on it. Ed: Needs a little work. Susan: I agree. Jamal: Outside of charter even if it was ready today. Comments? Adrian: Given that ONF is looking at some of the overlap as well. Should we try to coordinate work, should we encourage them to look in ForCES as filling their holes? Susan: First lets have experts in both sides. Make the conversation start, and i hope it continues. If valuable go both ways. Yes good idea. Adrian: On the Charter thing. As a wg is close to 10 years old and major publications completing, Jamal and i have been having discussions on shutting down the working group for some time. The Paris wg meeting had lots of momentum, since Paris, not much in the mailing list. I will be looking for evidence in the next months, people trying to work in the ForCES. New ideas posted in the mailing list, things on the mailing list. And from there we'll decide how to proceed. Jamal: To clarify to the wg, the drafts or discussions can go outside the current charter. Adrian: Yes. We'll see if people want to do work to see if we want to recharter. Threat: opposite of discovering motive will mean closing the wg. Jamal: Makes sense. Ed: Open source implementation is a very good start. Could be cause for zero inertia. Jamal: There exists implementation but not open source. And thanks for putting me on the spot Ed.