Application Performance Metrics BOF Wednesday, June 25 Afternoon Session II BOF Chairs: Alan Clark <alan@telchemy.com> Al Morton <acmorton@att.com> This report of the APM BOF is divided in two parts: - Summary - Detailed Notes of the meeting Summary The Application Performance Metrics (APM) BOF was attended by at least 93 people, many of whom offered their opinions at the microphone. The BOF introduced the IETF community to the possibility of a generalized activity to define standardized performance metrics. The existence of a growing list of Internet-Drafts on performance metrics supports the need for this activity, and the majority of people present supported the proposition that IETF should be working in these areas. No one objected to any of these proposals. However, many people from beyond the performance community sought clarifications about the scope of the activity. The proposal was deliberately made broad to cover the wide variety of performance work that is currently excluded by the IPPM and BMWG charters. When it came to the solution directions, there were about 25 people willing to work in some kind of working group (these appeared to be mostly people from the performance community, who did not seek clarifications at the microphone). There were two people who did not want to see a working group formed. The WG supporters were roughly split on the question of whether it should be long-lived or short-lived. The Directorate approach had one or two supporters. We can conclude that there is strong support from the IETF community to do something on this topic, but the direction to take is less clear. Discussion will continue on the pmol@ietf.org mailing list. Detailed Notes 0. Preliminary Admin and Agenda Bashing The APM BOF opened at 3.15 PM CDT in the Crystal room. Al Morton and Alan Clark co-chaired. Tom Alexander was note taker. Al Morton said that today we would talk about performance metrics that don't fit in the charters of the existing working groups. There was no interest in jabbering this, as it was mostly a BoF and only people attending generally were interested. Al requested everyone to sign the blue sheets. He then reviewed the agenda and requested bashing. No bashing was forthcoming, so he proceeded with the agenda as presented. 1. Introduction: Problem Statement and Goals Al Morton started off with the problem statement and the goal. He noted that just about everyone in the room had some experience with it and it was a difficult problem. The performance metric drafts tended to get less attention and he had some examples of that. Currently no WG had focused on the APM aspects and thus our goal was to explore the need for a new WG and/or directorate, and also capture the consensus on direction. John Klensin commented that he did not see a clear statement about what this is good for other than being interesting. Al said that once we got into the details of the existing drafts this would be answered. The general proposal was for an APM Directorate and an APM WG. The output could be a BCP or a framework plus RFCs. Alan Clark discussed some of the constraints on the proposed APM WG. One was the need to develop in cooperation with relevant WGs, because there were many WGs working on topics that affected application performance metrics. There was also the concept of cooperating with other groups outside the IETF. Dave Oran said that there were 2 dimensions. First, we need to coordinate with organizations that understood applications really well. Second, there could be organizations overlapping with what we were doing in an APM WG. We need somehow to put things into a metrics framework, and we could end up devising a framework that couldn't deal with the models that different applications have. Alan Clark responded to the comments. He said that the idea was not to duplicate the kinds of work within groups such as ITU; within the IETF, though, there was a need to develop some kind of framework for measurement methodologies relating to protocols developed within the IETF. Dan Romanescu said that there are framework issues of measurements and a methodology for defining metrics that is specific to the IETF and reflects the IP spec and related protocols. We should be able to reference the metrics from other standards organizations whenever we need to, or when there is overlap. There may be perfect matches, in which case we can reference directly, or there may be deviations, where we could address just the changes. 2. Examination of a perceived gap in IETF coverage: Existing Drafts: (< 2 minute, 1 slide overview of each) - draft-malas-performance-metrics-07.txt - draft-venna-ippm-app-loss-metrics-00.txt - draft-ietf-avt-rtcpxr-video-01.txt - draft-ietf-avt-rtcpxr-audio-00.txt - draft-ietf-avt-rtcphr-01.txt - draft-xie-ccamp-lsp-dppm-01.txt - draft-kikuchi-passive-measure-00.txt Al Morton listed some of the drafts that had been circulated within the IETF - in some cases for many revisions - relating to application performance metrics. He asked the authors of the drafts to come forward and speak to how well the current IETF WGs handle performance metrics. He cited Daryl Malas's draft on SIP end-to-end performance metrics as a "poster child" for this application performance metrics effort. Daryl Malas spoke on the draft motives and progress for the SIP benchmarking effort. He said that the original purpose was to solve several issues, starting with the fact that many people were using SIP but there was no consensus on how to measure its performance. Further, there was an industry confusion on "how to measure" - which messages should be measured, and what information really mattered? Finally, the biggest issue was that different people were relying on a bunch of legacy metrics defined by ITU-T for PSTN, and trying to mold them to SIP, but they didn't work too well. As a consequence, the SIP performance metrics drafts came into being. The current drafts list about 12 metrics. Steve Norris said that the architecture has an impact on the metrics; thus, did he have an architecture in mind? For example what is the router layout, how are the user agents modeled, etc. Daryl Malas said that there was no intent to go into the details, but these were covered. Dave Oran said that there were two issues - instrumenting the protocol, and instrumenting the VoIP applications that use the SIP protocol. What was the goal? Daryl Malas said that this was an attempt to measure SIP, regardless of where it was being used. Al Morton said that this was not about instrumentation, but about defining the metrics; the methods of measurement would be a follow-on. Dave said that this just adds another dimension to the matrix. The metrics as opposed to the instrumentation could be different depending on the item being measured. Al said that how to measure is beyond the scope; the work dealt with how to define the metric. Daryl further confused the issue and Dave gave up. Rajiv Papneja said that there is a sister aspect in BMWG and discussed some of the work and presentations. Daryl Malas said that the BMWG focused on single-device benchmarking while the APM group would focus more on end-to-end metrics. One question was whether the SIPPING group could take it up as part of their normal workload. There were two issues with this, however. First, the SIPPING group had its hands full and probably couldn't take it up. Second, the SIPPING group probably didn't have the metrics expertise to deal with the details of the measurements themselves. The BMWG had the right metrics expertise but did not do end-to-end metrics. Thus the APM group should be formed to address this. Finally, metrics just weren't sexy - people in the protocol groups preferred to work on new protocol features, rather than testing, and that is a limitation on doing this work in SIPPING. Alan Clark talked about the RTCP XR video metrics. He noted that there had been a lot of good work and they were getting the docs finished. Alan discussed RTCP XR and RTCP HR and their relative application situations. Al Morton discussed another series of performance metrics proposals - application loss patterns, LSP dynamical provisioning, etc. The application loss pattern metrics was considered by the IPPM group but was quickly discovered to be outside their charter. Nagarjuna Venna from the IPPM group clarified that they had decided that this was out of scope; however, he used these metrics in his work and there was a strong case for standardizing metrics like these. Pete Resnick asked who or what the consumer of those documents were? What is eventually going to take those documents and do something with them? He further clarified the question of "what" and "who" - is there a metrics engine that is going to take the metrics and use them? This was pertinent to the question of where in the IETF this work ought to be done. Alan Clark responded that metrics are produced by some sort of measurement function, embedded or in test equipment, but the metrics should be defined clearly so that systems could efficiently measure them. Also there was a need to separate the metric from the protocol used to report them. Al Morton remarked that all these systems had to start out with unambiguous metrics definitions. Daryl Malas said that it was obvious where the SIP performance measurement documents could be used. There were two different levels; one could be in the box itself, to be communicated back for using the info for determining healthy/unhealthy; the other could be in the back-office, using data obtained from a snoop or sniff, or even some other method. The real requirement is to collect the input variables. The user is either a vendor or an operator. Tom Taylor said the had 30 years experience in standardization of measurements, starting with the Orange books. The key point he wanted to make was it was too easy to fall into the syndrome of "if it moves, measure it". You have to define the uses of the measurement first and then design the measurement to achieve those uses. Carol Davids said that we were in the infancy here of defining tests of the applications we're building, such as SIP, and she felt that we need to have the BoF turn into where the applications we build with SIP can be quantified. Colin Perkins said that there seems to be an assumption that the metrics were independent of the protocol, and this was not always true; he would caution against this. Al Morton responded that there is separation between the some protocols being referred to, the network management protocol that extracts the info and the protocol being measured. Colin cautioned that we should take into account the whole system. 3. Comparison of Possible Solutions: + APM Directorate (with working process) + APM WG (and working process) + Short Lived APM WG Alan Clark presented the slides to describe these three alternatives. 4. Discussion of Pros and Cons Joerg Ott said that one of the benefits of a WG was the creation of a forum where people could talk about the same things, but the word "applications" in the WG title was overly broad, and people might not be able to talk about the same things. We should be careful in scoping and ensure that there is focus. Dan Romanescu reminded the people about the history leading to the BoF - there was no good home for anything related to performance measurements, as the protocol groups put these on a low priority. Ravi Raviraj said that there were three things: first, what are the parameters; second, how do we measure them; and third, how do we report them? If the parameters have already been defined, then there is no issue with defining a framework of reporting. However, if the parameters are yet to be defined then this is an open question. For SIP there was no issue of defining the parameters - an APM group was the best forum. For others, we would have to decide on a case by case basis. Andy Bierman further mentioned that there was an application MIB in RMON and other MIBs in IPPM, so there was already mechanisms in place for handling similar work, and we should not overlap. Al Morton covered other groups with related activities - BMWG (lab characterization, only), IPPM (active performance characterization, but no passive characterization), OPSAWG (can take on small projects) and RMON (concluded). Alan Clark covered the proposals for possible solutions to the application performance measurement work issue. The first possibility was to create a new APM directorate to advise protocol WGs on initiating and developing application performance metrics - i.e., serving as a consult and review kind of thing. The second proposal was to create a new APM WG to develop RFCs for metrics concerning layers above IP, in coordination or in partnership with other WGs. It was duly noted that this APM WG would not go off developing ad-hoc drafts, but would work in conjunction with protocol groups. The third direction is to form a short-lived WG to prepare a BCP or framework RFC, then leave the creation of the actual metrics to protocol groups. Barry Leiba asked if this could be a longstanding WG; Alan Clark responded that yes, it could be one. Al Morton noted that the WG would be structured to enable protocol experts to only have to wade through the mail of their applicable application performance metrics areas, instead of having to deal with all issues even if unrelated to their subject areas. Colin Perkins asked the chairs to clarify just what the BCP would be for. Al answered that a BCP would be for directing the adoption, development and interaction dependencies of WG items such as the SIP performance. Daryl Malas asked how we would get visibility from a protocol WG standpoint; the answer was that a draft that was relevant to a WG could be presented during their agenda. Dan Romanescu said that there was agreement on the following: the need to work with the protocol WGs; to invite comments from the protocol WG; and to invite comments from the IESG standpoint. David Oran said that there were three aspects. First, we could have some central function in the IETF to ensure that when people define some application metrics that they made sense to people versed in metrics, so that they didn't define garbage. A second function could be a place for people developing metrics to do the work, because the protocol WG might not have enough thrust behind metrics efforts. The third function could be a sort of post-facto review group (like the MIB Doctors) that takes work that is done, or about to be done, and provides feedback. He understood how the first and third could be done but wasn't sure about the second. If this would drastically increase the probability of publishing a lot more RFCs that nobody implements, then this is probably not useful. Alan Clark addressed this: he said that from his perspective manageability and performance measurements were done as an afterthought, and needed to be improved. Carol Davids agreed, saying that some of the prime users of the output documents were the vendors of test and measurement equipment; an APM effort would help their customers understand what is being tested. Vijay Garbani asked how we would move this forward; the only reference point he had was BMWG. He felt that BMWG was a significant aid when he was doing performance measurement; for example, he said, he knew the protocol side but having Scott Poretsky of BMWG present to help on the metrics side was very good. Dan Romanescu asked if he had put the BoF WG mailing list on the slide? Al Morton replied that the mailing list was the OPS area list. Al then asked several questions to the BoF attendees. 5. Assess consensus on the Solution/Direction questions: The first question was: should the IETF commit its resources to the formal development of application performance metrics as one or more key areas? Considerable debate took place on the wording of the question. Colin Perkins pointed out that the IETF couldn't commit resources, only the members could pitch in and do the work. Alan Clark clarified that "IETF" in this context refers to the IETF community. There was a comment from the back of the room that the question was just fine, essentially requesting that we get on with the polling of people. There was some back-and-forth debate as to whether this was a reasonable question or not. Before getting down to polling the BoF, Al Morton put up all the key questions that were going to be asked on the preferred direction and the method of working. Dan Romanescu confirmed that the initial question could be phrased as "should we work on this?" Colin commented that this should be the actual question, not the one on the slide. Daryl Malas requested a hum on whether people are interested in working on application metrics. He also pointed out that other protocols could benefit as well. David Oran said that the question could only reflect whether the people attending the BoF were interested. With that, Al Morton polled the BoF attendees on the first question. Many people in the room were in favor (>30). Leslie Nagle said that the next question should really be whether there was support for a WG that would work on some specific examples of application performance metrics. Al Morton then asked if people in the IETF were willing to work in a group chartered for a few drafts like those listed in the examples? About 17 hands went up. John Klensin wanted to understand where some of the resistance was coming from. He said that there was an issue of IETF resources and subject matter expertise involved. Ravi Raviraj said that there were two parts: are we willing to work on a reporting framework, and are we willing to work on new parameters where we are the experts? Dan Romascanu asked (before Al Morton cut off the queue): are we going allow this BoF to get input from the community? He said that there were many people in the room who felt that there was a problem, and there were many people who wanted to help solve the problem. He suggested that one or two working samples would be taken as proof of concept to be added to the materials for the short-lived WG. Alan Clark asked for a show of hands on the preferred directions. Colin Perkins asked for a clarification: How many times can you vote? Al Morton responded that it was Chicago rules, so anybody could vote for as many options as they liked. The three options were then considered, one after the other. Question 1: let the work be done in protocol development WGs with advice from an APM Directorate. There was not much support exhibited. It was noted that there was considerable confusion on the split. It was suggested by Daryl Malas that we should work from the WG level down to a Directorate. Question 2: Should an APM WG be formed? About 25 people wanted an APM WG. About 2 were not interested in having an APM WG. Question 3: should the WG be short-lived or long-lived? The question was duly put to the group. Dan clarified the difference - a shortlived WG would work on the framework and a couple of proof of concept documents, while a longlived WG would continue ad-infinitum. About 6 people wanted a short-lived WG and 12 were in favor of a long-lived WG. The question of a directorate was asked. Tom Taylor suggested that there are at least two applications areas, and perhaps a directorate at the area level might be a thought because it's tied to their subject matter. Daryl Malas noted that creating another WG is problematic because there are already too many WGs, but if we create a directorate then we have a group of dedicated people assigned to this particular piece of work. Dan Romanescu clarified that when the SIP performance metrics issue surfaced it was long debated by several people to get the right input on what would be the better framework, and a directorate wasn't favored. The level of discussion here led him to believe that an OPS area mailing list should be created. Jean-Francois Male asked what would be the value of such a WG to drafts such as Daryl's; this WG would bring expertise into one group without any expertise in protocols. He was concerned about how long it is going to take for Daryl's draft to be done. In conclusion, Al Morton said that we didn't write a charter because we didn't know the level of support or the preferred direction, and that this has been a very interesting experience. He appreciated all the comments and support, and thanked the members for participating in the BoF. He then closed the BoF at 4.15 PM CDT.