Summary: Has 2 DISCUSSes. Has enough positions to pass once DISCUSS positions are resolved.
Martin Duke (was No Objection) Discuss
Sec 2.3 says: At minimum, the API MUST provide: (1) the state of captivity and (2) a URI for the Captive Portal Server. But in section 5 of capport-api, user-portal-url is an optional field. Both a capport-api author and a WG chair agreed that the architecture doc should be fixed, so I'm moving the DISCUSS here.
I found the terminology around “Captive Portal API server” and “Captive Portal Server” to be a little confusing, as these are similar terms. The latter also doesn’t get its own discussion in Section 2 and is confusingly called the “web portal server” in Figure 1. After Figure 1, this seems to be consistently called the “web portal” (sec 2.6 and 4). In the API doc it is called a "user portal." It would be great to unify the terminology across the documents as a whole.
Benjamin Kaduk Discuss
(1) and (2) should be easy to fix; (3) may well be "fixed" by telling me I'm too naive :) (1) Given that section 1 describes other options, the abstract should not limit to just DHCP and RA as options for provisioning the API URL. (2) Section 4.1 says that: 5. The Captive Portal API server indicates to the Enforcement Device that the User Equipment is allowed to access the external network. but I believe this should be the "Captive Portal Server" (or, as the previous point has it, the "web portal"). (3) Probably a "discuss discuss", but ... in Section 1 we have: * Solutions SHOULD NOT require the forging of responses from DNS or HTTP servers, or any other protocol. In particular, solutions SHOULD NOT require man-in-the-middle proxy of TLS traffic. I'd like to understand the motivation for this one a little better. Naively, it seems like we could get away with "MUST NOT require" while still allowing it to be done. Am I missing something obvious?
I'd like to see some more discussion of which signals are authenticated and how, and what kind of authorization checks are possible. In well-run networks DHCP and RA signals should be relatively trustworthy, but clients don't always have a good indicator for whether a given network falls into that category. Are there (other) mechanisms that can be used to give trust in the authenticity of a given Captive Portal API URI and that that API is authorthorized to provide unconstrained access for the network in question? We require TLS for accessing the API server, but (as I note inline) there are more details that can be given about this TLS usage. What can be done to authenticate and authorize the Captive Portal Server? Most importantly (and most appropriately for an architecture document), which of these properties are strictly required vs. merely optional? These are not Discuss-level points because an architecture does not strictly-speaking need to specify all of them, but having some indication of how we plan to achieve them would give greater confidence that this architecture will be a useful one. I'm happy to see the response to the genart reviewer's comment regarding "a" vs. "the" capport architecture; thanks! Abstract This document describes a CAPPORT architecture. DHCP or Router Advertisements, an optional signaling protocol, and an HTTP API are used to provide the solution. The role of Provisioning Domains nit: there's perhaps a bit of a lack of parallelism in the list structure, where we talk about specific mechanisms for provisioning without describing the more abstract concept of provisioning, and list that alongside an abstract mention of "a signaling protocol" and the both-abstract-and-concrete "HTTP API". Section 1 Implementations generally require a web server, some method to allow/ block traffic, and some method to alert the user. Common methods of nit: I'd suggest clarifying that this is "implementations of captive portals" (or is it "captive networks"?). alerting the user involve modifying HTTP or DNS traffic. nit: perhaps "at present" or "prior to this work"? If I understand correctly one of the goals of this work is to shift the balance of captive portals away from these practices (while acknowledging that fully eliminating them is not feasible in the near future). * Solutions MAY allow a device to be alerted that it is in a captive network when attempting to use any application on the network. I'm also not sure I understand this one, especially in light of the following (paraphrased) "SHOULD allow learning of captivity before application attempts to use the network". What's the alternative to "MAY allow", not-allowing such detection at all? * The architecture MUST provide a path of incremental migration, acknowledging a huge variety of portals and end-user device implementations and software versions. nit: "preexisting" or similar would go a long way here. * Network provisioning protocols provide end-user devices with a side note: using the word "provisioning" to describe things like DHCP and RA feels odd to me, presumably due to my background and what I expect provisioning to be. I can see why it makes sense to use the term for this purpose, though. Perhaps an additional adjective could help clarify what is meant, though I don't have a suggestion at hand. for this purpose are available in [RFC7710bis]. Other protocols (such as RADIUS), Provisioning Domains [I-D.pfister-capport-pvd], or static configuration may also be used. A device MAY query this side note: personally, I'd expand to "may also be used to convey this API URI", though it's probably not required for clarity. The device MAY take immediate action to satisfy the portal (according to its configuration/policy). side note: it's not entirely clear to me that we need a normative MAY for this. Section 2.1 have Internet access). The User Equipment communication is typically restricted by the Enforcement Device, described in Section 2.4, until site-specific requirements have been met. It seems like these "site-specific requirements" must be the "Captive Portal Conditions" that we just defined. * SHOULD have a mechanism for notifying the user of the Captive Portal It is pretty important that this mechanism be non-spoofable by, e.g., untrusted websites. I think we should mention something about "non-spoofable" here. * MAY prevent applications from using networks that do not grant full network access. E.g., a device connected to a mobile network may be connecting to a captive WiFi network; the operating system MAY avoid updating the default route until network access restrictions have been lifted (excepting access to the Captive nit: maybe say in which direction the update would go and/or something about why the move to wifi is desirable? None of the above requirements are mandatory because (a) we do not wish to say users or devices must seek full access to the captive network, (b) the requirements may be fulfilled by manually visiting the captive portal web application, and (c) legacy devices must continue to be supported. side note: in my opinion, it's possible to support legacy devices in practice without baking their limitations into the spec. If User Equipment supports the Captive Portal API, it MUST validate the API server's TLS certificate (see [RFC2818]). An Enforcement We should probably cite RFC 6125 here and say something about how the UE gets a name to validate the server's certificate against (and what name type to use). [I-D.ietf-capport-api] for more information. If certificate validation fails, User Equipment MUST NOT proceed with any of the behavior described above. I'm not sure which behavior the "behavior described above" is. "[accessing...] OCSP responders, CRLs, and NTP servers" doesn't seem quite right since that's *how* you determine that certificate validation fails, but the bits further up about "navigate [to] the Captive Portal user interface" do not seem to clearly call out a single behavior or set of behaviors by the UE. Section 2.2.2 Although still a work in progress, [I-D.pfister-capport-pvd] proposes a mechanism for User Equipment to be provided with PvD Bootstrap Information containing the URI for the JSON-based API described in Section 2.3. I don't think "JSON-based" is supported by the text of § 2.3 (and isn't really appropriate for an architecture doc in most cases, anyway). Section 2.3 The purpose of a Captive Portal API is to permit a query of Captive Portal state without interrupting the user. This API thereby removes the need for User Equipment to perform clear-text "canary" HTTP queries to check for response tampering. nit: probably don't need to be specific about HTTP, here. At minimum, the API MUST provide: (1) the state of captivity and (2) a URI for the Captive Portal Server. Is there anything useful to say about the URI scheme for the captive portal server URI? I guess I could probably (grudgingly) come up with a case where http-not-s would be tolerable, but given that we admit the possibility of "payment" as a captive portal condition, I don't want us to encourage sending payment or other sensitive information over schemes inappropriate for such information. A caller to the API needs to be presented with evidence that the content it is receiving is for a version of the API that it supports. What about evidence that the content it is receiving is intended to be used with, and authorized to speak for, the network it is joining? When User Equipment receives Captive Portal Signals, the User Equipment MAY query the API to check the state. The User Equipment nit: we seem to use "the state of its captivity" most places. The API MUST use TLS to ensure server authentication. The implementation of the API MUST ensure both confidentiality and integrity of any information provided by or required by it. It's a little weird to split the TLS requirements between here and Section 2.1, though I guess if we're splitting things by role it's probably unavoidable. (I made my RFC 6125 comment in Section 2.1 and it probably doesn't need to appear in both places.) Section 2.4 * May signal User Equipment using the Captive Portal Signaling protocol if certain traffic is blocked. nit: I think that "optionally signals" might be a better fit for the list structure as used in the other bullet points. Section 2.5 When User Equipment first connects to a network, or when there are changes in status, the Enforcement Device could generate a signal toward the User Equipment. This signal indicates that the User Equipment might need to contact the API Server to receive updated information. For instance, this signal might be generated when the end of a session is imminent, or when network access was denied. Would this signal also be used when the UE has successfully met the Captive Portal Conditions? Section 2.6 * The User Equipment queries the API to learn of its state of captivity. If captive, the User Equipment presents the portal user interface from the Web Portal Server to the user. [we previously discussed this UE behavior as optional. I don't mind having the text be descriptive like this, since it's describing the diagram, and the diagram is not binding on all UEs, but it seemed worth noting just in case.] Section 3.1 An Identifier is a characteristic of the User Equipment used by the components of a Captive Portal to uniquely determine which specific User Equipment is interacting with them. An Identifier MAY be a Do we want to say anything about what scope within which the uniqueness must hold? ("No" is probably fine.) Section 3.2.1 Each instance of User Equipment interacting with the Captive Network MUST be given an identifier that is unique among User Equipment interacting at that time. side note: "MUST be given" gets a knee-jerk "by whom?" response from me. It's probably okay for this document to not specify, though, as it may depend on the nature of the Captive Network. Over time, the User Equipment assigned to an identifier value MAY change. Allowing the identified device to change over time ensures that the space of possible identifying values need not be overly large. Is the identifier assigned to a given UE on the same network expected to be able to change as well? This may have some privacy considerations... Section 3.2.2 are active at the same time. This property is particularly important when the User Equipment is extended externally to devices such as billing systems, or where the identity of the User Equipment could imply liability. nit(?): is it the UE that is extended externally or the identifier thereof? Section 3.2.4 In some situations, the User Equipment may have multiple IP addresses, while still satisfying all of the recommended properties. nit: as written, "while still satisfying all of the recommended properties" is describing the UE, but the context of Section 3.4 suggests that we want to be talking about the recommended properties for identifiers. Section 3.5 Accessing the API MAY depend on contextual information. However, the URIs provided in the API SHOULD be unique to the UE and not dependent on contextual information to function correctly. Should the per-UE APIs and/or the mapping between UE and per-UE API be unguessable? (Do we want to reference Capability URLs [https://www.w3.org/TR/capability-urls/]?) Section 4 I might consider explicitly saying "non-normative" somewhere in here. Section 4.1 4. If necessary, the User navigates the web portal to gain access to the external network. nit: "navigates to" Section 4.2 3. The User Equipment's UI indicates that the length of time left for its access has fallen below a threshold 4. The User Equipment visits the API again to validate the expiry time side note: I feel like there's implicitly some User action in here, though I don't know that we need to actually say anything about it. (Otherwise we wouldn't have the UI indicating things.) Section 4.3 Whenever a new Portal URI is received by end User Equipment, it SHOULD discard the old URI and use the new one for future requests to the API. What kind of validation/authorization checks need to be applied to the new Portal URI? (nit: we probably should check the terminology in this section; the Section 1.2 lexicon would call this information the "Captive Portal API Server URI" and not a "Portal URI".) Section 7 This mechanism rather inherently requires having multiple entities track the UE's identity (and, thus, likely be tracking a proxy for the user's identity). It seems appropriate to include some discussion of the privacy considerations of this tracking, and whether/what kind of anonymity support is appropriate! Section 7.1 Given that a user chooses to visit a Captive Portal URI, the URI location SHOULD be securely provided to the user's device. E.g., the DHCPv6 AUTH option can sign this information. I'm not sure that I understand the intent behind the "Given that" construction here. Is it trying to emphasize user choice, and thus the need for informed choice? Section 7.2 [In the vein of my previous remarks, there are many ways to use TLS, and usually we provide more details on how we expect TLS to be used.] Section 7.3 The API MUST ensure the integrity of this information, as well as its confidentiality. Who/what is the attacker(s) that we need to preserve confidentiality from? Section 7.4 * Accesses to the API Server are rate limited, limiting the impact of a repeated attack. One might consider a flooding attack that tries to get the UE to use all its (rate-limited) connections to get some information that is not the information that it's most important for the UE to have. If there's only a single operation that can be performed at the API Server (which I believe is the intent?) there is no such attack, but it may be worth mentioning that there is no such attack. Section 8.1 Interestingly, none of the places where we reference 7710bis have surrounding text that clearly incur a normative dependency. Appendix A We explain the use of the "canary" term here, but have already used it twice (with no forward-reference) in the body of the document. Another test that can be performed is a DNS lookup to a known address with an expected answer. If the answer differs from the expected answer, the equipment detects that a captive portal is present. DNS queries over TCP or HTTPS are less likely to be modified than DNS queries over UDP due to the complexity of implementation. Is the reader supposed to draw the conclusion that DoTCP/DoH provide less-reliable captive-portal detection than Do53? (I assume "TCP" is not a typo for "TLS", here, though am unsure enough to want to check.) Malicious or misconfigured networks with a captive portal present may not intercept these requests and choose to pass them through or decide to impersonate, leading to the device having a false negative. nit: I suggest "these 'canary' requests" to clarify which requests we're talking about.
Barry Leiba Yes
Deborah Brungard No Objection
Roman Danyliw No Objection
I support Martin and Ben's DISCUSS positions. Thanks for laying out the architecture to explain the subsequent protocol drafts. A few areas of feedback: ** Section 2.1. Per “At this time we consider …”, to what is “at this time” referring (maybe this is referring to the WG scope)? This might not age well as currently framed. ** Section 2.2. The architecture doesn’t explicitly describe which component is responsibility for provisioning the user equipment sufficiently so it can access the IP network anywhere. I would have expected it to be the Provisioning Service. Section 2.1, 2.3 and 2.4 describe the role of these components in the architecture and their requirements. Section 2.2 does not. Instead it describes candidate technologies. It would be helpful to explicitly say. ** Section 2.3. Perhaps this is too pedantic, but should the obvious be explicitly called out: the user equipment should only be able to check it’s own captivity status? This would be some explicit notion of authorization. ** Section 2.3 Per “A caller to the API needs to be presented with evidence that the content it is receiving is for a version of the API that it supports.”, is the caller the User Equipment, the web browser or the end user – does that distinction matter – does each layer need anything different? ** Section 3.2.1. Per “Each instance of User Equipment interacting with the Captive Network MUST be given an identifier that is unique among User Equipment interacting at that time.”, is “unique among user equipment interacting at that time” the same as saying “unique among the identifiers currently in use in the Captive Network”? It might be useful to frame this guidance within the scope of the previous definitions. ** Section 3.2.2. The acceptable workfactor for “hard” still isn’t clear here but I understand the difficulty of pinning it down while remaining flexible. ** Section 4. Does this section provide normative guidance? The introductory sentence suggests no by saying that this section describes “possible workflow[s]”. However, Section 4.3 uses a normative SHOULD. ** Section 4.2. Between step #2 and #3, did some kind of signaling happen to indicate that expiration is imminent, or did the UE keep state of some kind? Keeping state isn’t mentioned as a UE requirement in Section 2.1. Section 2.5. notes that a “signal might be generated when the end of a session is imminent”. ** Section 7. This section would benefit from a discussion of the privacy impacts of the implicit identifiers embedded into the architecture (e.g., re-identification) ** Section 7.1. Per “If a user decides to incorrectly trust an attacking network ….”, you have an on-path attacker so additional risks include traffic redirection to arbitrary destinations to server malicious payloads; traffic analysis and loss of confidentiality; inline traffic modification; etc. ** Section 7.2. Per “The solution described here assumes that when the User Equipment needs to trust the API …”, why is this conditional. Doesn’t the UE have to trust the API server? ** Section 7.3. In addition to integrity and confidentiality, is there an authenticity requirement? I ask because Section 2.1. noted that the UE “SHOULD [be] allow[ed] access to any services that User Equipment could need to contact to perform certificate validation.”
Murray Kucherawy No Objection
Pretty straightforward. Again, nice work. Some nits: Although I see why you did it, the capitalization of the bullet list in Section 3.2 appears peculiar. Also curious is that "User Equipment" is defined in Section 2.1, but not shortened to "UE" anywhere other than in Section 3.5. In Section 4.1, what's an "RA"?
Warren Kumari No Objection
Alvaro Retana No Objection
I have some minor comments: (1) Please expand CAPPORT. (2) §1: s/This document standardizes an architecture/This document describes an architecture This is not a standard track document. (3) §1: "MAY allow a device to be alerted" Other parts of the document (even in the same section) talk about "devices can be notified" or "informs an end-user", while "alert" is not mentioned anywhere else. Given that "alert" has the normative attachment, it would be nice to use consistent language. (4) §2.1: "E.g....MAY avoid updating..." s/MAY/may This is an example, not a normative statement. (5) §3.1: "An Identifier MAY be a field...Or, an Identifier MAY be an ephemeral property..." s/MAY/may These seem to be statements and not normative statements.
Éric Vyncke No Objection
Thank you for the work put into this document. The document is easy to read. I also appreciate the fact that "devices without user interfaces" are not ignored by this document. Please find below a couple on non-blocking COMMENTs. A response/comment for those COMMENT will be read with interest. I hope that this helps to improve the document, Regards, -éric == COMMENTS == Is there a reason why the words "captive portal" do not appear in the abstract? This would assist normal human beings (outside of the WG) to find the document. I found no text about what happens to the traffic inside the captive network. Is it allowed even when still in captive mode ? -- Section 1.2 -- Even if the document support "devices without user interfaces", I wonder why the I-D uses "User Equipment" rather than "Client Equipment" (which is also more aligned with "Server"). Nothing dramatic, just curious about the reason. -- Section 2.1 -- "At this time we consider only devices with web browsers" while the previous text was about "devices without user interfaces". Finally, is this document for devices with or without human interface ? -- Section 2.6 -- While the components are described as being optional collocated, what about resiliency ? I.e., having two different instances on one component. -- Section 3.4.2 --- While I appreciate that the section contains text about multiple IPv6 addresses, I suggest to mention the dual-stack use case explicitly. -- Section 3.4 -- I was expecting to see the MAC address also used as identifier. Is there any reason why it is not mentioned? If so, may I suggest to document the absence of a MAC address section in the examples?
Robert Wilton No Objection
I found this document easy to read, but have a few comments. I support the 3rd bullet of Ben's discuss. I was surprised by the diagram in section 2.6, since it seems to imply that the Provisioning Service kicks everything off, but I would have expected the User equipment to initiate the flow, which is articulated in the first step of section 4.1. Hence, I think that the diagram could be more clear if it also showed the initial request from the client (as per the first step in 4.1). Finally, I note that this document makes no mention of OAM considerations. Having some text covering these aspects would probably be beneficial.