HTTP Client Hints
draft-ietf-httpbis-client-hints-15
Yes
(Barry Leiba)
No Objection
(Alvaro Retana)
(Deborah Brungard)
Note: This ballot was opened for revision 13 and is now closed.
Murray Kucherawy
(was Discuss)
Yes
Comment
(2020-05-18 for -14)
Sent for earlier
Thanks for fixing my IANA concern and all of my nits!
Erik Kline
No Objection
Comment
(2020-05-19 for -14)
Sent
[[ nits ]] [ section 1 ] * "requires external device" -> "requires an external device"? [ section 2.2 ] * "indicate user agents" -> "indicate to user agents"? [ section 3.1 ] * s/sh-list/sf-list/g [ section 4.3 ] * "we can remove" -> perhaps another wording that doesn't require "we" * "unless the opt-in origin has explicitly delegated permission to another origin": is this delegation mechanism documented somewhere that should be referenced here? * "is challenging and likely" -> "may be challenging or even"
Roman Danyliw
No Objection
Comment
(2020-05-19 for -14)
Sent
** Section 4.1. Per “Therefore, features relying on this document to define Client Hint headers MUST NOT provide new information that is otherwise not available to the application via other means, such as existing request headers, HTML, CSS, or JavaScript”, would this text allow for a shift in permissiveness if the references specs changed? For example, if something was not permissible in Javascript/HTML/CSS “vX” today, but it was in “vX+1”, would that mean that additional data could be sent as hints? I’m exploring the value of assigning version numbers to HTML, CSS and Javascript to freeze the security assumptions. ** Section 4.1. Per “User agents need to consider the value provided by a particular feature vs these considerations, and MAY have different policies regarding that tradeoff on a per-feature basis”, IMO more is needed to handle these tradeoffs. User agent implementations SHOULD expose this policy creation process through a rich set of configuration/tuning options and with an API to enable privacy-minded, third party software to assist the user in making choices. ** Section 4.1. Per “Implementers SHOULD restrict delivery of some or all Client Hints header fields to the opt-in origin only, unless the opt-in origin has explicitly delegated permission to another origin to request Client Hints header fields”, how does this delegation happen?
Barry Leiba Former IESG member
Yes
Yes
(for -13)
Unknown
Alissa Cooper Former IESG member
No Objection
No Objection
(2020-05-21 for -14)
Sent
Section 1: "passively providing such information allows servers to silently fingerprint the user" --> isn't pretty much all fingerprinting silent? Moreover, I think it would be good to explain in Section 1 that Client Hints provides a way for servers to actively fingerprint clients rather than doing it passively. Section 2.1: "Without such an opt-in, user agents SHOULD NOT send high-entropy hints, but MAY send low-entropy ones [CLIENT-HINTS-INFRASTRUCTURE]." --> Given the use of normative language here, I think either this doc or the referenced doc needs to define what high-entropy hints are. Are all hints not defined as low-entropy considered to be high-entropy? If so, then I think a change along the lines of what Benjamin proposed is warranted. If not (as the text in Section 4.1 sort of indicates), then this doc needs to specify what high-entropy hints are. Section 4.1: "user choice mechanisms that allow users to balance privacy concerns against bandwidth limitations" --> This is vague enough that I don't understand what it is talking about. What is an example of such a mechanism? Section 4.1: "The header-based opt-in means that we can remove passive fingerprinting vectors, such as the User-Agent string (enabling active access to that information through User-Agent Client Hints [4]), or otherwise expose information already available through script (e.g. the Save-Data Client Hint [5]), without increasing the passive fingerprinting surface." How about changing that "we can" into a recommendation to do so? In other words, could this document recommend that if User-Agents are sending certain information in Client Hints that they stop sending it or similar information via other headers? Maybe this is too obvious, but given the breadth of HTTP clients in the wild, it may help to state the obvious.
Alvaro Retana Former IESG member
No Objection
No Objection
(for -14)
Not sent
Benjamin Kaduk Former IESG member
No Objection
No Objection
(2020-05-19 for -14)
Sent
Section 1 There are thousands of different devices accessing the web, each with different device capabilities and preference information. These device capabilities include hardware and software characteristics, as well as dynamic user and user agent preferences. Historically, nit: should "user-agent" be hyphenated? applications that wanted to allow the server to optimize content delivery and user experience based on such capabilities had to rely on passive identification (e.g., by matching the User-Agent header nit: it feels like "allow the server" would be something that involves granting permission or the client sending an active signal (as proposed by this document), as opposed to just the apaplication that "wanted the server to optimize" and had to make do with such limited signal as was already available. field (Section 5.5.3 of [RFC7231]) against an established database of user agent signatures), use HTTP cookies [RFC6265] and URL nit: hyphenate user-agent again, used as an adjective. o User agent detection cannot reliably identify all static variables, cannot infer dynamic user agent preferences, requires external device database, is not cache friendly, and is reliant on nit: singular/plural mismatch ("an external device database" or "external device databases") o Cookie-based approaches are not portable across applications and servers, impose additional client-side latency by requiring JavaScript execution, and are not cache friendly. (I think I missed a step in why a cookie-based approach inherently requires javascript execution, though maybe it doesn't matter.) Proactive content negotiation (Section 3.4.1 of [RFC7231]) offers an alternative approach; user agents use specified, well-defined request headers to advertise their capabilities and characteristics, so that Chasing the reference, it's not clear that it supports quite this strong of a statement: in addition to the explicit negotiation fields, it also allows using implicit characteristics such as client IP address and User-Agent. Section 2.1 access of third parties to those same header fields. Without such an opt-in, user agents SHOULD NOT send high-entropy hints, but MAY send low-entropy ones [CLIENT-HINTS-INFRASTRUCTURE]. It looks like the reference only defines a registry for low-entropy hints, and we are inferring that any hints not listed in that table are to be treated as "high-entropy". Perhaps we could reword both directions of this directive to refer only to the registry of low-entropy hints (e.g., "SHOULD NOT send hints that are not listed in [registry]")? Implementers need to be aware of the passive fingerprinting implications when implementing support for Client Hints, and follow the considerations outlined in the Security Considerations (Section 4) section of this document. side note: in some sense the Accept-CH mechanism transforms it from a passive to an active fingerprinting mechanism. Section 2.2 information in them. When doing so, and if the resource is cacheable, the server MUST also generate a Vary response header field (Section 7.1.4 of [RFC7231]) to indicate which hints can affect the selected response and whether the selected response is appropriate for a later request. side note: I suspect the answer I want is already present with a detailed reading of RFC 7231, but I wonder if it's worth saying something here about whether the Vary response header could/should include registered client hint header field names that were not present in the request in question. Section 3.1 Based on the Accept-CH example above, which is received in response to a user agent navigating to "https://example.com", and delivered over a secure transport, a user agent will have to persist an Accept- CH preference bound to "https://example.com". It will then use it What level of requirement is implied by "will have to" here? IIUC, it's just that "if anything is persisted, it must be keyed on" but with no obligation to do any persistence. If so, perhaps a wording like "any persisted Accept-CH preference will be bound to" would be better? for navigations to e.g. "https://example.com/foobar.html", but not to e.g. "https://foobar.example.com/". It will similarly use the preference for any same-origin resource requests (e.g. to nit: comma after "e.g." (throughout). "https://example.com/image.jpg") initiated by the page constructed from the navigation's response, but not to cross-origin resource requests (e.g. "https://thirdparty.com/resource.js"). This preference will not extend to resource requests initiated to "https://example.com" from other origins (e.g. from navigations to "https://other-example.com/"). Perhaps thirdparty.example and other.example, to stay within the BCP32 space? Section 3.2 When selecting a response based on one or more Client Hints, and if the resource is cacheable, the server needs to generate a Vary response header field ([RFC7234]) to indicate which hints can affect the selected response and whether the selected response is appropriate for a later request. Is BCP 14 language approprite here? Above example indicates that the cache key needs to include the Sec- CH-Example header field. nit: please add the article "the" to make this a complete sentence. Section 4 While I don't expect that I can tell the major browser vendors anything new about the privacy considerations to client hints, I do think that we should give some guidance to implementors of other HTTP clients, who may not have such extensive depth of knowlege, on the general landscape in which this mechanism is set. The subsections hereof do a great job covering a lot of relevant details and specific factors to consider; thank you! I think it may also be appropriate to have some more generic lead-in text, noting that in the worst case, merely converting a passive fingerprinting mechanism to an active fingerprinting mechanism with server opt-in does not actually provide any privacy benefit (the worst case being when all servers ask for all the data and clients accede)! While we might hope that the need to jump through an extra hoop to access fingerprinting information might dissuade some servers from asking for it, it seems imprudent to assume that it will happen, so in order to obtain real privacy benefit there needs to be some additional policy controls in the client and in what hints are defined/implemented. As I mentioned already, we already have a lot of the details for how to apply such policy controls, and limitations to only define hints that expose information already available in other means; what I'd like to see is the high-level picture that ties them together. Section 4.1 upon it. The header-based opt-in means that we can remove passive fingerprinting vectors, such as the User-Agent string (enabling active access to that information through User-Agent Client Hints [4]), or otherwise expose information already available through I think this [4] is the same as [UA-CH]. Also, use of the first person ("we") is somewhat unusual in RFC style. Therefore, features relying on this document to define Client Hint headers MUST NOT provide new information that is otherwise not available to the application via other means, such as existing request headers, HTML, CSS, or JavaScript. As written, this is a fairly weird condition. What constitutes "available to the application via other means"? Does "put up an interstitial until the user provides the information in question" count? o Entropy - Exposing highly granular data can be used to help identify users across multiple requests to different origins. Reducing the set of header field values that can be expressed, or restricting them to an enumerated range where the advertised value is close but is not an exact representation of the current value, nit: "close to" seems like it would scan better. Different features will be positioned in different points in the space between low-entropy, non-sensitive and static information (e.g. user agent information), and high-entropy, sensitive and dynamic information (e.g. geolocation). User agents need to consider the value provided by a particular feature vs these considerations, and MAY have different policies regarding that tradeoff on a per-feature basis. How about on a per-origin basis (and, e.g., domain reputation)? An "entropy budget" where an origin that asks for too many distinct hints won't get all of them? (I also wonder if a descriptive "may wish to have" is better than the normative "MAY", here.) o Implementers SHOULD restrict delivery of some or all Client Hints header fields to the opt-in origin only, unless the opt-in origin has explicitly delegated permission to another origin to request Client Hints header fields. Am I reading things right that this document does not define any such delegation mechanisms but is just admitting the possibility of such mechanisms being defined in the future? I'd suggest clarifying up in §2.1 with a parenthetical (akin to the "outlined below" note about the opt-in mechanism). Implementers SHOULD support Client Hints opt-in mechanisms and MUST clear persisted opt-in preferences when any one of site data, browsing history, browsing cache, cookies, or similar, are cleared. Who is the target audience for this SHOULD? If it's just "people implementing this document", it seems ineffectual, and if it's any broader scope it seems unenforcable. Section 4.3 Research into abuse of Client Hints might look at how HTTP responses that contain Client Hints differ from those with different values, nit: what are "responses that contain Client Hints"? We have discussed Accept-CH header fields in responses, and client hints in requests, but the only mention I recall of hints in responses was in the Vary header field, and it's not clear that that is what was intended. Section 5 While HTTP header compression schemes reduce the cost of adding HTTP header fields, sending Client Hints to the server incurs an increase in request byte size. Servers SHOULD take that into account when nit: I wonder if this would be more clear as: % Sending Client Hints to the server incurs an increase in request byte % size. Some of this increase can be mitigated by HTTP header % compression schemes, but each new hint will still lead to some % increased bandwidth usage. Servers SHOULD [...] Section 7.1 I'm not sure I understand why [FETCH] is listed as a normative reference. I find it amusing that we reference both 7231 and 7234 for Vary, though to my untrained eye the current references both seem appropriate in their respective locations. Section 7.2 If [CLIENT-HINTS-INFRASTRUCTURE] is to be the source of truth for low-entropy (and, by deduction) high-entropy hints, it seems like it should be normative.
Deborah Brungard Former IESG member
No Objection
No Objection
(for -14)
Not sent
Magnus Westerlund Former IESG member
No Objection
No Objection
(2020-05-21 for -14)
Sent
I have no significant concern here, but I would appreciate an answer if I understand the situation correctly. The Accept-CH header value is structure header value and uses sh-token which has a more restrictive syntax than the HTTP specifications token used for header field names. However, this restriction is not of any real practical concern as all registered HTTP headers starts with an ALPHA. I did notice that the new HTTP semantics documents proposed new registry was not mandating but strongly recommending to keep within what sh-token can except. Thus, do I assume correctly that this issue has been sufficiently discussed in the WG?
Martin Duke Former IESG member
No Objection
No Objection
(2020-05-20 for -14)
Sent
Sec 4: s/JavsScript/JavaScript
Robert Wilton Former IESG member
No Objection
No Objection
(2020-05-21 for -14)
Sent
Hi, Thanks for this document. It seemed relatively easy to read, although I'm not sure whether I'm totally bought into the idea since it feels like it is perhaps making HTTP a little bit less stateless. However, I'm not particularly familiar with the details of HTTP as it is outside my domain of expertise. One issue that wasn't clear to me was how do you ensure that two independent entities don't both try and standardize the same client hint. From looking at the IANA section in https://wicg.github.io/ua-client-hints/ it seems the answer is probably that the client hint headers would be expected to be registered in RFC3864. It might be useful if this document had some text describing this, along with a reference to RFC 3864. Regards, Rob