Summary: Has 2 BLOCKs. Has enough positions to pass once BLOCK positions are resolved.
Ballot question: "Is this charter ready for external review?"
I am afraid that this charter is not ready for external review, hence my BLOCK. Nothing bad in the WG goals (useful work to be done) but rather on the way it is written so it should be easy to fix: - 'Selection among multiple encryption keys' should there be a way to use different encryption algorithm as well with the encapsulation (I noted that this bullet is explicitly for inside a session)? - like Magnus, I find "Information to form a unique nonce" pretty vague and is it 'nonce' or more 'initialization vector' ? - 'This working group will not specify the signaling required to configure SFrame encryption", it is unclear to me whether the WG will specify a control channel to negotiate keys and crypto algorithms as the current sentence appears more generic configuration (e.g., supported crypto algorithms) - only one milestone ? There is nothing about the RTP mapping document that is mentioned in the charter text
I know we have had discussion touching on this before. But post vacation and looking on this charter again I think we need to have some additional discussion of the goals and how the charter describes them in relation to encoder sub-streams and identification of what is encapsulated. In regards to the below: This working group will not specify the signaling required to configure SFrame encryption. In particular, considerations related to SIP or SDP are out of scope. This is because SFrame is intended to be applied as an additional layer on top of the base levels of protection that these protocols provide. This working group will, however, define how SFrame interacts with RTP (e.g., with regard to packetization, depacketization, and recovery algorithms) to ensure that it can be used in environments such as WebRTC. I think there exist a conflict in the above paragraph in relation to stated goals of the work. With the following earlier sentence: " It may also be desirable to encrypt units of intermediate size (e.g., H.264 NALUs or AV1 OBUs) to allow partial frames to be usable." in mind creating an RTP payload format that is capable of carrying SFRAMEs that contains these units will require some interaction with the signalling. Even without these sub-stream SFRAMEs there exist a description capability that needs to exist in an RTP payload format for the end-consumer to correctly be able to route the protected data after decapsulation and that the end-point having that capability. If the goal here when it comes to RTP is simply to be able to treat SFRAME as CODEC in WebRTC and thus use WebRTC InsertableStreams as a receiver of the decrypted media ADUs. Require the use of the WebRTC application to have a proprietary signalling to know what this ADU is and then route it to a media decoder? I can see that working in the WebRTC only context. However, I would prefer if some thought was spent on at least having a model for what information may be needed to be able to handle the media streams. Considering RFC 7656 (https://datatracker.ietf.org/doc/rfc7656/) and the work that was needed for us to up-level how RTP worked and even discuss this so that we understood each other. I think SFRAME needs to discuss how it is going to handle identification of the data encapsulated by SFRAMEs for media. A single media source can be encoded in multiple formats. Each format may produce one or more sub-streams of encoding for scalability or robustness and this needs to conveyed. So looking at the above challenges in the context of SFRAME over RTP. So a possibility here is to say that the SSRC represents either just a media source. The RTP payload format provides only fragmentation of the SFRAME across multiple RTP packets and the RTP timestamp can be used to indicate its belonging in the timeline of the encoding. That puts a lot of the identification on the SFRAME layer, but its minimizes the signalling interactions related to RTP. However it creates limitation about what the SFU can do, especially when it comes to repair. Switching can be done based on Frame-marker extension header. However, layer related loss detection becomes impossible without additional information, or use of multiple SSRCs. Thus, I think the charter as currently written are uncertain if it can be executed on with stated goals.
* Information to form a unique nonce within the scope of the key Is this really "Information to form a" the best formulation. I am uncertain if the goal is to have a specification for how to generate unique Nonce values within the context of a particular key, or if it is related to which information sources that should be used when creating a nonce?
** I share Éric Vyncke concerns with the bulleted list of what SFRAME encapsulation will provide. My recommendation would be to reframe this text around what security properties/assurances/services this encapsulation will provide (rather than a functional list). ** If configuring the security services is out of scope, where is it anticipated that this signalling protocol work would occur?
I support Magnus's and Érics' Blocks. Some additional comments: Real-time conferencing sessions increasingly require end-to-end protections that prevent intermediary servers from decrypting real-time media. The PERC WG developed a “double encryption” scheme for end-to-end encryption that was deeply tied to SRTP as its underlying transport. This entanglement has prevented widespread deployment. I thought we were going to tweak this text (noting that RFC RFC 8723 is only a handful of months old). It might also be worth a note about the general expected shape of the key hierarchy (e.g., one key per sender vs. full mesh). * Selection among multiple encryption keys in use during a real-time session * Information to form a unique nonce within the scope of the key * Authenticated encryption using the selected key and nonce I assume that this means "assembling preexisting crypto building blocks", not "define new crypto". The transport-independence of this encapsulation means that it can be applied at a higher level than individual RTP payloads. For example, it may be desirable to encrypt whole frames that span multiple packets in order to amortize the overhead from framing and authentication tags. It may also be desirable to encrypt units of intermediate size (e.g., H.264 NALUs or AV1 OBUs) to allow partial frames to be usable. The working group will choose what levels of granularity are available and to what degree this can be configured. (Available as input to the WG of available from its output?) It is anticipated that several use cases of SFrame will involve its use with keys derived from the MLS group key exchange protocol. The working group will define a mechanism for doing SFrame encryption using keys from MLS, including, for example, the derivation of SFrame keys per MLS epoch and per sender. Will other sources of key material be considered?
[[ comments/questions ]] * Agree with "Information to form a unique nonce within the scope of the key" seeming a bit underdefined at the moment. * If conferencing migrates to carriage over QUIC, would that have any impact on/overlap with this work?
This may come up in the context of the document reviews, but given the recent discussions on terminology we may want to proactively avoid use of the term 'nonce' in the WG charter.