SCIM - IETF 114 Friday, July 29, 2022 12:30pm Chairs: Aaron Parecki \[ap\], Nancy Cam-Winget \[ncw\] Notes Takers: Pamela Dingle \[pd\], Joris Baum \[jb\] ## Agenda {#agenda} * 5 min - Agenda Bash & Logistics: chairs * 5 min - Use Cases document update : Pamela Dingle * 30 min - Janelle Allen and Danny Zollner * soft delete * account status * HR schema action * roles and entitlements * change detection/delta query * 5 min - SCIM Events update : Phil Hunt * https://datatracker.ietf.org/doc/draft-ietf-scim-events/ * 15 min - Coordination : Phil Hunt * 40 min – Pagination : Janelle Allen and Danny Zollner * 20 min - AOB \[ncw - Nancy Cam Winget\] welcome and housekeeping, make sure you are joined to the meeting, masks on unless presenting at the front, code of conduct, note well \[ncw\] - Joris and Pam are scribes (yeah!) * packed agenda - instead of reviewing drafts, chair has asked for brief updates * discussion topics are where we are putting our time, threads have occured about coordination, implemenation, performance, pagination * agenda in Meetecho (and in this document) is correct but slides aren't updated, so use the meetecho agenda * call for comments/updates? \[no answer\] ## Use Cases {#use-cases} Current draft Link: https://hackmd.io/2rpAFJnVSkyPbnO9Bm-qPQ Pam Dingle \[pd\]: There is an actual document. No discussion today, because nobody had a change to review. Rationale: Define terminology that was not defined in SCIM protocol themselves (e.g. SCIM service provider, SCIM Schema, etc.). There are new use cases relevant for SCIM. Call for review and feedback. Feel free to add changes to the HackMD in the next few hours. Danny Zollner \[dz\]: The first version will be published to Datatracker latest on Monday. ## SCIM Schemas and Protocol Topics {#scim-schemas-and-protocol-topics} \[dz\] Janelle couldn't be here but we are editors for changes to scim schema/protocols and are shepherding exenstion \[dz\] Add new extensions to the protocol * some ideas have drafts, some don't if you want to get involved let us now ### Cursor-based pagination {#cursor-based-pagination} \[dz\] - an expired version exists, high level: improve massive scale operations when a client needs to traverse large results * Protocol defines cursor-based pagination (but that isn't the only option) * Lots of discussion on the list on this topic ### Multi-values attribute pagination {#multi-values-attribute-pagination} \[dz\] Phil Hunt wrote a draft. Phil Hunt \[ph\]: * extending multi attribute value and telling which pages you want * case of the group saying they like it (or not) * complexities: knowing how many rows there are * clarifying: this is paging of attribute values, not paging of resources \[dz\] - for example if you have a group of a million members this lets you break up the request into bite size pieces * Matt Peterson (not present) has proposed a different approach to not need multi-value attribute pagination - proposal is to create new top-level scim resources that can be results returned \[ph\] - there already is an attribute under the user profile called "group" (discussion on what Matt's proposal is for) \[ncw\] should postpone discussion on pagination topic ### Soft Deletion {#soft-deletion} \[dz\] - adds a new query parameter & response attribute, expired draft \[ncw\] - asking Danny to pause to fix audio issues \[ph\] - today the SCIM protocol says once you delete a resource the response should be a 404. Adding a parameter changes the behavior to return an object even though it is deleted * need something to override the behavior * question - is that a security issue? Should only some clients be able to execute that command? question from Darrel Miller \[dm\]: What is the expectation of DELETE on a resource with the softDeleted=true parameter? Is it no-op or a hard delete? \[ph\] - that is something a spec would define, this is a change to the spec \[dz\] - I believe if you do \[pm\] - The client might not be able to hard-delete it? \[ncw\] - ??? \[dz\] - whether or not the SCIM server or client is capable of soft delete should be represented inside ServiceProvider Config \[Michael Prorock - mp\] - stating clear need for soft delete for use cases where there is an overlap of those that can or can't do soft delete \[ncw\] - as chair - there needs to be clear delineation \[ph\] - it is always true that today a SCIM server can soft delete if they want to. The issue is why does the client need to request a soft delete or not. Currently is the broad use case is resurrection, so that if a resource is deleted & returned there is referential inte grity \[ap\] Clarification:is it the abiliy to query for soft deleted records? \[dz\] we draft defining a way to get info from the service provider \[pm\] question (if IDaaS platform) they could be more pres \[ncw\] ACTION: take back to use cases and requirements document \[Conner cr\] - the status field can be used for knowing that an entry is soft deleted so question is what is new/different \[ph\] it might be more about using the right identifier if the user is restored \[dz\] - any more comments/questions? \### Soft Deletion - Proposed additions \[dz\] - there are suggested additions, not sure this is a right fit, folks should review \### Schema: Roles and Entitlements \[dz\] - There is a draft that just expired, URL wrong on slide * adds 2 new resource types to schema - /roles and /entitlements * SCIM client can find accepted entitlements & roles for the resource. In apps linked to a SCIM server (esp SaaS) the customer can customize their roles, SCIM client cannot discover those customizations in a standardized way. Creates an OOB management problem, and failed requests. * there is an example on the slide * new version: -02. Please check out, new hierarchical representation to see how the structure of roles, licenses work in the application. Also attributes to represent availability (for example when entitlements are licenses with a specific number of seats). ### Topics without drafts \[dz\] - for topics with drafts, we want to pick up and discuss whether or not to adopt the drafts that exist so we can either write new or adopt in the next 5 weeks * following topics align with today's charter but no docs ### Topics without drafts - Change detection/Delta Query {#topics-without-drafts---change-detectiondelta-query} * time drift is an issue with time-based queries * good to have a way to accurately get all changes since last query \[mp\] - looking at bulk update nature, change detection is very useful \* change since last update highly desirable \### Topics without drafts - Human Resources Schema \[dz\] - given data relationship from HCM has with other services, desire to help HCM label their attributes in a common way. No draft exists, critical need is for actual HCM providers to be part of the schema development \### Topics without drafts - Account Status Context \[dz\] - lines up with soft deletion topic * today you only know whether active is true/false. Proposal for a complex attribute with more information (pre-hire? on leave? terminated?) * * goals to align states with shared signals ... \### Topics without drafts - Securing Reference URL access \[dz\] - improving to protocol for reference URLs * key example is photos - right now there is a complex attribute with a reference property * problem with things of URL format (ie avatar picture) - as a cloud IDP and I am communicating all the URLs for the whole company, the client/server needs to actively go back and ask for the URLs. What is the expectation for storing the URL over time? * If a URL is provided, how can only the SCIM participant intended can access the URL? Goal originally was to leave up to implementers, but \[ph\] - my understanding is reference URL is just URL. Actually fetching opens a can of worms, spec is silent for a reason. maybe somebody wants to add value but that is new territory \[dz\] - agreed this is a can of worms, but a consensus is needed. the number of collaboration apps that exist out there - partners want to consume this information but the barrier is high. We need a scalable way to do this, for purposes of interop \[os - Orie Steele\] - have seen versions of this problem solved * trusted central proxy unwraps image links and serves stable files * sometimes network goes down and URL references can't resolve, in this case embedding data is an option * Trusted highly availabe proxy is also an option (this has issues) \[jb\] - my take is even there is no consensus at least it should be in the security considerations \[dz\] - a lot of potential solutions but would like to see a group of people find a good scalable solution \[ncw\] Phil comments that it could fit well in a "best practice document" \### Topics without drafts - BCP: Modern Security Profile \[dz\] - interested in typing up aspects of standard wrt Security. Published in 2015 - but the world has changed. * Examples of things that we profiled include dropping support for basic auth, to drop password attribute from user resource (not a never, just not a not-recommended) * Common scim use cases of cross-domain identity exchange should not be a place for passwords to move. * Adding to Joris' point earlier, reference URL security could be another option * Hoping to find other co-authors \[ph\] - it may already be in the security considerations, but there are 2 scenarios. 1 Authenticating as a client to make SCIM calls (using a header) 2 There are valid cases to work with passwords, if they use scim server as a directory this could be part of a whole MFA strategy, so it might be harder to remove schema than to augment considerations "Usage of basic authentication should be avoided" \[ap\] this is 2 separate topics (password is about attributes, basic auth is about client auth) \[ph\] - better to refer to what is in the spec (which is very clear but not normative) - if people feel strongly we could ban basic auth, but getting rid of passwords is tougher, there is still LDAP out there. Hearing that people want credential schema to have more sophistication, want password management procedures/properties standardized (last login etc) \## SCIM Events update \[ph\] WG had a call for adoption of the events draft (thanks nc) still needs more co-authors * means ph will be talking more with authors, if anyone wants to co-author let him know * couldn't get confirmation but have heard from a few people \[ncw\] - since I am now an author I will relinquish chair role to ap for this draft \[mk\] - how do I participate? \[ncw\] - send a mail to phil \[ph\] - minor changes to make before you see anything new \## Coordination : Phil Hunt \[ph\] - I have seen a lot of dupe of effort and was under impression a decision was made but need is there for more discussion. Things like change detection etc could be covered by events draft as a polling technique but what are we trying to solve? * want to understand full set of requirement * has been pointed out that there are broader security threats, signalling, scale is much bigger than LDAP * now have multiple administrative domains * (refer to slides) * Sometimes it is complex to choose 'master' configuraiton, independent operation is more important ### Cursor paging \[ph\] max results is actually optional. Polling + reconciliation can work, decisions made on what domain a results mean to domain B * domains independently operate today (didn't happen in the past) * bi-directional authority can exist ### Event Paging \[ph\] - a SET is sent through a transport mechanism (message bus or SET transfer protocol) * reconciliation happens after every event * can happen in real-time or could be dispatched (ie as digests) * could be security, account change, or other events causing action (eg SOC notification) * each message is a statement of fact, up to domains to decide what to do ### Bootstrap/Recovery {#bootstraprecovery} \[ph\] no draft addresses today, could be JSON export/import * is a special case problem solved in different ways * if replication and message bus is used, could be built-in recovery options * describing an actual solution for this could be out of scope for the spec? ### Controller Coordinated Provisioning {#controller-coordinated-provisioning} \[ph\] controlling domain reconciles changes between domains * ### Internal Domain Replication/Coordination {#internal-domain-replicationcoordination} \[ph\] - in past we didn't standardize replication, but isn't true today * if you use multiple cloud providers, internal domain replication would be a challenge * if you are in the same security domain, schema will be the same * cross-domain the receiving domain might be interested but no way to tell which bits ### Bi-directional Co-ordination {#bi-directional-co-ordination} \[ph\] - concept of who is master/slave is weaker and bi-directional coordination is more important. * might want to think about throwing away master/slave and moving to per-attribute authority ### Coordinated Signaling {#coordinated-signaling} \[ph\] - might want to know about security things that happen, may want to share that data * scenario: new authentication factor added - want to be sure this isn't an attacker adding the factor, abusing it and removing it ### Comparison {#comparison} (big comparison chart, please refer to meeting materials) * compares cursor paging vs events \[ph\] prefer to see a way where changes don't contain raw data just notice that an identifier resource has changed so that interested parties know to go look * async signal such as 'bulk request has completed' might have value * paging protocol is simple to specify but harder to implement * Some DBs can't support cursor paging, maybe issue with thrashing, memory * Phil thinks you can avoid paging by doing a GET * but can perform the operation rarely \[dz\] - Spoke with Matt Peterson on this topic prior to meeting (Matt not present) * one of the issues in the events draft - once the signal is sent and receiver has responded with 200ok,the transmitter may purge the message * in cursor paging, the request can be replayed. This helps with small downtime * As a SCIM SP, implementing extra attributes to support cursor pagination, that is still easier than shared signals processing * paraphrasing Matt - there are existing DBs that natively do cursor-based pagination and store in memory in order to implement the index pagination \[ph\] - in the idtoken WG, this was discussed, issue for SPs were the sheer # of events flowing out, storing over time becomes problematic. Guaranteed transfer with ack means that responsibility shifts to receiver. This is permission to forget given to transmitter. But nothing says the transmitter can't keep the event indefinitely * if we mean surviving an hour that is one thing but surviving a month is something else. \[ncw\] We want to trigger discussion of use cases and requirements. We want discussion about adoption about drafts. We want to get alignment about use cases that are driving requirements. Please jump into discussion! ### Pagination Use Case Discussion {#pagination-use-case-discussion} \[dz\] Use case behind pagination - full transparency, I work with a centralized SCIM client that is authoritative and pushing data to many SCIM SPs * pagination is needed for both push/pull * when you want to send the data you have and learn other data from other systems, and there are millions of results cursor pagination PLUS some kind of delta query might be needed. First get an incremental set and then break them up. It is similar to how LDAP does it, right? \[ph\] - cn=changelog was an example of how not to do things. It was very hard to implement good security, esp with multi-tenancy it was tough, had confidential data in it. dont want to repeat LDAP, we want to improve upon it \[dz\] my understanding there are other REST APIs where a delta query mechanism does exist. Not tring to re-invent, just to look for existing patterns * scope of events draft compared to cursor-based pagination -- maybe an event flow could be used. Putting SCIM payloads into events - other examples for events was high priority data that could not wait \[ncw\] Message from chat (from Wenting Tang) "I think cursor pagination and events both have its unique use cases. In the area of the data sync, I felt we need both even though the changing event publish should be the primary mechanism to sync the data efficiently." QUESTION to implementers and users of wg: phil has articulated requirements, danny has said how it is today. Are you willing to absorb the overhead for both supporting/implementing/deploying both thmechanisms for certain use cases? \[dz\] Speaking as an implementer - both are needed. A wholesale replication feature to move all changes vs high priority ones, cannot say we would never implement at that level but suspect there is strong preference to implement in some use cases. But at scale of wholesale support of all events going through a polling model, having both polling and a listener might be a problem. * when a hundred little cloud platform containers are running, those containers die all the time. recreating a request is likely to be important \[ph\] - there are 2 other mechanisms - RFC 7232 & e-Tags, ie adding a pre-condition of "get this resource if it has changed". " This condition only applies if the resource hasn't changed underneath me" * other mechanism is meta-last-modified. Might be more crude than what others might want but \[dz\] - hesitance to lean on last-modified because times are tough to be precise over with distributed systems * ETags are a resource-level hash so doesn't help with change detection on large datasets \[ph\] - yes it only works for specific resources \[pd\] - Need to go forward with this. Proposal: Create a survey about both mechanisms and get feedback. \[ncw\] - ACTION: Please post request for survey in the wg mailing list and let implementers reply. \[ph\] if you go with paging, nobody will use events. Cursors are for subsets of data, otherwise you would have to lock all the rows in your DB. Could be a denial of service problem in cases where hackers attack. If we choose multiple methods we risk that half the community does one thing half does the other. Would rather have one method than both. \[ncw\] which is why I asked about whether implementers can absorb overhead \[dz\] I think SET and SCIM would still be adopted because of the urgency differentiation. Agree that cursor based pagination would be strongly adopted and implementers may not want shared signals for the same use case as pagination + delta query. HR provider problem of urgent termination would be a scenario that encourages(and needs) events for SP to alert client. \[Wenting Tang wt, Okta\] we are doing many integrations with HR systems most implementations start with search which use pagination but later on use cases with real-time notifications. Not sure as a standard do we want to force people into one option or to give people choices - there could be pub/sub systems or HR systems that provide information too. \[ncw\] - other suggestion to Phil/Danny/Matt - comparison chart that Phil started is a good way to provde succinct data, encourage to complete that and bring back for discussion. \[wt\] another use case - full export import to verify issues in new channels, for example when org chart changes, may need to go back to a full SCIM export to check \[dz\] capability provided by events draft could be a potential optimization to pagination rather than a replacement (part of a conversation with Matt) ### Chairs update {#chairs-update} \[ncw\] we are behind on our milestones, need to update * Aaron and Nancy will come back and get dates from editors for Use case doc * May put questions to mailing list on deferring publication or turning use cases into living doc * Can decide on what extensions are needed but we need drafts to be adopted before they can evolve * Chairs looking for at least 3 individuals who are not authors to provide constructive feedback * Should this draft be considered to be in scope * Can it serve as a seed or starting point for the topic * Will try to schedule an interim in September, will put out a doodle poll to get a date * Hope to see us at IETF 115 ### end {#end}