SCIM - IETF 114

Friday, July 29, 2022 12:30pm

Chairs: Aaron Parecki \[ap\], Nancy Cam-Winget \[ncw\]  
Notes Takers: Pamela Dingle \[pd\], Joris Baum \[jb\]

## Agenda   {#agenda}

*   5 min - Agenda Bash & Logistics: chairs
*   5 min - Use Cases document update : Pamela Dingle
*   30 min - Janelle Allen and Danny Zollner
    *   soft delete
    *   account status
    *   HR schema action
    *   roles and entitlements
    *   change detection/delta query

*   5 min - SCIM Events update : Phil Hunt
*   https://datatracker.ietf.org/doc/draft-ietf-scim-events/
*   15 min - Coordination : Phil Hunt
*   40 min – Pagination : Janelle Allen and Danny Zollner
*   20 min - AOB

\[ncw - Nancy Cam Winget\] welcome and housekeeping, make sure you are
joined to the meeting, masks on unless presenting at the front, code of
conduct, note well  
\[ncw\] - Joris and Pam are scribes (yeah!)

*   packed agenda - instead of reviewing drafts, chair has asked for
    brief updates
*   discussion topics are where we are putting our time, threads have
    occured about coordination, implemenation, performance, pagination
*   agenda in Meetecho (and in this document) is correct but slides
    aren't updated, so use the meetecho agenda
*   call for comments/updates? \[no answer\]

## Use Cases   {#use-cases}

Current draft Link: https://hackmd.io/2rpAFJnVSkyPbnO9Bm-qPQ

Pam Dingle \[pd\]:

There is an actual document.

No discussion today, because nobody had a change to review.

Rationale: Define terminology that was not defined in SCIM protocol
themselves (e.g. SCIM service provider, SCIM Schema, etc.). There are
new use cases relevant for SCIM.

Call for review and feedback. Feel free to add changes to the HackMD in
the next few hours.

Danny Zollner \[dz\]: The first version will be published to Datatracker
latest on Monday.

## SCIM Schemas and Protocol Topics   {#scim-schemas-and-protocol-topics}

\[dz\] Janelle couldn't be here but we are editors for changes to scim
schema/protocols and are shepherding exenstion  
\[dz\] Add new extensions to the protocol

*   some ideas have drafts, some don't if you want to get involved let
    us now

### Cursor-based pagination   {#cursor-based-pagination}

\[dz\] - an expired version exists, high level: improve massive scale
operations when a client needs to traverse large results

*   Protocol defines cursor-based pagination (but that isn't the only
    option)
*   Lots of discussion on the list on this topic

### Multi-values attribute pagination   {#multi-values-attribute-pagination}

\[dz\] Phil Hunt wrote a draft.

Phil Hunt \[ph\]:

*   extending multi attribute value and telling which pages you want
*   case of the group saying they like it (or not)
*   complexities: knowing how many rows there are
*   clarifying: this is paging of attribute values, not paging of
    resources

\[dz\] - for example if you have a group of a million members this lets
you break up the request into bite size pieces

*   Matt Peterson (not present) has proposed a different approach to not
    need multi-value attribute pagination - proposal is to create new
    top-level scim resources that can be results returned

\[ph\] - there already is an attribute under the user profile called
"group" (discussion on what Matt's proposal is for)

\[ncw\] should postpone discussion on pagination topic

### Soft Deletion   {#soft-deletion}

\[dz\] - adds a new query parameter & response attribute, expired draft 

\[ncw\] - asking Danny to pause to fix audio issues  
\[ph\] - today the SCIM protocol says once you delete a resource the
response should be a 404. Adding a parameter changes the behavior to
return an object even though it is deleted

*   need something to override the behavior
*   question - is that a security issue? Should only some clients be
    able to execute that command?

question from Darrel Miller \[dm\]: What is the expectation of DELETE on
a resource with the softDeleted=true parameter? Is it no-op or a hard
delete?

\[ph\] - that is something a spec would define, this is a change to the
spec  
\[dz\] - I believe if you do   
\[pm\] - The client might not be able to hard-delete it?  
\[ncw\] - ???  
\[dz\] - whether or not the SCIM server or client is capable of soft
delete should be represented inside ServiceProvider Config  
\[Michael Prorock - mp\] - stating clear need for soft delete for use
cases where there is an overlap of those that can or can't do soft
delete  
\[ncw\] - as chair - there needs to be clear delineation  
\[ph\] - it is always true that today a SCIM server can soft delete if
they want to. The issue is why does the client need to request a soft
delete or not. Currently is the broad use case is resurrection, so that
if a resource is deleted & returned there is referential inte  
grity  
\[ap\] Clarification:is it the abiliy to query for soft deleted records?

 \[dz\] we draft defining a way to get info from the service provider  
 \[pm\] question (if IDaaS platform) they could be more pres  
 \[ncw\] ACTION: take back to use cases and requirements document  
 \[Conner cr\] - the status field can be used for knowing that an entry
is soft deleted so question is what is new/different  
 \[ph\] it might be more about using the right identifier if the user is
restored   
 \[dz\] - any more comments/questions?

\### Soft Deletion - Proposed additions  
 \[dz\] - there are suggested additions, not sure this is a right fit,
folks should review

\### Schema: Roles and Entitlements  
 \[dz\] - There is a draft that just expired, URL wrong on slide

*   adds 2 new resource types to schema - /roles and /entitlements
*   SCIM client can find accepted entitlements & roles for the resource.
    In apps linked to a SCIM server (esp SaaS) the customer can
    customize their roles, SCIM client cannot discover those
    customizations in a standardized way. Creates an OOB management
    problem, and failed requests.
*   there is an example on the slide
*   new version: -02. Please check out, new hierarchical representation
    to see how the structure of roles, licenses work in the application.
    Also attributes to represent availability (for example when
    entitlements are licenses with a specific number of seats).  
     ### Topics without drafts  
     \[dz\] - for topics with drafts, we want to pick up and discuss
    whether or not to adopt the drafts that exist so we can either write
    new or adopt in the next 5 weeks

*   following topics align with today's charter but no docs

### Topics without drafts - Change detection/Delta Query   {#topics-without-drafts---change-detectiondelta-query}

*   time drift is an issue with time-based queries
*   good to have a way to accurately get all changes since last query

\[mp\] - looking at bulk update nature, change detection is very useful 

 \* change since last update highly desirable

\### Topics without drafts - Human Resources Schema

\[dz\] - given data relationship from HCM has with other services,
desire to help HCM label their attributes in a common way. No draft
exists, critical need is for actual HCM providers to be part of the
schema development

\### Topics without drafts - Account Status Context

\[dz\] - lines up with soft deletion topic

*   today you only know whether active is true/false. Proposal for a
    complex attribute with more information (pre-hire? on leave?
    terminated?)
*   *   goals to align states with shared signals  
         ...

\### Topics without drafts - Securing Reference URL access  
 \[dz\] - improving to protocol for reference URLs

*   key example is photos - right now there is a complex attribute with
    a reference property
*   problem with things of URL format (ie avatar picture) - as a cloud
    IDP and I am communicating all the URLs for the whole company, the
    client/server needs to actively go back and ask for the URLs. What
    is the expectation for storing the URL over time?
*   If a URL is provided, how can only the SCIM participant intended can
    access the URL? Goal originally was to leave up to implementers, but

\[ph\] - my understanding is reference URL is just URL. Actually
fetching opens a can of worms, spec is silent for a reason. maybe
somebody wants to add value but that is new territory  
 \[dz\] - agreed this is a can of worms, but a consensus is needed. the
number of collaboration apps that exist out there - partners want to
consume this information but the barrier is high. We need a scalable way
to do this, for purposes of interop  
 \[os - Orie Steele\] - have seen versions of this problem solved

*   trusted central proxy unwraps image links and serves stable files
*   sometimes network goes down and URL references can't resolve, in
    this case embedding data is an option
*   Trusted highly availabe proxy is also an option (this has issues)

\[jb\] - my take is even there is no consensus at least it should be in
the security considerations  
\[dz\] - a lot of potential solutions but would like to see a group of
people find a good scalable solution   
\[ncw\] Phil comments that it could fit well in a "best practice
document"

\### Topics without drafts - BCP: Modern Security Profile  
 \[dz\] - interested in typing up aspects of standard wrt Security.
Published in 2015 - but the world has changed.

*   Examples of things that we profiled include dropping support for
    basic auth, to drop password attribute from user resource (not a
    never, just not a not-recommended)
    *   Common scim use cases of cross-domain identity exchange should
        not be a place for passwords to move.
    *   Adding to Joris' point earlier, reference URL security could be
        another option
    *   Hoping to find other co-authors

\[ph\] - it may already be in the security considerations, but there are
2 scenarios.   
 1 Authenticating as a client to make SCIM calls (using a header)  
 2 There are valid cases to work with passwords, if they use scim server
as a directory this could be part of a whole MFA strategy, so it might
be harder to remove schema than to augment considerations  
"Usage of basic authentication should be avoided"

\[ap\] this is 2 separate topics (password is about attributes, basic
auth is about client auth)  
 \[ph\] - better to refer to what is in the spec (which is very clear
but not normative) - if people feel strongly we could ban basic auth,
but getting rid of passwords is tougher, there is still LDAP out there.
Hearing that people want credential schema to have more sophistication,
want password management procedures/properties standardized (last login
etc)

\## SCIM Events update  
 \[ph\] WG had a call for adoption of the events draft (thanks nc) still
needs more co-authors

*   means ph will be talking more with authors, if anyone wants to
    co-author let him know
*   couldn't get confirmation but have heard from a few people

\[ncw\] - since I am now an author I will relinquish chair role to ap
for this draft  
 \[mk\] - how do I participate?  
 \[ncw\] - send a mail to phil  
 \[ph\] - minor changes to make before you see anything new

\## Coordination : Phil Hunt  
 \[ph\] - I have seen a lot of dupe of effort and was under impression a
decision was made but need is there for more discussion. Things like
change detection etc could be covered by events draft as a polling
technique but what are we trying to solve?

*   want to understand full set of requirement
*   has been pointed out that there are broader security threats,
    signalling, scale is much bigger than LDAP
*   now have multiple administrative domains
*   (refer to slides)
*   Sometimes it is complex to choose 'master' configuraiton,
    independent operation is more important  
     ### Cursor paging  
     \[ph\] max results is actually optional. Polling + reconciliation
    can work, decisions made on what domain a results mean to domain B
*   domains independently operate today (didn't happen in the past)
*   bi-directional authority can exist  
     ### Event Paging   
     \[ph\] - a SET is sent through a transport mechanism (message bus
    or SET transfer protocol)
*   reconciliation happens after every event
*   can happen in real-time or could be dispatched (ie as digests)
*   could be security, account change, or other events causing action
    (eg SOC notification)
*   each message is a statement of fact, up to domains to decide what to
    do

### Bootstrap/Recovery   {#bootstraprecovery}

\[ph\] no draft addresses today, could be JSON export/import

*   is a special case problem solved in different ways
*   if replication and message bus is used, could be built-in recovery
    options
*   describing an actual solution for this could be out of scope for the
    spec?

### Controller Coordinated Provisioning   {#controller-coordinated-provisioning}

\[ph\] controlling domain reconciles changes between domains

*   ### Internal Domain Replication/Coordination   {#internal-domain-replicationcoordination}
    
    \[ph\] - in past we didn't standardize replication, but isn't true
    today

*   if you use multiple cloud providers, internal domain replication
    would be a challenge
*   if you are in the same security domain, schema will be the same
*   cross-domain the receiving domain might be interested but no way to
    tell which bits

### Bi-directional Co-ordination   {#bi-directional-co-ordination}

\[ph\] - concept of who is master/slave is weaker and bi-directional
coordination is more important.

*   might want to think about throwing away master/slave and moving to
    per-attribute authority

### Coordinated Signaling   {#coordinated-signaling}

\[ph\] - might want to know about security things that happen, may want
to share that data

*   scenario: new authentication factor added - want to be sure this
    isn't an attacker adding the factor, abusing it and removing it

### Comparison   {#comparison}

(big comparison chart, please refer to meeting materials)

*   compares cursor paging vs events  
    \[ph\] prefer to see a way where changes don't contain raw data just
    notice that an identifier resource has changed so that interested
    parties know to go look
*   async signal such as 'bulk request has completed' might have value
*   paging protocol is simple to specify but harder to implement
    *   Some DBs can't support cursor paging, maybe issue with
        thrashing, memory
    *   Phil thinks you can avoid paging by doing a GET
        *   but can perform the operation rarely

\[dz\] - Spoke with Matt Peterson on this topic prior to meeting (Matt
not present)

*   one of the issues in the events draft - once the signal is sent and
    receiver has responded with 200ok,the transmitter may purge the
    message
*   in cursor paging, the request can be replayed. This helps with small
    downtime
*   As a SCIM SP, implementing extra attributes to support cursor
    pagination, that is still easier than shared signals processing
*   paraphrasing Matt - there are existing DBs that natively do
    cursor-based pagination and store in memory in order to implement
    the index pagination

\[ph\] - in the idtoken WG, this was discussed, issue for SPs were the
sheer # of events flowing out, storing over time becomes problematic.
Guaranteed transfer with ack means that responsibility shifts to
receiver. This is permission to forget given to transmitter. But nothing
says the transmitter can't keep the event indefinitely

*   if we mean surviving an hour that is one thing but surviving a month
    is something else.

\[ncw\] We want to trigger discussion of use cases and requirements. We
want discussion about adoption about drafts. We want to get alignment
about use cases that are driving requirements. Please jump into
discussion!

### Pagination Use Case Discussion   {#pagination-use-case-discussion}

\[dz\] Use case behind pagination - full transparency, I work with a
centralized SCIM client that is authoritative and pushing data to many
SCIM SPs

*   pagination is needed for both push/pull
*   when you want to send the data you have and learn other data from
    other systems, and there are millions of results cursor pagination
    PLUS some kind of delta query might be needed. First get an
    incremental set and then break them up. It is similar to how LDAP
    does it, right?  
    \[ph\] - cn=changelog was an example of how not to do things. It was
    very hard to implement good security, esp with multi-tenancy it was
    tough, had confidential data in it. dont want to repeat LDAP, we
    want to improve upon it  
    \[dz\] my understanding there are other REST APIs where a delta
    query mechanism does exist. Not tring to re-invent, just to look for
    existing patterns
*   scope of events draft compared to cursor-based pagination -- maybe
    an event flow could be used. Putting SCIM payloads into events -
    other examples for events was high priority data that could not wait

\[ncw\] Message from chat (from Wenting Tang) "I think cursor pagination
and events both have its unique use cases. In the area of the data sync,
I felt we need both even though the changing event publish should be the
primary mechanism to sync the data efficiently."

QUESTION to implementers and users of wg: phil has articulated
requirements, danny has said how it is today. Are you willing to absorb
the overhead for both supporting/implementing/deploying both
thmechanisms for certain use cases?

\[dz\] Speaking as an implementer - both are needed. A wholesale
replication feature to move all changes vs high priority ones, cannot
say we would never implement at that level but suspect there is strong
preference to implement in some use cases. But at scale of wholesale
support of all events going through a polling model, having both polling
and a listener might be a problem.

*   when a hundred little cloud platform containers are running, those
    containers die all the time. recreating a request is likely to be
    important  
    \[ph\] - there are 2 other mechanisms - RFC 7232 & e-Tags, ie adding
    a pre-condition of "get this resource if it has changed". " This
    condition only applies if the resource hasn't changed underneath me"
*   other mechanism is meta-last-modified. Might be more crude than what
    others might want but   
    \[dz\] - hesitance to lean on last-modified because times are tough
    to be precise over with distributed systems
    *   ETags are a resource-level hash so doesn't help with change
        detection on large datasets

\[ph\] - yes it only works for specific resources

\[pd\] - Need to go forward with this. Proposal: Create a survey about
both mechanisms and get feedback.

\[ncw\] - ACTION: Please post request for survey in the wg mailing list
and let implementers reply.

\[ph\] if you go with paging, nobody will use events. Cursors are for
subsets of data, otherwise you would have to lock all the rows in your
DB. Could be a denial of service problem in cases where hackers attack.
If we choose multiple methods we risk that half the community does one
thing half does the other. Would rather have one method than both.  
\[ncw\] which is why I asked about whether implementers can absorb
overhead  
\[dz\] I think SET and SCIM would still be adopted because of the
urgency differentiation. Agree that cursor based pagination would be
strongly adopted and implementers may not want shared signals for the
same use case as pagination + delta query. HR provider problem of urgent
termination would be a scenario that encourages(and needs) events for SP
to alert client.  
\[Wenting Tang wt, Okta\] we are doing many integrations with HR systems
most implementations start with search which use pagination but later on
use cases with real-time notifications. Not sure as a standard do we
want to force people into one option or to give people choices - there
could be pub/sub systems or HR systems that provide information too.   
\[ncw\] - other suggestion to Phil/Danny/Matt - comparison chart that
Phil started is a good way to provde succinct data, encourage to
complete that and bring back for discussion.   
\[wt\] another use case - full export import to verify issues in new
channels, for example when org chart changes, may need to go back to a
full SCIM export to check  
\[dz\] capability provided by events draft could be a potential
optimization to pagination rather than a replacement (part of a
conversation with Matt)

### Chairs update   {#chairs-update}

\[ncw\] we are behind on our milestones, need to update

*   Aaron and Nancy will come back and get dates from editors for Use
    case doc
*   May put questions to mailing list on deferring publication or
    turning use cases into living doc
*   Can decide on what extensions are needed but we need drafts to be
    adopted before they can evolve
*   Chairs looking for at least 3 individuals who are not authors to
    provide constructive feedback
    *   Should this draft be considered to be in scope
    *   Can it serve as a seed or starting point for the topic

*   Will try to schedule an interim in September, will put out a doodle
    poll to get a date
*   Hope to see us at IETF 115

### end   {#end}