It can take a while depending on the size of the document..please wait
Federated Identity Management for Libraries (FIM4L) - Draft Guidelines & Recommendations
0 days left (ends 31 May)
Known as federated authentication, delivering Single Sign On (SSO), this process, if not configured correctly, is at odds with the responsibility of libraries to protect their patrons’ privacy.
In order to preserve patron privacy, while also making the configuration and management of federated SSO connections easier for both libraries and publishers, LIBER’s FIM4L Working Group has drafted 10 Implementation Principles for SSO. The principles drafted by the group are now open for public comment.
Please read our full draft guidelines and share your feedback by 30 April 2020. Your comments will help us create a final set of recommendations which libraries can use to give patrons seamless access while preserving privacy as much as possible. You can comment here or email feedback to firstname.lastname@example.org.
LEVEL OF AGREEMENT
MOST DISCUSSED PARAGRAPHS
Much like with the transient ID this is a matter of software implementation and configuration. For example, shibboleth cryptographically derives the pairwise identifier from a value associated with the relying party entity ID and the user’s identity. Because the shibboleth implementation is a cryptographic derivation, the IdP operator can NOT determine the user from the persistent identifier. Logging, again, would be required for a shibboleth IdP to persist the pairwise identifiers and ensure the ability to translate to a patron if misconduct occurs. If it is desired that
I wrote, "Some IDP implementations allow the IDP to identify who the patron is from the transient ID as well," which is too terse. The transient ID's behavior at the IdP depends on the IdP software and the configuration of that software. Some institutions may require logging that persists the transient Name ID and the user's idneity. See https://wiki.shibboleth.net/confluence/display/IDP30/AuditLoggingConfiguration which notes "The audit logging feature provides a detailed record of every request and response handled by the IdP to allow tracing of user activity, statistical analysis of usage, legal record keeping, etc. " note in the section SAML Fields, "NameID value" and "NameID format" are explicitly called out. If a shibboleth implementation has this is logged, the IdP can correlate a transient ID with the user identity. If not, the IdP will not be able to. If it is desired that this correlation occur, it would be valuable to call out as an IdP configuration best practice.
Publishers and suppliers of licensed online resources want to provide authorized users of institutions for higher education and research with access to their services in a controlled way. The commonly used access method based on IP address has limits when users want access from anywhere and any device at any time. With the new solutions, based on federated authentication and Single Sign-On (SSO), it depends on how you configure the connection whether and which parties can identify individual users. As always, libraries want to protect the privacy of their patrons, and give them control over that privacy.
Add/View comment (1)
In order to make configuration and management of federated authentication easier for both libraries as well as publishers, a number of scholarly libraries from around the world have agreed upon the following guidelines to control access to services based on licensed content.
To understand the rest of the text, the following terms are important:
- Publishers are Service Providers (SP)
- Institutions/libraries are Identity Providers (IdP)
SSO Implementation Principles
Principle 1 - The configuration and solution has to be in line with data protection regulations, in particular the General Data Protection Regulation (EU GDPR).
1. Regulation (EU) 2016/679 (General Data Protection Regulation) in the current version of the OJ L 119, 04.05.2016; cor. OJ L 127, 23.5.2018, https://gdpr-info.eu
Principle 2 - For access to services based on licensed content, next to the option of access based on IP addresses, it is recommended to use the SAML 2.0 protocol (or its follow-up technology OIDC/OAuth2 if the involved IdPs are able to handle it) to connect and control access.
Principle 3 - eduGAIN has been established as a proper means to interfederate between identity federations, and thus enables service providers to greatly expand their user base. Thus scholarly libraries should prefer publishers who are connected to eduGAIN. Libraries should encourage publishers to make use of eduGAIN.
Add/View comments (2)
Principle 4 - The following lists the recommended options for authentication attributes, ordered by degree of privacy control, with a. being better privacy preserving than b. and so on:
A - The publisher only requires a transient identifier - "privacy star". During a session the user is identified by a transient identifier (NameID) containing a unique string (for example: bd09168cf0c2e675b2def0ade6f50b7d4bb4aae) for this Service Provider (SP). If the user logs in again, a new transient identifier will be generated. This allows for maximum privacy. However, it doesn’t allow the publisher to recognize a returning customer, which makes it impossible for instance to know what resource is downloaded by the same user. A profile page for the user thus also doesn’t make sense with this option. It also doesn't allow the library to translate the transient ID to a patron in case of misconduct (e.g. excessive downloads). (In exceptional cases it could be done however by some IdPs and federations by thoroughly investigation of log files).
Add/View comment (1)
B - The publisher requires a persistent but targeted identifier - "personalisation and subject tracking possible". A persistent identifier (ID) contains a unique string, like the transient one, identifying the user for a specific SP, but persisting over multiple sessions: on every authentication, for the same user the same ID is used. This is an option for services that have a need to recognize returning customers, for instance so it can present you your files, your orders etc. In SAML the Pairwise Subject Identifier is preferred over eduPersonTargetedID (deprecated) and SAML 2.0 persistent NameID . When opting for a persistent ID, consider the following:
- A persistent ID allows the library (not the publisher) to translate the ID to a patron in case of misconduct.
- It is possible to lock down access for a particular user in case of misconduct.
- A persistent ID (like the Pairwise Subject Identifier, pairwise-id) is sufficient for the SP to provide personalization features. Sometimes an SP requests more information, like a name and email address. Adding personal information like Name and Email to enrich the user profile should be optional (not mandatory) for the user. Libraries/institutions are advised not to transfer that information during authentication, but have the SP offer the user a profile page in their service, where users provide consent and can voluntarily provide name, email or other information. Minimize the attribute set provided to the service during the authentication-flow.
- Before a service that receives a persistent identifier creates a profile for the user, the user should be asked for permission to store and process personal data, for instance via a button “personalize account” or at least be informed by a message on data privacy . In no way should the permission request be mandatory or seemingly mandatory for the user; the user must be free whether or not to have a personal profile.
2. This is in line with this argumentation.
3. E.g., "By connecting to this service, I agree that the service provider stores my person related data (ID, affiliation, entitlements sent by my IdP, my IP address sent by my client, and my actions on this platform). Only if I want to receive emails from the service or if I want to be addressed by my name, I will add my email address and name respectively, but this is not needed for any other personalisation features like 'point me to the last document and its last page I read', 'my last searches', <include your personalisation feature here>, etc. Whenever I wish to do so, I may request to see and to have deleted all data stored about me."
Add/View comments (2)
C - In addition to 4A or 4B the SP can require extra (‘non-identifiable’) information. If more information is needed to allow for billing, access control etc. identity providers can supply one or more of the following attributes (from most to least preferred):
- eduPersonEntitlement, with the specific value urn:mace:dir:entitlement:common-lib-terms
- eduPersonEntitlement, with other values, representing group or role memberships in alignment with AARC Guidelines on expressing group membership and role information
- Usage of schacLocalReportingCode attribute is recommended for statistics purposes once it is well defined.
4. Please note that this attribute is not available in many federations and IdP's, so if the SP would like to receive that attribute, it will take specific communication between SP and IdP and possible the federation.
Add/View comment (1)
Principle 5 - SP software should be able to handle more attributes, but not require more attributes. Some publishers state “I need an email address, as my software can’t function without it”. Publishers with (older) systems that require more attributes for authentication to function should adapt their systems ASAP. Libraries are recommended to stop or don’t start using services that require more personally identifiable information (PII) than a transient or persistent ID during authentication.
Principle 6 - Apart from generally working according to the GDPR, when requesting information from users, for instance in a profile page, publishers have to adhere to the most recent EU “Guidelines on Consent” to make sure that free consent is given in compliance with the GDPR.
5. Guidelines on Consent under Regulation 2016/679, https://ec.europa.eu/newsroom/article29/item-detail.cfm?item_id=623051
Risks & Concerns
The above recommendations do impact some risks, that we want to make explicit in this section:
● Deanonymization: If you provide a targeted ID, as recommended in Principle 4, Part B above, you have to be aware that other data, already collected by the SP, could be linked to this ID.
● Apart from the fact that for GDPR pseudonym IDs (and even IP-addresses) are PII, normally users would see a consent or information screen when accessing an SP for the first time and would see what attribute release policy the IdP has opted for. There might be cases where everybody is fine with releasing certain PII. But if possible, give users a choice, for instance by not releasing information during authentication, but by offering a profile page within a service, where an individual can voluntarily share more information.