-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
🙋 allow nodes to authorize each other's users within the network #8
Comments
It would be possible to have a dedicated federation node with Magpie and Twitcher running with the various services to support. This does not need any modification to the current Magpie/Twitcher code. However, this also implies that all services shared across all the nodes must be defined on that federated node. Users should also be created on that federated node. Given that, any authentication endpoint on the local node should be redirected to the federated node as well. For the token validation portion of a user, if this is still needed, Twitcher also has a |
Thanks @fmigneault, the info about the twitcher endpoint is very useful. The idea about using nginx to delegate the authentication to other nodes is a good one too and worth considering. It sounds like you're describing a centralized node that's used by all other nodes for authentication. I'd really like to avoid that if possible since it creates a single point of failure for the network. I'd like to keep the current status quo of having each node fully responsible for the data/services it provides, including authentication of its users and authorization of its services. |
@mishaschwartz Another option that could be possible is using a synchronization of user permissions between nodes. Another option, would be to use Twitcher's OAuth2 Tokens](https://twitcher.readthedocs.io/en/latest/api.html#module-twitcher.oauth2), but those are purely for Twitcher controlled access, without any additional capabilities from Magpie related to users. Again another approach, is to use the |
Only question I have is regarding permission groups. Do those need to be communicated across nodes ? |
@huard No they wouldn't need to be communicated. The idea is that nodes need to be able to communicate to authenticate users, but then all authorization is internal to a specific node. For example, a user might be authenticated by Node A, but be in different permission groups in Node A and Node B since they have different authorization profiles in each. I don't think that we would want a change in some permission group in Node A to affect Node B since that would take some of the authorization control away from the node administrator of Node B. Does that make sense? |
I'm thinking that the best solution here might be to implement a token system in Magpie that is similar to what we have in twitcher but is aware of users. Like @fmigneault describes here:
We could use a similar mechanism as described here to implement this, but allow passing token values in a header as well as Authorization keys:
I think that would allow us to make all changes within Magpie and the changes would only be additive so they shouldn't break any backwards compatibility. We could even make the token passing mechanism optional for Magpie (required for a Marble implementation though) and turned off by default so that current users of Magpie for other applications would see no difference. |
Two quick questions:
|
Yes, mostly... the idea is more that if you have an account on Ouranos, you don't need an account on UofT as well to access some resources on UofT. The node admin at UofT can decide something like: "tlogan2000 from the Ouranos node can have access to resource A, B, and C at UofT". However, to do anything browser-based (that would require setting a cookie), you would need an account at both UofT and Ouranos. The use-case that I'm imagining here is more "I have logged in to jupyterlab at Ouranos and I'd like to run a weaver workflow on the UofT node".
This is an interesting idea. I don't love the idea of automatically syncing data between user workspaces across the network because that would involve a lot of data transfer between the nodes that may not be necessary. I wouldn't be opposed to a feature that would allow a user to selectively transfer data between user workspaces in a more transparent way. Though, I think that feature we'd want to implement as a second step; first give users access to multiple nodes, second allow data syncing between nodes. I'd be interested to hear other people's opinions on this as well.
No, each node would not be aware of each other's groups. So even if both nodes have a "protected" group, neither node knows anything about the other one. The only information that nodes will share is that a specific user is authenticated on another node, not anything about that users permissions or group memberships on the other node. |
Maybe instead of syncing data between Jupyter instances, which could potentially be quite big so waste of disk space and bandwidth if many instances in our Marble federation, how about sharing those data like Ouranos currently do with For the user sharing between different Magpie instances currently discussed here, does that requires code change or just additional config? |
It would require a code change |
@mishaschwartz Sorry for being unclear but in my mind this would not be an 'auto-sync' but indeed something that the user would do selectively. Users on the Ouranos node can and do generate a reasonable amount of output (.nc files etc etc) and I agree it is likely not be a great option to automatically transfer everything that resides in the user workspace |
Behind the scene, regardless of the approach, a pseudo "tlogan2000-ouranos" (could be any name, an UUID, etc.) would have to be created as a Magpie "user" to resolve permission access. Any request (browser-based or not) requires some kind of "user" to resolve permissions against. Even when validating a group permission, it is the user's membership to given groups that is resolved for access rights. The algorithm cannot grant access if it has nothing to resolve against (ie: the "user"). The Cookie, Basic Auth, Bearer token, etc. are only the methods to indicate who that "user" is, but are essentially equivalent after the identity (authentication) was resolved. The authorization part requires a "user". Note that this "user" could simply be a concept, a bot, or whatever else, not necessarily an actual person with a profile. Therefore, there would always be at least one "user" account for each node.
Reading from a protected HTTP endpoint per user is something being worked on by @ChaamC. It would be supported by Cowbird granted that https://github.com/bird-house/birdhouse-deploy/tree/master/birdhouse/optional-components/secure-data-proxy is properly configured (relates to bird-house/birdhouse-deploy#360). However, writingto that location (ie: push data to the other node) is not supported, since WPS outputs are not intended for this purpose. This affects some of the design choices defined in the current implementation, which impacts how to manage HTTP vs FileSystem file/dir-access, which are not that trivial with the multi-services permission synchronization that Cowbird must accomplish. |
Yes, good idea. The idea is to create one "anonymous" user for the network, and one for each node. |
I'm going to start working on a change to Magpie in order to implement the ideas discussed here |
Let me know if you encounter an issue regarding Magpie/Twitcher. |
Topic category
Select which category your topic relates to:
Topic summary
Problem
Users will typically have an account on one node in the network. If they want to access resources from another node in the network (that aren't publicly available) they currently need to create a second account on the other node and log in there as well.
This creates an additional burden on the user:
This creates an additional burden on node administrators:
Lastly, it leads to duplication of user accounts on the network that can cause some minor issues down the line (e.g. if we decide to have a mechanism to email all network users about technical issues, then some users will receive duplicate emails.)
We should implement a system where nodes can be "remote authenticators" for each other within the network and node administrators can authorize access to resources for users registered elsewhere on the network.
Additional information
Proposed Solution
Allow nodes to authenticate users for each other using tokens and provide authorization options that can be generalized to members of specific nodes or the network in general.
Proposed cross-network authn/z process:
Proposed new authorization groups (in magpie):
Process of authorizing a user on another node:
Required changes:
Most of the above changes should probably be handled by magpie/twitcher but we should discuss the best options.
This topic may be of special interest to @fmigneault @huard. Please feel free to tag others who may want to discuss this as well
The text was updated successfully, but these errors were encountered: