-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ansible operators usage of WATCH_NAMESPACE env variable may lead to unnecessary privilege escalations #5989
Comments
Hi @ivandov, If I properly understand what you are looking for is;
All projects are done using Golang and controller runtime (Indeed Ansible/Helm operators are acctually Golang ones). The options to configure the manager (see https://pkg.go.dev/sigs.k8s.io/controller-runtime/pkg/manager) do not allow: I want an Operator which is cluster scope (have access to the whole clusters for what is allowed with the RBAC config) but for only the specific Kind( Secret ) I want to limit the access and restrict it to a namespace X. What you are looking for can be done via Predicates (which is nothing more than a condition if condition X return nill to stop the reconciliation). That would mean in the Reconcile you could check if the secret is or not from the namespace where the Operator is installed. Then, if not stop it. However, as you are using Ansible it seems more complicated. You would need to make your playbook be able to check if the secret is not from the same namespace where my operator is running then ignore it. Just to add you might want to also give a look at (but it is for Golang): https://sdk.operatorframework.io/docs/building-operators/golang/references/event-filtering/ |
Hi @ivandov, My understanding of the core problem here is the inability to "scope" down operator permissions to only have the permissions they need to operate. In your particular case that would be that your operator should only have the RBAC permissions to read secrets from namespace X, while having the permissions to create the This is a problem that we have identified in the current state of the operator ecosystem and are currently looking into different solutions. I don't have an ETA on when this functionality may be available (if at all), but there is work being done in this problem space. For now I think the best workaround would be what @camilamacedo86 suggested. |
Thanks @camilamacedo86 and @everettraven. So my team has had experience building both Golang and Ansible operators, as well as forking the This issue was written up - not because of the need for handling this form of logic in the ansible layer - but more about the security exposures of this form of operator. It is significantly broader in scope to make all ansible operators run with cluster permissions for a resource, when they only need to work with that resource in the current namespace scope. From a security perspective, if my operator's service account token leaks... a user would be able to read all secrets from all namespaces! That should not be the case if the functional need of my workload is to read secrets from the same namespace in which the operator is installed. It had seemed to me that the I'm not intimately familiar with the considerations for handling multiple In both cases, these clients would use the same service account token, which would have the properly scoped RBACs set with From an "overly-simplified" perspective, I should be able to hit the kube APIs with the proper cluster or namespace scoped URIs based on the permissions that were granted. In its current state, it should be made extremely clear that ansible operators that need to work with resources that may be considered sensitive, like |
Hi @ivandov, Currently, the controller runtime itself does not allow us to inform that one kind or that a list of kinds should be allowed only from a specific namespace as you are looking for. Therefore, I think we should close this one based on the above explanations. c/c @everettraven |
A limitation in |
I agree with @ivandov that the current limitation in controller-runtime shouldn't warrant closing this issue, but I think this issue would rather be better suited to be added to the Backlog due to the fact that there is work being done in this problem space. @ivandov if you are interested in reading more about the work and potential solutions we are exploring I would recommend taking a look at these HackMD documents:
As another part of this, I am also working on a PoC for a new type of Cache that will be dynamic and able to handle RBAC scoping in the caching layer. I think that the problem that you describe is the exact type of problem we are wanting to solve with this new concept of Descoped Operators and RBAC templating/scoping (the problem being Operators not being able to have permissions scoped via RBAC and instead requiring all or nothing access to resources). As I mentioned in my previous comment, I don't have an ETA as to when any of this functionality will be available. I do agree that it would be a good idea to make this limitation clearly stated in the documentation. @ivandov would you be interested in creating a PR to update the docs to mention this? I am happy to keep this issue open for now and use it as a forum to post updates with the work that we have done regarding this problem. I hope this helps! |
Thanks @everettraven - I reviewed the Descoped Operators doc and spoke with our internal OLM Guild lead, who sounds like he's been staying in sync with the work your team is doing. I like the approach the descoped operator pattern will be taking, it makes sense. My only concern is that the even with manual definition of RBACs for the Operator's service account, the limitation in Or - are you saying that the PoC caching solution that you are working on will address this somehow? If so, how? Before anything is available in a cache, some kube client will have to query the proper cluster or namespace scoped resource APIs to populate the cache... so, won't I'm out of my depth here, maybe the cache is pre-populated with all resources that are able to be queried from the available RBAC scopes? Sounds like an area where a logic/data-flow disconnect can occur. |
@ivandov You are correct that there are currently some limitations in
The idea is that the PoC caching solution that I am currently working on will address this. As for the how, That's all the details I'll share here for now, but if you are interested in looking into it some more here are the GitHub repositories I have been working on to develop and test this PoC:
One thing I want to note is that this is an early stage PoC and it is NOT a solution sanctioned by the Operator SDK project or the Operator Framework organization. I will continue updating this issue as more information is available. |
Issues go stale after 90d of inactivity. Mark the issue as fresh by commenting If this issue is safe to close now please do so with /lifecycle stale |
Stale issues rot after 30d of inactivity. Mark the issue as fresh by commenting If this issue is safe to close now please do so with /lifecycle rotten |
/lifecycle frozen |
Bug Report
What did you do?
This is a follow-on and related to #2461.
I deployed an Ansible Operator with no
WATCH_NAMESPACE
environment variable defined. This operator needs RBACs to work with some resources in the current namespace as well as resources in other namespaces.Hypothetically, to focus on the potential security exposure or unnecessary privilege escalation here... let's say I'm creating a
secret-sprayer
Ansible Operator. It's purpose is to readSecrets
from the current namespace and create a Custom ResourceFoo
in another namespace with some of the data from thatSecret
.The design discussed in #2461, and also in the official
operator-sdk
doc around golangmanager
client scopes, here, seem to indicate that when theWATCH_NAMESPACE
env variable is omitted, themanager
client that is created will be expecting cluster-scoped authority.It also seems that the guidance is then to move all permissions needed by your operator in the CSV into
clusterPermissions
in order to now work with the cluster-scopedmanager
.However, in this case, the service account that is now created is given permissions to
get
Secrets
from ALL namespaces, even though, I only need it to readSecrets
from the namespace in which mysecret-sprayer
Operator is running. I only need theclusterPermissions
to create myFoo
CRs in the other namespaces!What did you expect to see?
Ansible Operators that could correctly run with clients that interact at both the namespace scope and cluster scope as necessary.
What did you see instead? Under which circumstances?
Golang manager/client errors indicating failures to work with
Secrets
at the cluster scope. Even though the operator's CSV had proper namespace-scoped permissions defined for theSecret
.Environment
Operator type:
language ansible
Kubernetes cluster type:
OpenShift 4.10
$ operator-sdk version
v1.8
Possible Solution
Create multiple
manager
clients, one namespace-scoped for the RBAC rules defined in the CSV'spermissions
block, and another for RBAC rules that are defined in the CSV'sclusterPermissions
block.Additional context
I used
Secrets
in the hypothetical example because you could see how this could be an unnecessary privilege escalation. My operator only needs to readSecrets
from the current namespace, but, I'm forced to grant it permissions to readSecrets
from all namespaces.In terms of threat modeling and ensuring privilege escalations are blocked as much as possible, this is an unintended side-effect for the way I'd like my workload to run.
The text was updated successfully, but these errors were encountered: