-
Notifications
You must be signed in to change notification settings - Fork 589
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: implement distributed image store #7120
base: main
Are you sure you want to change the base?
Conversation
This is PoC/experiment. Each Talos node runs `registryd` component which acts both as a registry and a fan-out service. For local requests, `registryd` serves manifests/blobs from the containerd content storage. For incoming requests, `registryd` fans out requests to other nodes (cluster members), finding the first one which has the content. I had to disable content store deduplication, as otherwise containerd drops original layers immediately. One not fully solved question is how to inject `registryd`, what I did in my testing is to inject it as the endpoint in the registry mirror scheme, so if `registryd` has nothing, `containerd` falls back to "upstream" registry/mirror. There needs some work to be done to support it for `*` redirects. There is unresolved issues with images protected by authorization. At the moment `registryd` never resolves tags (defers it to the upstream registry), but still it might deliver images without pull secrets given the proper digest. How to secure `registryd` from access outside of the cluster? Signed-off-by: Andrey Smirnov <[email protected]>
This requires something like in the machine config (first endpoint is
|
Of course final solution should be opt-in, configurable with a single flag:
Open questions:
|
break | ||
} | ||
|
||
if err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this seems not needed, which err case is this handling?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this handles the case when we get IsNotFound
for all namespaces we checked for
I just found this and had a few thoughts, how would this work if the config has something like this?
And don't mirror endpoints require listing every single source separately? |
@ruifung I don't think I got your question, but this is early PoC, not real implementation yet, so some details are not known |
Afaik we can just use this as a forward for the image store? |
@smira I'm not sure if you saw this project or not but it works great on Talos. It seems like what you want to do here, maybe you'll find some ideas looking thru the source. |
Thats great software actually, thanks for the tip! |
yes, this was the inspiration, but probably more stuff we could do easier, but this is not done yet |
@smira is there something specific in Spegel that you do not want, which is causing you to implement your own embedded registry? |
it's not that Spegel has anything wrong, but rather it's a generic solution, while on Talos Linux we have more control and more information, e.g. we have the discovery data. So it should be easier to implement and run it on Talos. Also it's our philosophy to keep things simple for the end users, just flip the switch and you get a distributed image cache. |
I agree with you, my thought was that Talos could embed Spegel the same way k3s does. You don't even have to use the libp2p router if you have some other way of routing the traffic. Most components are interfaces so it should be pretty easy to just replace the router with a custom implementation. |
I think this is a great suggestion, which is also easier to maintain. |
I've done some implementation of Spegel now, and I have to say: It basically does precisely what you describe here... Its pretty much "apply and forget". |
This PR is stale because it has been open 45 days with no activity. |
This PR is stale because it has been open 45 days with no activity. |
I'm pretty sure this isn't stale and @smira is still working/thinking-about this. |
Has there been any decision made on Spegel vs. implementing your own solution? |
He literally said above they had already made a decision. |
This PR is stale because it has been open 45 days with no activity. |
This is PoC/experiment.
Each Talos node runs
registryd
component which acts both as a registry and a fan-out service. For local requests,registryd
serves manifests/blobs from the containerd content storage. For incoming requests,registryd
fans out requests to other nodes (cluster members), finding the first one which has the content.I had to disable content store deduplication, as otherwise containerd drops original layers immediately.
One not fully solved question is how to inject
registryd
, what I did in my testing is to inject it as the endpoint in the registry mirror scheme, so ifregistryd
has nothing,containerd
falls back to "upstream" registry/mirror. There needs some work to be done to support it for*
redirects.There is unresolved issues with images protected by authorization. At the moment
registryd
never resolves tags (defers it to the upstream registry), but still it might deliver images without pull secrets given the proper digest.How to secure
registryd
from access outside of the cluster?