Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ability to run local instances in a cluster #1486

Closed
victoitor opened this issue Dec 9, 2024 · 6 comments
Closed

Add ability to run local instances in a cluster #1486

victoitor opened this issue Dec 9, 2024 · 6 comments

Comments

@victoitor
Copy link

When incus is set up as a cluster, all instances are clustered and will not start unless there are enough members in the cluster. A local instance will not be part of the cluster and will start independent of the cluster being available or not.

One interesting use case for this is related to #1440 if one wants to restrict the resource usage of the incus database. If one wants the database to work on a restricted set of cores or with a limited usage of RAM, a VM can be run on each cluster node to be put in the cluster and the database can be restricted only to those VMs. The issue is a chicken and egg situation in which the cluster needs database nodes to start (the VM nodes) and the VM nodes cannot start if the cluster is not initialized. Having local nodes allows for those VMs to start independently of the cluster and then the cluster is formed with the quorum of database nodes.

In some sense this would make incus always run a local database and possibly have a clustered database available for cluster operations only.

@stgraber
Copy link
Member

This isn't possible. Incus always has two databases:

  • local (very limited local configuration and certificates needed to bring up the global database)
  • global (contains everything else, all instances, networks, storage, ...)

On a standalone server, the global database only has a single machine participating in it.

In a cluster, the global database is spread across the cluster, some servers get to vote on database changes (needs a majority), one server is the leader, a couple other servers get a backup copy of the database, then all other servers don't get any of the global database at all.

If you have a global database configured to have 3 voters, a minimum of 2 must be online for the database to be available, anything less than that and it's not possible to bring up the database at all.

This is a design that works very very well for Incus as it makes it so the codepaths are the same whether we're in a clustered or local environment, only making clustering slightly more tricky but generally something that one can ignore when adding new features to Incus.

If we were to store the bulk of the data only in the local database, then any requests that needs to see all the data, say incus list would now need every single server to access its local database. This kind of request happens constantly, if only to do simple things like fill in the used_by value on all the API objects. Being able to fill in a good 90%+ of that data solely from the shared database is what allows Incus to be so fast and scale so well even when dealing with dozens of servers and thousands of instances.

@victoitor
Copy link
Author

Understood! Thanks for the reply.

@victoitor
Copy link
Author

victoitor commented Dec 10, 2024

By the way, I was reading your reply again and wanted to clarify a few things. First of all, I didn't know how incus' databases worked, so my first proposal was slightly misunderstood. I will explain a bit more just to make sure, but I still have a feeling it might be too much for too little.

From what you mentioned, I don't think it's appropriate to store any more configuration in the local database. It should maintain it's use as is. Furthermore, it's not intended to "store the bulk of the data only in the local database" or significantly move data to a new database. It should be minimal data movement actually.

From what you mentioned, a new explanation would be to possibly create a third database, say cluster-local only on clusters and it should only be created if requested (say by using a --cluster-local flag and it doesn't exist yet). Operations on the cluster would should run as normal on the global database. But if a command should be called with --cluster-local, use the cluster-local database as if it was the global database instead. The difference between the global and the cluster-local databases would be that the cluster-local would work as a local incus instance instead of a clustered setup, so instances could run locally independent of the cluster state. Again, as I mentioned before, the use case I thought of was for adding to the cluster as members which could run the database, but be up independent of the global database.

Is there any other simpler solution like having a local part of the global database that would make sense? Where changes to this part of the global database would only be controlled locally?

@stgraber
Copy link
Member

I think that'd get extremely difficult to keep track of.
Those local-only instances would then need to only use local-only networks, be created from local-only images, inside of local-only projects and using local-only storage pools.

At that point, you've effectively had to duplicate your entire Incus setup just to run those local-only instances, so you may as well just run a separate standalone Incus server.

@stgraber
Copy link
Member

(Those instances only being allowed to rely on local-only networks, pools, images, profiles, projects, ... is because if that server goes offline for some reason, there will be no way for the rest of the cluster to access that information, so no way for the cluster as a whole to keep enforcing resource limits, correctly report consumption or know what's safe to change as the cluster won't be able to know what's running on that one server that's now offline)

@victoitor
Copy link
Author

Got it. Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants