-
Notifications
You must be signed in to change notification settings - Fork 53
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
VolumeSnapshot support #179
Comments
You may want to check this feature: https://seaweedfs.com/docs/admin/features/#point-in-time-recovery . It basically can continuously backup data and it can create a snapshot to any point of time. Need to investigate on how to better fit Kubernetes model, especially on how to restore a snapshot. |
I've looked at the PIT recovery, however from what I see this is not available in the free plan, I believe Seaweedfs needs to make it easier to support a 3-2-1 backup strategy in Kubernetes native way. There are countless of examples where it's been critical, i.e recently when Google deleted the Australian pension fund - what saved them was that they had a backup at another provider. Just as a concrete example (and also most likely a future ticket) on why VolumeSnapshots or a Kubernetes native way to back up the Seaweedfs is important: We run a Talos cluster where 3 nodes run Seaweedfs with 1 master, 3 filer and 3 volumes + S3 storage tier externally. We had an issue specifically yesterday where only one node was wiped during an upgrade (default for Talos unless you set a flag to preserve the filesystem during upgrades) and it seems like this caused a bad state of Seaweedfs that was unrecoverable and we weren't able to recover it even after a full reinstall of Seaweedfs. We use the Rancher local storage provider and I don't believe this is something it is meant to handle since the node state is the same but the whole underlying storage is gone. I believe the issue is caused by a mix of several components, one being that the Rancher Local Storage provisioner isn't able to delete the storage (even after deleting the PVC+PV) because it's helper needs privileged access to do that in Talos (as it uses strict pod admission policies by default) - the helper used to have privilege before the Rancher Local Storage provisioner 0.26.0 update. So I believe Seaweefs partially recovered some data after the wipe which then was present during the reinstall (we let Seaweedfs run 5-6 hours before doing the reinstall). Secondly, we have a S3 storage tier (edit: after wiping, it still didn't fix the issue) that we didn't wipe after the reinstall which might also cause issues. Thirdly, we had only set defaultReplication 003 for the master but not for the filer (I'd really appreciate an example in the helm chart for a production ready setup, since these things are easy to miss the first time around), so dataloss should be expected. My biggest concern however is that Seaweedfs didn't report any errors and seemed to work fine based on logs, we only knew it was an issue since we a mix of errors A Kubernetes native method of backing up and restoring the whole Seaweedfs system would most likely get us out of the bad state (which I believe is on the roadmap for the operator) and then using restore with VolumeSnapshot would weed out the permission issues as it would restore the whole PVC on a block level with the correct permissions (which the Direct copying method of i.e Restic can't guarantee). I'll continue to debug the issue, we have wiped the S3 storage tier, will try to use a different storage backend than Local Storage Provisioner and I plan to reset the other nodes that Seaweedfs runs on today to see if we're able to recover it. We have a unique opportunity to do some chaos testing and debug the issue since the cluster is set up declaratively and is made to be prod ready with logging and metrics on every system but doesn't run any critical workloads yet. If I get enough time to figure it out, I will create a separate ticket (I feel like there's too many variables I need to weed out before I can give a create a decent ticket), I will hopefully be able to create something reproducible and figure out how to recover it. You can contact me by email (listed in my profile) if you want more information/assist over a chat like Signal/Discord/Matrix etc. edit: After a bit thinking, I see I mixed Restic/S3 external backup with Snapshots/cloning, it's inaccurate on my part - they're complimentary to each other and serve different purposes. Seaweedfs also has other backup methods that should work for Kubernetes. I've revised the original comment a bit to be more accurate. |
First, thanks for the detailed information! There has been many issues created with just an error message, which shows no context at all. It'll be nice to have a reproducible case. Usually for SeaweedFS cluster problem, I would recommend using docker compose. But for this csi driver, I am not sure what is the best approach. For a reliable backup, you can use: |
To follow up, the issue might have been caused by Local Storage Provisioner v0.29.0, they made the assumption that Also why reinstalling the whole Seaweedfs cluster didn't work is also something that is a bit weird. I know that Postgres has a "join" job that creates the PVC and prepares it before starting the database. Which meant that deleting the PVC wouldn't work to recover the cluster if the join job and main workload was scheduled on different nodes. Could Seaweedfs have something similar? I believe you could replicate it using Talos v.1.7.4 and Local Storage Provisioner v0.29.0. Talos also has a command to spin up a cluster locally but it doesn't support the upgrade API when ran in a container which would be needed to test if the upgrade caused the issue. To test upgrading locally, it would require a more complex setup by using either QEMU or VirtualBox, Talos has a doc on how to set it up locally. Either way, seems like the issue we experienced might not have been a bug in Seaweedfs specifically, but rather a problem with the underlying system. We will most likely give Seaweedfs another try at some point during the next 6-12 months and I will most likely have more information on it then. If you're able to replicate it in the meantime, I believe it could help to make the system more robust by adding some sanity checks for these type of edge cases. |
Related to #79
Is there any plans to add VolumeSnapshots and if so, is there an ETA for it?
Currently there's quite a lot of workaround to do i.e backups Volsync + Restic and direct copy - which mounts the PVC and copies the data "manually" over to a new PVC or to an external storage for backup.
Whenever an application is being updated there's always a risk that it can break something which requires a rollback of both the application and the storage. To guarantee that the backups are consistent, some application has to be shut down completely and sometimes even needs root access to copy all the files.
Compared to how i.e Restic direct copy works - VolumeSnapshots are faster, more secure (as it doesn't need a root pod, this also reduces the chance of permission issues when restoring), gives a better consistency guarantee since the whole PVC is copied and it requires less resources. It becomes increasingly important the more applications you need to backup in the system. Replication is also not a failsafe and backups are required in a production system, i.e if the storage system/network gets fully saturated for some reason and data is not able to be synchronised.
There's also the reason that most other CSI's now support VolumeSnapshots and even PVC cloning, it's overall just easier to work with in a Kubernetes native way and a lot of backup tools for Kubernetes have put their most of their development efforts the last few years on VolumeSnapshots.
The text was updated successfully, but these errors were encountered: