Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Check storage before clone, get-content, and zip #28

Open
achilleas-k opened this issue Dec 4, 2019 · 0 comments
Open

Check storage before clone, get-content, and zip #28

achilleas-k opened this issue Dec 4, 2019 · 0 comments

Comments

@achilleas-k
Copy link
Member

Before each operation that potentially downloads enough data to fill the target storage, the service should check the available storage.

There are three steps where this would be useful:

  1. git clone (gin get)
  2. git annex get (gin get-content)
  3. zip

Step 1 might be an issue since it's not straightforward to know the size of the repository before cloning. We might have to solve that by adding the functionality to report repository sizes on GIN Web (Gogs).

Step 2 is pretty straightforward. For one, git annex will refuse to download a file if there is not enough local storage. That said, if several files are being downloaded, we might reach near-capacity before the limit is hit, so the service should check for the entire download size before running get-content. Git annex provides this information via git annex info (which also supports --json output).
When we add the repo size reporting functionality to GIN Web, this step could be merged into step 1.

Step 3 essentially means we would require twice the space required for step 2. Assuming no compression (worst case), the zip file would take up the same amount of storage as the repository, so the storage needed to clone & get-content will be doubled when creating the zip file.

Essentially, if we can know ahead of time the storage requirements for a repository, we can safely clone and zip if there is twice as much available space.

NOTE: This becomes tricky when multiple repositories are being registered simultaneously. We should consider having workers report the storage space they're planning to use to the other workers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant