You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
objects can be directly found by their id (e.g. the id is mapped to the fs path / file name)
no transactions, no log-like appending - but correct write order
Pros:
simplicity
no need for some sort of "index" (which could be corrupted or out of date)
no segment file compaction needed, the server-side filesystem manages space allocation
Cons:
leads to big amounts of relatively small objects transferred and stored individually in the repository
latency and other overheads have quite a speed impact for remote repositories
depending on the storage type / filesystem, there will be more or less storage space usage overhead due to block size, esp. for many very small objects
dealing with lots of objects / doing lots of api calls can be expensive for some cloud storage providers
borg2 alternative idea
client assembles packs locally, transfers to store when the pack has reached the desired size or when there is no more data to write.
pack files have a per-pack index appended (pointing to the objects contained in the pack), so the per-pack index can be read without reading the full pack.
the per-pack index would also contain the RepoObj metadata (e.g. compression type/level, etc.)
Pros:
a lot less objects in store, less api calls, less latency impact
Cons:
more complex in general
will need an addtl. global index mapping object_id -> pack_id, offset_in_pack
will need more memory for that global index
space is managed clientside, causing more (network) I/O: compact will need to read the pack, drop unused entries and write it back to the store, update indexes
Side note: desired pack "size" could be given by amount of objects in the pack (N) or by the overall size of all objects in the pack (S). For the special case of N == 1 it would be a slightly different implementation (using a different file format) of what we currently have in borg2, not necessarily need that global index and also compact would still be very easy.
I would really appreciate the use of packs. Currently borg 2 is "incompatible" with most USB hard disks with SMR recording. I used a Toshiba 4TB external USB hard drive for borg2 testing and a borg check was done approx. 50% after 12 hours when i killed it (needed the USB port). The repository was only approx 1,3TB
borg 1.x segment files
borg 1.x used:
object id --> (segment_name, offset_in_segment)
.borg2 status quo: objects stored separately
borg2 is much simpler:
borgstore
(k/v store with misc. backends)Pros:
Cons:
borg2 alternative idea
Pros:
Cons:
object_id -> pack_id, offset_in_pack
Side note: desired pack "size" could be given by amount of objects in the pack (N) or by the overall size of all objects in the pack (S). For the special case of N == 1 it would be a slightly different implementation (using a different file format) of what we currently have in borg2, not necessarily need that global index and also compact would still be very easy.
Related: #191
The text was updated successfully, but these errors were encountered: