What is the trade-off for choosing chunk size and slice size? #5101
Unanswered
merryituxz
asked this question in
Q&A
Replies: 1 comment
-
Maybe the chunk size referencing HDFS v1. I remember in hadoop books, the block size set 64MB, more in line with the disk sector size |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Why is the chunk size limited to 64MB? What issues might arise if the chunk size were 128MB,or 32MB?
The same question applies to the choice of slice size. If the maximum slice size were limited to 32MB or 128MB (assuming the corresponding chunk size is also 128MB), what could happen?
Additionally, suppose slice's metadata updates fail (all transaction retries fail, assuming the metadata engine crashes). In that case, a slice-sized amount of garbage data would be stored in the object storage, and this garbage data would not be detected by GC. This is because every time a slice fills up to 4MB, that portion is stored as a Block in the object storage. However, at this point, the slice's metadata has not yet been stored in the metadata engine, which could result in at most 64MB of garbage data. This data seems to be unrecoverable through GC.
Beta Was this translation helpful? Give feedback.
All reactions