What is the trade-off for choosing chunk size and slice size? #5101

merryituxz · 2024-08-20T02:36:11Z

merryituxz
Aug 20, 2024

Why is the chunk size limited to 64MB? What issues might arise if the chunk size were 128MB，or 32MB?

The same question applies to the choice of slice size. If the maximum slice size were limited to 32MB or 128MB (assuming the corresponding chunk size is also 128MB), what could happen?

Additionally, suppose slice's metadata updates fail (all transaction retries fail, assuming the metadata engine crashes). In that case, a slice-sized amount of garbage data would be stored in the object storage, and this garbage data would not be detected by GC. This is because every time a slice fills up to 4MB, that portion is stored as a Block in the object storage. However, at this point, the slice's metadata has not yet been stored in the metadata engine, which could result in at most 64MB of garbage data. This data seems to be unrecoverable through GC.

sober-wang · 2024-09-01T13:07:25Z

sober-wang
Sep 1, 2024

Maybe the chunk size referencing HDFS v1. I remember in hadoop books, the block size set 64MB, more in line with the disk sector size

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What is the trade-off for choosing chunk size and slice size? #5101

{{title}}

Replies: 1 comment

{{title}}

Select a reply

What is the trade-off for choosing chunk size and slice size? #5101

merryituxz Aug 20, 2024

Replies: 1 comment

sober-wang Sep 1, 2024

merryituxz
Aug 20, 2024

sober-wang
Sep 1, 2024