-
Notifications
You must be signed in to change notification settings - Fork 807
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement partition compaction grouper #6172
Implement partition compaction grouper #6172
Conversation
Signed-off-by: Alex Le <[email protected]>
Signed-off-by: Alex Le <[email protected]>
Signed-off-by: Alex Le <[email protected]>
Signed-off-by: Alex Le <[email protected]>
Signed-off-by: Alex Le <[email protected]>
Signed-off-by: Alex Le <[email protected]>
Overall looking good to me. Just a few comments. For user to migrate. Would it just work changing the configuration and deploying? I would imagine yes as we would not find partition data to any block and treat them all as partitionId 0. Correct? |
Yes. Switching back and forth between partitioning and non partitioning should not cause any issue. At most, the largest time range block would be recompacted one more time. |
How it works while deployment is happening? Because we can have compactors creating blocks with partition and compactors creating others without and they are seeing different visit markers? Would it create duplicate compaction while deployment is happening? |
If both are compacting the largest time range blocks, it would create duplicate blocks. For any lower level blocks, it would be compacted into higher level properly after deployment. |
Signed-off-by: Alex Le <[email protected]>
Signed-off-by: Alex Le <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this work
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. Mostly are nits.
We should also update https://cortexmetrics.io/docs/configuration/v1guarantees/ to mention it is experimental feature but we can do it after you finish all partition compactor PRs
} | ||
|
||
func UpdatePartitionedGroupInfo(ctx context.Context, bkt objstore.InstrumentedBucket, logger log.Logger, partitionedGroupInfo PartitionedGroupInfo) (*PartitionedGroupInfo, error) { | ||
existingPartitionedGroup, _ := ReadPartitionedGroupInfo(ctx, bkt, logger, partitionedGroupInfo.PartitionedGroupID) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it fine to ignore the error here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ignore error in order to always update partitioned group info. There is no harm to put latest version of partitioned group info which is supposed to be the correct grouping based on latest bucket store. We skip updating when the file exist just want to try best finishing previously generated plan. But even the previous partitioned group info got updated in the middle, the new file should consider already compacted partitions into account.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make sense. Can you add comment for the reason
Signed-off-by: Alex Le <[email protected]>
Signed-off-by: Alex Le <[email protected]>
Signed-off-by: Alex Le <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. I think we need some document about how this works but can be done after everything is implemented
Signed-off-by: Alex Le <[email protected]>
What this PR does:
This PR implements partition compaction grouper.
Introduced new files for partition compaction:
partitionedGroupID
in the file is unique for particular time range.Here is high level algorithm of partition compaction grouper:
Introduced
meta_extensions
to save partition information of result block in meta.json. This infomation can be used to better assign block to proper partition in the next round of compaction.Which issue(s) this PR fixes:
NA
Checklist
CHANGELOG.md
updated - the order of entries should be[CHANGE]
,[FEATURE]
,[ENHANCEMENT]
,[BUGFIX]