-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adjust max concurrent queries for ClickHousePlugin #143
Adjust max concurrent queries for ClickHousePlugin #143
Conversation
e0dbb66
to
a4b62ef
Compare
Codecov Report
@@ Coverage Diff @@
## master #143 +/- ##
==========================================
+ Coverage 87.02% 87.85% +0.82%
==========================================
Files 122 137 +15
Lines 7648 9177 +1529
==========================================
+ Hits 6656 8062 +1406
- Misses 992 1115 +123
... and 3 files with indirect coverage changes 📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
5b663bd
to
ce2aee3
Compare
process_part(part_desc, client) | ||
for part_descs in zip_longest( | ||
*[ | ||
list_parts_to_attach(snapshot_result, self.disks, tables_by_uuid) | ||
for snapshot_result in backup_manifest.snapshot_results | ||
] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change is good but I think it becomes hard to follow with the zip_longest
and it does not fully reach the apparent goal of having at most max_concurrent_attach/node_count
concurrent queries on each node.
Maybe something like that instead :
# Do all nodes at the same time
await asyncio.gather(
*[
# But on each node, limit the number of concurrent attach
gather_limited(
self.max_concurrent_attach_per_node,
[
execute_with_timeout(
client,
self.attach_timeout,
f"ALTER TABLE {table_identifier} ATTACH PART {escape_sql_string(part_name)}".encode(),
)
for table_identifier, part_name in
list_parts_to_attach(snapshot_result, self.disk_paths, tables_by_uuid)
]
)
for client, snapshot_result in zip(self.clients, backup_manifest.snapshot_results)
]
)
(In the context of Astacus, it's ok to create a new setting and deprecate/ignore the old one)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! It looks like I was a bit too focused on transposing the loops and missed the whole point. Fixed.
@root_validator(pre=True) | ||
def _apply_max_concurrent_query_defaults(cls, values): # pylint: disable=E0213 | ||
default_value = 10 * len(values["clickhouse"].nodes) if "clickhouse" in values else 100 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure we should have complex conditional defaults, given that we can afford to deprecate&ignore old parameters, I'd rather just move to _per_node
concurrency limit for all limits.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! I added _per_node and changed the code in steps - would it be better?
5c2118c
to
d75866f
Compare
Thanks! The comments were addressed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor config issue but LGTM otherwise.
max_concurrent_drop_databases: int = 100 | ||
max_concurrent_drop_databases_per_node: int = 10 | ||
create_databases_timeout: float = 60.0 | ||
max_concurrent_create_databases: int = 100 | ||
max_concurrent_create_databases_per_node: int = 10 | ||
sync_databases_timeout: float = 60.0 | ||
restart_replica_timeout: float = 300.0 | ||
max_concurrent_restart_replica: int = 100 | ||
max_concurrent_restart_replica_per_node: int = 10 | ||
restore_replica_timeout: float = 300.0 | ||
max_concurrent_restore_replica: int = 100 | ||
max_concurrent_restore_replica_per_node: int = 10 | ||
freeze_timeout: float = 3600.0 | ||
unfreeze_timeout: float = 3600.0 | ||
# Deprecated parameter, ignored | ||
attach_timeout: float = 300.0 | ||
max_concurrent_attach: int = 100 | ||
max_concurrent_attach_per_node: int = 10 | ||
sync_tables_timeout: float = 3600.0 | ||
max_concurrent_sync: int = 100 | ||
max_concurrent_sync_per_node: int = 10 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All the old parameters should be kept (and ignored) to avoid failing if a newer astacus is accidentally installed in a place where the config still uses the old parameters.
I don't know how/why, but I misplaced the # Deprecated parameter, ignored
comment, it should be just before use_system_unfreeze
(all the attach/sync parameters are not deprecated). But you can move it in the right place and then put all the new deprecated options after that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Kevin! Changed.
(again, sorry for the delay with the changes)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nothing to be sorry about :)
d75866f
to
e4cc044
Compare
Thanks! The comment was addressed.
No description provided.