Adjust max concurrent queries for ClickHousePlugin #143

meatlink · 2023-09-28T09:36:44Z

No description provided.

codecov-commenter · 2023-09-28T09:50:19Z

Codecov Report

Merging #143 (ce2aee3) into master (c0cc73c) will increase coverage by 0.82%.
Report is 156 commits behind head on master.
The diff coverage is 85.86%.

@@            Coverage Diff             @@
##           master     #143      +/-   ##
==========================================
+ Coverage   87.02%   87.85%   +0.82%     
==========================================
  Files         122      137      +15     
  Lines        7648     9177    +1529     
==========================================
+ Hits         6656     8062    +1406     
- Misses        992     1115     +123

Files	Coverage Δ
astacus/common/asyncstorage.py	`100.00% <100.00%> (ø)`
astacus/common/cachingjsonstorage.py	`96.49% <100.00%> (+0.06%)`	⬆️
astacus/common/cassandra/client.py	`92.85% <100.00%> (ø)`
astacus/common/etcd.py	`94.11% <100.00%> (+0.11%)`	⬆️
astacus/common/ipc.py	`100.00% <100.00%> (ø)`
astacus/common/limiter.py	`100.00% <100.00%> (ø)`
astacus/common/magic.py	`100.00% <100.00%> (ø)`
astacus/common/op.py	`96.42% <100.00%> (ø)`
astacus/common/progress.py	`95.18% <100.00%> (ø)`
astacus/common/rohmustorage.py	`97.97% <100.00%> (+0.11%)`	⬆️
... and 92 more

... and 3 files with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

kmichel-aiven · 2023-09-28T13:29:43Z

astacus/coordinator/plugins/clickhouse/steps.py

+                process_part(part_desc, client)
+                for part_descs in zip_longest(
+                    *[
+                        list_parts_to_attach(snapshot_result, self.disks, tables_by_uuid)
+                        for snapshot_result in backup_manifest.snapshot_results
+                    ]


This change is good but I think it becomes hard to follow with the zip_longest and it does not fully reach the apparent goal of having at most max_concurrent_attach/node_count concurrent queries on each node.

Maybe something like that instead :

# Do all nodes at the same time await asyncio.gather( *[ # But on each node, limit the number of concurrent attach gather_limited( self.max_concurrent_attach_per_node, [ execute_with_timeout( client, self.attach_timeout, f"ALTER TABLE {table_identifier} ATTACH PART {escape_sql_string(part_name)}".encode(), ) for table_identifier, part_name in list_parts_to_attach(snapshot_result, self.disk_paths, tables_by_uuid) ] ) for client, snapshot_result in zip(self.clients, backup_manifest.snapshot_results) ] )

(In the context of Astacus, it's ok to create a new setting and deprecate/ignore the old one)

Thanks! It looks like I was a bit too focused on transposing the loops and missed the whole point. Fixed.

kmichel-aiven · 2023-09-28T13:35:57Z

astacus/coordinator/plugins/clickhouse/plugin.py

+    @root_validator(pre=True)
+    def _apply_max_concurrent_query_defaults(cls, values):  # pylint: disable=E0213
+        default_value = 10 * len(values["clickhouse"].nodes) if "clickhouse" in values else 100


I'm not sure we should have complex conditional defaults, given that we can afford to deprecate&ignore old parameters, I'd rather just move to _per_node concurrency limit for all limits.

Thanks! I added _per_node and changed the code in steps - would it be better?

Thanks! The comments were addressed

kmichel-aiven

Minor config issue but LGTM otherwise.

kmichel-aiven · 2023-10-06T07:18:50Z

astacus/coordinator/plugins/clickhouse/plugin.py

-    max_concurrent_drop_databases: int = 100
+    max_concurrent_drop_databases_per_node: int = 10
    create_databases_timeout: float = 60.0
-    max_concurrent_create_databases: int = 100
+    max_concurrent_create_databases_per_node: int = 10
    sync_databases_timeout: float = 60.0
    restart_replica_timeout: float = 300.0
-    max_concurrent_restart_replica: int = 100
+    max_concurrent_restart_replica_per_node: int = 10
    restore_replica_timeout: float = 300.0
-    max_concurrent_restore_replica: int = 100
+    max_concurrent_restore_replica_per_node: int = 10
    freeze_timeout: float = 3600.0
    unfreeze_timeout: float = 3600.0
    # Deprecated parameter, ignored
    attach_timeout: float = 300.0
-    max_concurrent_attach: int = 100
+    max_concurrent_attach_per_node: int = 10
    sync_tables_timeout: float = 3600.0
-    max_concurrent_sync: int = 100
+    max_concurrent_sync_per_node: int = 10


All the old parameters should be kept (and ignored) to avoid failing if a newer astacus is accidentally installed in a place where the config still uses the old parameters.

I don't know how/why, but I misplaced the # Deprecated parameter, ignored comment, it should be just before use_system_unfreeze (all the attach/sync parameters are not deprecated). But you can move it in the right place and then put all the new deprecated options after that.

Thanks Kevin! Changed.
(again, sorry for the delay with the changes)

Nothing to be sorry about :)

Thanks! The comment was addressed.

meatlink force-pushed the denis-potapov-adjust-max-concurrent-queries branch 3 times, most recently from e0dbb66 to a4b62ef Compare September 28, 2023 09:46

meatlink force-pushed the denis-potapov-adjust-max-concurrent-queries branch 4 times, most recently from 5b663bd to ce2aee3 Compare September 28, 2023 10:13

meatlink requested a review from kmichel-aiven September 28, 2023 10:17

meatlink marked this pull request as ready for review September 28, 2023 10:18

alanfranz requested a review from a team September 28, 2023 13:04

kmichel-aiven previously requested changes Sep 28, 2023

View reviewed changes

meatlink force-pushed the denis-potapov-adjust-max-concurrent-queries branch 2 times, most recently from 5c2118c to d75866f Compare October 2, 2023 16:09

meatlink requested a review from kmichel-aiven October 5, 2023 18:10

kmichel-aiven previously requested changes Oct 6, 2023

View reviewed changes

Limit Clickhouse concurrent queries to 10 per node

e4cc044

meatlink force-pushed the denis-potapov-adjust-max-concurrent-queries branch from d75866f to e4cc044 Compare October 19, 2023 16:22

meatlink requested a review from kmichel-aiven October 19, 2023 18:53

kmichel-aiven approved these changes Oct 20, 2023

View reviewed changes

kmichel-aiven merged commit d7e724b into master Oct 20, 2023
4 checks passed

kmichel-aiven deleted the denis-potapov-adjust-max-concurrent-queries branch October 20, 2023 06:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adjust max concurrent queries for ClickHousePlugin #143

Adjust max concurrent queries for ClickHousePlugin #143

meatlink commented Sep 28, 2023

codecov-commenter commented Sep 28, 2023 •

edited

Loading

kmichel-aiven Sep 28, 2023

meatlink Oct 2, 2023

kmichel-aiven Sep 28, 2023

meatlink Oct 2, 2023

kmichel-aiven left a comment

kmichel-aiven Oct 6, 2023

meatlink Oct 19, 2023

kmichel-aiven Oct 20, 2023

Adjust max concurrent queries for ClickHousePlugin #143

Adjust max concurrent queries for ClickHousePlugin #143

Conversation

meatlink commented Sep 28, 2023

codecov-commenter commented Sep 28, 2023 • edited Loading

Codecov Report

kmichel-aiven Sep 28, 2023

Choose a reason for hiding this comment

meatlink Oct 2, 2023

Choose a reason for hiding this comment

kmichel-aiven Sep 28, 2023

Choose a reason for hiding this comment

meatlink Oct 2, 2023

Choose a reason for hiding this comment

kmichel-aiven left a comment

Choose a reason for hiding this comment

kmichel-aiven Oct 6, 2023

Choose a reason for hiding this comment

meatlink Oct 19, 2023

Choose a reason for hiding this comment

kmichel-aiven Oct 20, 2023

Choose a reason for hiding this comment

codecov-commenter commented Sep 28, 2023 •

edited

Loading