You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm proposing that Vitess can automate temporal partition rotation based on user defined rules.
What are temporal partitioned tables?
Tables that are PARTITIONED BY RANGE and over temporal (time-based) values. This would be either:
PARTITION BY RANGE COLUMNS (col_name) over a single column, that is either DATE or DATETIME (technically MySQL also allows TIME but that is not so interesting to partition rotation).
PARTITION BY RANGE (func(col_name)) WHERE func is one of selected functions such as TO_DAYS, and the column is again either DATE or DATETIME.
Note: with MySQL 8.4 we can also support UNIX_TIMESTAMP(timestamp_column)
For practical reasons, we will have a predefined set of allowed functions/expressions. For example, we will support PARTITION BY RANGE (TO_DAYS(my_column)) or likewise YEAR(my_column) but we will not support ROUND(SQRT(TO_DAYS(my_column)+3.14)). We aim for practical operational scenarios. Most users will rotate hourly, daily, possibly weekly, monthly, yearly.
What is the proposal?
We will have a per-table rule:
name of table
rotation interval (hourly, daily, ...)
number of ahead-of-time partitions (prepare this many futuristic empty partitions)
retention (e.g. 30 days)
control flags such as enabled/disabled
Vitess will periodically look at all the rules, and check all referenced tables. For each table, if applicable, it will generate a sequence of Online DDL migrations, with internal UUID and in-order-execution, that ensure the table is in required state. Since it will do this periodically, most of the this will be a no-op since the table will already have all the required future partitions, and will have dropped expired partitions.
Where are rules to be stored?
I'm thinking as part of Keyspace record in topo, much like the throttler configuration. The config will be copied from Keyspace to SrvKeyspace as needed, again just like the throttler configuration.
What are expected problems?
I'm not sure how to handle errors. For example:
what if a table was dropped but the rule is still there?
what if the table is not partitioned?
what if the rule does not comply with the table definition? e.g. request to rotate hourly, but the table uses TO_DAYS expression, making the minimal interval 24h?
So where should these errors go? As this is a background operation, there's no occasion to respond to the user with the list of errors.
Alternative approaches
With rules still in place, maybe Vitess should not auto-rotate. Instead, maybe we should have vtctldclient RotatePartitionedTables command, that will:
analyze all the required changes
run them sequentially (either asynchronously via Online DDL or somehow directly)
report errors to the user
The partition analysis should still be made on a per-shard basis, to ensure independence of shards and eventual consistency, much like we delegate all Online DDL changes to shards.
Existing work
schemadiff PR to analyze temporal range partitioned tables and to generate required creation of ahead-of-time partitions, and purge expired partitions: #17426. This also validates intervals and other constraints.
The text was updated successfully, but these errors were encountered:
I'm proposing that Vitess can automate temporal partition rotation based on user defined rules.
What are temporal partitioned tables?
Tables that are
PARTITIONED BY RANGE
and over temporal (time-based) values. This would be either:PARTITION BY RANGE COLUMNS (col_name)
over a single column, that is eitherDATE
orDATETIME
(technically MySQL also allowsTIME
but that is not so interesting to partition rotation).PARTITION BY RANGE (func(col_name))
WHEREfunc
is one of selected functions such asTO_DAYS
, and the column is again eitherDATE
orDATETIME
.For practical reasons, we will have a predefined set of allowed functions/expressions. For example, we will support
PARTITION BY RANGE (TO_DAYS(my_column))
or likewiseYEAR(my_column)
but we will not supportROUND(SQRT(TO_DAYS(my_column)+3.14))
. We aim for practical operational scenarios. Most users will rotate hourly, daily, possibly weekly, monthly, yearly.What is the proposal?
We will have a per-table rule:
Vitess will periodically look at all the rules, and check all referenced tables. For each table, if applicable, it will generate a sequence of Online DDL migrations, with internal UUID and
in-order-execution
, that ensure the table is in required state. Since it will do this periodically, most of the this will be a no-op since the table will already have all the required future partitions, and will have dropped expired partitions.Where are rules to be stored?
I'm thinking as part of
Keyspace
record intopo
, much like the throttler configuration. The config will be copied fromKeyspace
toSrvKeyspace
as needed, again just like the throttler configuration.What are expected problems?
I'm not sure how to handle errors. For example:
TO_DAYS
expression, making the minimal interval24h
?So where should these errors go? As this is a background operation, there's no occasion to respond to the user with the list of errors.
Alternative approaches
With rules still in place, maybe Vitess should not auto-rotate. Instead, maybe we should have
vtctldclient RotatePartitionedTables
command, that will:The partition analysis should still be made on a per-shard basis, to ensure independence of shards and eventual consistency, much like we delegate all Online DDL changes to shards.
Existing work
schemadiff
PR to analyze temporal range partitioned tables and to generate required creation of ahead-of-time partitions, and purge expired partitions: #17426. This also validates intervals and other constraints.The text was updated successfully, but these errors were encountered: