-
Notifications
You must be signed in to change notification settings - Fork 594
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: alter the schema registry of table with connector #15025
Conversation
bail_not_implemented!("Alter table with source having schema registry"); | ||
return Err(ErrorCode::NotSupported( | ||
"alter table with schema registry".to_string(), | ||
"try `ALTER TABLE .. FORMAT .. ENCODE .. (...)` instead".to_string(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this case, I guess the user may actually want to refresh the schema i.e. pulling from the schema registry to get a latest schema. Do we have such syntax support?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have such syntax support?
Yes, that's exactly what this pr has done: impl ALTER TABLE .. FORMAT .. ENCODE .. (schema.xxx = ...)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean, he might not want to change the schema.location
or message
but only do a refresh from the schema registry (because he has updated the schema in the schema registry server before)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it. In current impl, users need to repeat the same config as they create the table/source to do a refresh operation. Maybe we can add a shortcut for this purpose:
ALTER ... FORMAT .. ENCODE ..;
: emptyformat_encode_options
implies a simple refresh.ALTER ... REFRESH SCHEMA REGISTRY;
: new syntax to avoid confusion in originalFORMAT ENCODE
syntax.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a plan to allow users to specify a completely different set of FORMAT
and ENCODE
when altering a table in the future? If not, I would suggest using a syntax like ALTER ... REFRESH SCHEMA REGISTRY
for clarity and convenience.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤔 I think they can get all the fields easily from the system table.
Both LGTM.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have a decision here?
FYI, we are going to implement alter source src format xxx encode xxx (connector = 'new_connector')
. As users may have a source for history data and another for incremental data, they can continue the consumption by altering the connector.
So we gonna mix schema refresh
and connector reset
in the same sql, and probably more cases in the future. The semantic can be confusing then. I suggest we designate the sql individually.
Let's start a vote here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ALTER SOURCE ... REFRESH SCHEMA
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ALTER SOURCE ... FORMAT .. ENCODE ..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually I believe these two features could be clearly distinguished. 🤔
Refresh schema (or update any other minor configuration fields)
- The connector is still the same.
- The state of the source executors is kept.
- Users would not like to repeat unchanged properties.
Alter connector
- The connector is replaced with a completely different one.
- The state of the source executors is reset.
- Users would not like to reuse any existing properties.
When it comes to designing syntax, I suppose we should choose different syntaxes based on the specific requirements. For example, we may leave the general syntax of ALTER SOURCE ... FORMAT .. ENCODE ..
for the latter one. For the former feature, a sugar like REFRESH SCHEMA
looks good to me if we don't find any other similar requirements.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally LGTM
let [definition]: [_; 1] = Parser::parse_sql(&definition) | ||
.context("unable to parse original table definition")? | ||
.try_into() | ||
.unwrap(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why unwrap()
instead of return Err
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because the definition is converted from SQL AST in alter_definition_format_encode
, and should be error-free as converted back.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the code here is directly duplicated from alter_table_column
. Could you please try your best to reuse some code snippets between them? Otherwise, it can be painful to maintain.
bail_not_implemented!("Alter table with source having schema registry"); | ||
return Err(ErrorCode::NotSupported( | ||
"alter table with schema registry".to_string(), | ||
"try `ALTER TABLE .. FORMAT .. ENCODE .. (...)` instead".to_string(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean, he might not want to change the schema.location
or message
but only do a refresh from the schema registry (because he has updated the schema in the schema registry server before)
Co-authored-by: Eric Fu <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you mind updating the PR title to make it more clear? IMO, "sr" and "conn" are not commonly used abbreviations.
let [definition]: [_; 1] = Parser::parse_sql(&definition) | ||
.context("unable to parse original table definition")? | ||
.try_into() | ||
.unwrap(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the code here is directly duplicated from alter_table_column
. Could you please try your best to reuse some code snippets between them? Otherwise, it can be painful to maintain.
bail_not_implemented!("Alter table with source having schema registry"); | ||
return Err(ErrorCode::NotSupported( | ||
"alter table with schema registry".to_string(), | ||
"try `ALTER TABLE .. FORMAT .. ENCODE .. (...)` instead".to_string(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a plan to allow users to specify a completely different set of FORMAT
and ENCODE
when altering a table in the future? If not, I would suggest using a syntax like ALTER ... REFRESH SCHEMA REGISTRY
for clarity and convenience.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
ALTER SOURCE src_user FORMAT PLAIN ENCODE PROTOBUF( | ||
schema.registry = 'http://message_queue:8081', | ||
message = 'test.User' | ||
); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIUC, altering table with same format & encoding but different properties is also supported in this PR, right?
ALTER [SOURCE|TABLE] <name> FORMAT ... ENCODE ... (
(... new properties ...)
);
If so, please update the PR description.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, I reverted the implementation of ALTER TABLE FORMAT ENCODE
as we discussed in slack that it should be suspended and needs more consideration. This pr is only about ALTER TABLE REFRESH SCHEMA
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. Only ALTER SOURCE
works, ALTER TABLE ... FORMAT ... ENCODE
is not in this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me summarize. So after this PR gets merged, it will be like
-
ALTER TABLE <name> FORMAT ... ENCODE ... ( ... )
-
ALTER SOURCE <name> FORMAT ... ENCODE ... ( ... )
-
ALTER TABLE <name> REFRESH SCHEMA
-
ALTER SOURCE <name> REFRESH SCHEMA
Right? Shall we complete the rest two in the future? Asked because I think we should keep the syntax of create source & table
unified.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, you are right.
ALTER SOURCE <name> REFRESH SCHEMA
is on progress.
ALTER TABLE <name> FORMAT ... ENCODE ... ( ... )
will start after further discussion.
I hereby agree to the terms of the RisingWave Labs, Inc. Contributor License Agreement.
What's changed and what's your intention?
Resolve #11800.
This pr supports to alter a table with conn by refreshing schema registry.
If a downstream fragment references a column that is either missing or has undergone a type change in the updated schema, the command will be declined.
Checklist
./risedev check
(or alias,./risedev c
)Documentation
Release note
If this PR includes changes that directly affect users or other significant modifications relevant to the community, kindly draft a release note to provide a concise summary of these changes. Please prioritize highlighting the impact these changes will have on users.