Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: alter the schema registry of table with connector #15025

Merged
merged 16 commits into from
Mar 8, 2024

Conversation

Rossil2012
Copy link
Contributor

@Rossil2012 Rossil2012 commented Feb 6, 2024

I hereby agree to the terms of the RisingWave Labs, Inc. Contributor License Agreement.

What's changed and what's your intention?

Resolve #11800.

This pr supports to alter a table with conn by refreshing schema registry.

If a downstream fragment references a column that is either missing or has undergone a type change in the updated schema, the command will be declined.

Checklist

  • I have written necessary rustdoc comments
  • I have added necessary unit tests and integration tests
  • I have added test labels as necessary. See details.
  • I have added fuzzing tests or opened an issue to track them. (Optional, recommended for new SQL features Sqlsmith: Sql feature generation #7934).
  • My PR contains breaking changes. (If it deprecates some features, please create a tracking issue to remove them in the future).
  • All checks passed in ./risedev check (or alias, ./risedev c)
  • My PR changes performance-critical code. (Please run macro/micro-benchmarks and show the results.)
  • My PR contains critical fixes that are necessary to be merged into the latest release. (Please check out the details)

Documentation

  • My PR needs documentation updates. (Please use the Release note section below to summarize the impact on users)

Release note

If this PR includes changes that directly affect users or other significant modifications relevant to the community, kindly draft a release note to provide a concise summary of these changes. Please prioritize highlighting the impact these changes will have on users.

ALTER TABLE t REFRESH SCHEMA;

bail_not_implemented!("Alter table with source having schema registry");
return Err(ErrorCode::NotSupported(
"alter table with schema registry".to_string(),
"try `ALTER TABLE .. FORMAT .. ENCODE .. (...)` instead".to_string(),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case, I guess the user may actually want to refresh the schema i.e. pulling from the schema registry to get a latest schema. Do we have such syntax support?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have such syntax support?

Yes, that's exactly what this pr has done: impl ALTER TABLE .. FORMAT .. ENCODE .. (schema.xxx = ...).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean, he might not want to change the schema.location or message but only do a refresh from the schema registry (because he has updated the schema in the schema registry server before)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. In current impl, users need to repeat the same config as they create the table/source to do a refresh operation. Maybe we can add a shortcut for this purpose:

  1. ALTER ... FORMAT .. ENCODE ..;: empty format_encode_options implies a simple refresh.
  2. ALTER ... REFRESH SCHEMA REGISTRY;: new syntax to avoid confusion in original FORMAT ENCODE syntax.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a plan to allow users to specify a completely different set of FORMAT and ENCODE when altering a table in the future? If not, I would suggest using a syntax like ALTER ... REFRESH SCHEMA REGISTRY for clarity and convenience.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤔 I think they can get all the fields easily from the system table.
Both LGTM.

Copy link
Contributor Author

@Rossil2012 Rossil2012 Mar 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have a decision here?

FYI, we are going to implement alter source src format xxx encode xxx (connector = 'new_connector'). As users may have a source for history data and another for incremental data, they can continue the consumption by altering the connector.

So we gonna mix schema refresh and connector reset in the same sql, and probably more cases in the future. The semantic can be confusing then. I suggest we designate the sql individually.

Let's start a vote here.

Copy link
Contributor Author

@Rossil2012 Rossil2012 Mar 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ALTER SOURCE ... REFRESH SCHEMA

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ALTER SOURCE ... FORMAT .. ENCODE ..

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I believe these two features could be clearly distinguished. 🤔

Refresh schema (or update any other minor configuration fields)

  • The connector is still the same.
  • The state of the source executors is kept.
  • Users would not like to repeat unchanged properties.

Alter connector

  • The connector is replaced with a completely different one.
  • The state of the source executors is reset.
  • Users would not like to reuse any existing properties.

When it comes to designing syntax, I suppose we should choose different syntaxes based on the specific requirements. For example, we may leave the general syntax of ALTER SOURCE ... FORMAT .. ENCODE .. for the latter one. For the former feature, a sugar like REFRESH SCHEMA looks good to me if we don't find any other similar requirements.

Copy link
Member

@fuyufjh fuyufjh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally LGTM

src/frontend/src/handler/alter_table_with_sr.rs Outdated Show resolved Hide resolved
let [definition]: [_; 1] = Parser::parse_sql(&definition)
.context("unable to parse original table definition")?
.try_into()
.unwrap();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why unwrap() instead of return Err?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because the definition is converted from SQL AST in alter_definition_format_encode, and should be error-free as converted back.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the code here is directly duplicated from alter_table_column. Could you please try your best to reuse some code snippets between them? Otherwise, it can be painful to maintain.

bail_not_implemented!("Alter table with source having schema registry");
return Err(ErrorCode::NotSupported(
"alter table with schema registry".to_string(),
"try `ALTER TABLE .. FORMAT .. ENCODE .. (...)` instead".to_string(),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean, he might not want to change the schema.location or message but only do a refresh from the schema registry (because he has updated the schema in the schema registry server before)

Copy link
Member

@BugenZhao BugenZhao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you mind updating the PR title to make it more clear? IMO, "sr" and "conn" are not commonly used abbreviations.

let [definition]: [_; 1] = Parser::parse_sql(&definition)
.context("unable to parse original table definition")?
.try_into()
.unwrap();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the code here is directly duplicated from alter_table_column. Could you please try your best to reuse some code snippets between them? Otherwise, it can be painful to maintain.

bail_not_implemented!("Alter table with source having schema registry");
return Err(ErrorCode::NotSupported(
"alter table with schema registry".to_string(),
"try `ALTER TABLE .. FORMAT .. ENCODE .. (...)` instead".to_string(),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a plan to allow users to specify a completely different set of FORMAT and ENCODE when altering a table in the future? If not, I would suggest using a syntax like ALTER ... REFRESH SCHEMA REGISTRY for clarity and convenience.

@Rossil2012 Rossil2012 changed the title feat: alter table sr with conn feat: alter the schema registry of table with connector Feb 21, 2024
@Rossil2012 Rossil2012 added the user-facing-changes Contains changes that are visible to users label Mar 7, 2024
@Rossil2012 Rossil2012 requested review from fuyufjh and BugenZhao March 7, 2024 11:40
Copy link
Member

@fuyufjh fuyufjh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Comment on lines 45 to 48
ALTER SOURCE src_user FORMAT PLAIN ENCODE PROTOBUF(
schema.registry = 'http://message_queue:8081',
message = 'test.User'
);
Copy link
Member

@fuyufjh fuyufjh Mar 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC, altering table with same format & encoding but different properties is also supported in this PR, right?

ALTER [SOURCE|TABLE] <name> FORMAT ... ENCODE ... (
    (... new properties ...)
);

If so, please update the PR description.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I reverted the implementation of ALTER TABLE FORMAT ENCODE as we discussed in slack that it should be suspended and needs more consideration. This pr is only about ALTER TABLE REFRESH SCHEMA.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. Only ALTER SOURCE works, ALTER TABLE ... FORMAT ... ENCODE is not in this PR.

Copy link
Member

@fuyufjh fuyufjh Mar 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me summarize. So after this PR gets merged, it will be like

  • ALTER TABLE <name> FORMAT ... ENCODE ... ( ... )
  • ALTER SOURCE <name> FORMAT ... ENCODE ... ( ... )
  • ALTER TABLE <name> REFRESH SCHEMA
  • ALTER SOURCE <name> REFRESH SCHEMA

Right? Shall we complete the rest two in the future? Asked because I think we should keep the syntax of create source & table unified.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you are right.
ALTER SOURCE <name> REFRESH SCHEMA is on progress.
ALTER TABLE <name> FORMAT ... ENCODE ... ( ... ) will start after further discussion.

@Rossil2012 Rossil2012 added this pull request to the merge queue Mar 8, 2024
@xxchan xxchan self-requested a review March 8, 2024 05:08
Merged via the queue into main with commit 3f102e1 Mar 8, 2024
33 of 34 checks passed
@Rossil2012 Rossil2012 deleted the kanzhen/alter-table-sr-with-conn branch March 8, 2024 05:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/feature user-facing-changes Contains changes that are visible to users
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Tracking: Alter Source/table with connector
5 participants