Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

discussion(connector): persist avro/proto definition in meta #12982

Open
xiangjinwu opened this issue Oct 20, 2023 · 8 comments
Open

discussion(connector): persist avro/proto definition in meta #12982

xiangjinwu opened this issue Oct 20, 2023 · 8 comments
Assignees
Milestone

Comments

@xiangjinwu
Copy link
Contributor

xiangjinwu commented Oct 20, 2023

Do we still choose to fetch the descriptor from registry every time we create a new formatter instead of persisting it in meta when creating the sink?

Originally posted by @wenym1 in #12858 (comment)

@github-actions github-actions bot added this to the release-1.4 milestone Oct 20, 2023
@hzxa21
Copy link
Collaborator

hzxa21 commented Oct 23, 2023

Just to confirm, the benefit of persisting avro/proto definition in meta is not needing to fetch it from schema registry on restart/recovery?

@wenym1
Copy link
Contributor

wenym1 commented Oct 24, 2023

Just to confirm, the benefit of persisting avro/proto definition in meta is not needing to fetch it from schema registry on restart/recovery?

It's also for consistency and stability. If we don't persisting the metadata in meta, if the value is changed in the external system, RW may be implicitly affected.

@tabVersion
Copy link
Contributor

Just to confirm, the benefit of persisting avro/proto definition in meta is not needing to fetch it from schema registry on restart/recovery?

It's also for consistency and stability. If we don't persisting the metadata in meta, if the value is changed in the external system, RW may be implicitly affected.

Yes, but maybe we just need to persist the version number instead of the whole descriptor set.

@xiangjinwu
Copy link
Contributor Author

xiangjinwu commented Jan 9, 2024

Yes, but maybe we just need to persist the version number instead of the whole descriptor set.

I was assuming this to be the pre-req of #11800/#14056/#14057 but it actually went to another direction that allows to alter to a completely new URL/subject rather than bumping the version number.

Also some related questions to answer:

  • What if not using schema registry?
    • There is no version number and we would have to persist the definition itself.
    • Leave it unhandled because non-registry already has other limitations as well.
  • What about other sinks?
    • For example, doris sink also queries downstream for schema before writing to it. Are we also worried it would change implicitly and would like to handle them similarly?

@fuyufjh
Copy link
Member

fuyufjh commented Mar 1, 2024

Strong +1 for this. It would be very counter-intuitive if RisingWave restarts and fails because some schema definition URL is expired/invalid.

@fuyufjh
Copy link
Member

fuyufjh commented Mar 1, 2024

By the way, correspondingly, we shall provide a command ALTER TABLE/SOURCE ... REFRESH SCHEMA so that users can update the schema easily.

Related #15025. After that, REFRESH SCHEMA can be considered as a simpler shortcut for altering the schema configs but change nothing.

@xiangjinwu xiangjinwu modified the milestones: release-1.7, release-1.8 Mar 6, 2024
@xiangjinwu xiangjinwu modified the milestones: release-1.8, release-1.9 Apr 8, 2024
@xiangjinwu xiangjinwu modified the milestones: release-1.9, release-1.10 May 13, 2024
@st1page
Copy link
Contributor

st1page commented Jun 26, 2024

When executing CREATE SINK INTO TABLE, the SQL statement must be fully processed again to generate a plan.

let (mut graph, mut table, source) =
In this case, it will also fetch the external defineiton again which could be invaild.

@xxchan
Copy link
Member

xxchan commented Sep 11, 2024

I elaborated what problems may happen when initial schema not persisted here:

https://github.com/risingwavelabs/risingwave/pull/18419/files#r1753603090

(And this PR fixed a bug caused by the exact problem

@xiangjinwu xiangjinwu modified the milestones: release-2.1, release-2.2 Oct 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants