Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: Removing warning from feature server #4281

Merged
merged 2 commits into from
Jun 17, 2024
Merged

Conversation

franciscojavierarceo
Copy link
Member

@franciscojavierarceo franciscojavierarceo commented Jun 15, 2024

What this PR does / why we need it:

Remove deprecation from write.

The endpoints should behave as follows:

  • push: async write
  • write: sync write
  • materialize: async batch write

This is the architecture I recommend:

image

Which issue(s) this PR fixes:

Fixes

Signed-off-by: Francisco Javier Arceo <[email protected]>
Signed-off-by: Francisco Javier Arceo <[email protected]>
@franciscojavierarceo franciscojavierarceo changed the title chore: Removing logging from feature server chore: Removing warning from feature server Jun 15, 2024
@tokoko
Copy link
Collaborator

tokoko commented Jun 15, 2024

Can you expand on this? Do you mean a change in OnlineStore abstract class similar to what we did will online_read method? (There's now online_read and online_read_async). If so, I don't really get what does that have to do with the feature server. Client doesn't really have any knowledge whether a server endpoint is sync or async, there still would just be a single endpoint for write on the server, right?

@franciscojavierarceo
Copy link
Member Author

franciscojavierarceo commented Jun 16, 2024

@tokoko the client should be choosing whether to write async or sync, as you mentioned it has no knowledge about the behavior behind the endpoint. So no change to the OnlineStore is needed.

@tokoko
Copy link
Collaborator

tokoko commented Jun 17, 2024

@franciscojavierarceo I'm not sure I understand what an async write to an http server means. I'm guessing you don't mean using asyncio or something like that as that's a client-side thing and would not require any different treatment on the server (in other words, a single endpoint would be enough). What else then? Are you thinking of an endpoint that returns a response before actually writing to the online db?

@franciscojavierarceo
Copy link
Member Author

This is more about architectural patterns for data producers choosing to write to the online store and the three different options available.

They have different guarantees (latency, availability, and consistency) and tradeoffs about the data.

When a caller makes a synchronous write to a db it creates tight coupling and usually will result in retries on failures (or just entire crashes).

For asynchronous writes, the caller moves on and if the write fails, data may be lost entirely. For simplicity, I'd bucket Kafka events in this group even though that's not quite accurate (but everywhere I've worked has only use asynchronous writes with Kafka--though I believe technically you can wait for acknowledgement from a Kafka broker).

For batch writes, they are effectively large scale asynchronous writes but happening purely independently of some user experience and typically on an arbitrary schedule determined by some human process or some random choice by a developer.

These write patterns have massive consequences to the usage and guarantees of the features and machine learning models Feast users will use.

Let me know if that's more clear.

For

@tokoko
Copy link
Collaborator

tokoko commented Jun 17, 2024

I do understand those distinctions, but I'm having a hard time linking those to the changes proposed in this PR. I think the simplest way we would go about enabling an async write interface would be to offer the users a way to produce message payloads to a kafka topic that some feast service would then consume and write to the db afterwards. We actually sort of already have a stream processor contrib that does exactly that, don't we? I don't really understand how we can use our existing http feature server to accomplish any of that.

@franciscojavierarceo
Copy link
Member Author

I do understand those distinctions, but I'm having a hard time linking those to the changes proposed in this PR.

Ah, yes I think the "push" naming convention is confusing and I think "write" is better and I intend on creating documentation to outline this is all. My bad, I should have clarified that.

I think the simplest way we would go about enabling an async write interface would be to offer the users a way to produce message payloads to a kafka topic that some feast service would then consume and write to the db afterwards. We actually sort of already have a stream processor contrib that does exactly that, don't we? I don't really understand how we can use our existing http feature server to accomplish any of that.
Agree with this! This could be a huge thing that could be cool to do.

@franciscojavierarceo franciscojavierarceo enabled auto-merge (squash) June 17, 2024 18:40
@franciscojavierarceo franciscojavierarceo merged commit 4cb24fc into master Jun 17, 2024
26 checks passed
@tokoko tokoko deleted the write-online branch July 16, 2024 12:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants