Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Access] ws controller error handling #6798

Open
wants to merge 27 commits into
base: master
Choose a base branch
from

Conversation

illia-malachyn
Copy link
Contributor

@illia-malachyn illia-malachyn commented Dec 11, 2024

In this PR I identified and fixed all the places where we have to communicate with a client by sending an either OK or error response. I implemented new data flows (unsubscribe, list_subscription) and added tests for each. What's more, I unified and merged both code and tests with Ulyana's keepalive routine PR (mentioned below). Besides, I added a client ID to messages to see how the controller would look like and to avoid refactoring in the future.

As there's lots of work done, I'm pretty sure some scenarios might have bugs. So, it should be well reviewed. (maybe some cases were not covered in tests)

What's missing/to be done:

  1. Controller documentation
    We need to document a WebSocket controller, its routines, error handling, and shutdown process. Maybe even a scenario of every action.
  2. Tests documentation & simplification
    Because of the asynchronous nature of the controller, the tests become quite hard to understand. We need to write good docs for it. I'd also think of creating better abstractions (maybe wrapping mocks?) and refactoring them to reduce complexity (think of using https://pkg.go.dev/github.com/stretchr/testify/mock#Call.NotBefore)
  3. Error codes
    Firstly, error code names should be better. Also, I don't know if we should use them. Maybe sentinel errors are a better approach.

Closes #6642 #6637 #6724

This PR depends on #6757 #6762

@codecov-commenter
Copy link

codecov-commenter commented Dec 11, 2024

Codecov Report

Attention: Patch coverage is 82.38636% with 31 lines in your changes missing coverage. Please review.

Project coverage is 41.06%. Comparing base (66e6db0) to head (a6f604a).

Files with missing lines Patch % Lines
engine/access/rest/websockets/controller.go 83.33% 25 Missing and 3 partials ⚠️
...ckets/data_providers/mock/data_provider_factory.go 40.00% 3 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #6798      +/-   ##
==========================================
+ Coverage   41.01%   41.06%   +0.04%     
==========================================
  Files        2104     2104              
  Lines      185030   185076      +46     
==========================================
+ Hits        75892    75997     +105     
+ Misses     102767   102712      -55     
+ Partials     6371     6367       -4     
Flag Coverage Δ
unittests 41.06% <82.38%> (+0.04%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@Guitarheroua Guitarheroua self-requested a review December 11, 2024 12:25
@illia-malachyn illia-malachyn changed the title Illia malachyn/6642 ws controller error handling [Access] ws controller error handling Dec 11, 2024
@illia-malachyn illia-malachyn marked this pull request as ready for review December 12, 2024 16:30
@illia-malachyn illia-malachyn self-assigned this Dec 13, 2024
@illia-malachyn
Copy link
Contributor Author

illia-malachyn commented Dec 15, 2024

One thing I am not sure of is error codes. I introduced them to categorize and be able to compare actual/expected errors in unit tests. I need some advice on how to do it better. @peterargue

}

func (c *Controller) handleSubscribe(ctx context.Context, msg models.SubscribeMessageRequest) {
dp, err := c.dataProviderFactory.NewDataProvider(ctx, msg.Topic, msg.Arguments, c.communicationChannel)
// register new provider
provider, err := c.dataProviderFactory.NewDataProvider(ctx, msg.Topic, msg.Arguments, c.multiplexedStream)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Am I wrong, or we talk about using a writeResponse instead of the channel in data providers?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initially, we thought that the use of a callback here would prevent a possible write to a closed channel but it turned out it would not. So, it doesn't matter what approach to use

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this works as implemented. please add comments around the channel's lifecycle about how you know it's safe. particularly covering how and why the producers are not responsible for closing the channel.

engine/access/rest/websockets/models/unsubscribe.go Outdated Show resolved Hide resolved
engine/access/rest/websockets/models/error.go Outdated Show resolved Hide resolved
engine/access/rest/websockets/models/base_message.go Outdated Show resolved Hide resolved
engine/access/rest/websockets/error_codes.go Show resolved Hide resolved
// drain the channel as some providers may still send data to it after this routine shutdowns
// so, in order to not run into deadlock there should be at least 1 reader on the channel
go func() {
for range c.multiplexedStream {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: rather than draining in a second loop here, you can just refactor the existing loop to drop messages after the context is closed (instead of returning)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we have to drain the channel in any case, not only when ctx is canceled

engine/access/rest/websockets/controller.go Outdated Show resolved Hide resolved
engine/access/rest/websockets/controller.go Outdated Show resolved Hide resolved
engine/access/rest/websockets/controller_test.go Outdated Show resolved Hide resolved
engine/access/rest/websockets/controller_test.go Outdated Show resolved Hide resolved
engine/access/rest/websockets/controller_test.go Outdated Show resolved Hide resolved
@peterargue
Copy link
Contributor

@Guitarheroua can you fix these conflicts so we can merge in master?

Copy link
Contributor

@Guitarheroua Guitarheroua left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me!

engine/access/rest/websockets/error_codes.go Show resolved Hide resolved
@@ -8,6 +8,8 @@ type ListSubscriptionsMessageRequest struct {
// ListSubscriptionsMessageResponse is the structure used to respond to list_subscriptions requests.
// It contains a list of active subscriptions for the current WebSocket connection.
type ListSubscriptionsMessageResponse struct {
BaseMessageResponse
Subscriptions []*SubscriptionEntry `json:"subscriptions,omitempty"`
ClientMessageID string `json:"message_id"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should match with the other message types

Suggested change
ClientMessageID string `json:"message_id"`
ClientMessageID string `json:"message_id,omitempty"`

Code int `json:"code"`
Message string `json:"message"`
Action string `json:"action,omitempty"`
SubscriptionID string `json:"subscription_id,omitempty"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

given SubscriptionID is also included in BaseMessageResponse, is it also needed here?

// and avoid timeouts.
//
// Expected errors during normal operation:
// - context.Canceled if the client disconnected
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

context.Canceled is never returned

err := c.conn.WriteControl(websocket.PingMessage, time.Now().Add(WriteWait))
if err != nil {
if errors.Is(err, websocket.ErrCloseSent) {
return err
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this an error condition?

if !ok {
return fmt.Errorf("communication channel closed, no error occurred")
return fmt.Errorf("multiplexed stream closed")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this an error condition?

}

func (c *Controller) parseAndValidateMessage(message json.RawMessage) (models.BaseMessageRequest, interface{}, error) {
func (c *Controller) parseAndValidateMessage(ctx context.Context, message json.RawMessage) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this does more than parseAndValidateMessage. how about renaming to handleMessage?

//
// No errors are expected during normal operation. All errors are considered benign.
// Expected errors during normal operation:
// - context.Canceled if the client disconnected
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it doesn't look like this method returns context.Canceled

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Access] Design error handling for web socket connection
5 participants