Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Encapsulate format logic within Statement and helper classes #162

Open
wants to merge 18 commits into
base: master
Choose a base branch
from

Conversation

leboshi
Copy link
Contributor

@leboshi leboshi commented Sep 6, 2024

Summary

Handling different I/O formats added a lot of complexity to the codebase: whenever a specific format is needed or requested, that format has to be passed around to whatever handles the response so that it can be parsed correctly. Various methods around parsing responses according to different formats have been growing and collecting within the SchemaStatements module, where they don't really seem to belong. process_response also has nested case statements that are hard to grok.

This PR extracts three helper classes:

  • FormatManager to determine when to automatically apply a FORMAT clause to a SQL statement,
  • ResponseProcessor to parse the response according to the requested format and handle errors, and
  • Statement to couple them together and maintain the format state.

The first two are internal to Statement and have no reason to be used directly outside of it.

Breaking Changes

  • Move DEFAULT_RESPONSE_FORMAT constant into ClickhouseAdapter, since it's now used by helper classes instead of only within the SchemaStatements module.

Deprecations

  • Passing format: keyword to execute is deprecated (with no specific horizon yet) in favor of the new with_response_format wrapper method. The deprecation warning points users to the new method.
  • execute and do_execute had effectively identical signatures, so do_execute has been deprecated.

Next Steps

The skip_format? check in the FormatManager really just includes checks that were already being run plus a few more we've found. They are not comprehensive. Ideally, we should go through the documentation to identify exactly which statements do or don't accept a FORMAT clause.

The chain of conditionals is also less than ideal. It's ugly but manageable for now. Whenever the complete list of inclusions/exclusions materializes, it may be better to apply a filter pattern to clean up the checks.

@PNixx
Copy link
Owner

PNixx commented Oct 23, 2024

@leboshi need resolve conflicts

@leboshi leboshi force-pushed the refactor/encapsulate-format-logic branch from 2b4c2dc to 7fb1a7f Compare November 22, 2024 00:30
@leboshi
Copy link
Contributor Author

leboshi commented Nov 22, 2024

Sorry that took so long! It's been a busy couple months.

module ConnectionAdapters
module Clickhouse
class Statement
class FormatManager
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This incorrect logic for request:

SELECT * FROM table WHERE column LIKE 'insert into%'

We have a method exec_insert which clearly indicates that this is an insert and what response format should be used.

spec/cluster/migration_spec.rb Outdated Show resolved Hide resolved
end

def format_from_json_compact_each_row_with_names_and_types(body)
rows = body.split("\n").map { |row| parse_json_payload(row) }

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

body.each_line.map should be more efficient since it doesn't need to allocate an array before iterating over it to parse the JSON.

@leboshi
Copy link
Contributor Author

leboshi commented Dec 6, 2024

OK, all the above should be resolved! Thanks to both of you for the review!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants