Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better handling for which producers can correctly encode which queries #9

Open
richtia opened this issue Dec 19, 2022 · 1 comment
Open

Comments

@richtia
Copy link
Member

richtia commented Dec 19, 2022

I think there are some bits here that are being handled manually that should be handled in a more automated fashion.

With the hand-written tuples for which producer can correctly encode which query, we have the means to track regressions (DuckDB suddenly fails to run logb, e.g.) but not improvements (Isthmus can now encode logb).

This is a highly non-trivial problem, because the outcomes of the tests of the producers are essentially the text fixtures for the consumers.

We've been using pytest-snapshot to test that Ibis produces "good" or "golden" SQL for various expressions (https://pypi.org/project/pytest-snapshot/) and I wonder if that would be of help here.

Testing producers would mean generating substrait blobs, then comparing them to known good / valid snapshots of those blobs.

Testing consumers would consist of loading the snapshots blobs and attempting to execute.

I know I'm not covering everything that needs covering in the test matrix here, but I think it would be a very good idea to start sketching out more sustainable patterns.

Having said all of ^^^^that^^^^, I don't think that should block this PR.

I do think that we should be attempting to run all producer tests on all SQL snippets, and not manually filtering them down pre-test. If isthmus is going to fail one of those tests because it uses a different SQL dialect, so be it -- we can get creative in the xfail markers and distinguish between "tests that fail that should pass in the future" and "tests that fail that will always fail".

Alternatively, we might make use of sqlglot to translate string sql between dialects -- it's very good at that.

Originally posted by @gforsyth in #6 (review)

@richtia
Copy link
Member Author

richtia commented Mar 28, 2023

Partially resolved by #14

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

1 participant