-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve efficiency of CI checks (so we can add MORE!) #13845
Comments
Running tests after a merge has a problem: who gonna look at the results? It seems to me that we should either give merge queue a shot, or at least understand why we're not doing that. |
I agree with @findepi, once it is merged it is a deferred problem and much harder to find the root cause. The challenge is we focused to move most of test cases to sqllogictests, which is good and doesn't have tests duplication anymore. But another side of the medal we cannot unit test some packages without end 2 end tests which are sqllogictest and the latter requires all DF to be compiled |
Yes, I also 100% agree.
Maybe we could still run the more advanced tests before merging a PR but would not run them all during development, review and iteration. This would ensure that if a PR introduced a problem it wouldn't be merged to main so the error is not deferred I expect in most cases the "extended" tests will simply pass and things will be fine. Occasionally the extended tests would fail and we would have to go dig into why before the PR was merged It seems like wishful thinking though, as I am not sure there is capacity to create and maintain such a workflow 🤔 |
I was also thinking that this "when to run more extended tests" may have stopped the great work @2010YOUY01 did to run with Sqllancer. I suspect it is somewhat thankless work to run, triage, and file those tickets and totally understandability interests move on |
That's exactly what merge queues are supposed to do.
Note that benchmarks are not qualitative, but quantitive. if we have merge queue, we should be able to do some benchmarks, but not necessarily all benchmarks. |
It sounds like then "implement a basic merge queue" would be a good next step! |
Maintaining an extended workflow shouldn't be too bad tbh. I think having a workflow that runs outside of PR's (iow runs nightly) could be useful as well for expensive tests that rarely break (or just a take a long time to run). Think sqlite tests, etc |
I think the cost / annoyance will be when the tests start failing, someone has to care enough to look into the failure and triage / file tickets |
Other alternatives: Split test executions in multiple jobsWe can generate a docker image with the code and the tests compiled, and then having separate jobs that pulls the docker image from the embedded github docker reigstry and run a subset of the tests. This way the tests could be parallized Using more powerful workersI have no knowledge of the current status, but in general workers from Github are pretty slow. If the Apache Foundation Infra has a Kubernetes cluster, we could run Github hosted workers there (also useful if we want to use GPUs for testing) |
Ah yes, I can definitely see that as a problem for sure. |
Thanks for the ideas @edmondop -- FWIW the ASF doens't have its own hosted workers (individual projects could do that -- arrow did for a time -- but it requires dedicated time / effort from someone / some company to maintain them, which I don't think we have in DataFusion at this time |
BTW I think we may get our first experience with tests that don't run on PRs as part of What I envision doing is running on each commit to main and then we'll have to have the discipline to look at any failures that happen and fix them. I am not 100% sure we'll be able to muster to ability to do so but hope to give it a try |
Is your feature request related to a problem or challenge?
There is a tension between adding more tests and and PRs and code velocity (more tests --> longer CI)
DataFusion runs a many tests on every change to every PR. For example my most recent PR ran 24 tests (link) consuming over an hour of worker time.
This has several challenges
Another observation is that there are several tests also rarely fail in PRs, but offer important coverage such as the Windows and mac tests. We even disabled the Windows test due to its (lack of) speed.
Describe the solution you'd like
I would like to improve the efficiency of existing CI jobs as well as have a mechanism to run both new and existing tests that offer important coverage but take too long to run on each CI
Describe alternatives you've considered
Here are some options from my past lives:
Option 1: Change some tests to only run on merges to
main
In this option, we could change some checks to only run on merges to
main
. For example, we could run the windows tests only on merges to main rather than also on PRs.Instead of all jobs being triggered like
we would change some tests to run like
pros:
cons:
We could probably add some sort of job that would automatically make revert PRs for any code that broke the main tests to help triage
Option 2: Implement more sophisticated merge flow
In this option, I would imagine a workflow like
This might already be supported by github in
Merge Queues
There are probably bots like https://github.com/rust-lang/homu that could automate something like this
pros:
cons:
Option 3: Your idea here
Additional context
No response
The text was updated successfully, but these errors were encountered: