Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use salsa accumulators for diagnostics #14760

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

MichaReiser
Copy link
Member

@MichaReiser MichaReiser commented Dec 3, 2024

Summary

This is a reimplementation of #14116 now that Salsa salsa accumulators are cheaper (close to "free" for queries without diagnostics).

The main benefit of salsa accumulators is that they're an easy way to emit a diagnostic from anywhere inside a salsa query, and salsa takes care to deduplicate diagnostics if the same query is called multiple times.

Performance

This still significantly regresses the incremental performance. I created another salsa PR to reduce the regression to 9%. I don't think we can do much better except reducing the number of inputs by using coarse-grained salsa dependencies because:

  • Salsa currently tracks reads on a per "field" level for salsa structs. This very fine-grained tracking costs us here because the accumulator has to iterate and resolve all of them to know if they have any accumulated values. The coarse-grained salsa feature will help with this (and reduce the overhead in other places as well)
  • We only check 4 files (plus standard library files that are all unchanged and have no diagnostics). The regression would be less proportionally if there were more files without diagnostics.
  • The diagnostics are from the largest file and, to make it worse, from the module scope, which has the most definitions.

Considering this, I'm still leaning towards migrating to salsa accumulators because of what they unlock: We can now emit diagnostics from any part of the code. This includes the module resolver (e.g. emit a warning if we find an invalid pth file?)

Other changes

This PR upgrades Salsa to a version with "cheap" accumulators. The new salsa version now requires that Db structs implement Clone for its parallel db support (which doesn't seem to work yet).

Test Plan

cargo test

@MichaReiser MichaReiser added internal An internal refactor or improvement red-knot Multi-file analysis & type inference labels Dec 3, 2024
@MichaReiser MichaReiser force-pushed the micha/salsa-accumulators branch 2 times, most recently from 7db3778 to 1f6247c Compare December 3, 2024 17:54
Copy link

codspeed-hq bot commented Dec 3, 2024

CodSpeed Performance Report

Merging #14760 will degrade performances by 10.43%

Comparing micha/salsa-accumulators (35f3815) with main (1685d95)

Summary

❌ 1 (👁 1) regressions
✅ 31 untouched benchmarks

Benchmarks breakdown

Benchmark main micha/salsa-accumulators Change
👁 red_knot_check_file[incremental] 4 ms 4.4 ms -10.43%

Copy link
Contributor

github-actions bot commented Dec 3, 2024

ruff-ecosystem results

Linter (stable)

✅ ecosystem check detected no linter changes.

Linter (preview)

✅ ecosystem check detected no linter changes.

Formatter (stable)

✅ ecosystem check detected no format changes.

Formatter (preview)

✅ ecosystem check detected no format changes.

@MichaReiser MichaReiser force-pushed the micha/salsa-accumulators branch from e0f43ee to 35f3815 Compare December 4, 2024 09:47
@MichaReiser MichaReiser marked this pull request as ready for review December 4, 2024 10:15
Copy link
Member

@AlexWaygood AlexWaygood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The use of report_ everywhere makes me feel like I'm reading the pyright codebase haha. But in most cases it seems like it probably is the best verb to use. Still, there's a couple of places where I think we could use some better names:

pub struct CompileDiagnostic(std::sync::Arc<dyn Diagnostic>);

impl CompileDiagnostic {
pub fn report<T>(db: &dyn Db, diagnostic: T)
Copy link
Member

@AlexWaygood AlexWaygood Dec 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is report the best verb here? It feels to me like we're adding a diagnostic to be reported later, rather than reporting it to the user immediately. Maybe this could be called CompileDiagnostic::add()?

format_args!("Name `{id}` used when not defined"),
);
/// Reports a diagnostic for the given node and file if diagnostic reporting is enabled for this file.
pub(crate) fn report_type_diagnostic(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

similarly here, add_type_diagnostic would be a more natural name for me

@MichaReiser
Copy link
Member Author

Hmm, I'm looking into suppressions right now, specifically how we'd recognize unused suppression comments.
We could ignore inline and file-level suppression comments when deciding whether or not to emit a diagnostic and then filter them out as part of a post-processing step. The diagnostics would then allow us to "account" for the seen suppressions and list comments without any suppressions. However, this has the downside that more files have accumulated values, meaning the performance regression remains relevant for more files.

Another alternative to emitting the diagnostic is to emit a Suppressed accumulated value. That's cheaper because it can be a more lightweight value (and suppressing a diagnostic can be used to work-around a bug in the diagnostic generation), but it has the same downside as using the diagnostic accumulator: Collecting all values requires an extra traversal

diagnostic
}));
let mut diagnostics = check_types::accumulated::<CompileDiagnostic>(db, test_file.file);
// Filter out diagnostics that are not related to the current file.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only tangentially related: do we have a feeling for how many unrelated-file diagnostics we generate typically? If this would be a large fraction of all diagnostics, we should probably detect that earlier?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be any number of diagnostics and salsa doesn't provide a way to detect this earlier.

Salsa accumulators work by traversing the entire dependency tree of a query. In our case, this is check_file. check_file can branch out into arbitrary files when resolving imports (and checking those files in turn). The only way for us to truncate this earlier is by not using accumulators because salsa doesn't know about the file boundaries.

I don't think this is very terrible because the CLI uses check_workspace to get all diagnostics. The LSP uses check_file today but I don't think it should. The wasm playground uses check_file... this is probably fine

use crate::Db;

type AnnotationParseResult = Result<Parsed<ModExpression>, TypeCheckDiagnostics>;
type AnnotationParseResult = Result<Parsed<ModExpression>, ()>;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: Something like

Suggested change
type AnnotationParseResult = Result<Parsed<ModExpression>, ()>;
struct StringAnnotationParseError;
type AnnotationParseResult = Result<Parsed<ModExpression>, StringAnnotationParseError>;

would maybe make matches at the call sites of this function a bit easier to understand

Comment on lines +192 to +207
diagnostics.sort_unstable_by(|a, b| {
let a_file = a.file();
let b_file = b.file();

a_file.cmp(&b_file).then_with(|| {
a_file
.path(db)
.as_str()
.cmp(b_file.path(db).as_str())
.then_with(|| {
a.range()
.unwrap_or_default()
.start()
.cmp(&b.range().unwrap_or_default().start())
})
})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few questions:

  • Do we really want an unstable sort here? Two different diagnostics could compare equal according to the ordering defined here if they refer to the same range, or even just the same range start.
  • Why do we order by file and then by the path of the file? Can multiple files refer to the same path?
  • If performance is not a major concern here, we could potentially do something like
    diagnostics.sort_unstable_by_key(|diagnostic| {
        let file = diagnostic.file();
        let path = file.path(db).as_str();
        let range_start = diagnostic.range().unwrap_or_default().start();
        (file, path, range_start)
    });

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we order by file and then by the path of the file? Can multiple files refer to the same path?

Oh, that's just wrong. The idea was to short-cut if the files are identical.

Do we really want an unstable sort here? Two different diagnostics could compare equal according to the ordering defined here if they refer to the same range, or even just the same range start.

I don't think it's important for us to preserve the original order if there are multiple diagnostics at the same line, for as long as the order is deterministic (Running Red Knot multiple times should give you the same result). We should probably include the rule code as well as disambiguator

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for as long as the order is deterministic

Exactly, that was my concern.

We should probably include the rule code as well as disambiguator

👍

@MichaReiser MichaReiser marked this pull request as draft December 10, 2024 13:02
MichaReiser added a commit that referenced this pull request Dec 18, 2024
## Summary

I'm currently on the fence about landing the #14760 PR because it's
unclear how we'd support tracking used and unused suppression comments
in a performant way:
* Salsa adds an "untracked" dependency to every query reading
accumulated values. This has the effect that the query re-runs on every
revision. For example, a possible future query
`unused_suppression_comments(db, file)` would re-run on every
incremental change and for every file. I don't expect the operation
itself to be expensive, but it all adds up in a project with 100k+ files
* Salsa collects the accumulated values by traversing the entire query
dependency graph. It can skip over sub-graphs if it is known that they
contain no accumulated values. This makes accumulators a great tool for
when they are rare; diagnostics are a good example. Unfortunately,
suppressions are more common, and they often appear in many different
files, making the "skip over subgraphs" optimization less effective.

Because of that, I want to wait to adopt salsa accumulators for type
check diagnostics (we could start using them for other diagnostics)
until we have very specific reasons that justify regressing incremental
check performance.

This PR does a "small" refactor that brings us closer to what I have in
#14760 but without using accumulators. To emit a diagnostic, a method
needs:

* Access to the db
* Access to the currently checked file

This PR introduces a new `InferContext` that holds on to the db, the
current file, and the reported diagnostics. It replaces the
`TypeCheckDiagnosticsBuilder`. We pass the `InferContext` instead of the
`db` to methods that *might* emit diagnostics. This simplifies some of
the `Outcome` methods, which can now be called with a context instead of
a `db` and the diagnostics builder. Having the `db` and the file on a
single type like this would also be useful when using accumulators.

This PR doesn't solve the issue that the `Outcome` types feel somewhat
complicated nor that it can be annoying when you need to report a
`Diagnostic,` but you don't have access to an `InferContext` (or the
file). However, I also believe that accumulators won't solve these
problems because:

* Even with accumulators, it's necessary to have a reference to the file
that's being checked. The struggle would be to get a reference to that
file rather than getting a reference to `InferContext`.
* Users of the `HasTy` trait (e.g., a linter) don't want to bother
getting the `File` when calling `Type::return_ty` because they aren't
interested in the created diagnostics. They just want to know what
calling the current expression would return (and if it even is a
callable). This is what the different methods of `Outcome` enable today.
I can ask for the return type without needing extra data that's only
relevant for emitting a diagnostic.

A shortcoming of this approach is that it is now a bit confusing when to
pass `db` and when an `InferContext`. An option is that we'd make the
`file` on `InferContext` optional (it won't collect any diagnostics if
`None`) and change all methods on `Type` to take `InferContext` as the
first argument instead of a `db`. I'm interested in your opinion on
this.

Accumulators are definitely harder to use incorrectly because they
remove the need to merge the diagnostics explicitly and there's no risk
that we accidentally merge the diagnostics twice, resulting in
duplicated diagnostics. I still value performance more over making our
life slightly easier.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
internal An internal refactor or improvement red-knot Multi-file analysis & type inference
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants