Use salsa accumulators for diagnostics #14760

MichaReiser · 2024-12-03T17:47:23Z

Summary

This is a reimplementation of #14116 now that Salsa salsa accumulators are cheaper (close to "free" for queries without diagnostics).

The main benefit of salsa accumulators is that they're an easy way to emit a diagnostic from anywhere inside a salsa query, and salsa takes care to deduplicate diagnostics if the same query is called multiple times.

Performance

This still significantly regresses the incremental performance. I created another salsa PR to reduce the regression to 9%. I don't think we can do much better except reducing the number of inputs by using coarse-grained salsa dependencies because:

Salsa currently tracks reads on a per "field" level for salsa structs. This very fine-grained tracking costs us here because the accumulator has to iterate and resolve all of them to know if they have any accumulated values. The coarse-grained salsa feature will help with this (and reduce the overhead in other places as well)
We only check 4 files (plus standard library files that are all unchanged and have no diagnostics). The regression would be less proportionally if there were more files without diagnostics.
The diagnostics are from the largest file and, to make it worse, from the module scope, which has the most definitions.

Considering this, I'm still leaning towards migrating to salsa accumulators because of what they unlock: We can now emit diagnostics from any part of the code. This includes the module resolver (e.g. emit a warning if we find an invalid pth file?)

Other changes

This PR upgrades Salsa to a version with "cheap" accumulators. The new salsa version now requires that Db structs implement Clone for its parallel db support (which doesn't seem to work yet).

Test Plan

cargo test

codspeed-hq · 2024-12-03T17:55:21Z

CodSpeed Performance Report

Merging #14760 will degrade performances by 10.43%

_{Comparing micha/salsa-accumulators (35f3815) with main (1685d95)}

Summary

❌ 1 (👁 1) regressions
✅ 31 untouched benchmarks

Benchmarks breakdown

	Benchmark	`main`	`micha/salsa-accumulators`	Change
👁	`red_knot_check_file[incremental]`	4 ms	4.4 ms	-10.43%

github-actions · 2024-12-03T18:03:56Z

`ruff-ecosystem` results

Linter (stable)

✅ ecosystem check detected no linter changes.

Linter (preview)

✅ ecosystem check detected no linter changes.

Formatter (stable)

✅ ecosystem check detected no format changes.

Formatter (preview)

✅ ecosystem check detected no format changes.

AlexWaygood

The use of report_ everywhere makes me feel like I'm reading the pyright codebase haha. But in most cases it seems like it probably is the best verb to use. Still, there's a couple of places where I think we could use some better names:

AlexWaygood · 2024-12-04T11:43:46Z

crates/ruff_db/src/diagnostic.rs

+pub struct CompileDiagnostic(std::sync::Arc<dyn Diagnostic>);
+
+impl CompileDiagnostic {
+    pub fn report<T>(db: &dyn Db, diagnostic: T)


is report the best verb here? It feels to me like we're adding a diagnostic to be reported later, rather than reporting it to the user immediately. Maybe this could be called CompileDiagnostic::add()?

AlexWaygood · 2024-12-04T11:44:39Z

crates/red_knot_python_semantic/src/types/diagnostic.rs

-            format_args!("Name `{id}` used when not defined"),
-        );
+/// Reports a diagnostic for the given node and file if diagnostic reporting is enabled for this file.
+pub(crate) fn report_type_diagnostic(


similarly here, add_type_diagnostic would be a more natural name for me

MichaReiser · 2024-12-04T13:36:50Z

Hmm, I'm looking into suppressions right now, specifically how we'd recognize unused suppression comments.
We could ignore inline and file-level suppression comments when deciding whether or not to emit a diagnostic and then filter them out as part of a post-processing step. The diagnostics would then allow us to "account" for the seen suppressions and list comments without any suppressions. However, this has the downside that more files have accumulated values, meaning the performance regression remains relevant for more files.

Another alternative to emitting the diagnostic is to emit a Suppressed accumulated value. That's cheaper because it can be a more lightweight value (and suppressing a diagnostic can be used to work-around a bug in the diagnostic generation), but it has the same downside as using the diagnostic accumulator: Collecting all values requires an extra traversal

sharkdp · 2024-12-04T15:02:10Z

crates/red_knot_test/src/lib.rs

-                diagnostic
-            }));
+            let mut diagnostics = check_types::accumulated::<CompileDiagnostic>(db, test_file.file);
+            // Filter out diagnostics that are not related to the current file.


Only tangentially related: do we have a feeling for how many unrelated-file diagnostics we generate typically? If this would be a large fraction of all diagnostics, we should probably detect that earlier?

This could be any number of diagnostics and salsa doesn't provide a way to detect this earlier.

Salsa accumulators work by traversing the entire dependency tree of a query. In our case, this is check_file. check_file can branch out into arbitrary files when resolving imports (and checking those files in turn). The only way for us to truncate this earlier is by not using accumulators because salsa doesn't know about the file boundaries.

I don't think this is very terrible because the CLI uses check_workspace to get all diagnostics. The LSP uses check_file today but I don't think it should. The wasm playground uses check_file... this is probably fine

sharkdp · 2024-12-04T15:32:02Z

crates/red_knot_python_semantic/src/types/string_annotation.rs

 use crate::Db;

-type AnnotationParseResult = Result<Parsed<ModExpression>, TypeCheckDiagnostics>;
+type AnnotationParseResult = Result<Parsed<ModExpression>, ()>;


Minor: Something like

Suggested change

type AnnotationParseResult = Result<Parsed<ModExpression>, ()>;

struct StringAnnotationParseError;

type AnnotationParseResult = Result<Parsed<ModExpression>, StringAnnotationParseError>;

would maybe make matches at the call sites of this function a bit easier to understand

sharkdp · 2024-12-04T15:46:42Z

crates/red_knot_workspace/src/workspace.rs

+        diagnostics.sort_unstable_by(|a, b| {
+            let a_file = a.file();
+            let b_file = b.file();
+
+            a_file.cmp(&b_file).then_with(|| {
+                a_file
+                    .path(db)
+                    .as_str()
+                    .cmp(b_file.path(db).as_str())
+                    .then_with(|| {
+                        a.range()
+                            .unwrap_or_default()
+                            .start()
+                            .cmp(&b.range().unwrap_or_default().start())
+                    })
+            })


A few questions:

Do we really want an unstable sort here? Two different diagnostics could compare equal according to the ordering defined here if they refer to the same range, or even just the same range start.

Why do we order by file and then by the path of the file? Can multiple files refer to the same path?

If performance is not a major concern here, we could potentially do something like
diagnostics.sort_unstable_by_key(|diagnostic| { let file = diagnostic.file(); let path = file.path(db).as_str(); let range_start = diagnostic.range().unwrap_or_default().start(); (file, path, range_start) });

Why do we order by file and then by the path of the file? Can multiple files refer to the same path?

Oh, that's just wrong. The idea was to short-cut if the files are identical.

Do we really want an unstable sort here? Two different diagnostics could compare equal according to the ordering defined here if they refer to the same range, or even just the same range start.

I don't think it's important for us to preserve the original order if there are multiple diagnostics at the same line, for as long as the order is deterministic (Running Red Knot multiple times should give you the same result). We should probably include the rule code as well as disambiguator

for as long as the order is deterministic

Exactly, that was my concern.

We should probably include the rule code as well as disambiguator

👍

## Summary I'm currently on the fence about landing the #14760 PR because it's unclear how we'd support tracking used and unused suppression comments in a performant way: * Salsa adds an "untracked" dependency to every query reading accumulated values. This has the effect that the query re-runs on every revision. For example, a possible future query `unused_suppression_comments(db, file)` would re-run on every incremental change and for every file. I don't expect the operation itself to be expensive, but it all adds up in a project with 100k+ files * Salsa collects the accumulated values by traversing the entire query dependency graph. It can skip over sub-graphs if it is known that they contain no accumulated values. This makes accumulators a great tool for when they are rare; diagnostics are a good example. Unfortunately, suppressions are more common, and they often appear in many different files, making the "skip over subgraphs" optimization less effective. Because of that, I want to wait to adopt salsa accumulators for type check diagnostics (we could start using them for other diagnostics) until we have very specific reasons that justify regressing incremental check performance. This PR does a "small" refactor that brings us closer to what I have in #14760 but without using accumulators. To emit a diagnostic, a method needs: * Access to the db * Access to the currently checked file This PR introduces a new `InferContext` that holds on to the db, the current file, and the reported diagnostics. It replaces the `TypeCheckDiagnosticsBuilder`. We pass the `InferContext` instead of the `db` to methods that *might* emit diagnostics. This simplifies some of the `Outcome` methods, which can now be called with a context instead of a `db` and the diagnostics builder. Having the `db` and the file on a single type like this would also be useful when using accumulators. This PR doesn't solve the issue that the `Outcome` types feel somewhat complicated nor that it can be annoying when you need to report a `Diagnostic,` but you don't have access to an `InferContext` (or the file). However, I also believe that accumulators won't solve these problems because: * Even with accumulators, it's necessary to have a reference to the file that's being checked. The struggle would be to get a reference to that file rather than getting a reference to `InferContext`. * Users of the `HasTy` trait (e.g., a linter) don't want to bother getting the `File` when calling `Type::return_ty` because they aren't interested in the created diagnostics. They just want to know what calling the current expression would return (and if it even is a callable). This is what the different methods of `Outcome` enable today. I can ask for the return type without needing extra data that's only relevant for emitting a diagnostic. A shortcoming of this approach is that it is now a bit confusing when to pass `db` and when an `InferContext`. An option is that we'd make the `file` on `InferContext` optional (it won't collect any diagnostics if `None`) and change all methods on `Type` to take `InferContext` as the first argument instead of a `db`. I'm interested in your opinion on this. Accumulators are definitely harder to use incorrectly because they remove the need to merge the diagnostics explicitly and there's no risk that we accidentally merge the diagnostics twice, resulting in duplicated diagnostics. I still value performance more over making our life slightly easier.

MichaReiser added internal An internal refactor or improvement red-knot Multi-file analysis & type inference labels Dec 3, 2024

MichaReiser force-pushed the micha/salsa-accumulators branch 2 times, most recently from 7db3778 to 1f6247c Compare December 3, 2024 17:54

MichaReiser force-pushed the micha/salsa-accumulators branch 3 times, most recently from 1d26d75 to 040f4d9 Compare December 4, 2024 08:35

MichaReiser mentioned this pull request Dec 4, 2024

Approximate sizes of stack and visisted when collecting accumulated values salsa-rs/salsa#622

Merged

MichaReiser force-pushed the micha/salsa-accumulators branch 3 times, most recently from 5cbbfc2 to e0f43ee Compare December 4, 2024 09:35

MichaReiser mentioned this pull request Dec 4, 2024

introduce parallel salsa salsa-rs/salsa#568

Merged

Use salsa accumulators for diagnostics

35f3815

MichaReiser force-pushed the micha/salsa-accumulators branch from e0f43ee to 35f3815 Compare December 4, 2024 09:47

MichaReiser marked this pull request as ready for review December 4, 2024 10:15

MichaReiser requested review from carljm, AlexWaygood and sharkdp as code owners December 4, 2024 10:15

AlexWaygood approved these changes Dec 4, 2024

View reviewed changes

sharkdp reviewed Dec 4, 2024

View reviewed changes

carljm mentioned this pull request Dec 4, 2024

[red-knot] Infer precise types for len() calls #14599

Merged

dhruvmanila approved these changes Dec 5, 2024

View reviewed changes

MichaReiser marked this pull request as draft December 10, 2024 13:02

MichaReiser mentioned this pull request Dec 13, 2024

Introduce InferContext #14956

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use salsa accumulators for diagnostics #14760

Use salsa accumulators for diagnostics #14760

MichaReiser commented Dec 3, 2024 •

edited

Loading

codspeed-hq bot commented Dec 3, 2024 •

edited

Loading

github-actions bot commented Dec 3, 2024 •

edited

Loading

AlexWaygood left a comment •

edited

Loading

AlexWaygood Dec 4, 2024 •

edited

Loading

AlexWaygood Dec 4, 2024

MichaReiser commented Dec 4, 2024

sharkdp Dec 4, 2024

MichaReiser Dec 4, 2024

sharkdp Dec 4, 2024

sharkdp Dec 4, 2024

MichaReiser Dec 4, 2024

sharkdp Dec 4, 2024

	type AnnotationParseResult = Result<Parsed<ModExpression>, ()>;
	struct StringAnnotationParseError;
	type AnnotationParseResult = Result<Parsed<ModExpression>, StringAnnotationParseError>;

Use salsa accumulators for diagnostics #14760

Are you sure you want to change the base?

Use salsa accumulators for diagnostics #14760

Conversation

MichaReiser commented Dec 3, 2024 • edited Loading

Summary

Performance

Other changes

Test Plan

codspeed-hq bot commented Dec 3, 2024 • edited Loading

Merging #14760 will degrade performances by 10.43%

Summary

Benchmarks breakdown

github-actions bot commented Dec 3, 2024 • edited Loading

ruff-ecosystem results

Linter (stable)

Linter (preview)

Formatter (stable)

Formatter (preview)

AlexWaygood left a comment • edited Loading

Choose a reason for hiding this comment

AlexWaygood Dec 4, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

MichaReiser commented Dec 4, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

MichaReiser commented Dec 3, 2024 •

edited

Loading

codspeed-hq bot commented Dec 3, 2024 •

edited

Loading

github-actions bot commented Dec 3, 2024 •

edited

Loading

`ruff-ecosystem` results

AlexWaygood left a comment •

edited

Loading

AlexWaygood Dec 4, 2024 •

edited

Loading