Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modular contracts #221

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

Conversation

sbrugman
Copy link
Contributor

@sbrugman sbrugman commented Feb 23, 2024

Closes #133

The implementation makes use of the multiple sibling modules checks of the layers contract.

For each submodule of the provided module, we squash the (copy of) the graph.

For every dependency that would be illegal by find_illegal_dependencies_for_layers, we check if there is a path in the reverse direction by find_shortest_chains. If so, then this violates the modular contract.

Implementation deviates from the issue as it considers the full depth of the tree (the user can provide submodules). Does this make sense?

Uses functionality from #220 that we can import if merged.

@sbrugman sbrugman force-pushed the modular-contract branch 4 times, most recently from c4d3a7a to c3932c6 Compare February 23, 2024 20:15
@sbrugman sbrugman marked this pull request as draft February 23, 2024 20:22
@sbrugman sbrugman force-pushed the modular-contract branch 3 times, most recently from db3412f to 9ebf0c9 Compare February 23, 2024 20:38
@sbrugman sbrugman marked this pull request as ready for review February 23, 2024 20:48
Copy link
Owner

@seddonym seddonym left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's great to see some thought put into this, and I like your inventive and clever way to use find_illegal_dependencies_for_layers.

As per my longer comment, I think we need to (i) understand better what we want to report in the case of failure and (ii) prevent it taking forever / giving millions of results on large code bases. I'll give it a try on a real code base and report back, but keen to hear your thoughts in the meantime.

from importlinter.domain.imports import Module


# TODO: import from helpers once https://github.com/seddonym/import-linter/pull/220 is merged
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI A good way to do this is just to branch off the other branch, and then edit the base in the Github UI so it's merging into that PR. Once that PR gets merged, the base will update.


type_name = "modular"

modules = fields.ListField(subfield=fields.ModuleField())
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not a SetField?

def check(self, graph: ImportGraph, verbose: bool) -> ContractCheck:
violations = {}
for module in self.modules: # type: ignore
direct_submodules = _resolve_wildcards(f"{module.name}.*", graph)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's call them 'children' rather than 'direct submodules' for consistency. And better to use graph.find_children to get them.

)
violations[module.name] = sorted(
{
f"{dependency.imported} <- {dependency.importer}"
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's keep the notation consistent - the importer is always listed before the imported, and with the arrow in the other direction.

More generally I reckon it would be better to put richer objects in the metadata - take a look at what the independence contract does.

squashed_graph.squash_module(m.name)

dependencies = squashed_graph.find_illegal_dependencies_for_layers(
layers=({y.name for y in direct_submodules},),
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a clever way to solve the problem, but it took me a while to understand! It could do with a comment to explain what's going on here. 😉

{
f"{dependency.imported} <- {dependency.importer}"
for dependency in dependencies
if squashed_graph.find_shortest_chains(
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a very costly operation on large code bases, while the find_illegal_dependencies_for_layers is highly optimized (it uses Rust) and has already, I think, done most of the work for us.

I have a feeling we could employ an algorithm along these lines, though as you'll see I am not sure on the details yet on the final step:

  1. Find all dependencies between children (using your 'children are independent' approach with find_illegal_dependencies_for_layers). Now we have a set of PackageDependencies (https://grimp.readthedocs.io/en/stable/usage.html#PackageDependency).
  2. Build a different Grimp graph manually consisting only of those children (so, only a handful of nodes). Then identify any cycles that exist between the children (maybe using a combination of the package dependency results and a call to find_shortest_chains?) That will be quick as there should just be a handful of nodes and edges.
  3. Then comes the hard part. We need to figure out which parts of the cycles are interesting and which to report on. Which imports are illegal and which not? How do we manage large numbers of cycles? How do we make it deterministic (possibly hard, given that the routes within the package dependencies returned by find_illegal_dependencies_for_layers are not completely deterministic?

Really interested to explore these issues more. I will post back as I have more ideas, but would welcome other input to help define concretely what we want to happen here.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I ran it on a large codebase.

I ran it at the top level, which has a fairly small amount of child modules, and the good news is it didn't take long, around 3 seconds. However I get this output: it's difficult to understand what the cycle is, do you agree?

child modules of mypackage must be modular and thus circular dependencies are not allowed:

- mypackage.application <- mypackage.comms
- mypackage.application <- mypackage.domain
- mypackage.application <- mypackage.interfaces
- mypackage.application <- mypackage.plugins
- mypackage.celery_config <- mypackage.domain
- mypackage.celery_config <- mypackage.interfaces
- mypackage.celery_config <- mypackage.plugins
- mypackage.celery_config <- mypackage.services
- mypackage.celery_signals <- mypackage.celery_config
- mypackage.comms <- mypackage.application
- mypackage.comms <- mypackage.data
- mypackage.comms <- mypackage.domain
- mypackage.comms <- mypackage.interfaces
- mypackage.comms <- mypackage.plugins
- mypackage.core <- mypackage.application
- mypackage.core <- mypackage.data
- mypackage.core <- mypackage.domain
- mypackage.core <- mypackage.interfaces
- mypackage.core <- mypackage.plugins
- mypackage.core <- mypackage.settings
- mypackage.core <- mypackage.utils
- mypackage.core <- mypackage.vendors
(and lots more)

I also ran it on one of the child modules, which has 141 children itself. I ran the contract for an hour and a half before killing the process.

@seddonym
Copy link
Owner

seddonym commented Mar 6, 2024

Checking in on this. It's looking possible that my workplace will need something along these lines so I'm likely to have capacity to work on it if time is tight for you.

Something that came up in discussion, the term 'modular' is perhaps overloaded. I'm considering calling this kind of contract 'acyclic' instead. Any thoughts?

@sbrugman
Copy link
Contributor Author

sbrugman commented Mar 7, 2024

Feel free to take any of the PRs and continue. Time is tight, and it would be amazing to have these as collaborative features.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

New contract type: modular
2 participants