-
Notifications
You must be signed in to change notification settings - Fork 49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Modular contracts #221
base: master
Are you sure you want to change the base?
Modular contracts #221
Conversation
c4d3a7a
to
c3932c6
Compare
db3412f
to
9ebf0c9
Compare
9ebf0c9
to
38b1118
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's great to see some thought put into this, and I like your inventive and clever way to use find_illegal_dependencies_for_layers
.
As per my longer comment, I think we need to (i) understand better what we want to report in the case of failure and (ii) prevent it taking forever / giving millions of results on large code bases. I'll give it a try on a real code base and report back, but keen to hear your thoughts in the meantime.
from importlinter.domain.imports import Module | ||
|
||
|
||
# TODO: import from helpers once https://github.com/seddonym/import-linter/pull/220 is merged |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI A good way to do this is just to branch off the other branch, and then edit the base in the Github UI so it's merging into that PR. Once that PR gets merged, the base will update.
|
||
type_name = "modular" | ||
|
||
modules = fields.ListField(subfield=fields.ModuleField()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not a SetField
?
def check(self, graph: ImportGraph, verbose: bool) -> ContractCheck: | ||
violations = {} | ||
for module in self.modules: # type: ignore | ||
direct_submodules = _resolve_wildcards(f"{module.name}.*", graph) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's call them 'children' rather than 'direct submodules' for consistency. And better to use graph.find_children
to get them.
) | ||
violations[module.name] = sorted( | ||
{ | ||
f"{dependency.imported} <- {dependency.importer}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's keep the notation consistent - the importer is always listed before the imported, and with the arrow in the other direction.
More generally I reckon it would be better to put richer objects in the metadata - take a look at what the independence contract does.
squashed_graph.squash_module(m.name) | ||
|
||
dependencies = squashed_graph.find_illegal_dependencies_for_layers( | ||
layers=({y.name for y in direct_submodules},), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a clever way to solve the problem, but it took me a while to understand! It could do with a comment to explain what's going on here. 😉
{ | ||
f"{dependency.imported} <- {dependency.importer}" | ||
for dependency in dependencies | ||
if squashed_graph.find_shortest_chains( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a very costly operation on large code bases, while the find_illegal_dependencies_for_layers
is highly optimized (it uses Rust) and has already, I think, done most of the work for us.
I have a feeling we could employ an algorithm along these lines, though as you'll see I am not sure on the details yet on the final step:
- Find all dependencies between children (using your 'children are independent' approach with
find_illegal_dependencies_for_layers
). Now we have a set of PackageDependencies (https://grimp.readthedocs.io/en/stable/usage.html#PackageDependency). - Build a different Grimp graph manually consisting only of those children (so, only a handful of nodes). Then identify any cycles that exist between the children (maybe using a combination of the package dependency results and a call to
find_shortest_chains
?) That will be quick as there should just be a handful of nodes and edges. - Then comes the hard part. We need to figure out which parts of the cycles are interesting and which to report on. Which imports are illegal and which not? How do we manage large numbers of cycles? How do we make it deterministic (possibly hard, given that the routes within the package dependencies returned by
find_illegal_dependencies_for_layers
are not completely deterministic?
Really interested to explore these issues more. I will post back as I have more ideas, but would welcome other input to help define concretely what we want to happen here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I ran it on a large codebase.
I ran it at the top level, which has a fairly small amount of child modules, and the good news is it didn't take long, around 3 seconds. However I get this output: it's difficult to understand what the cycle is, do you agree?
child modules of mypackage must be modular and thus circular dependencies are not allowed:
- mypackage.application <- mypackage.comms
- mypackage.application <- mypackage.domain
- mypackage.application <- mypackage.interfaces
- mypackage.application <- mypackage.plugins
- mypackage.celery_config <- mypackage.domain
- mypackage.celery_config <- mypackage.interfaces
- mypackage.celery_config <- mypackage.plugins
- mypackage.celery_config <- mypackage.services
- mypackage.celery_signals <- mypackage.celery_config
- mypackage.comms <- mypackage.application
- mypackage.comms <- mypackage.data
- mypackage.comms <- mypackage.domain
- mypackage.comms <- mypackage.interfaces
- mypackage.comms <- mypackage.plugins
- mypackage.core <- mypackage.application
- mypackage.core <- mypackage.data
- mypackage.core <- mypackage.domain
- mypackage.core <- mypackage.interfaces
- mypackage.core <- mypackage.plugins
- mypackage.core <- mypackage.settings
- mypackage.core <- mypackage.utils
- mypackage.core <- mypackage.vendors
(and lots more)
I also ran it on one of the child modules, which has 141 children itself. I ran the contract for an hour and a half before killing the process.
Checking in on this. It's looking possible that my workplace will need something along these lines so I'm likely to have capacity to work on it if time is tight for you. Something that came up in discussion, the term 'modular' is perhaps overloaded. I'm considering calling this kind of contract 'acyclic' instead. Any thoughts? |
Feel free to take any of the PRs and continue. Time is tight, and it would be amazing to have these as collaborative features. |
Closes #133
The implementation makes use of the multiple sibling modules checks of the layers contract.
For each submodule of the provided module, we squash the (copy of) the graph.
For every dependency that would be illegal by
find_illegal_dependencies_for_layers
, we check if there is a path in the reverse direction byfind_shortest_chains
. If so, then this violates the modular contract.Implementation deviates from the issue as it considers the full depth of the tree (the user can provide submodules). Does this make sense?
Uses functionality from #220 that we can import if merged.