-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] annotation wrapper class #282
base: main
Are you sure you want to change the base?
[WIP] annotation wrapper class #282
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for this. I have some inline comments but my high level feedback would be:
A lot of what I write below is gut feeling, I don't like making suggestions based on this without some further experimentation but in the interest of time I will. I probably miss stuff (or am flat out wrong) but I hope the suggestions/ideas are useful.
I am not a fan of Annotated{Src,Bool,Int,Dict,...} in general, this is true for this approach and for #280. I am worried about the complexity and how we need to overload magic methods and then we have a getitem on a bool etc).
But I do like the idea of AnnotatedNode (or OtkNode) and AnnoatedNote.value because it gives access to the real thing and all operations will work as expected on ".value" plus the inconvenience of having to do "node.value" is actually useful as it hints the user of the API that AnnotatedNode is not the thing to work with but just a wrapper that carries the value and extra data.
I would love to experiment with something slightly smaller:
- Just have "AnnotatedNode" with the minimal needed interface, probably hash/eq. And AnnotatedNode takes a value and (optionally an exiting AnnotatedNode from which it will inherit the annotation history)
- The API of AnnoatedNode would be just
.value
and.src
withsrc
a dataclass that carries a list of filename,line for now (and the two magic methods so that Nodes can be put into a dict and compared) - when we need to check the type (like in transform.py:resolve() we use
isinstance(data.value, dict)
- we will most likely need convenience OtkDict, OtkList or navigation the tree will be extremly painful, they just need to provide:
def __getitem__(self, item): return self.value.__getitem__
and we should construct them only in the yaml loader for list/mapping (or explicitely in transform.py)
I did a little bit of experimenting in https://github.com/osbuild/otk/compare/main...mvo5:annotations-3?expand=1 - its extremly messy and incomplete but I would love to see this explored a bit more. If that works out I think it's preferable over the subclass approvch in #280 but if we really need classes for all python types here too I think the subclass one is better (but I hope we won't)
Sorry for this slightly messy review and my rambling, hope it's helpful still
class AnnotatedNode(Generic[AnnotatedNodeT]): | ||
|
||
# to be able to print relative paths | ||
_basedirs = [os.path.curdir] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(unrelated but I can't help noticing) - using a class attribute like this is problematic pattern as this is shared access all instance, I think it's best to avoid this kind of shared state as it may lead to subtle surprises down the line.
_basedirs = [os.path.curdir] | ||
|
||
def __init__(self, value: Optional[AnnotatedNodeT] = None, annotations: Optional[dict[str, Any]] = None): | ||
self.value: AnnotatedNodeT = self.deep_convert(value) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like that self.value
should be the real value, not anything we convert. If it's a bool, it is a bool, dict etc. This way we avoid the issue that AnnoatedBool(true) is True
is actually (and surprisingly) False
. With the real value at value this beomces the correct AnnotatedValue(True).value is True == True
.
Fwiw, a similar comment as in #280 (comment) - I think the yaml buildup will give us the right types here and we don't need to convert (except for the loading of data fromthe external which we can probably do there instead of here)
AnnotatedNodeT = TypeVar('AnnotatedNodeT') | ||
|
||
|
||
class AnnotatedNode(Generic[AnnotatedNodeT]): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I personally would just call it OtkValue
or OtkNode
(as we are not writing a generic library here (yet))
|
||
def __init__(self, value: Optional[AnnotatedNodeT] = None, annotations: Optional[dict[str, Any]] = None): | ||
self.value: AnnotatedNodeT = self.deep_convert(value) | ||
self.annotations: dict[str, Any] = annotations or {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would just use a dataclass like here with the minimal amount of informatin we need
self.squash_annotations([self.value]) # type: ignore[attr-defined] | ||
self.value = self.value.value # type: ignore[attr-defined] | ||
|
||
def __iter__(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If .value
the real value, could we try to get as minimal as we can on the "magic" methods here? It seems an OtkNode would need __eq__
and __hash__
so that values it can be get/set into dicts. Then we would need a Otk{List,Dict} with __getitem__
so that we can navigate the resulting json in a "natural" way but beyond that I am not sure we need more if we always use ".value" for the real manipulations.
if not isinstance(item, AnnotatedNode): | ||
continue | ||
for k in item.annotations.keys(): | ||
if k == "src": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Merging via string manipulation is not ideal and the special handling of "src" even less. I think we should go YAGNI on this too and just have "src" as a dataclass and then we can have a list of dataclasses if there are multiple annotations (and a maybe a str to format them nicley?)
self.annotations[f"{prefix}column"] = column | ||
self.annotations[f"{prefix}column_end"] = column_end | ||
|
||
def squash_annotations(self, annotation_list: list) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe merge_annotations
as the other ones are merged?
Just an alternative implementation of #280
I don't really like the approach to have
.value
all over the place. Let's see which approach is more suitable for a future usageThis PR is just one squashed commit until we decide on which we continue.