Changes to add named fields and some other improvements #67
Replies: 5 comments 18 replies
-
Hi @tombolano, Thanks for the feedback and sorry for the delay! I'm definitely very interested by everything you did, but unfortunately my free time has been scarce lately ... Anyway, I certainly hope that we can integrate most of what you did. I'll try to have at least a quick look at your code next week. Just one question at this point : I was also interested in the constructor keyword arguments & named attributes at one stage (I do agree that it's nice to have!) and kinda lost interest when I saw that this information was absent from the Haskell types definition (and I didn't want to settle with a manual naming system that would become obsolete with every new version of the Pandoc types). Do you derive automatically the attribute names? Sometimes there is a sensible name to give (for example name Cheers, Sébastien |
Beta Was this translation helpful? Give feedback.
-
Well, this is a very well thought-out piece! Thanks for this very detailled answer, I see where you're coming from and definitely agree with your decision process (infer what can be inferred automatically, clean up the Lua filter naming inconsistencies, etc.) Two details:
I am definitely interested in reviewing this is detail. I can think of three related issues to consider at the moment: Representation as StringsI don't like very much the representation of pandoc items with keyword arguments by default. When >>> import pandoc
>>> text = "Hello world!"
>>> doc = pandoc.read(text) I think that I'd rather have >>> doc
Pandoc(Meta({}), [Para(content=[Str(text='Hello,'), Space(), Str(text='World!')])]) instead of >>> doc
Pandoc(meta=Meta(table={}), blocks=[Para(content=[Str(text='Hello,'), Space(), Str(text='World!')])]) but this is mostly a matter of taste. I guess that a mechanism like NumPy's printoptions / set_printoptions could allow the user to switch between both representations? (Note that by the same mechanims we could introduce some pretty-printing that would probably alleviate my distaste for named fields.) On a more practical note: the current library tests rely on the examples used in the documentation that use the compact/positional representation. If we were to change the default to named arguments, all tests would failed ATM. Type DiscoverabilityI really, really like that I can forget the details of the pandoc type hierarchy and find this info in my Python console: >>> from pandoc.types import *
>>> Meta
Meta({Text: MetaValue})
>>> Pandoc
Pandoc(Meta, [Block])
>>> AlignCenter
AlignCenter()
>>> Attr
Attr = (Text, [Text], [(Text, Text)])
>>> Cell
Cell(Attr, Alignment, RowSpan, ColSpan, [Block]) To use named arguments with the same degree of convienence (or greater), this type representation must be adapted. For example, something like: >>> Pandoc
Pandoc(meta: Meta, blocks: [Block]) (should it also be configurable?) Default ValuesNamed constructor arguments open the way for default values. For example, I'd much rather have >>> doc = Pandoc(blocks=blocks) than >>> doc = Pandoc(Meta({}), blocks) I did not think very deeply of it but I guess that at least some of the cases would be no-brainers (for example make every list and every map empty by default?). Your inputs on this are welcome! 🤗 |
Beta Was this translation helpful? Give feedback.
-
Very nice. I have not unpacked everything you've done yet (for example I need to read more about rich), but I really like it so far! At this stage, I think that it would be simpler to add you as a collaborator to my repo, so that you can create an experimental branch with your contributions and make changes with minimal friction. Would that be ok with you? |
Beta Was this translation helpful? Give feedback.
-
Hello @boisgera, I have uploaded my work to the new branch 'experimental', Apart from the work already discussed before, this has the following changes:
However note that there are some things still not documented:
The only breaking thing with respect to the master version are the docstrings of the types, which as I said now print the field names and default values, and I don't know if this could even be considered "breaking". However this can be changed to be exactly as in master by running Best regards, |
Beta Was this translation helpful? Give feedback.
-
Lists of lists
I like it for every reason but one, and it's a big one : the name is probably misleading. A "list of blocks" is probably most likely interpreted as "list of objects of type I tried to make ChatGPT guess the type of several names and one that works better than At the moment, I'd favor
(I gave a look at the Maps
OK, let's pick Commit
Please do, thanks a lot! 👍 If you're ok with it (we can take more time to discuss & brainstorm if needed), I'll change the |
Beta Was this translation helpful? Give feedback.
-
Hello, this is a great library, I found it some days ago because I wanted to use the Pandoc AST in Python and this library is exactly what I needed.
One thing that I like about Pandoc is the Lua filters interface, which allows accessing the fields of the elements by their names. Since this library lacked this feature I added it, but before that I also did a general review of the code and changed some other things. My changes are in https://github.com/tombolano/pandoc/tree/experimental. As a summary these are the most relevant changes:
Avoid creating temporary input and output files for communicating with pandoc, use stdin and stdout.
Fixed a bug that caused that the
configure
andmake_types
functions were called twice when loading thetypes
module.Modified the
apply
code to use a single function for the tree traversal. The typical post order recursive tree traversal function is like this (https://en.wikipedia.org/wiki/Tree_traversal#Post-order_implementation):The current code does the post order tree traversal correctly, but it is split in two functions,
_apply_children
, andapply_
(insideapply
). I found this code a bit confusing and I changed it to just use a single function similar to the one from the example above, called_apply_post_order
.Implemented concrete data types as dataclasses. This offers the following features:
Link !Attr ![Inline] !Target
, but in the Pandoc Lua filters API it has fieldscontent
,target
,title
, andattr
.__init__
,__repr__
, and__eq__
methods are added automatically to the classes.__match_args__
variable is created automatically.pprint
module can be used by default to pretty-print dataclasses, this way we can pretty-print the documents easily.As a drawback, the implementation of the
__getitem__
and__setitem__
methods now is a little more complicated since we cannot rely directly on a list.Note that the changes do not affect the API of the library, it is still the same, so any previous code should work the same, the only difference is that now because the data types are implemented with dataclasses, when printing a type the field names are also printed. For example, consider this code from the examples in the documentation:
Now with the changes the result is the following:
@boisgera, if you may be interested in some of these changes you may take a look at my commits, I made sure to explain everything in detail in the commit messages. If you are interested I can submit pull requests of the changes, or you may also pick and apply yourself any changes that you want.
Beta Was this translation helpful? Give feedback.
All reactions