Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrate to mkdocs #81

Draft
wants to merge 59 commits into
base: main
Choose a base branch
from
Draft

Migrate to mkdocs #81

wants to merge 59 commits into from

Conversation

thomasmarwitz
Copy link

@thomasmarwitz thomasmarwitz commented Aug 14, 2024

Demonstrate how mkdocs-jupyter plugin can be used to execute and render Jupyter Notebooks in a similar way to MyST.

Output: https://metalearners--81.org.readthedocs.build/en/81/

TODO:

Copy link

codecov bot commented Aug 14, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 94.73%. Comparing base (8b371de) to head (1bf0671).

Additional details and impacted files
@@            Coverage Diff             @@
##             main      #81      +/-   ##
==========================================
- Coverage   94.73%   94.73%   -0.01%     
==========================================
  Files          15       15              
  Lines        1806     1805       -1     
==========================================
- Hits         1711     1710       -1     
  Misses         95       95              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@thomasmarwitz thomasmarwitz changed the title Remove sphinx specific formatting, adjust heading syntax Migrate to mkdocs Aug 21, 2024
@thomasmarwitz

This comment was marked as resolved.

@kklein
Copy link
Collaborator

kklein commented Aug 27, 2024

Thanks for looking into this @thomasmarwitz !
Is it possible that the subscripts no longer need escaping?

See

Screenshot 2024-08-27 at 9 57 13 AM
from this PR compared to
Screenshot 2024-08-27 at 9 58 59 AM

from https://metalearners.readthedocs.io/en/latest/background.html#x-learner

@thomasmarwitz
Copy link
Author

@kklein Thanks for pointing that out, looks like I completely missed this. You are right, you don't need to escape subscript anymore.

@kklein
Copy link
Collaborator

kklein commented Aug 27, 2024

Aside from the math and enumeration topic mentioned above, what do you see as hurdles and next steps?
Might it make sense to test whether our readthedocs setup works fine with mkdocs?

@pavelzw
Copy link
Member

pavelzw commented Aug 27, 2024

BTW an alternative to readthedocs could be github pages in combination with https://github.com/jimporter/mike. Disadvantage would be that there is no preview for PRs. Advantage would be that there is no external configuration necessary.

@kklein
Copy link
Collaborator

kklein commented Aug 27, 2024

Sounds interesting! Can we have branch-specific deployments with GitHub-pages? Given our somewhat heavy use of formulas, we rely a lot on rendered docs in PRs.

@thomasmarwitz
Copy link
Author

I don't know whether branch based deployments can be accomplished out of the box with mike and gh-pages, that would be interesting to keep an eye on.

I just tried out to host the mkdocs documentation with readthedocs and that worked: https://metalearners--81.org.readthedocs.build/en/81/

@thomasmarwitz

This comment was marked as outdated.

@thomasmarwitz

This comment was marked as resolved.

@kklein
Copy link
Collaborator

kklein commented Aug 29, 2024

But perhaps this is even desired behavior? In the sense that if you change the implementation, you should also adapt the docstring?

I can totally see that one might want to ask that question. In our case we mostly described the 'contract' of a given method/function in the docstrings, i.e. what a user may provide as input and what they can expect as an output. We don't usually say much about the 'how's.

At times, changing the implementation does not change this contract.

E.g. one might think of the folllowing:

class Shape:
    contour: list[Point]
    def surface_area(self):
        """Return the surface area in square meters."""
        return numerical_integration(self.contour)
        
class Rectange(Shape)
    def surface_area(self):
        return distance(self.cotour[2], self.contour[0]) * distance(self.contour[3], self.contour[1])

In this case we used said sphinx feature in order to DRY and reduce the risk of inconsistencies due to redundancy.

@pavelzw
Copy link
Member

pavelzw commented Aug 29, 2024

Can we have branch-specific deployments with GitHub-pages? Given our somewhat heavy use of formulas, we rely a lot on rendered docs in PRs.

I don't think so, so in this case readthedocs might be more fitting.
Note though, that with mkdocs you can easily spin up a dev docs instance using pixi run docs or pixi run mkdocs serve.

mkdocs.yml Outdated Show resolved Hide resolved
docs/styles/custom.css Outdated Show resolved Hide resolved
docs/styles/custom.css Outdated Show resolved Hide resolved
pixi.toml Outdated Show resolved Hide resolved
@thomasmarwitz thomasmarwitz force-pushed the mkdocs branch 2 times, most recently from 0bcbb39 to cb6c281 Compare November 26, 2024 14:31
@thomasmarwitz thomasmarwitz marked this pull request as ready for review November 29, 2024 14:47
@thomasmarwitz thomasmarwitz marked this pull request as draft November 29, 2024 14:47
@kklein
Copy link
Collaborator

kklein commented Dec 2, 2024

Thanks for your work @thomasmarwitz ! A couple of observations on the rendered docs:

@thomasmarwitz
Copy link
Author

thomasmarwitz commented Dec 3, 2024

@kklein Thanks for examining the current progress so carefully. I checked each point to see whether there is some way of solving it. I haven't implemented everything as changing all URLs, adjusting all docstrings etc. takes some time.

We can disable the dark and auto-mode as a temporary work-around. The site will then be always displayed in light mode.
A solution can be the neat conditional rendering of images depending on light/dark mode, this would require a dark-mode friendly version of the images, though.

Thanks for pointing that out. As it turned out, prettier adjusted the tabwidth in markdown, that's why the version on github was always broken. I found a way to override the tabwidth for markdown with 4.

  • For some reason 'Covariates' in the glossary is treated surprisingly, see here.

Fixed, typo in heading.

I have to replace all sphinx-like references with mkdocs / normal links. Just not there yet.

  • The examples no longer show the output of the code cells, e.g. here.

I think, this point needs some discussion.

During development, I turned off the execute option during docs build as the building of the optuna notebook (through mkdocs) takes >20 min on my mac (even by excluding optuna, the remaning notebooks need 5 min to be all converted). This leads to all notebooks being used in their current state, some may have no output, but some have e.g. optuna.

When building the documentation in CI, we could execute all notebooks beforehand to ensure everything is up to date and then build the documentation. Executing all notebooks everytime, makes the mkdocs serve rather inconvenient as I expect high iteration speeds from such a live server.

WDYT? (also @pavelzw as you are a huge fan of the serve option)

  • The Table of Content of this Linear Regression paragraph has three colon-subparagraphs which I don't see in the content.

Appears to be a bug, as a work-around we can remove Headings using "``" to wrap code-like elements. If we use that feature frequently, I can open an issue.

This is also sphinx-specific terminology that I haven't replaced completely with the mkdocs equivalent.

  • Some notes aren't picked up properly, see e.g. the {note} reference in this paragraph.

Mkdocs has nice admonition rendering: https://squidfunk.github.io/mkdocs-material/reference/admonitions/#+type:note. In jupyter notebooks, the syntactic sugar to render those admonitions is not (sadly) not available. A solution I found is to refer to the "compiled" html elements directly e.g.

<!-- mkdocs note -->
<div class="admonition note">
    <p class="admonition-title">Note</p>
    <p style="margin-top: 0.6rem">The fact that we have a fixed propensity score for all observations is not true for this dataset, we just use it for illustrational purposes.</p>
</div>

This produces a note, similar to sphinx:
image

Again, this is not really pretty. If we find ourselves using that a lot, we can also think about opening an issue to support the more concise markdown syntax.

This is also sphinx-specific terminology that I haven't replaced completely with the mkdocs equivalent.

  • I can't find the parameter and return value type hints in the API docs, e.g. in the evaluate method docs.

I found a setting to render the type hints:

image

If we add black as a dependency in the docs feature, we get a formatted signature (that looks much better imo):

image

For a table like display (there are also other display options) like this:
image

The function needs a numpy, google or sphinx docstring: https://mkdocstrings.github.io/python/usage/configuration/docstrings/#docstring_style. There is a tendency that google docstrings has more features coming.

The nice thing here is that we don't need this google style or whatever everywhere. One can sprinkle that in if e.g. a param needs some explanation as in the screenshot. In all other cases, the param names and types seem pretty solid.

@thomasmarwitz
Copy link
Author

I'm afraid this is taking so long - that's mainly because migration all sphinx specific syntax to mkdocs (even in docstrings or jupyter notebooks) takes some time and I can only spend a certain amount of time per week on this.

@kklein I'll ping you here explicitly once I've migrated all docstrings and jupyter notebooks.

@kklein
Copy link
Collaborator

kklein commented Dec 5, 2024

I think, this point needs some discussion.

I totally second your take that the runtime caused by the execution of the notebooks upon building of the docs is a pain.

At the same time, I think that the output of the cells creates a lot of value for a reader of the docs.

If we can find a solution which provides the outputs while reducing the amount of time used for docs-building that'd be great.

To give you some context: We didn't start off by using jupyter notebooks at all for these code examples. Rather, we executed the corresponding code blocks by hand and mode the code blocks as well as the outputs into rst. This approach clearly has the downside of

  • there being no alerting mechanism if code doesn't run (anymore)
  • there being a lot of development overhead

If we add black as a dependency in the docs feature, we get a formatted signature (that looks much better imo):

Looks great; I think adding black as a docs dependency is not problem at all. :)

@thomasmarwitz
Copy link
Author

thomasmarwitz commented Dec 10, 2024

If we can find a solution which provides the outputs while reducing the amount of time used for docs-building that'd be great.

@kklein I talked w/ @pavelzw about how we could address this. I agree that having outputs is very valuable to the reader.

Our idea is:

  • Have an additional check here in CI that checks whether the execution count for each cell of the jupyter notebooks is not null, i.e. the notebook has been executed and, fail if we encounter an unexecuted cell. Thus, it becomes the task of the person changing something in the jupyter notebook or adding a new one to execute it at least once.
  • In the docs build, we can just copy over the output of all already executed cells.

@kklein
Copy link
Collaborator

kklein commented Dec 11, 2024

@thomasmarwitz sgtm :)

@thomasmarwitz thomasmarwitz force-pushed the mkdocs branch 2 times, most recently from a506cd4 to ff0a208 Compare December 13, 2024 09:33
Check via execution count in a bash script.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants