You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In a recent conversation with EarthCube, @rabernat@yuvipanda@sgibson91 and I discussed some ideas around infrastructure for reproducible pipelines. In another meeting, I've spoken with @cgentemann about how AGU is also trying to sort out its "publishing with notebooks" story.
It seems like there is an opportunity for 2i2c to provide some guidance (and development!) to provide a path forward for both of these communities (and others!). This is a discussion to see if others are excited about this direction, potentially as a first step in our Pangeo collaboration, and to see what next steps might be.
An Idea
Here's one idea as a start. I imagine two things:
A set of documentation similar to Zero to JupyterHub that describes an end-to-end solution for reproducible publishing, using open tools and services as much as possible.
A working implementation of this pipeline that 2i2c runs to show it off, potentially as a prototype for either EarthCube or AGU.
This would be some combination of these two setups:
Each author submits a link to a Binder-ready repository
They also submit a path, relative to the repository's root, to a notebook that serves as the "paper"
This is collected in a YAML file (either manually created, or ask authors to submit via a PR to a repository. (for example, Jupyter Book's gallery uses a single YAML file to build itself, here's an example of a PR to add a new entry)
The entries in the YAML file get turned into pages in a Jupyter Book (so the left sidebar has one item for each submission, and the page is then rendered via Jupyter Book.
We set up some kind of BinderHub integration similar to what @ryan Abernathey showed off to execute each notebook as a part of the book building process.
I think after implementing a system like this we'd likely need tweaks (e.g., another option is to let people submit Jupyter Books directly, with their own configuration and such). But this seems like a reasonable start.
Curious if others think that this vision is worth working towards - perhaps regardless of the specific outcomes of the EarthCube / AGU collaborations.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
Background
In a recent conversation with EarthCube, @rabernat @yuvipanda @sgibson91 and I discussed some ideas around infrastructure for reproducible pipelines. In another meeting, I've spoken with @cgentemann about how AGU is also trying to sort out its "publishing with notebooks" story.
It seems like there is an opportunity for 2i2c to provide some guidance (and development!) to provide a path forward for both of these communities (and others!). This is a discussion to see if others are excited about this direction, potentially as a first step in our Pangeo collaboration, and to see what next steps might be.
An Idea
Here's one idea as a start. I imagine two things:
This would be some combination of these two setups:
The workflow itself could be something like:
I think after implementing a system like this we'd likely need tweaks (e.g., another option is to let people submit Jupyter Books directly, with their own configuration and such). But this seems like a reasonable start.
Curious if others think that this vision is worth working towards - perhaps regardless of the specific outcomes of the EarthCube / AGU collaborations.
Beta Was this translation helpful? Give feedback.
All reactions