This repo is intended to contain short "vignettes" illustrating statistical concepts. It is very much work in progress. Things may change quickly and often...
The name comes from the fact that, in principle, each vignette should be readable in a short amount of time. Perhaps five minutes.
The overall goal is that by making vignettes short in this way we can try to make learning more "modular".
Each vignette should, ideally, focus on introducing a single concept, or a small number of related concepts,
that are easily digestable provided other pre-requisite concepts are mastered.
One reason for this is to try to make learning easier: break complex ideas down into smaller
more easily digestable chunks. Another is that it encourages re-use of material.
Just as software engineers write software in a "modular" way, with each function performing a well-defined role,
the idea is that these vignettes make learning "modular". If you don't like the way one vignette introduces
the concept then you can write a different one and just replace that one part. And in principle if this takes
off we can have large numbers of authors, each contributing a small number of vignettes.
Modularizing facilitates sharing the load.
In principle we can have multiple vignettes for the same concept and users can choose which one they like.
The idea of breaking learning down into small chunks is kind of obvious, but I was personally inspired by watching videos with my kids: http://www.artofproblemsolving.com/videos/prealgebra Maybe we can make learning statistics this easy and this much fun? However, I decided against video because a) I'm not as funny as this guy, and b) it is harder to collaborate and update videos.
If you are interested in these ideas, please get in touch, [email protected] (remove the marsupial).
The directory structure is probably more complex than it needs to be, because it was based on a template, known as ashlar, which we use for projects more generally. For most purposes you will only need to look at the vignettes, which are in the "analysis" subdirectory.
Everything below this is a product of this repo being forked from ashlar. You can probably ignore it, and it may be removed in the future...
Table of Contents generated with DocToc
- ashlar: A workflow template for statistical computing projects
- Making your own ashlar
- [Cloning ashlar](#cloning-ashlarhttpgithubcomjhsiao999ashlar)
- Reset git remote directory
- Producing and publishing the website
- Resources
ashlar is our attempt to streamline workflow and to do reproducible research here at the University of Chicago Stephens Lab.
Cloning ashlar
ashlar is inspired by singleCellSeq - a collaborative project between biologists, bioinformaticians and statisticians that aim at exploring and understanding batch effects in single-cell RNA sequencing data. Both projects adopt the popular rmarkdown website layout.
I suggest cloning into a new folder to distinguish your work from the example repository.
git clone https://github.com/jhsiao999/ashlar.git ashlar-trial
At this point, your remote directory of the clone is still ashlar. Make sure you change the name of the remote repository to match your local directory.
git remote rm origin
git remote add origin https://github.com/jhsiao999/ashlar-trial.git
Create a repository at github.com. Then, push contents of the entire directory to the master branch. We use git add -f option to force add html files to the master branch, such as index.html for table of content. The default .gitignore in ashlar ignores htmls.
git add -f --all
git commit -m "first commit"
git push origin master
Open index.html. This is the homepage of your unpubished website. You are DONE!
If you choose this option, you only have the master branch. The gitignore is set up to not to push htmls, pngs, pdfs, etc to the remote master brach, so edit the .gitignore to add these files if you want to add them to the remote directory.
Create a branch named gh-pages. GitHub turns the htmls in the gh-pages branch into a website. You now have a project website for free. This feature makes all contents in the gh-pages public, even if the master branch is private.
git checkout -b gh-pages
git add -f --all
git commit -m "Build site"
git push origin gh-pages
The site address is udner the analysis directory since the site contents are kept under the analysis directory.
https://jhsiao999.github.io/ashlar-trial/analysis
This two-branch workflow is set up to keep the source files (such as Rmds) separate from the html pages and the output figures. It allows me to keep clean repositories: master for the source, and gh-pages only for the website.
I mostly use RStudio to generate htmls, but when there are a large number of analysis files that need to be updated, I choose to use the simple make command. Below are two of my most recently used paths of update GitHub Pages.
Path 1: I mostly use this one when there's only a small analysis file to be updated.
## Work at the gh-pages branch, push website content to gh-pages,
## push source to the master
git checkout gh-pages
cd analysis
make # (or use knitr to render the Rmds)
git add -f *Rmd *html figure/*
git commit -m "add new analysis"
git push origin gh-pages
git checkout master
git merge gh-pages
git add new-analysis.Rmd index.Rmd
git commit -m "add new analysis"
git push origin master
Path 2: I use this when the site has not been update in a while, and I need to compile a large number of Rmds.
## Work at the master branch, keep all htmls local
## Push source to the master branch, use make to generate htmls
## for the gh-pages branch
git checkout master
cd analysis
git add new-analysis.Rmd index.Rmd
git commit -m "add new analysis"
git push origin master
git checkout gh-pages
git merge master
make
git add *Rmd *html figure/*
git commit -m "add new analysis"
git push origin gh-pages