The goals for week 3 lab are:
- Install and use the Jupyter notebook application on your VM.
- Be able to create, edit, and run notebooks.
When running jobs on many CPUs, we need to take into account how the CPUs communicate with each other. One way is to share memory, which works up to a point, but doesn't scale past that. Another way is to have networking and to use something called MPI (for Message Passing Interface) to communicate between processes on different machines. The computer architecture that works best will depends on the problem you are trying to solve. Some problems can be framed as many independent operations that only need to synchronize infrequently (e.g. Science United or SETI @ Home). These kinds of problems are the easiest to scale up up to massive sizes, since we just need a lot of CPUs, but don't need much in terms of interconnections. The problems that resist such a framing need many CPUs that are interconnected, and network speed will be a major factor. These are the problems that supercomputers are designed for. A lot of these are simulations, where each point in a grid depends on what is happening at other neighboring points.
OK! Lets look at Python and Jupyter notebooks, an increasingly standard way of using Python.
First, update your cicf files:
cd cicf
git pull
We need to install Jupyter on our VMs. If you didn't do this last week:
sudo apt install jupyter
Go into week 3 and start juypter:
cd week3
jupyter notebook
Select the introduction notebook, Introduction.ipynb
.
Read through the notebook and run the commands.
When finished look at the tutorial for Python notebook, QuickTourOfPython.ipynb
, and then the third notebook, PythonPackages.ipynb
.
- The always useful Software Carpentry Python course
- Jupyter Manual
- An excellent Python for Scientific Computing tutorial.
- Python Reference for the most recent version of Python. (n.b. we are using an old verion, 3.9, on the VMs).
- NumPy for Absolute Beginners
- MatPlotLib
- Literate Programming is a specific instance of the idea that code and documentation should be more mixed together. Here is a Knuth paper and an amazing website (not Knuth's) with more information that you ever thought existed on program documentation.
- Markdown Cheat Sheet
- A fun LIGO data tutorial and all of the released LIGO data
- If you do the LIGO tutorial, the data is a file format called HDF5. This format supports efficient storage and transfer of numeric data.