Skip to content

Commit

Permalink
writing getting-starting notebook
Browse files Browse the repository at this point in the history
  • Loading branch information
tclose committed Dec 29, 2024
1 parent 03546ed commit 35641b1
Showing 1 changed file with 140 additions and 22 deletions.
162 changes: 140 additions & 22 deletions new-docs/source/tutorial/getting-started.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -6,42 +6,160 @@
"source": [
"# Getting started\n",
"\n",
"A *Task* is the basic runnable component in Pydra, and can execute either a Python function,\n",
"shell command or workflows consisting of combinations of all three types."
"## Running your first task\n",
"\n",
"The basic runnable component of Pydra is a *task*. Tasks are conceptually similar to\n",
"functions, in that they take inputs, process them and then return results. However,\n",
"unlike functions, tasks are parameterised before they are executed in a separate step.\n",
"This enables parameterised tasks to be linked together into workflows that are checked for\n",
"errors before they are executed, and modular execution workers and environments to specified\n",
"independently of the task being performed.\n",
"\n",
"Pre-defined task definitions are installed under the `pydra.tasks.*` namespace by separate\n",
"task packages (e.g. `pydra-fsl`, `pydra-ants`, ...). Pre-define task definitions are run by\n",
"\n",
"* importing the class from the `pydra.tasks.*` package it is in\n",
"* instantiate the class with the parameters of the task\n",
"* \"call\" resulting object to execute it as you would a function (i.e. with the `my_task(...)`)\n",
"\n",
"To demonstrate with a toy example, of loading a JSON file with the `pydra.tasks.common.LoadJson` task, this we first create an example JSON file"
]
},
{
"cell_type": "code",
"execution_count": 5,
"execution_count": 6,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Sample JSON file created at '0UAqFzWsDK4FrUMp48Y3tT3Q.json' with contents: {\"a\": true, \"b\": \"two\", \"c\": 3, \"d\": [7, 0.5598136790149003, 6]}\n",
"Loaded contents: {'a': True, 'b': 'two', 'c': 3, 'd': [7, 0.5598136790149003, 6]}\n"
]
}
],
"outputs": [],
"source": [
"from fileformats.application import Json\n",
"from pydra.tasks.common import LoadJson\n",
"from pathlib import Path\n",
"from tempfile import mkdtemp\n",
"import json\n",
"\n",
"# Create a sample JSON file to test\n",
"json_file = Json.sample()\n",
"JSON_CONTENTS = {'a': True, 'b': 'two', 'c': 3, 'd': [7, 0.5598136790149003, 6]}\n",
"\n",
"# Print the path of the sample JSON file and its contents for reference\n",
"print(f\"Sample JSON file created at {json_file.name!r} with contents: {json_file.read_text()}\")\n",
"test_dir = Path(mkdtemp())\n",
"json_file = test_dir / \"test.json\"\n",
"with open(json_file, \"w\") as f:\n",
" json.dump(JSON_CONTENTS, f)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now we can load the JSON contents back from the file using the `LoadJson` task definition\n",
"class"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [],
"source": [
"# Import the task definition\n",
"from pydra.tasks.common import LoadJson\n",
"\n",
"# Parameterise the task specification to load the JSON file\n",
"# Instantiate the task definition, providing the JSON file we want to load\n",
"load_json = LoadJson(file=json_file)\n",
"\n",
"# Run the task to load the JSON file\n",
"result = load_json()\n",
"\n",
"# Print the output interface of the of the task (LoadJson.Outputs)\n",
"print(f\"Loaded contents: {result.output.out}\")"
"# Access the loaded JSON output contents and check they match original\n",
"assert result.output.out == JSON_CONTENTS"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Iterating over inputs\n",
"\n",
"It is straightforward to apply the same operation over a set of inputs using the `split()`\n",
"method. For example, if we wanted to re-grid all the NIfTI images stored in a directory,\n",
"such as the sample ones generated by the code below"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from fileformats.medimage import Nifti\n",
"\n",
"nifti_dir = test_dir / \"nifti\"\n",
"nifti_dir.mkdir()\n",
"\n",
"for i in range(10):\n",
" Nifti.sample(nifti_dir, seed=i)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Then we can by importing the `MrGrid` shell-command task from the `pydra-mrtrix3` package\n",
"and then splitting over the list of files in the directory"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from pydra.tasks.mrtrix3 import MrGrid\n",
"\n",
"# Instantiate the task definition, \"splitting\" over all NIfTI files in the test directory\n",
"mrgrid = MrGrid(voxel=0.5).split(input=nifti_dir.iterdir())\n",
"\n",
"# Run the task to resample all NIfTI files\n",
"result = mrgrid()\n",
"\n",
"# Print the locations of the output files\n",
"print(\"\\n\".join(str(p) for p in result.output.output))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"It is also possible to iterate over inputs in pairs, if for example you wanted to use\n",
"different voxel sizes for different images, both the list of images and the voxel sizes\n",
"are passed to the `split()` method and their combination is specified by a tuple \"splitter\"\n",
"(see [Splitting and combining](../explanation/splitting-combining.html) for more details\n",
"on splitters)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Define a list of voxel sizes to resample the NIfTI files to, must be the same length\n",
"# as the number of NIfTI files\n",
"VOXEL_SIZES = [0.5, 0.5, 0.5, 0.75, 0.75, 0.75, 1.0, 1.0, 1.0, 1.25]\n",
"\n",
"mrgrid_varying_sizes = MrGrid().split(\n",
" (\"input\", \"voxel\"),\n",
" input=nifti_dir.iterdir(),\n",
" voxel=VOXEL_SIZES\n",
")\n",
"\n",
"# Run the task to resample all NIfTI files with different voxel sizes\n",
"result = mrgrid()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Cache directories\n",
"\n",
"When a task runs, a hash is generated by the combination of all the inputs to the task and the task to be run."
]
},
{
Expand Down

0 comments on commit 35641b1

Please sign in to comment.