Notes and comments from trial run 2024-02-15 #18

ns-rse · 2024-02-15T12:00:54Z

Hopefully useful capture of some of the points raised.

Introduction to Profiling

Disseminate setup instructions before course, perhaps bundle up the various files that users will download into a zip so they can download those in one go.
@willfurnass better distinction between benchmarking and profiling
Alt text on timeline profiling.

Function Level Profiling

Line Level Profiling

People might not be familiar with putting things in functions when they come from using Notebooks, but this is covered nicely in the fizzbuzz example before they undertake the exercise.
@fred should perhaps separate <script name/arguments> throughout.
@edwin emphasise how you narrow down from the cProfile down to line_profile.

Optimisation

Testing

Is it worth going into the detail of test directory structure? MOre generally how much testing should be covered?

Data Structures and Algorithms

Minor Tpyo They allows direct and sequential element access, with the convenience to append items. extra s on allows.

List

Lot of detail here make it clear that people might not get everything here (Instructor callout perhaps?).
If there could be some diagrams to show the concepts it would be useful.

Generators

No great performance increase here so could be an example of Knuths principle that small gains aren't worth chasing. But highlight that this isn't part of memory profiling. Perhaps ditch this section.

Sets

Whilst true that these are keys in that they are unique and hashable, perhaps use the more general term items.

Searching

Explain load factor and collisions

Minimise Python

Perhaps remove zip() from built-in operators?

Numpy

Callout could demonstrate the use of dtype when having arrays of mixed types.
What about having the examples of speed differences between lists and arrays as tasks for the attendees to do? Would break up the talking and perhaps improve concentration to have an interactive task to demonstrate the point.

Pandas

Common convention many are likely to use is import numpy as np

Keeping Python up-to-date

Pedantic tpyo such changes to the JIT and GIL will provide is missing an as.
Highlight that they need to be careful that there aren't any breaking changes in when updating.
Will highlight benefits of number of cores used when doing NumPy calculations, see Which NumPy Functions Are Multithreaded - Super Fast Python.

Memory

alt text on first diagram is broken.

Accessing Disk

Is it worth mentioning/describing the differences in performance between csv and other formats such as parque or HDF5

Latency Overview

Clearer label for London > Canada > London

Optimisation Conclusion

Keypoints are not rendering correctly.

Useful resources to point people (from @ns-rse)

Could be useful to point people to additional material for the various topics I (@ns-rse) know of the following for NumPy

From Python to Numpy (the author also has a Numpy beginner tutorial
@willfurnass : point people to advent of code and other toy examples.

The text was updated successfully, but these errors were encountered:

willfurnass · 2024-02-15T13:13:03Z

File formats - reasons for covering:

perf gains if not having to do lots of type inference/type conversion (plus data validation easier to enforce within raw data)
perf gains if working with single Parquet or HDF5 file (containing many tables/ndarrays) vs lots of tiny files and data stored on network/parallel filesystem: lots of overheads from all the metadata lookups associated with the many tiny files, particularly on Lustre filesystems.
some binary formats allow for reading just subsets of data - perf and mem benefits.

Could suggest creating HDF5 or Parquet caches of CSV files if need to make repeated reads of files?

EDIT: pd.read_csv() vs pd.read_hdf5() good for demoing perf differences.

willfurnass · 2024-02-15T13:15:43Z

Free benefits of numpy/pandas built against a good BLAS/LAPACK/FFTW library e.g. Intel MKL:
many operations might be multi-threaded by default - can experiment by requesting more cores on Stanage (up to 64 per node)
BLAS/LAPACK lib might be able to auto-detect and make use of advanced CPU features e.g. AVX512 hardware vectorisation (enabled in Intel Icelake CPUs in Stanage)

See https://github.com/RSE-Sheffield/hi-perf-ipynb/blob/master/tutorials/01-multithreading.ipynb

willfurnass · 2024-02-15T13:18:13Z

Generate a diagram of or text info on the CPU core, CPU cache, mem and peripheral device connectivity/affinity within your own machine: lstopo or lstopo-no-graphics (mentioned briefly on https://github.com/RSE-Sheffield/hi-perf-ipynb/blob/master/tutorials/01-multithreading.ipynb)

willfurnass · 2024-02-15T13:18:55Z

Native Python array datatype: rarely used anywhere; suggest don't mention.

willfurnass · 2024-02-15T13:19:48Z

@willfurnass : point people to advent of code and other toy examples.

https://projecteuler.net/ is fab and language agnostic.

willfurnass · 2024-02-15T13:22:35Z

Re references/objects in Numpy arrays and Pandas DataFrames being bad: recommend people look to see if my_arr.dtype is object and/or object in my_df.dtypes. Particularly valuable/important check after running my_df = pd.read_csv(..) with type inference left as defaults.

willfurnass · 2024-02-15T13:24:45Z

Use of decorators when profiling: add suggestions for how to enable profiling on dubious-quality 3rd party code? Edit files within packages in virtualenv/conda env or something cleaner than that?

willfurnass · 2024-02-15T13:25:23Z

'function' vs 'method': use 'function' everywhere for consistency unless explicitly meaning method of an object?

willfurnass · 2024-02-15T13:26:25Z

Predator prey requires numpy which isn't included by default.

And matplotlib

willfurnass · 2024-02-15T13:28:01Z

Function profiling: could comment that easier to introduce if have somewhat modular software architecture (a reminder of issues of having functions 1000s of lines long)?

tdjames1 · 2024-02-15T13:52:07Z

fredsonnenwald · 2024-02-19T08:59:41Z

I spent some time this weekend profiling and trying to do some optimisation on one of my personal Python code projects. (Bear in mind that I am not a Python specialist and I wasn't working on particularly scientific or complex code.)

The profiling part of the course worked as advertised and helped me identify exactly which bits of my code were slow and would benefit from some effort improving. Unfortunately, the result there was that the major slow down in my code was caused by poor coding on my part and can only be improved by a better algorithm for tackling the problem, and not as far as I can see by taking advantage of any Python-specific quirks.

A few bits of the optimisation side of the course were still quite helpful however. I think by far the most useful thing I learned was about variable scope and function calls causing slow downs. The easiest and largest speed gains I got were pre-allocating non-local variables to local copies, putting functions called only once inline. Particularly the scope thing I think speed up those functions by around 10% and it might be worth putting more emphasis on this than just a single callout.

I think it would be beneficial to acknowledge at some point in the course that there might not be any optimisations to be made. I would not like to put a researcher in a position of "these things should be helping me but I can't get them to work I feel so disheartened".

Robadob · 2024-02-19T10:43:30Z

Thanks Fred, useful comments.

I appreciate all this feedback, not too sure when I will have to time to address it though. I've got a bit of a busy month.

ns-rse · 2024-02-24T08:01:46Z

I discovered by chance that the IPython, an enhanced interactive Python shell, has support for both line and memory profiling using "in-line magic" %lprun and %mprun respectively. Not entirely sure how useful it would be but thought it worth mentioning.

Robadob · 2024-02-27T13:40:00Z

Disseminate setup instructions before course, perhaps bundle up the various files that users will download into a zip so they can download those in one go.

I think @gyengen currently plans for it to run on managed desktops in Hicks. So this may not be that simple.

ns-rse · 2024-02-27T14:35:15Z

With a wider view though is it possible that some might want to use their own laptops?

I never used the managed desktops so don't know if its possible for people to install software in advance, i.e. they work like VMs/Remote desktops. If so it would seem sensible to ask people to download and install software and data in advance as doing so at the start of a session wastes valuable face-to-face time.

Also this course has the possibility of feeding up-stream into the Carpentries Incubator where it could be used by others and may see contributions and so making it as general as possible would be useful.

In that regard having instructions for participants to download and install setups before hand would be really useful.

Robadob · 2024-02-27T14:38:56Z

With a wider view though is it possible that some might want to use their own laptops?

Yes, eventually. Still a lot to resolve before then. I'm acknowledging the feedback (not going to hide it away), just not an immediate priority. Afaik carpentries format does have a data page, which would serve this purpose. I'm just not a huge fan of having individual downloads that need to also be manually archived if change. So would want to look at whether I can fudge carpentries CI to do that for me.

Also this course has the possibility of feeding up-stream into the Carpentries Incubator where it could be used by others and may see contributions and so making it as general as possible would be useful.

There's already Sheffield specific stuff in here (such as the Theme), I expect carpentries incubator would end up being a fork of this repository.

ns-rse · 2024-02-27T14:44:39Z

Cool, the main reason I mentioned it is that with the Git course it can delay the start of the session if people hadn't followed the setup instructions.

If/when you get round to creating archives there seems to be a GitHub Action for everything...Create Archive · Actions · GitHub Marketplace!

Robadob · 2024-03-05T14:04:04Z

Removed the scope callout whilst removing generator functions. Need to workout where it fits.

As suggested by Fred, likely worth promoting it beyond a passing callout.
- Worth renaming the physical file minimise-python to understanding-python?


::::::::::::::::::::::::::::::::::::: callout

The use of `max_val` in the previous example moves the value of `N` from global to local scope.

The Python interpreter checks local scope first when finding variables, therefore this makes accessing local scope variables slightly faster than global scope, this is most visible when a variable is being accessed regularly such as within a loop.

Replacing the use of `max_val` with `N` inside `test_generator()` causes the function to consistently perform a little slower than `test_list()`, whereas before the change it would normally be a little faster.

:::::::::::::::::::::::::::::::::::::::::::::

Robadob mentioned this issue Feb 27, 2024

Accelerated Feedback v1 #20

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Notes and comments from trial run 2024-02-15 #18

Notes and comments from trial run 2024-02-15 #18

ns-rse commented Feb 15, 2024 •

edited by Robadob

Loading

willfurnass commented Feb 15, 2024 •

edited

Loading

willfurnass commented Feb 15, 2024 •

edited by Robadob

Loading

willfurnass commented Feb 15, 2024

willfurnass commented Feb 15, 2024

willfurnass commented Feb 15, 2024

willfurnass commented Feb 15, 2024

willfurnass commented Feb 15, 2024

willfurnass commented Feb 15, 2024

willfurnass commented Feb 15, 2024 •

edited by Robadob

Loading

willfurnass commented Feb 15, 2024

tdjames1 commented Feb 15, 2024 •

edited by Robadob

Loading

fredsonnenwald commented Feb 19, 2024

Robadob commented Feb 19, 2024

ns-rse commented Feb 24, 2024

Robadob commented Feb 27, 2024

ns-rse commented Feb 27, 2024

Robadob commented Feb 27, 2024 •

edited

Loading

ns-rse commented Feb 27, 2024

Robadob commented Mar 5, 2024 •

edited

Loading

Notes and comments from trial run 2024-02-15 #18

Notes and comments from trial run 2024-02-15 #18

Comments

ns-rse commented Feb 15, 2024 • edited by Robadob Loading

Introduction to Profiling

Function Level Profiling

Line Level Profiling

Optimisation

Testing

Data Structures and Algorithms

List

Generators

Sets

Searching

Minimise Python

Numpy

Pandas

Keeping Python up-to-date

Memory

Accessing Disk

Latency Overview

Optimisation Conclusion

Useful resources to point people (from @ns-rse)

willfurnass commented Feb 15, 2024 • edited Loading

willfurnass commented Feb 15, 2024 • edited by Robadob Loading

willfurnass commented Feb 15, 2024

willfurnass commented Feb 15, 2024

willfurnass commented Feb 15, 2024

willfurnass commented Feb 15, 2024

willfurnass commented Feb 15, 2024

willfurnass commented Feb 15, 2024

willfurnass commented Feb 15, 2024 • edited by Robadob Loading

willfurnass commented Feb 15, 2024

tdjames1 commented Feb 15, 2024 • edited by Robadob Loading

Function Level Profiling

Profiling Summary

Optimisation

Testing

Data Structures and Algorithms

Minimise Python

Latency Overview

General comments

fredsonnenwald commented Feb 19, 2024

Robadob commented Feb 19, 2024

ns-rse commented Feb 24, 2024

Robadob commented Feb 27, 2024

ns-rse commented Feb 27, 2024

Robadob commented Feb 27, 2024 • edited Loading

ns-rse commented Feb 27, 2024

Robadob commented Mar 5, 2024 • edited Loading

ns-rse commented Feb 15, 2024 •

edited by Robadob

Loading

willfurnass commented Feb 15, 2024 •

edited

Loading

willfurnass commented Feb 15, 2024 •

edited by Robadob

Loading

willfurnass commented Feb 15, 2024 •

edited by Robadob

Loading

tdjames1 commented Feb 15, 2024 •

edited by Robadob

Loading

Robadob commented Feb 27, 2024 •

edited

Loading

Robadob commented Mar 5, 2024 •

edited

Loading