Skip to content

Commit

Permalink
Delete old unused dict-set
Browse files Browse the repository at this point in the history
update optimisation intro/conclusion keypoints.
  • Loading branch information
Robadob committed Feb 13, 2024
1 parent d3b83de commit 94736ac
Show file tree
Hide file tree
Showing 4 changed files with 27 additions and 249 deletions.
20 changes: 19 additions & 1 deletion episodes/optimisation-conclusion.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,24 @@ This course's website can be used as a reference manual when profiling your own

::::::::::::::::::::::::::::::::::::: keypoints

<!-- todo -->
Data Structures & Algorithms
- List comprehension should be preferred when constructing lists.
- Where appropriate, Tuples and Generator functions should be preferred over Python lists.
- Dictionaries and sets are appropriate for storing a collection of unique data with no intrinsic order for random access.
- When used appropriately, dictionaries and sets are significantly faster than lists.
- If searching a list or array is required, it should be sorted and searched using `bisect_left()` (binary search).
- Minimise Python Written
- Python is an interpreted language, this adds an additional overhead at runtime to the execution of Python code. Many core Python and NumPy functions are implemented in faster C/C++, free from this overhead.
- NumPy can take advantage of vectorisation to process arrays, which can greatly improve performance.
- Pandas' data tables store columns as arrays, therefore operations applied to columns can take advantage of NumPys vectorisation.
- Newer is Often Faster
- Where feasible, the latest version of Python and packages should be used as they can include significant free improvements to the performance of your code.
- There is a risk that updating Python or packages will not be possible to due to version incompatibilities or will require breaking changes to your code.
- Changes to packages may impact results output by your code, ensure you have a method of validation ready prior to attempting upgrades.
- How the Computer Hardware Affects Performance
- Sequential accesses to memory (RAM or disk) will be faster than random or scattered accesses.
- This is not always natively possible in Python without the use of packages such as NumPy and Pandas
- One large file is preferable to many small files.
- Memory allocation is not free, avoiding destroying and recreating objects can improve performance.

::::::::::::::::::::::::::::::::::::::::::::::::
241 changes: 0 additions & 241 deletions episodes/optimisation-dict-set.md

This file was deleted.

13 changes: 7 additions & 6 deletions episodes/optimisation-introduction.md
Original file line number Diff line number Diff line change
Expand Up @@ -136,15 +136,16 @@ In the remainder of this course we will cover:
- Sets
- Generator Functions
- Searching
- How Python Executes
- Why less Python is often faster
- How to use NumPy for performance
- How to get the most from pandas
- Minimise Python Written
- built-ins
- NumPY
- Pandas
- Newer is Often Faster
- Keeping Python and packages upto date
- How the Computer Hardware Affects Performance
- Why some accessing some variables can be faster than others
- Putting latencies in perspective
- How variables are accessed & the performance implications
- Latency in perspective
- Memory allocation isn't free

::::::::::::::::::::::::::::::::::::: keypoints

Expand Down
2 changes: 1 addition & 1 deletion episodes/optimisation-minimise-python.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
---
title: "Minimise Python (Numpy/Pandas)"
title: "Minimise Python (NumPY/Pandas)"
teaching: 0
exercises: 0
---
Expand Down

0 comments on commit 94736ac

Please sign in to comment.