Delete old unused dict-set

update optimisation intro/conclusion keypoints.
RSE-Sheffield · Feb 13, 2024 · 94736ac · 94736ac
1 parent d3b83de
commit 94736ac
Show file tree

Hide file tree

Showing 4 changed files with 27 additions and 249 deletions.
diff --git a/episodes/optimisation-conclusion.md b/episodes/optimisation-conclusion.md
@@ -29,6 +29,24 @@ This course's website can be used as a reference manual when profiling your own
 
 ::::::::::::::::::::::::::::::::::::: keypoints
 
-<!-- todo -->
+Data Structures & Algorithms
+    - List comprehension should be preferred when constructing lists.
+    - Where appropriate, Tuples and Generator functions should be preferred over Python lists.
+    - Dictionaries and sets are appropriate for storing a collection of unique data with no intrinsic order for random access.
+    - When used appropriately, dictionaries and sets are significantly faster than lists.
+    - If searching a list or array is required, it should be sorted and searched using `bisect_left()` (binary search).
+- Minimise Python Written
+    - Python is an interpreted language, this adds an additional overhead at runtime to the execution of Python code. Many core Python and NumPy functions are implemented in faster C/C++, free from this overhead.
+    - NumPy can take advantage of vectorisation to process arrays, which can greatly improve performance.
+    - Pandas' data tables store columns as arrays, therefore operations applied to columns can take advantage of NumPys vectorisation.
+- Newer is Often Faster
+    - Where feasible, the latest version of Python and packages should be used as they can include significant free improvements to the performance of your code.
+    - There is a risk that updating Python or packages will not be possible to due to version incompatibilities or will require breaking changes to your code.
+    - Changes to packages may impact results output by your code, ensure you have a method of validation ready prior to attempting upgrades.
+- How the Computer Hardware Affects Performance
+    - Sequential accesses to memory (RAM or disk) will be faster than random or scattered accesses.
+      - This is not always natively possible in Python without the use of packages such as NumPy and Pandas
+    - One large file is preferable to many small files.
+    - Memory allocation is not free, avoiding destroying and recreating objects can improve performance.
 
 ::::::::::::::::::::::::::::::::::::::::::::::::
diff --git a/episodes/optimisation-dict-set.md b/episodes/optimisation-dict-set.md
diff --git a/episodes/optimisation-introduction.md b/episodes/optimisation-introduction.md
@@ -136,15 +136,16 @@ In the remainder of this course we will cover:
   - Sets
   - Generator Functions
   - Searching
-- How Python Executes
-  - Why less Python is often faster
-  - How to use NumPy for performance
-  - How to get the most from pandas
+- Minimise Python Written
+    - built-ins
+    - NumPY
+    - Pandas
 - Newer is Often Faster
   - Keeping Python and packages upto date
 - How the Computer Hardware Affects Performance
-  - Why some accessing some variables can be faster than others
-  - Putting latencies in perspective
+   - How variables are accessed & the performance implications
+   - Latency in perspective
+   - Memory allocation isn't free
 
 ::::::::::::::::::::::::::::::::::::: keypoints
 

diff --git a/episodes/optimisation-minimise-python.md b/episodes/optimisation-minimise-python.md
@@ -1,5 +1,5 @@
 ---
-title: "Minimise Python (Numpy/Pandas)"
+title: "Minimise Python (NumPY/Pandas)"
 teaching: 0
 exercises: 0
 ---