diff --git a/docs/talks/index.md b/docs/talks/index.md index 785274f..a284352 100644 --- a/docs/talks/index.md +++ b/docs/talks/index.md @@ -23,7 +23,7 @@ or, for upcoming talks, _Titles are placeholder._ -- 2025-03-20 | "Python tidbits: Small Python packages and tips you wish you knew about yesterday" by Nick Hodgskin [(:material-github: Discussion)](https://github.com/UU-IMAU/python-for-lunch/issues/21) +- 2025-03-20 | "Python tidbits: Small Python packages and tips you wish you knew about yesterday" by Nick Hodgskin ([:material-file-document: Notebook](./python-tidbits.ipynb), [:material-github: Discussion](https://github.com/UU-IMAU/python-for-lunch/issues/21)) - 2025-04-03 | "Choosing beautiful (and accessible) colour maps" by Miriam Sterl [(:material-github: Discussion)](https://github.com/UU-IMAU/python-for-lunch/issues/12) - 2025-04-17 TBD - 2025-05-01 On break (EGU) diff --git a/docs/talks/python-tidbits.ipynb b/docs/talks/python-tidbits.ipynb new file mode 100644 index 0000000..b33ac23 --- /dev/null +++ b/docs/talks/python-tidbits.ipynb @@ -0,0 +1,2047 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": { + "vscode": { + "languageId": "plaintext" + } + }, + "source": [ + "# Python Tidbits: Small Python tips, tricks, and packages you wish you knew about yesterday\n", + "> by Nick Hodgskin\n", + "\n", + "This talk will mainly be code examples so that we can learn about these Python features by doing. I am using Python 3.12, but these features work in Python 3.6 and above.\n", + "\n", + "Let's get started! We have many examples to go through." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "vscode": { + "languageId": "plaintext" + } + }, + "source": [ + "## Native Python Tricks\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "vscode": { + "languageId": "plaintext" + } + }, + "source": [ + "### f-strings" + ] + }, + { + "cell_type": "code", + "execution_count": 58, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Hello, John! You are 25 years old.\n", + "Hello, Alice! You are 30 years old.\n", + "Hello, Bob! You are 25 years old.\n", + "Hello, Charlie! You are 28 years old.\n" + ] + } + ], + "source": [ + "# String concatenation\n", + "name = \"John\"\n", + "age = 25\n", + "print(\"Hello, \" + name + \"! You are \" + str(age) + \" years old.\")\n", + "\n", + "\n", + "# Python 2: % syntax\n", + "name = \"Alice\"\n", + "age = 30\n", + "greeting = \"Hello, %s! You are %d years old.\" % (name, age)\n", + "print(greeting)\n", + "\n", + "# Python 3: .format() syntax\n", + "name = \"Bob\"\n", + "age = 25\n", + "greeting = \"Hello, {}! You are {} years old.\".format(name, age)\n", + "print(greeting)\n", + "\n", + "# Python 3.6+: f-strings (the best!)\n", + "name = \"Charlie\"\n", + "age = 28\n", + "greeting = f\"Hello, {name}! You are {age} years old.\"\n", + "print(greeting)" + ] + }, + { + "cell_type": "code", + "execution_count": 59, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "The sum of 5 and 10 is 15.\n", + "The product of 5 and 10 is 50.\n" + ] + } + ], + "source": [ + "# Bonus: f-strings can evaluate expressions inline\n", + "a = 5\n", + "b = 10\n", + "result = f\"The sum of {a} and {b} is {a + b}.\"\n", + "print(result)\n", + "\n", + "\n", + "def multiply(x, y):\n", + " return x * y\n", + "\n", + "a = 5\n", + "b = 10\n", + "result = f\"The product of {a} and {b} is {multiply(a, b)}.\"\n", + "print(result)" + ] + }, + { + "cell_type": "code", + "execution_count": 60, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Pi rounded to 2 decimal places: 3.14\n", + "Earth's circumference (4 decimal places): 4.0030e+07 meters\n", + "Earth's circumference (4 significant digits): 4.003e+07 meters\n" + ] + } + ], + "source": [ + "# Bonus: f-strings support formatting options\n", + "pi = 3.14159265\n", + "formatted_pi = f\"Pi rounded to 2 decimal places: {pi:.2f}\"\n", + "print(formatted_pi)\n", + "\n", + "radius = 6_371_000 # 6,371 km in meters\n", + "circumference = 2 * pi * radius\n", + "print(f\"Earth's circumference (4 decimal places): {circumference:.4e} meters\")\n", + "print(f\"Earth's circumference (4 significant digits): {circumference:.4g} meters\")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "You can find out more about formatting options at [W3Schools: Python String Formatting](https://www.w3schools.com/python/python_string_formatting.asp).\n", + "\n", + "Quick reference of format specified (mentioned in the article):\n", + "\n", + "```\n", + ":<\t\tLeft aligns the result (within the available space)\n", + ":>\t\tRight aligns the result (within the available space)\n", + ":^\t\tCenter aligns the result (within the available space)\n", + ":=\t\tPlaces the sign to the left most position\n", + ":+\t\tUse a plus sign to indicate if the result is positive or negative\n", + ":-\t\tUse a minus sign for negative values only\n", + ": \t\tUse a space to insert an extra space before positive numbers (and a minus sign before negative numbers)\n", + ":,\t\tUse a comma as a thousand separator\n", + ":_\t\tUse a underscore as a thousand separator\n", + ":b\t\tBinary format\n", + ":c\t\tConverts the value into the corresponding Unicode character\n", + ":d\t\tDecimal format\n", + ":e\t\tScientific format, with a lower case e\n", + ":E\t\tScientific format, with an upper case E\n", + ":f\t\tFix point number format\n", + ":F\t\tFix point number format, in uppercase format (show inf and nan as INF and NAN)\n", + ":g\t\tGeneral format\n", + ":G\t\tGeneral format (using a upper case E for scientific notations)\n", + ":o\t\tOctal format\n", + ":x\t\tHex format, lower case\n", + ":X\t\tHex format, upper case\n", + ":n\t\tNumber format\n", + ":%\t\tPercentage format\n", + "```\n", + "\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### enumerate and zip" + ] + }, + { + "cell_type": "code", + "execution_count": 61, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "0 apple\n", + "1 banana\n", + "2 cherry\n", + "0 apple\n", + "1 banana\n", + "2 cherry\n", + "1 apple\n", + "2 banana\n", + "3 cherry\n" + ] + } + ], + "source": [ + "# Use enumerate to loop over an iterable while keeping track of the index.\n", + "\n", + "# Without enumerate\n", + "fruits = ['apple', 'banana', 'cherry']\n", + "for i in range(len(fruits)):\n", + " print(i, fruits[i])\n", + "\n", + "# With enumerate\n", + "for i, fruit in enumerate(fruits):\n", + " print(i, fruit)\n", + "\n", + "# Bonus: Start indexing at a custom number\n", + "for i, fruit in enumerate(fruits, start=1):\n", + " print(i, fruit)" + ] + }, + { + "cell_type": "code", + "execution_count": 62, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\n", + "[(0, 'apple'), (1, 'banana'), (2, 'cherry')]\n" + ] + } + ], + "source": [ + "# under the hood\n", + "print(enumerate(fruits))\n", + "print(list(enumerate(fruits)))" + ] + }, + { + "cell_type": "code", + "execution_count": 63, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Alice 85\n", + "Bob 90\n", + "Charlie 95\n", + "Alice 85\n", + "Bob 90\n", + "Charlie 95\n" + ] + } + ], + "source": [ + "# Use zip to loop over multiple iterables in parallel.\n", + "\n", + "# Without zip\n", + "names = ['Alice', 'Bob', 'Charlie']\n", + "scores = [85, 90, 95]\n", + "for i in range(len(names)):\n", + " print(names[i], scores[i])\n", + "\n", + "# With zip\n", + "for name, score in zip(names, scores):\n", + " print(name, score)" + ] + }, + { + "cell_type": "code", + "execution_count": 64, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "pairs: [('Alice', 85), ('Bob', 90), ('Charlie', 95)]\n", + "names_unzipped: ('Alice', 'Bob', 'Charlie')\n", + "scores_unzipped: (85, 90, 95)\n" + ] + } + ], + "source": [ + "# Bonus: Unzipping\n", + "pairs = list(zip(names, scores))\n", + "print('pairs:', pairs)\n", + "names_unzipped, scores_unzipped = zip(*pairs)\n", + "print(\"names_unzipped:\", names_unzipped)\n", + "print(\"scores_unzipped:\", scores_unzipped)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### list comprehensions\n" + ] + }, + { + "cell_type": "code", + "execution_count": 65, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "[1, 4, 9, 16, 25]\n", + "[1, 4, 9, 16, 25]\n" + ] + } + ], + "source": [ + "numbers = [1, 2, 3, 4, 5]\n", + "\n", + "# Example 1: Basic list comprehension\n", + "# Squaring numbers in a list\n", + "\n", + "# using a for loop\n", + "squares = []\n", + "for x in numbers:\n", + " squares.append(x**2)\n", + "print(squares)\n", + "\n", + "# using a list comprehension\n", + "squares = [x**2 for x in numbers]\n", + "print(squares)" + ] + }, + { + "cell_type": "code", + "execution_count": 66, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "[2, 4]\n", + "[2, 4]\n" + ] + } + ], + "source": [ + "# Example 2: Using `if` to filter elements\n", + "# Keeping only even numbers\n", + "\n", + "# Using a for loop\n", + "evens = []\n", + "for x in numbers:\n", + " if x % 2 == 0:\n", + " evens.append(x)\n", + "print(evens)\n", + "\n", + "# list comprehension\n", + "evens = [x for x in numbers if x % 2 == 0]\n", + "print(evens)" + ] + }, + { + "cell_type": "code", + "execution_count": 67, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "[-1, 2, -1, 4, -1]\n", + "[-1, 2, -1, 4, -1]\n" + ] + } + ], + "source": [ + "# Example 3: Using `if` and `else` in a list comprehension\n", + "# Replacing odd numbers with -1\n", + "\n", + "# Using a for loop\n", + "processed = []\n", + "for x in numbers:\n", + " if x % 2 == 0:\n", + " processed.append(x)\n", + " else:\n", + " processed.append(-1)\n", + "print(processed)\n", + "\n", + "# list comprehension\n", + "processed = [x if x % 2 == 0 else -1 for x in numbers]\n", + "print(processed)" + ] + }, + { + "cell_type": "code", + "execution_count": 68, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "[3.2, 0.0, 4.7, 5.6]\n" + ] + } + ], + "source": [ + "# Bonus: Filtering out negative values from data\n", + "data = [3.2, -1.5, 0.0, 4.7, -2.3, 5.6]\n", + "cleaned_data = [x for x in data if x >= 0]\n", + "print(cleaned_data)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "vscode": { + "languageId": "plaintext" + } + }, + "source": [ + "### sets" + ] + }, + { + "cell_type": "code", + "execution_count": 115, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "unique_numbers: {1, 2, 3, 4, 5}\n", + "unique_numbers (added 6): {1, 2, 3, 4, 5, 6}\n", + "unique_numbers (added 3) {1, 2, 3, 4, 5, 6}\n", + "data_with_duplicates: [5, 1, 2, 2, 3, 4, 4]\n", + "unique_data: [1, 2, 3, 4, 5]\n" + ] + } + ], + "source": [ + "# Creating a set\n", + "unique_numbers = {1, 2, 3, 4, 5}\n", + "print(\"unique_numbers:\", unique_numbers)\n", + "\n", + "# Adding elements to a set\n", + "unique_numbers.add(6)\n", + "print(\"unique_numbers (added 6):\", unique_numbers)\n", + "\n", + "# Sets automatically handle duplicates\n", + "unique_numbers.add(3)\n", + "print(\"unique_numbers (added 3)\", unique_numbers)\n", + "\n", + "# Using sets to remove duplicates from a list\n", + "data_with_duplicates = [5, 1, 2, 2, 3, 4, 4]\n", + "print(\"data_with_duplicates:\", data_with_duplicates)\n", + "unique_data = list(set(data_with_duplicates))\n", + "print(\"unique_data:\", unique_data) # Order not preserved" + ] + }, + { + "cell_type": "code", + "execution_count": 70, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "A: {1, 2, 3, 4, 5}\n", + "B: {4, 5, 6, 7, 8}\n", + "Union (set_a | set_b): {1, 2, 3, 4, 5, 6, 7, 8}\n", + "Difference (set_a - set_b): {1, 2, 3}\n", + "Intersection (set_a & set_b): {4, 5}\n", + "Symmetric Difference (set_a ^ set_b): {1, 2, 3, 6, 7, 8}\n" + ] + } + ], + "source": [ + "# Set operations\n", + "\n", + "# Define two sets\n", + "set_a = {1, 2, 3, 4, 5}\n", + "set_b = {4, 5, 6, 7, 8}\n", + "print(\"A:\", set_a)\n", + "print(\"B:\", set_b)\n", + "\n", + "# Union: Combine elements from both sets (no duplicates)\n", + "union_set = set_a | set_b # or set_a.union(set_b)\n", + "print(\"Union (set_a | set_b):\", union_set)\n", + "\n", + "# Difference: Elements in set_a but not in set_b\n", + "difference_set = set_a - set_b # or set_a.difference(set_b)\n", + "print(\"Difference (set_a - set_b):\", difference_set)\n", + "\n", + "# Intersection: Elements common to both sets\n", + "intersection_set = set_a & set_b # or set_a.intersection(set_b)\n", + "print(\"Intersection (set_a & set_b):\", intersection_set)\n", + "\n", + "# Symmetric Difference: Elements in either set but not in both\n", + "symmetric_diff_set = set_a ^ set_b # or set_a.symmetric_difference(set_b)\n", + "print(\"Symmetric Difference (set_a ^ set_b):\", symmetric_diff_set)" + ] + }, + { + "cell_type": "code", + "execution_count": 71, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Unique elements in either dataset: {20, 70, 10, 60}\n" + ] + } + ], + "source": [ + "# Practical example: Finding unique elements in two datasets\n", + "data_1 = {10, 20, 30, 40, 50}\n", + "data_2 = {30, 40, 50, 60, 70}\n", + "\n", + "# Unique elements in either dataset\n", + "unique_elements = data_1 ^ data_2\n", + "print(\"Unique elements in either dataset:\", unique_elements)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "vscode": { + "languageId": "plaintext" + } + }, + "source": [ + "### getting help straight from Python (dir(), help(), locals())\n", + "\n", + "\n" + ] + }, + { + "cell_type": "code", + "execution_count": 72, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "['__add__',\n", + " '__class__',\n", + " '__class_getitem__',\n", + " '__contains__',\n", + " '__delattr__',\n", + " '__delitem__',\n", + " '__dir__',\n", + " '__doc__',\n", + " '__eq__',\n", + " '__format__',\n", + " '__ge__',\n", + " '__getattribute__',\n", + " '__getitem__',\n", + " '__getstate__',\n", + " '__gt__',\n", + " '__hash__',\n", + " '__iadd__',\n", + " '__imul__',\n", + " '__init__',\n", + " '__init_subclass__',\n", + " '__iter__',\n", + " '__le__',\n", + " '__len__',\n", + " '__lt__',\n", + " '__mul__',\n", + " '__ne__',\n", + " '__new__',\n", + " '__reduce__',\n", + " '__reduce_ex__',\n", + " '__repr__',\n", + " '__reversed__',\n", + " '__rmul__',\n", + " '__setattr__',\n", + " '__setitem__',\n", + " '__sizeof__',\n", + " '__str__',\n", + " '__subclasshook__',\n", + " 'append',\n", + " 'clear',\n", + " 'copy',\n", + " 'count',\n", + " 'extend',\n", + " 'index',\n", + " 'insert',\n", + " 'pop',\n", + " 'remove',\n", + " 'reverse',\n", + " 'sort']" + ] + }, + "execution_count": 72, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# 1. Listing available methods and attributes with `dir()`\n", + "my_list = [1, 2, 3]\n", + "dir(my_list) # Shows all methods and attributes of the list object" + ] + }, + { + "cell_type": "code", + "execution_count": 73, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Help on built-in function pop:\n", + "\n", + "pop(index=-1, /) method of builtins.list instance\n", + " Remove and return item at index (default last).\n", + "\n", + " Raises IndexError if list is empty or index is out of range.\n", + "\n" + ] + } + ], + "source": [ + "# 2. Getting detailed help with `help()`\n", + "help(my_list.pop) # Displays documentation for the `append` method" + ] + }, + { + "cell_type": "code", + "execution_count": 74, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "List before: [1, 2, 3]\n", + "List after: [2, 3]\n", + "Popped item: 1\n" + ] + } + ], + "source": [ + "print(\"List before:\", my_list)\n", + "popped = my_list.pop(0)\n", + "print(\"List after:\", my_list)\n", + "print(\"Popped item:\", popped)" + ] + }, + { + "cell_type": "code", + "execution_count": 75, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "{'x': 10, 'y': 20}\n" + ] + } + ], + "source": [ + "\n", + "# 3. Inspecting local variables with `locals()`\n", + "def example_function():\n", + " x = 10\n", + " y = 20\n", + " print(locals()) # Shows all local variables in the current scope\n", + " # globals() would do the same but for global variables\n", + "\n", + "example_function()" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "vscode": { + "languageId": "plaintext" + } + }, + "source": [ + "### advanced sorting using keys" + ] + }, + { + "cell_type": "code", + "execution_count": 76, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "list (unsorted): [2, 1, 3, 6, 5, 4]\n", + "list (sorted): [1, 2, 3, 4, 5, 6]\n" + ] + } + ], + "source": [ + "# normal sorting\n", + "lst = [2, 1, 3, 6, 5, 4]\n", + "print(\"list (unsorted):\", lst)\n", + "lst.sort()\n", + "print(\"list (sorted):\", lst)" + ] + }, + { + "cell_type": "code", + "execution_count": 77, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "data (unsorted): [(1, 20), (3, 15), (2, 25), (4, 10)]\n", + "data (sorted by the second element): [(4, 10), (3, 15), (1, 20), (2, 25)]\n", + "data (unsorted): [(1, 20), (3, 15), (2, 25), (4, 10)]\n", + "data (sorted by the second element): [(4, 10), (3, 15), (1, 20), (2, 25)]\n" + ] + } + ], + "source": [ + "# Example: Sorting a list of tuples by the second element\n", + "def return_second_element(x):\n", + " return x[1]\n", + "data = [(1, 20), (3, 15), (2, 25), (4, 10)]\n", + "print(\"data (unsorted):\", data)\n", + "sorted_data = sorted(data, key=return_second_element)\n", + "print(\"data (sorted by the second element):\", sorted_data)\n", + "\n", + "\n", + "# ...using an inline lambda function\n", + "data = [(1, 20), (3, 15), (2, 25), (4, 10)]\n", + "print(\"data (unsorted):\", data)\n", + "sorted_data = sorted(data, key=lambda x: x[1])\n", + "print(\"data (sorted by the second element):\", sorted_data)" + ] + }, + { + "cell_type": "code", + "execution_count": 78, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "data (unsorted): [{'name': 'Alice', 'age': 25}, {'name': 'Bob', 'age': 20}, {'name': 'Charlie', 'age': 30}]\n", + "data (sorted by age): [{'name': 'Alice', 'age': 25}, {'name': 'Bob', 'age': 20}, {'name': 'Charlie', 'age': 30}]\n", + "words (unsorted): ['apple', 'banana', 'kiwi', 'cherry']\n", + "words (sorted by length): ['kiwi', 'apple', 'banana', 'cherry']\n", + "words (sorted reverse by length): ['banana', 'cherry', 'apple', 'kiwi']\n" + ] + } + ], + "source": [ + "# Example: Sorting a list of dictionaries by a specific key\n", + "data = [{'name': 'Alice', 'age': 25}, {'name': 'Bob', 'age': 20}, {'name': 'Charlie', 'age': 30}]\n", + "print(\"data (unsorted):\", data)\n", + "sorted_data = sorted(data, key=lambda x: x['age'])\n", + "print(\"data (sorted by age):\", data)\n", + "\n", + "# Example: Sorting strings by their length\n", + "words = ['apple', 'banana', 'kiwi', 'cherry']\n", + "print(\"words (unsorted):\", words)\n", + "sorted_words = sorted(words, key=len)\n", + "print(\"words (sorted by length):\", sorted_words)\n", + "sorted_words = sorted(words, key=len, reverse=True)\n", + "print(\"words (sorted reverse by length):\", sorted_words)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "vscode": { + "languageId": "plaintext" + } + }, + "source": [ + "### filter and map\n", + "*Filter and map aren't necessary to know - you can get away with for loops - but it's an alternative way of doing things that may be more readable/faster for your use case.*\n" + ] + }, + { + "cell_type": "code", + "execution_count": 79, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "[2, 4, 6, 8, 10]\n" + ] + } + ], + "source": [ + "# Example: Filter even numbers from a list\n", + "numbers = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]\n", + "\n", + "# Using filter with a lambda function\n", + "evens = filter(lambda x: x % 2 == 0, numbers)\n", + "print(list(evens)) # note its `filter(function, iterable)`, and note that `filter` returns an iterator (not a list - hence the `list()` call)" + ] + }, + { + "cell_type": "code", + "execution_count": 80, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "[1, 4, 9, 16, 25]\n" + ] + } + ], + "source": [ + "# Example: Square all numbers in a list\n", + "numbers = [1, 2, 3, 4, 5]\n", + "\n", + "# Using map with a lambda function\n", + "squared = map(lambda x: x**2, numbers)\n", + "print(list(squared))" + ] + }, + { + "cell_type": "code", + "execution_count": 81, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "[4, 16, 36, 64, 100]\n" + ] + } + ], + "source": [ + "# Example: Square only even numbers\n", + "numbers = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]\n", + "\n", + "# Filter even numbers, then square them\n", + "result = map(lambda x: x**2, filter(lambda x: x % 2 == 0, numbers))\n", + "print(list(result))" + ] + }, + { + "cell_type": "code", + "execution_count": 82, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "[72.5, 64.94, 77.0, 67.64, 86.36]\n" + ] + } + ], + "source": [ + "# A more complex example: Converting Celsius to Fahrenheit\n", + "\n", + "# Raw data: Some values are invalid (None or outliers)\n", + "data = [22.5, None, 18.3, 1000, 25.0, None, 19.8, 30.2, -999]\n", + "\n", + "# Step 1: Filter out invalid values (None and outliers)\n", + "valid_data = filter(lambda x: x is not None and -50 <= x <= 50, data)\n", + "\n", + "# Step 2: Convert Celsius to Fahrenheit\n", + "def c_to_f(celsius):\n", + " return celsius * 9/5 + 32\n", + "\n", + "fahrenheit_data = map(c_to_f, valid_data)\n", + "\n", + "# Step 3: Round to 2 decimal places\n", + "rounded_data = map(lambda x: round(x, 2), fahrenheit_data)\n", + "\n", + "# Final result\n", + "print(list(rounded_data))" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Why is this powerful?:\n", + "\n", + "- Readability: Each step is clearly separated and easy to understand.\n", + "- Lazy Evaluation: filter and map process data on-demand, which is memory-efficient for large datasets.\n", + "- Functional Style: Avoids mutable state and side effects, making the code more predictable." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "vscode": { + "languageId": "plaintext" + } + }, + "source": [ + "## Python packages: Standard Library" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "vscode": { + "languageId": "plaintext" + } + }, + "source": [ + "### pprint\n" + ] + }, + { + "cell_type": "code", + "execution_count": 83, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "[[{'experiment': {'name': 'North Atlantic', 'samples': [{'id': 1, 'temperature': 298.15, 'results': [0.1, 0.2, 0.3]}, {'id': 2, 'temperature': 310.15, 'results': [0.15, 0.25, 0.35]}], 'metadata': {'author': 'Dr. Smith', 'date': '2023-10-01', 'tags': ['biophysics', 'simulation']}}}]]\n" + ] + } + ], + "source": [ + "# Example: A messy nested data structure\n", + "data = [[{\n", + " \"experiment\": {\n", + " \"name\": \"North Atlantic\",\n", + " \"samples\": [\n", + " {\"id\": 1, \"temperature\": 298.15, \"results\": [0.1, 0.2, 0.3]},\n", + " {\"id\": 2, \"temperature\": 310.15, \"results\": [0.15, 0.25, 0.35]},\n", + " ],\n", + " \"metadata\": {\n", + " \"author\": \"Dr. Smith\",\n", + " \"date\": \"2023-10-01\",\n", + " \"tags\": [\"biophysics\", \"simulation\"],\n", + " },\n", + " }\n", + "}]]\n", + "\n", + "# Standard print output (hard to read)\n", + "print(data)\n" + ] + }, + { + "cell_type": "code", + "execution_count": 84, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "[[{'experiment': {'metadata': {'author': 'Dr. Smith',\n", + " 'date': '2023-10-01',\n", + " 'tags': ['biophysics', 'simulation']},\n", + " 'name': 'North Atlantic',\n", + " 'samples': [{'id': 1,\n", + " 'results': [0.1, 0.2, 0.3],\n", + " 'temperature': 298.15},\n", + " {'id': 2,\n", + " 'results': [0.15, 0.25, 0.35],\n", + " 'temperature': 310.15}]}}]]\n" + ] + } + ], + "source": [ + "from pprint import pprint\n", + "\n", + "# Pretty-printed output (clean and readable)\n", + "pprint(data)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "vscode": { + "languageId": "plaintext" + } + }, + "source": [ + "### pathlib\n", + "See [pathlib docs](https://docs.python.org/3/library/pathlib.html) for more info." + ] + }, + { + "cell_type": "code", + "execution_count": 85, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Sample data\n", + "\n", + "Found file: experiment_results.csv\n", + "if you want the full path: /Users/Hodgs004/coding/repos/python-for-lunch/docs/talks/data/experiment_results.csv\n", + "if you want the stem: experiment_results\n", + "if you want the extension: .csv\n" + ] + } + ], + "source": [ + "from pathlib import Path\n", + "\n", + "# Create a Path object\n", + "data_dir = Path(\"data\") # Represents a directory named \"data\"\n", + "\n", + "# Check if the directory exists\n", + "if not data_dir.exists():\n", + " data_dir.mkdir() # Create the directory if it doesn't exist\n", + "\n", + "# Create a file path\n", + "data_file = data_dir / \"experiment_results.csv\" # Use / to join paths\n", + "\n", + "# Write to the file\n", + "data_file.write_text(\"Sample data\\n\") # Write text to the file\n", + "\n", + "# Read from the file\n", + "print(data_file.read_text()) # Read text from the file\n", + "\n", + "# Iterate over files in a directory\n", + "for file in data_dir.glob(\"*.csv\"): # Find all CSV files\n", + " print(f\"Found file: {file.name}\")\n", + "\n", + "print(\"if you want the full path:\", data_file.resolve())\n", + "print(\"if you want the stem:\", data_file.stem)\n", + "print(\"if you want the extension:\", data_file.suffix)\n" + ] + }, + { + "cell_type": "code", + "execution_count": 86, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "data/experiment_results.csv\n" + ] + } + ], + "source": [ + "# Path objects can be passed to many functions from external libraries.\n", + "# If they *need* a string, you can do\n", + "print(str(data_file))" + ] + }, + { + "cell_type": "code", + "execution_count": 87, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "['__bytes__', '__class__', '__delattr__', '__dir__', '__doc__', '__enter__', '__eq__', '__exit__', '__format__', '__fspath__', '__ge__', '__getattribute__', '__getstate__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__le__', '__lt__', '__module__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__rtruediv__', '__setattr__', '__sizeof__', '__slots__', '__str__', '__subclasshook__', '__truediv__', '_drv', '_flavour', '_format_parsed_parts', '_from_parsed_parts', '_hash', '_lines', '_lines_cached', '_load_parts', '_make_child_relpath', '_parse_path', '_parts_normcase', '_parts_normcase_cached', '_raw_paths', '_root', '_scandir', '_str', '_str_normcase', '_str_normcase_cached', '_tail', '_tail_cached', 'absolute', 'anchor', 'as_posix', 'as_uri', 'chmod', 'cwd', 'drive', 'exists', 'expanduser', 'glob', 'group', 'hardlink_to', 'home', 'is_absolute', 'is_block_device', 'is_char_device', 'is_dir', 'is_fifo', 'is_file', 'is_junction', 'is_mount', 'is_relative_to', 'is_reserved', 'is_socket', 'is_symlink', 'iterdir', 'joinpath', 'lchmod', 'lstat', 'match', 'mkdir', 'name', 'open', 'owner', 'parent', 'parents', 'parts', 'read_bytes', 'read_text', 'readlink', 'relative_to', 'rename', 'replace', 'resolve', 'rglob', 'rmdir', 'root', 'samefile', 'stat', 'stem', 'suffix', 'suffixes', 'symlink_to', 'touch', 'unlink', 'walk', 'with_name', 'with_segments', 'with_stem', 'with_suffix', 'write_bytes', 'write_text']\n" + ] + } + ], + "source": [ + "# let's look at what methods are available\n", + "print(dir(Path)) # hmm, a bit difficult to read..." + ] + }, + { + "cell_type": "code", + "execution_count": 88, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "['absolute',\n", + " 'anchor',\n", + " 'as_posix',\n", + " 'as_uri',\n", + " 'chmod',\n", + " 'cwd',\n", + " 'drive',\n", + " 'exists',\n", + " 'expanduser',\n", + " 'glob',\n", + " 'group',\n", + " 'hardlink_to',\n", + " 'home',\n", + " 'is_absolute',\n", + " 'is_block_device',\n", + " 'is_char_device',\n", + " 'is_dir',\n", + " 'is_fifo',\n", + " 'is_file',\n", + " 'is_junction',\n", + " 'is_mount',\n", + " 'is_relative_to',\n", + " 'is_reserved',\n", + " 'is_socket',\n", + " 'is_symlink',\n", + " 'iterdir',\n", + " 'joinpath',\n", + " 'lchmod',\n", + " 'lstat',\n", + " 'match',\n", + " 'mkdir',\n", + " 'name',\n", + " 'open',\n", + " 'owner',\n", + " 'parent',\n", + " 'parents',\n", + " 'parts',\n", + " 'read_bytes',\n", + " 'read_text',\n", + " 'readlink',\n", + " 'relative_to',\n", + " 'rename',\n", + " 'replace',\n", + " 'resolve',\n", + " 'rglob',\n", + " 'rmdir',\n", + " 'root',\n", + " 'samefile',\n", + " 'stat',\n", + " 'stem',\n", + " 'suffix',\n", + " 'suffixes',\n", + " 'symlink_to',\n", + " 'touch',\n", + " 'unlink',\n", + " 'walk',\n", + " 'with_name',\n", + " 'with_segments',\n", + " 'with_stem',\n", + " 'with_suffix',\n", + " 'write_bytes',\n", + " 'write_text']" + ] + }, + "execution_count": 88, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "def is_public(name):\n", + " is_private = name.startswith(\"_\")\n", + " return not is_private\n", + "\n", + "list(filter(is_public, dir(Path)))\n", + "\n", + "# or\n", + "[name for name in dir(Path) if is_public(name)]" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "vscode": { + "languageId": "plaintext" + } + }, + "source": [ + "### datetime" + ] + }, + { + "cell_type": "code", + "execution_count": 89, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Parsed Date: 2023-10-15 14:30:00 (object of type )\n", + "Formatted Date: Sunday, October 15, 2023 at 02:30 PM (object of type )\n", + "Time Difference: 7 days, 3:00:00 (object of type )\n", + "Current Time: 2025-03-17 16:25:10.364997\n" + ] + } + ], + "source": [ + "from datetime import datetime, timedelta\n", + "\n", + "# 1. Parsing a string into a datetime object\n", + "date_str = \"2023-10-15 14:30:00\"\n", + "parsed_date = datetime.strptime(date_str, \"%Y-%m-%d %H:%M:%S\")\n", + "print(f\"Parsed Date: {parsed_date} (object of type {type(parsed_date)})\")\n", + "\n", + "# 2. Formatting a datetime object into a string\n", + "formatted_date = parsed_date.strftime(\"%A, %B %d, %Y at %I:%M %p\")\n", + "print(f\"Formatted Date: {formatted_date} (object of type {type(formatted_date)})\")\n", + "\n", + "# 3. Calculating time differences\n", + "future_date = parsed_date + timedelta(days=7, hours=3)\n", + "time_diff = future_date - parsed_date\n", + "print(f\"Time Difference: {time_diff} (object of type {type(time_diff)})\")\n", + "\n", + "# 4. Getting the current time\n", + "now = datetime.now() # time in UTC\n", + "print(f\"Current Time: {now}\")" + ] + }, + { + "cell_type": "code", + "execution_count": 90, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "New York Time: 2025-03-17 11:25:10.364997-04:00\n" + ] + } + ], + "source": [ + "# Bonus: Working with timezones (requires `pytz` or `zoneinfo` in Python 3.9+)\n", + "from zoneinfo import ZoneInfo # Python 3.9+\n", + "ny_time = now.astimezone(ZoneInfo(\"America/New_York\"))\n", + "print(f\"New York Time: {ny_time}\")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### itertools - tools to work with iterators\n", + "*See [docs](https://docs.python.org/3/library/itertools.html) for more.*\n", + "\n", + "What is an iterator?:\n", + "> An iterator is an object that contains a countable number of values.\n", + "\n", + "In Python, an iterator is an object which implements the iterator protocol (i.e., it tells Python how to get from the current value to the next value). Iterators allow for efficient looping and processing of large datasets. " + ] + }, + { + "cell_type": "code", + "execution_count": 91, + "metadata": {}, + "outputs": [], + "source": [ + "import itertools" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "#### itertools.chain\n", + "Use chain to seamlessly combine multiple iterables into a single iterator." + ] + }, + { + "cell_type": "code", + "execution_count": 92, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "[1, 2, 3, 'a', 'b', 'c']\n" + ] + } + ], + "source": [ + "list1 = [1, 2, 3]\n", + "list2 = ['a', 'b', 'c']\n", + "combined = itertools.chain(list1, list2)\n", + "\n", + "print(list(combined))" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "#### itertools.product – Cartesian Product\n", + "Generate all possible combinations (Cartesian product) of input iterables." + ] + }, + { + "cell_type": "code", + "execution_count": 93, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "[('red', 'S'), ('red', 'M'), ('red', 'L'), ('green', 'S'), ('green', 'M'), ('green', 'L')]\n" + ] + } + ], + "source": [ + "colors = ['red', 'green']\n", + "sizes = ['S', 'M', 'L']\n", + "\n", + "combinations = itertools.product(colors, sizes)\n", + "print(list(combinations))" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "#### itertools.combinations – Generate Combinations\n", + "Generate all possible combinations of a specific length from an iterable." + ] + }, + { + "cell_type": "code", + "execution_count": 94, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "[('a', 'b'), ('a', 'c'), ('b', 'c')]\n" + ] + } + ], + "source": [ + "data = ['a', 'b', 'c']\n", + "combinations = itertools.combinations(data, 2)\n", + "\n", + "print(list(combinations))" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "#### itertools.permutations - Generate Permutations\n", + "Generate all possible permutations of an iterable." + ] + }, + { + "cell_type": "code", + "execution_count": 95, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "[('a', 'b', 'c'), ('a', 'c', 'b'), ('b', 'a', 'c'), ('b', 'c', 'a'), ('c', 'a', 'b'), ('c', 'b', 'a')]\n" + ] + } + ], + "source": [ + "data = ['a', 'b', 'c']\n", + "perms = itertools.permutations(data)\n", + "\n", + "print(list(perms))" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "#### itertools.islice – Slice Iterators\n", + "Slice an iterator without converting it to a list first." + ] + }, + { + "cell_type": "code", + "execution_count": 96, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "[2, 3, 4, 5]\n" + ] + } + ], + "source": [ + "data = range(10)\n", + "sliced = itertools.islice(data, 2, 6) # Start at index 2, end at index 6\n", + "\n", + "print(list(sliced))" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "#### itertools.groupby – Group Data" + ] + }, + { + "cell_type": "code", + "execution_count": 97, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "a [('a', 1), ('a', 2)]\n", + "b [('b', 3), ('b', 4)]\n", + "c [('c', 5)]\n" + ] + } + ], + "source": [ + "data = [('a', 1), ('a', 2), ('b', 3), ('b', 4), ('c', 5)]\n", + "grouped = itertools.groupby(data, key=lambda x: x[0])\n", + "\n", + "for key, group in grouped:\n", + " print(key, list(group))" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "#### itertools.cycle – Infinite Cycling\n", + "Cycle through an iterable indefinitely." + ] + }, + { + "cell_type": "code", + "execution_count": 98, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "red\n", + "green\n", + "blue\n", + "red\n", + "green\n" + ] + } + ], + "source": [ + "import itertools\n", + "\n", + "colors = ['red', 'green', 'blue']\n", + "cycled = itertools.cycle(colors)\n", + "\n", + "for _ in range(5):\n", + " print(next(cycled))" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "#### itertools.tee – Duplicate an Iterator\n", + "Split an iterator into multiple independent iterators." + ] + }, + { + "cell_type": "code", + "execution_count": 99, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "[0, 1, 2, 3, 4]\n", + "[0, 1, 2, 3, 4]\n" + ] + } + ], + "source": [ + "import itertools\n", + "\n", + "data = iter(range(5))\n", + "iter1, iter2 = itertools.tee(data, 2)\n", + "\n", + "print(list(iter1))\n", + "print(list(iter2))" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "#### more itertools" + ] + }, + { + "cell_type": "code", + "execution_count": 100, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "['accumulate',\n", + " 'batched',\n", + " 'chain',\n", + " 'combinations',\n", + " 'combinations_with_replacement',\n", + " 'compress',\n", + " 'count',\n", + " 'cycle',\n", + " 'dropwhile',\n", + " 'filterfalse',\n", + " 'groupby',\n", + " 'islice',\n", + " 'pairwise',\n", + " 'permutations',\n", + " 'product',\n", + " 'repeat',\n", + " 'starmap',\n", + " 'takewhile',\n", + " 'tee',\n", + " 'zip_longest']" + ] + }, + "execution_count": 100, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "[name for name in dir(itertools) if is_public(name)]" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### functools - tools to work with functions\n", + "*Here we just cover `partial` and `cache`. See [docs](https://docs.python.org/3/library/functools.html) for more.*\n" + ] + }, + { + "cell_type": "code", + "execution_count": 101, + "metadata": {}, + "outputs": [], + "source": [ + "import functools" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "#### functools.partial\n", + "- Simplifies repetitive function calls with fixed parameters (e.g., fitting curves, transformations).\n", + "- Makes code cleaner and more reusable." + ] + }, + { + "cell_type": "code", + "execution_count": 102, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Help on class partial in module functools:\n", + "\n", + "class partial(builtins.object)\n", + " | partial(func, *args, **keywords) - new function with partial application\n", + " | of the given arguments and keywords.\n", + " |\n", + " | Methods defined here:\n", + " |\n", + " | __call__(self, /, *args, **kwargs)\n", + " | Call self as a function.\n", + " |\n", + " | __delattr__(self, name, /)\n", + " | Implement delattr(self, name).\n", + " |\n", + " | __getattribute__(self, name, /)\n", + " | Return getattr(self, name).\n", + " |\n", + " | __reduce__(...)\n", + " | Helper for pickle.\n", + " |\n", + " | __repr__(self, /)\n", + " | Return repr(self).\n", + " |\n", + " | __setattr__(self, name, value, /)\n", + " | Implement setattr(self, name, value).\n", + " |\n", + " | __setstate__(...)\n", + " |\n", + " | ----------------------------------------------------------------------\n", + " | Class methods defined here:\n", + " |\n", + " | __class_getitem__(...)\n", + " | See PEP 585\n", + " |\n", + " | ----------------------------------------------------------------------\n", + " | Static methods defined here:\n", + " |\n", + " | __new__(*args, **kwargs)\n", + " | Create and return a new object. See help(type) for accurate signature.\n", + " |\n", + " | ----------------------------------------------------------------------\n", + " | Data descriptors defined here:\n", + " |\n", + " | __dict__\n", + " |\n", + " | __vectorcalloffset__\n", + " |\n", + " | args\n", + " | tuple of arguments to future partial calls\n", + " |\n", + " | func\n", + " | function object to use in future partial calls\n", + " |\n", + " | keywords\n", + " | dictionary of keyword arguments to future partial calls\n", + "\n" + ] + } + ], + "source": [ + "help(functools.partial)" + ] + }, + { + "cell_type": "code", + "execution_count": 103, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "25\n", + "27\n" + ] + } + ], + "source": [ + "# Original function\n", + "def power(base, exponent):\n", + " return base ** exponent\n", + "\n", + "# Create a new function with `base` fixed to 2\n", + "square = functools.partial(power, exponent=2)\n", + "cube = functools.partial(power, exponent=3)\n", + "\n", + "print(square(5)) # 25\n", + "print(cube(3)) # 27" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "#### functools.lru_cache\n", + "- Speeds up recursive or repetitive computations (e.g., dynamic programming, simulations)\n", + "- Reduces redundant calculations in expensive functions\n", + "- Should only be used on functions that are deterministic and idempotent (i.e., no side effects)" + ] + }, + { + "cell_type": "code", + "execution_count": 104, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Help on function lru_cache in module functools:\n", + "\n", + "lru_cache(maxsize=128, typed=False)\n", + " Least-recently-used cache decorator.\n", + "\n", + " If *maxsize* is set to None, the LRU features are disabled and the cache\n", + " can grow without bound.\n", + "\n", + " If *typed* is True, arguments of different types will be cached separately.\n", + " For example, f(3.0) and f(3) will be treated as distinct calls with\n", + " distinct results.\n", + "\n", + " Arguments to the cached function must be hashable.\n", + "\n", + " View the cache statistics named tuple (hits, misses, maxsize, currsize)\n", + " with f.cache_info(). Clear the cache and statistics with f.cache_clear().\n", + " Access the underlying function with f.__wrapped__.\n", + "\n", + " See: https://en.wikipedia.org/wiki/Cache_replacement_policies#Least_recently_used_(LRU)\n", + "\n" + ] + } + ], + "source": [ + "help(functools.lru_cache)" + ] + }, + { + "cell_type": "code", + "execution_count": 105, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "first call with 1, 2: 3\n" + ] + } + ], + "source": [ + "from time import time, sleep\n", + "\n", + "@functools.lru_cache(maxsize=None)\n", + "def some_long_running_function(a, b):\n", + " sleep(2) # Simulate a long computation\n", + " return a + b\n", + "\n", + "print(\"first call with 1, 2:\", some_long_running_function(1, 2)) # Takes 2 seconds" + ] + }, + { + "cell_type": "code", + "execution_count": 106, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "second call with 1, 2: 3\n" + ] + } + ], + "source": [ + "print(\"second call with 1, 2:\", some_long_running_function(1, 2)) # Returns immediately" + ] + }, + { + "cell_type": "code", + "execution_count": 107, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "second call with 2, 4: 6\n" + ] + } + ], + "source": [ + "print(\"second call with 2, 4:\", some_long_running_function(2, 4)) # takes 2 seconds\n" + ] + }, + { + "cell_type": "code", + "execution_count": 108, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Time taken: 7.99 seconds\n", + "Fibonacci(40): 102334155\n", + "Number of function calls: 549755813888\n" + ] + } + ], + "source": [ + "# A more real-world example\n", + "def fibonacci(n):\n", + " \"\"\"Inefficient recursive function to compute Fibonacci number.\n", + " \n", + " fibonacci(5) calls fibonacci(4) and fibonacci(3), but fibonacci(4) also calls fibonacci(3).\n", + " This leads to an exponential number of function calls (2^(n-1) calls to be precise).\n", + " \"\"\"\n", + " if n < 2:\n", + " return n\n", + " return fibonacci(n - 1) + fibonacci(n - 2)\n", + "\n", + "n = 40\n", + "t = time()\n", + "fib = fibonacci(n)\n", + "print(f\"Time taken: {time() - t:.2f} seconds\")\n", + "print(f\"Fibonacci({n}): {fib}\")\n", + "print(f\"Number of function calls: {2**(n-1)}\")" + ] + }, + { + "cell_type": "code", + "execution_count": 109, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Time taken: 0.00 seconds\n", + "Fibonacci(40): 102334155\n" + ] + } + ], + "source": [ + "\n", + "@functools.lru_cache(maxsize=None) # Cache all results (maxsize default is 128)\n", + "def fibonacci(n):\n", + " if n < 2:\n", + " return n\n", + " return fibonacci(n - 1) + fibonacci(n - 2)\n", + "\n", + "n = 40\n", + "t = time()\n", + "fib = fibonacci(n)\n", + "print(f\"Time taken: {time() - t:.2f} seconds\")\n", + "print(f\"Fibonacci({n}): {fib}\")\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "#### functools.reduce" + ] + }, + { + "cell_type": "code", + "execution_count": 110, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Help on built-in function reduce in module _functools:\n", + "\n", + "reduce(...)\n", + " reduce(function, iterable[, initial]) -> value\n", + "\n", + " Apply a function of two arguments cumulatively to the items of a sequence\n", + " or iterable, from left to right, so as to reduce the iterable to a single\n", + " value. For example, reduce(lambda x, y: x+y, [1, 2, 3, 4, 5]) calculates\n", + " ((((1+2)+3)+4)+5). If initial is present, it is placed before the items\n", + " of the iterable in the calculation, and serves as a default when the\n", + " iterable is empty.\n", + "\n" + ] + } + ], + "source": [ + "help(functools.reduce)" + ] + }, + { + "cell_type": "code", + "execution_count": 111, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "120\n" + ] + } + ], + "source": [ + "# Multiply all numbers in a list\n", + "numbers = [1, 2, 3, 4, 5]\n", + "product = functools.reduce(lambda x, y: x * y, numbers)\n", + "\n", + "print(product)" + ] + }, + { + "cell_type": "code", + "execution_count": 112, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "['GenericAlias',\n", + " 'RLock',\n", + " 'WRAPPER_ASSIGNMENTS',\n", + " 'WRAPPER_UPDATES',\n", + " 'cache',\n", + " 'cached_property',\n", + " 'cmp_to_key',\n", + " 'get_cache_token',\n", + " 'lru_cache',\n", + " 'namedtuple',\n", + " 'partial',\n", + " 'partialmethod',\n", + " 'recursive_repr',\n", + " 'reduce',\n", + " 'singledispatch',\n", + " 'singledispatchmethod',\n", + " 'total_ordering',\n", + " 'update_wrapper',\n", + " 'wraps']" + ] + }, + "execution_count": 112, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# interested in other functools stuff? You can Google the public API for usecases...\n", + "[name for name in dir(functools) if is_public(name)]" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Python packages: 3rd Party\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "vscode": { + "languageId": "plaintext" + } + }, + "source": [ + "### tqdm\n", + "\n", + "After installing it using `conda install tqdm` or `pip install tqdm`..." + ] + }, + { + "cell_type": "code", + "execution_count": 113, + "metadata": {}, + "outputs": [ + { + "name": "stderr", + "output_type": "stream", + "text": [ + "100%|██████████| 100/100 [00:10<00:00, 9.47it/s]\n" + ] + } + ], + "source": [ + "from tqdm import tqdm\n", + "\n", + "def run_calculations():\n", + " sleep(0.1) # Simulate a long computation\n", + "\n", + "for _ in tqdm(range(100)):\n", + " run_calculations()" + ] + }, + { + "cell_type": "code", + "execution_count": 114, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "2\n" + ] + } + ], + "source": [ + "# Bonus tip!: Use `_` when assigning variables you don't care about. Good for for loops and unpacking.\n", + "# Example 1: Unpacking values\n", + "data = (1, 2, 3)\n", + "_, y, _ = data\n", + "print(y)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Topics not discussed, and further reading\n", + "\n", + "Things not mentioned in this talk:\n", + "- Testing (using Pytest)\n", + " - this is quite a large topic and could be a talk in itself\n", + "- Jupyter Notebook tips and tricks (+using markdown)\n", + " - this is quite a large topic and could be a talk in itself\n", + "- logging\n", + " - this is a topic that could form part of a talk in itself\n", + "\n", + "\n", + "Check out the rest of the Python standard library for more interesting packages!\n", + "- [Python | The Python Standard Library](https://docs.python.org/3/library/index.html)\n", + "- [Python | Brief tour of the standard library](https://docs.python.org/3/tutorial/stdlib.html)\n" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "base", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.12.7" + } + }, + "nbformat": 4, + "nbformat_minor": 2 +}