{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Lesson 2b: Packages and modules\n", "\n", "The Python Standard Library has lots of built-in **modules** that contain useful functions and data types for doing specific tasks. You can also use modules from outside the standard library. And you will undoubtedly write your own modules!\n", "\n", "A module is contained in a file that ends with `.py`. This file can have **classes**, functions, and other objects. We will not discuss defining your own classes in this class, so your modules will essentially just contain functions.\n", "\n", "A **package** contains several related modules that are all grouped together under one name. We will extensively use the [NumPy](http://www.numpy.org), [SciPy](http://www.scipy.org/), [Pandas](http://pandas.pydata.org), and [Bokeh](http://bokeh.pydata.org) packages, among others, and I'm sure you will also use them beyond. As such, the first module we will consider is NumPy." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Example: I want to compute the mean and median of a list of numbers\n", "\n", "Say I have a list of numbers and I want to compute the mean. This happens all the time; you repeat a measurement multiple times and you want to compute the mean. We could write a function to do this." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "def mean(values):\n", " \"\"\"Compute the mean of a sequence of numbers.\"\"\"\n", " return sum(values) / len(values)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "And it works as expected." ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "3.0\n", "3.275\n" ] } ], "source": [ "print(mean([1, 2, 3, 4, 5]))\n", "print(mean((4.5, 1.2, -1.6, 9.0)))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In addition to the mean, we might also want to compute the median, the standard deviation, etc. These seem like really common tasks. Remember my advice: if you want to do something that seems really common, a good programmer (or a team of them) probably already wrote something to do that. Means, medians, standard deviations, and lots and lots and lots of other numerical things are included in the **Numpy package**. To get access to it, we have to **import** it." ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "import numpy" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "That's it! We now have the `numpy` package available for use. Remember, in Python everything is an object, so if we want to access the methods and attributes, available in the `numpy` module, we use dot syntax. In a Jupyter notebook or in the JupyterLab console, you can type\n", "\n", " numpy.\n", "\n", "(note the dot) and hit tab, and we will see what is available. For Numpy, there is a huge number of options!\n", "\n", "So, let's try to use Numpy's `numpy.mean()` function to compute a mean." ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "3.0\n", "3.275\n" ] } ], "source": [ "print(numpy.mean([1, 2, 3, 4, 5]))\n", "print(numpy.mean((4.5, 1.2, -1.6, 9.0)))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Great! We get the same values! Now, we can use the `numpy.median()` function to compute the median." ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "3.0\n", "2.85\n" ] } ], "source": [ "print(numpy.median([1, 2, 3, 4, 5]))\n", "print(numpy.median((4.5, 1.2, -1.6, 9.0)))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This is nice. It gives the median, including when we have an even number of elements in the sequence of numbers, in which case it automatically interpolates. It is really important to know that it does this interpolation, since if you are not expecting it, it can give unexpected results. So, here is an important piece of advice:\n", "\n", "\n", "