Tutuorial 0a: Python for scientific computing

This tutorial was generated from a Jupyter notebook. You can download the notebook here.

In this tutorial, you will set up a Python computing environment for scientific computing. There are two main ways people do this.

  1. By downloading and installing package by package with tools like apt-get, pip, etc.
  2. By downloading and installing a Python distribution that contains binaries of many of the scientific packages needed. The major distributions of these are Anaconda and Enthought Canopy. Both contain IDEs.

In this class, we will use Anaconda, with its associated package manager, conda. It has recently become the de facto package manager/distribution for scientific use.

Python 2 vs Python 3

We are at an interesting point in Python's history. Python is currently in version 3.5 (as of September 13, 2015). The problem is that Python 3.x is not backwards compatible with Python 2.x. Many scientific packages were written in Python 2.x, and have been very slow to update to Python 3. However, Python 3 is Python's present and future, so all packages eventually need to work in Python 3. Today, most important scientific packages work in Python 3. All of the packages we will use do, so we will use Python 3 in this course.

For those of you who are already using Anaconda with Python 2, you can create a Python 3 environment.

Downloading and installing Anaconda

Downloading and installing Anaconda is simple.

  1. Go to the Anaconda homepage and click "Download Anaconda."
  2. You will be prompted for your email address, which you should provide. You may wish to use your Caltech email address because educational users get some of the non-free goodies in Anaconda (like MKL routines that will increase performance).
  3. Be sure to download Anaconda for Python 3.4 (or 3.5 if you are installing after the 2.5 release). Do not worry if you install Python 3.4; you easily upgrade to 3.5 using conda when it is available.
  4. Follow the on-screen instructions for installation.

That's it! After you do that, you will have a functioning Python distribution.

The conda package manager

conda is a package manager for keeping all of your packages up-to-date. It has plenty of functionality beyond our basic usage in class, which you can learn more about by reading the docs. We will primarily be using conda to install and update packages.

conda works from the command line. To get a command line prompt, do the following.

  • Mac: Fire up the Terminal application. It is typically in the /Applications/Utilities folder. Otherwise, hold down Command-space bar and type "terminal" in the search box, and you can select the Terminal Application.
  • Windows: Fire up PowerShell. To do this, select "Search programs and files" from the Start menu and type "powershell" and hit enter. This works on Windows 7 and presumably also on Windows 8 and 10.
  • Linux: If you're using Linux, it's a good bet you already know how to navigate a terminal.

Now that you have a command line prompt, you can start using conda. The first thing we'll do is update conda itself. To do this, enter the following on the command line:

conda update conda

If conda is out of date and needs to be updated, you will be prompted to perform the update. Just type y, and the update will proceeed.

Now that conda is updated, we'll use it to see what packages are installed. Type the following on the command line:

conda list

This gives a list of all packages and their versions that are installed. Now, we'll update all packages, so type the following on the command line:

conda update --all

You will be prompted to perform all of the updates. They may even be come downgrades. This happens when there are package conflicts where one package requires an earlier version of another. conda is very smart and figures all of this out for you, so you can almost always say "yes" (or "y") to conda when it prompts you.

Finally, we will use conda to install a package that is not included in the Anaconda distribution that we would like to use. We will install Seaborn, which is a nice package for data visualization. To do this, type the following on the command line:

conda install seaborn

You will again be prompted to approve the installation. Go for it! Seaborn is pretty cool.

Installing packages with pip

Some packages are not available through conda for various reasons, perhaps because they have not been submitted to the Anaconda developers or are still under nascent development. You can still install these packages and conda will be aware of them using pip, short for "Python install Python." One package we will use is pybeeswarm, used for making beeswarm plots. To install pybeeswarm, simple enter the following at the command line

pip install pybeeswarm

This will install pybeeswarm, and you will be able to import it when you want to use it.

conda is a convenient package manager for many reasons, one being that many packages contain compiled code, and conda installs binaries, enabling you to skip the ofter troublesome compilation step. As you can imagine, some packages not covered by conda do need to be compiled on your machine. In order to do that, you need to have compilers installed on your machine that pip can access to do the compilation. A good way to do this for Macs is to install Developer Tools, which you can get from the Apple App Store for free. For Windows you can install the MinGW suite or Visual Studio. We will probably not use any packages that require compilation outside of conda, so you do not need to worry about this now.

IDEs for our class

You are welcome to use any text editor/interactive developer environment (IDE) you like. I like Spyder, which is included in the Anaconda distribution. However, we have found that Spyder reliably crashes with certain Pandas objects (which we will definitely use throughout the course). Therefore, we will use Light Table along with an IPython QtConsole that can be launched from the Anaconda Launcher to do our computing in class. All of your homework will be submitted as Jupyter notebooks.

Installing Light Table

You are welcome to use whatever editor you like in class. However, note that all of the in-class demos will use Light Table with an IPython QtConsole. This is the only configuration we guarantee support for in class. You may be on your own if you use another IDE.

The IPython QtConsole can be launched through Anaconda's Launcher app, but you need to install Light Table. Light Table is a light weight text editor with pleasant features. To install it, just download it from the Light Table website and follow the instructions.

Importantly, after it is installed, you should do some configuration for Python. When you first launch Light Table, you will get a welcome screen. Do the following steps to get your tabs set properly for Python.

  1. Hit control-spacebar.
  2. Type behaviors and select "Settings: User behaviors". This will open a new tab called user.behaviors, which you can edit.
  3. Add the following to the bottom of this file:

    [:editor.python :lt.objs.editor/tab-settings false 4 4]

  4. Save it.

It will be useful to be able to comment out blocks of code. You can set your key bindings to do this. To do this, do the following steps.

  1. Hit control-spacebar.
  2. Type keymap and select "Settings: User keymap". This will open a tab called user.keymap, which you edit.
  3. Add the following to the bottom of this file:

    [:editor.active "ctrl-/" :toggle-comment-selection]

  4. Save it.

After that, Light Table should be ready for you to use to rock some Python!

A quick use of Spyder to test your Anaconda distribution

Right now, you will use Spyder just to check to make sure your Python distribution is working properly.

To start up Spyder, you can either launch it from the command line by typing

spyder

or you can do the following.

  • Mac: Use Anaconda's launcher. It is located in whatever directory you installed Anaconda (by default your home directory), in the anaconda folder. Just double click the Launcher application, and you will get a window where you can choose which app to launch. Click Spyder.
  • Windows: select "Search programs and files" from the Start menu and type "spyder" and hit enter. You can alternatively launch it from the Anaconda Launcher app.

Now, launch Spyder and configure the IPython settings by doing the following.

  1. Go to the Spyder Preferences menu (python $\to$ Preferences on a Mac, or Tools $\to$ Preferences on Windows). Select IPython console and then the Graphics tab.
  2. Make sure "Automatically load PyLab and NumPy modules" is unchecked.
  3. Select Automatic for the Backend (not Inline).
  4. Close the Preferences window.

We'll now run a quick test to make sure things are working properly. We will generate a plot of the Gamma distribution (which we will discuss later in class),

\begin{align} f(x\mid a,\lambda) = \frac{(\lambda x)^a\,\mathrm{e}^{-\lambda x}}{x\Gamma(a)}, \end{align}

on the domain $0\le x \le 10$ with $a = 2$ and $\lambda = 1$. This simplifies the expression to $f(x\mid 2, 1) = x \mathrm{e}^{-x}$.

Now, you can go to the new file which should be open in the Spyder editor window. With the exception of the obvious omission, paste the code below into the editor window. You can run the code by clicking on the green arrow on the Spyder toolbar. You may be prompted about run settings. Under "Console," choose "Execute in current Python or IPython console." You may also be prompted to save the file, which you should do, and then it will run.

In [1]:
# Do not enter the next line.  This is only to prepare this tutorial.
%matplotlib inline

# Do everything following
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
sns.set()

# Generate x values
x = np.linspace(0.0, 10.0, 100)
y = x * np.exp(-x)

# Generate the plot
plt.plot(x, y, 'k-')
plt.margins(0.02)
plt.xlabel(r'$x$', fontsize=18)
plt.ylabel(r'$y$', fontsize=18)

# These two commands may not be necessary, depending on your configuration.
plt.draw()
plt.show()

You should have a window pop up that shows the plot above. In Windows, you may have to click an icon on the bottom tool bar to view it. If you get this plot, excellent! You now have a functioning Python environment for scientific computing!