Tutorial 0a: Configuring your computer to use Python for scientific computing

(c) 2017 Justin Bois. This work is licensed under a Creative Commons Attribution License CC-BY 4.0. All code contained herein is licensed under an MIT license.

This tutorial was generated from a Jupyter notebook. You can download the notebook here.

In this lesson, you will set up a Python computing environment for scientific computing. In addition, you will set up a GitHub account, which you will use to collaborate on and submit all exercises of the course.

There are two main ways people set up Python for scientific computing.

  1. By downloading and installing package by package with tools like pip.
  2. By downloading and installing a Python distribution that contains binaries of many of the scientific packages needed. The major distributions of these are Anaconda and Enthought Canopy. Both contain IDEs.

In this class, we will use Anaconda, with its associated package manager, conda. It has recently become the de facto package manager/distribution for scientific use.

Python 2 vs Python 3

We are at an interesting point in Python's history. Python is currently in version 3.6 (as of September 10, 2017). The problem is that Python 3.x is not backwards compatible with Python 2.x. Many scientific packages were written in Python 2.x and have been very slow to update to Python 3. However, Python 3 is Python's present and future, so all packages eventually need to work in Python 3. Today, most important scientific packages work in Python 3. All of the packages we will use do, so we will use Python 3 in this course.

For those of you who are already using Anaconda with Python 2, you can create a Python 3 environment.

A special note to Mac users

If your machine is a Mac, you will need to install XCode, which you can get through the App Store, before installing Anaconda. One you install XCode, you need to launch it in order to have everything set up properly. It will take a while to launch, and when it has launched, you can close it, and you won't need it again for the rest of the course. Important components under the hood are set up by installing and launching XCode.

Downloading and installing Anaconda

Mac users: Before installing Anaconda, be sure you have XCode installed.

Downloading and installing Anaconda is simple.

  1. Go to the Anaconda homepage and download the graphical installer.
  2. Be sure to download Anaconda for Python 3.6.
  3. You will be prompted for your email address, which you should provide. You may wish to use your Caltech email address because educational users get some of the non-free goodies in Anaconda (like MKL routines that will increase performance).
  4. Follow the on-screen instructions for installation.

That's it! After you do that, you will have a functioning Python distribution.

Accessing the command line

Note: Do the steps below only after you have finished the installation of Anaconda.

During the bootcamp, you will need to access the command line. Doing this on a Mac or Linux is simple. If you are using Linux, it's a good bet you already know how to navigate a terminal, so we will not give specific instructions for Linux. For a Mac, you can fire up the Terminal application. It is typically in the /Applications/Utilities folder. Otherwise, hold down Command -space bar and type "terminal" in the search box, and select the Terminal Application.

For Windows, download and install Git Bash. After you have installed it, simply right click anywhere on your Desktop, and you should have an option to run Git Bash. You will then have a prompt that looks very much like Mac and Linux users will have.

The conda package manager

conda is a package manager for keeping all of your packages up-to-date. It has plenty of functionality beyond our basic usage in class, which you can learn more about by reading the docs. We will primarily be using conda to install and update packages.

conda works from the command line. Now that you know how to get a command line prompt, you can start using conda. The first thing we'll do is update conda itself. To do this, enter the following on the command line:

conda update conda

If conda is out of date and needs to be updated, you will be prompted to perform the update. Just type y, and the update will proceeed.

Now that conda is updated, we'll use it to see what packages are installed. Type the following on the command line:

conda list

This gives a list of all packages and their versions that are installed. Now, we'll update all packages, so type the following on the command line:

conda update --all

You will be prompted to perform all of the updates. They may even be some downgrades. This happens when there are package conflicts where one package requires an earlier version of another. conda is very smart and figures all of this out for you, so you can almost always say "yes" (or "y") to conda when it prompts you.

You will also need to install some packages that are not included in the default Anaconda distribution, namely PyMC3, HoloViews, and DataShader. To install these packages, do the following, in succession, on the command line.

conda install -c ioam holoviews bokeh
conda install -c conda-forge pymc3
conda install -c bokeh datashader

These will install these three packages, which we will use heavily.

Installing from the Python Package Index (PyPI)

Not all of the many many packages available for Python can be installed and managed with conda. For some, you need to install them from the Python Package Index, PyPI, which contains hundreds of thousands of packages. To install a package, you can use the pip utility, which stands for "Python install Python." For example, to output SVG graphics of your plots, you will need to install CairoSVG, which you can do by entering the following at the command line.

pip install cairosvg

The Atom text editor [not required for this class]

For almost the entirely of the course, we will do all work in Jupyter notebooks. This is not the best workflow for larger projects. It is better to have separate Python scripts. Furthermore, you might want to be building your own module of useful utilities for the course, which will be written using text files.

There are countless options for text editors. For example, Anaconda comes with an interactive developer environment (IDE) called Spyder, which is built for scientific computing. Sublime Text is a widely used text editor (and the one I use). I recommend using Atom, which has an advantage over my preferred editor, Sublime Text, in that Atom is free.

To download and install Atom, go to its website and follow the instructions. Once it's installed, you have a good text editor to use in the this class and beyond.

The default configuration of Atom will work well for you, but you have to be careful with how tabs are defined. In the Atom menu, go to

Packages -> Settings View -> Open

This will open the Settings page for Atom. Scroll toward the bottom of that page, and make sure Tab Length is set to 4. Underneath that, make sure Tab Type is set to soft. As you will soon learn, this is important in Python because indentation matters.

Usage of Git/GitHub

We will make extensive use of Git during the course. We will use GitHub to host the repositories. You need to set up a GitHub account and get yourself acquainted with the basics of Git. To do this, see this tutorial from my Intro to Programming Bootcamp.

Once you have a GitHub account, send an email to bois at caltech dot edu with your account ID to get access to the BE/Bi 103 Group on GitHub. Within this group, you will form a team. Your team consists of your partners for homework submission.