{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Lesson 1d: Lists and tuples\n",
"\n",
"We often want to group things together. For example, we may want to group all of the results some a control experiment and the results from a test experiment. As we will see in future lessons, data frames are very good for that kind of grouping. They are more complex objects, and it helps to have an understanding of Python's native data types for holding collections of data.\n",
"\n",
"In this lesson, we first will explore two important data types in Python, **lists** and **tuples**. They are both **sequences** of objects. Just like a string is a sequence (that is, an ordered collection) of characters, lists and tuples are sequences of arbitrary objects, called **items** or **elements**. They are a way to make a single object that contains many other objects. "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Lists\n",
"\n",
"As usual, it is easiest to explore new topics by example. We'll start by creating a list."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### List creation\n",
"\n",
"We create lists by putting Python values or expressions inside square brackets, separated by commas. For example:"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"scrolled": true
},
"outputs": [
{
"data": {
"text/plain": [
"list"
]
},
"execution_count": 1,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"my_list_1 = [1, 2, 3, 4]\n",
"type(my_list_1)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We observe here that although the elements of the list are `int`s, the type of the list is `list`. Actually, any Python expression can be inside a list (including another list!):"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[1, 2.4, 'a string', ['a string in another list', 5]]"
]
},
"execution_count": 2,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"my_list_2 = [1, 2.4, 'a string', ['a string in another list', 5]]\n",
"my_list_2"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"scrolled": true
},
"outputs": [
{
"data": {
"text/plain": [
"[5, 15, 16]"
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"my_list_3 = [2+3, 5*3, 4**2]\n",
"my_list_3"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"`my_list_2` contains `int`s, a `float`, a `string` and another `list`. And our third list contains expressions that get evaluated when the list as a whole gets created.\n",
"\n",
"We can also create a list by type conversion. For example, we can change a string into a list of characters."
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"['A', ' ', 's', 't', 'r', 'i', 'n', 'g', '.']"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"my_str = 'A string.'\n",
"list(my_str)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### List operators\n",
"\n",
"Operators on lists behave much like operators on strings. The **`+`** operator on lists means list concatenation."
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[1, 2, 3, 4, 5, 6]"
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"[1, 2, 3] + [4, 5, 6]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The __`*`__ operator on lists means list replication and concatenation."
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[1, 2, 3, 1, 2, 3, 1, 2, 3]"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"[1, 2, 3] * 3"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Membership operators\n",
"\n",
"Membership operators are used to determine if an item is in a list. The two membership operators are:\n",
"\n",
"|English|operator|\n",
"|:-------|:----------:|\n",
"|is a member of | `in`|\n",
"|is not a member of | `not in`|\n",
"\n",
"
\n",
"\n",
"The result of the operator is `True` or `False`. Let's look at `my_list_2` again: "
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {
"scrolled": true
},
"outputs": [
{
"data": {
"text/plain": [
"True"
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"my_list_2 = [1, 2.4, 'a string', ['a string in another list', 5]]\n",
"1 in my_list_2"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {
"scrolled": true
},
"outputs": [
{
"data": {
"text/plain": [
"True"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"['a string in another list', 5] in my_list_2"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {
"scrolled": true
},
"outputs": [
{
"data": {
"text/plain": [
"False"
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"'a string in another list' in my_list_2"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {
"scrolled": true
},
"outputs": [
{
"data": {
"text/plain": [
"True"
]
},
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"7 not in my_list_2"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Importantly, we see that the string `'a string in another list'` is not in `my_list_2`. This is because that string itself is not one of the four items of `my_list_2`. The string `'a string in another list'` is in a *list* that is an item in `my_list_2`.\n",
"\n",
"Now, these membership operators offer a great convenience for conditionals. Remember our example about stop codons?"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {
"scrolled": true
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"This codon is neither a start nor stop codon.\n"
]
}
],
"source": [
"codon = 'UGG'\n",
"\n",
"if codon == 'AUG':\n",
" print('This codon is the start codon.')\n",
"elif codon == 'UAA' or codon == 'UAG' or codon == 'UGA':\n",
" print('This codon is a stop codon.')\n",
"else:\n",
" print('This codon is neither a start nor stop codon.')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We can rewrite this much more cleanly, and with a lower chance of bugs, using a list and the `in` operator."
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {
"scrolled": true
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"This codon is neither a start nor stop codon.\n"
]
}
],
"source": [
"# Make a list of stop codons\n",
"stop_codons = ['UAA', 'UAG', 'UGA']\n",
"\n",
"# Specify codon\n",
"codon = 'UGG'\n",
"\n",
"# Check to see if it is a start or stop codon\n",
"if codon == 'AUG':\n",
" print('This codon is the start codon.')\n",
"elif codon in stop_codons:\n",
" print('This codon is a stop codon.')\n",
"else:\n",
" print('This codon is neither a start nor stop codon.')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The simple expression\n",
"\n",
"```python\n",
"codon in stop_codons\n",
"```\n",
" \n",
"replaced the more verbose\n",
"\n",
"```python\n",
"codon == 'UAA' or codon == 'UAG' or codon == 'UGA'\n",
"```\n",
"\n",
"Much nicer!"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### List indexing\n",
"\n",
"Imagine that we would like to access an item in a list. Because a list is ordered, we can ask for the first item, the second item, the *n*th item, the last item, etc. This is done using a bracket notation. We first write the name of our list and then enclosed in square brackets we write the location (index) of the desired element:"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {
"scrolled": true
},
"outputs": [
{
"data": {
"text/plain": [
"2.4"
]
},
"execution_count": 13,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"my_list = [1, 2.4, 'a string', ['a string in another list', 5]]\n",
"\n",
"my_list[1]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Wait a minute! Shouldn't `my_list[1]` give the first item in the list? It seems to give the second. This is because **indexing in Python starts at zero**. This is very important. (Historical note: [Why Python uses 0-based indexing](http://python-history.blogspot.com/2013/10/why-python-uses-0-based-indexing.html).)\n",
"\n",
"