Lesson 4 exercises

Data set download


[1]:
import pandas as pd

Exercise 4.1

In the lesson 3 exercise, we worked with the Anderson-Fisher iris data set. I will load it and view it now.

[2]:
df = pd.read_csv('../data/anderson-fisher-iris.csv', header=[0, 1])

df.head()
[2]:
setosa versicolor virginica
sepal length (cm) sepal width (cm) petal length (cm) petal width (cm) sepal length (cm) sepal width (cm) petal length (cm) petal width (cm) sepal length (cm) sepal width (cm) petal length (cm) petal width (cm)
0 5.1 3.5 1.4 0.2 7.0 3.2 4.7 1.4 6.3 3.3 6.0 2.5
1 4.9 3.0 1.4 0.2 6.4 3.2 4.5 1.5 5.8 2.7 5.1 1.9
2 4.7 3.2 1.3 0.2 6.9 3.1 4.9 1.5 7.1 3.0 5.9 2.1
3 4.6 3.1 1.5 0.2 5.5 2.3 4.0 1.3 6.3 2.9 5.6 1.8
4 5.0 3.6 1.4 0.2 6.5 2.8 4.6 1.5 6.5 3.0 5.8 2.2

Explain in words what each of the following code cells does as we work toward tidying this data frame.

[3]:
df.columns.names = ['species', None]
[4]:
df = df.stack(level='species')
[5]:
df = df.reset_index(level='species')
[6]:
df = df.reset_index(drop=True)

Exercise 4.2

What is the difference between merging and concatenating data frames?

Exercise 4.3

Write down any questions or points of confusion that you have.

Computing environment

[7]:
%load_ext watermark
%watermark -v -p pandas,jupyterlab
CPython 3.7.4
IPython 7.8.0

pandas 0.24.2
jupyterlab 1.1.4