Homework 2.2: Microtubule catastrophe and ECDFs (30 pts)

Data set download


Gardner, Zanic, and coworkers investigated the dynamics of microtubule catastrophe, the switching of a microtubule from a growing to a shrinking state. In particular, they were interested in the time between the start of growth of a microtubule and the catastrophe event. They monitored microtubules by using tubulin (the monomer that comprises a microtubule) that was labeled with a fluorescent marker. As a control to make sure that fluorescent labels and exposure to laser light did not affect the microtubule dynamics, they performed a similar experiment using differential interference contrast (DIC) microscopy. They measured the time until catastrophe with labeled and unlabeled tubulin.

In this problem, we will look at the data used to generate Fig. 2a of their paper. In the end, you will generate a plot similar to that figure.

a) Write a function with the call signature ecdf_vals(data), which takes a one-dimensional Numpy array (or Pandas Series; the same construction of your function will work for both) of data and returns the x and y values for plotting the ECDF in the “dots” style, as in Fig. 2a of the Gardner, Zanic, et al. paper. As a reminder,

ECDF(x) = fraction of data points ≤ x.

When you write this function, you may only use base Python and the standard library, in addition to Numpy and Pandas. I.e., you cannot just write a function that calls iqplot.ecdf().

b) Use the ecdf_vals() function that you wrote to plot the ECDFs shown in Fig. 2a of the Gardner, Zanic, et al. paper. By looking this plot, do you think that the fluorescent labeling makes a difference in the onset of catastrophe? (We will do a more careful statistical inference later in the course, but for now, does it pass the eye test? Eye tests are an important part of EDA.) You can access the data set here: https://s3.amazonaws.com/bebi103.caltech.edu/data/gardner_time_to_catastrophe_dic_tidy.csv