BE/Bi 103, Fall 2016: Homework 5

Due 1pm, Sunday, November 6

(c) 2016 Justin Bois. This work is licensed under a Creative Commons Attribution License CC-BY 4.0. All code contained therein is licensed under an MIT license.

This homework was generated from an Jupyter notebook. You can download the notebook here. You can also view it here.

Problem 5.1: Outliers in FRET binding curve (20 pts)

We often want to ascertain how tightly two proteins are bound by measuring their dissociation constant, $K_d$. This is usually done by doing a titration experiment and then performing a regression. For example, imagine two proteins, $a$ and $b$ may bind to each other in the reaction

\begin{align} ab \rightleftharpoons a + b \end{align}

with dissociation constant $K_d$. At equilibrium

\begin{align} K_d = \frac{c_a\,c_b}{c_{ab}}, \end{align}

were $c_i$ is the concentration of species $i$. If we add known amounts of $a$ and $b$ to a solution such that the total concentration of a is $c_a^0$ and the total concentration of b is $c_b^0$, we can compute the equilibrium concentrations of all species. Specifically, in addition to the equation above, we have conservation of mass equations,

\begin{align} c_a^0 &= c_a + c_{ab}\\[1em] c_b^0 &= c_b + c_{ab}, \end{align}

fully specifying the problem. We can solve the three equations for $c_{ab}$ in terms of the known quantities $c_a^0$ and $c_b^0$, along with the parameter we are trying to measure, $K_d$. We get

\begin{align} c_{ab} = \frac{2c_a^0\,c_b^0}{K_d+c_a^0+c_b^0 + \sqrt{\left(K_d+c_a^0+c_b^0\right)^2 - 4c_a^0\,c_b^0}}. \end{align}

The technique, then, is to hold $c_a^0$ fixed and measure $c_{ab}$ for various $c_b^0$. We can then perform a regression to get $K_d$.

In order to do this, though, we need some readout of $c_{ab}$. For this problem, we will use FRET (fluorescence resonance energy transfer) to monitor how much of $a$ is bound to $b$. Specifically, we consider $a$ to have a fluorophore and $b$ to be its receptor. When the two are unbound, we get a fluorescence signal per molecule of $f_0$. When they are bound, the receptor absorbs the light coming out of the fluorophore, so we get less fluorescence per molecule, which we will call $f_q$ (for "quenched"). Let $f$ be the total per-fluorophore fluorescence signal. Then, the measured fluorescence signal, $F$, is

\begin{align} F = c_a^0\,V f = \left(c_a \,f_0 + c_{ab}\, f_q\right)V, \end{align}

where $V$ is the reaction volume. We define by $e$ the FRET efficiency,

\begin{align} e = 1 - \frac{f}{f_0}. \end{align}

If we measure $F_0$, the measured fluorescence when there is no b protein in the sample, we can compute the FRET efficiency from the measured values $F$ and $F_0$

\begin{align} e = 1 - \frac{c_a^0\,V f}{c_a^0\,Vf_0} = 1 - \frac{F}{F_0}. \end{align}

Substituting in our expressions for $F$ and $F_0$, we get

\begin{align} e = 1 - \frac{\left(c_a \,f_0 + c_{ab}\, f_q\right)V}{c_a^0\,V f_0} = 1 - \frac{c_a}{c_a^0} - \frac{c_{ab}}{c_a^0}\,\frac{f_q}{f_0}. \end{align}

Using the fact that $c_a^0 = c_a + c_{ab}$, this becomes

\begin{align} e = \left(1-\frac{f_q}{f_0}\right)\frac{c_{ab}}{c_a^0}. \end{align}

In other words, the FRET efficiency is proportional to the fraction of a that is bound, or

\begin{align} e = \alpha \, \frac{c_{ab}}{c_a^0} = \frac{2\alpha\,c_b^0}{K_d+c_a^0+c_b^0 + \sqrt{\left(K_d+c_a^0+c_b^0\right)^2 - 4c_a^0\,c_b^0}}, \end{align}

where $\alpha = 1-f_q/f_0$. So, we perform a regression with two phenomenological parameters to fit, $\alpha$ and $K_d$. Note that $0 \le \alpha \le 1$.

a) Load in the data for one of these FRET efficiency titration curves. You can download the data set here. Compute the background-subtracted fluorescence and the FRET efficiency. These are real data, but they are from an unpublished experiment here on campus. If you are interested in the proteins we are studying, please ask me and we can discuss.

b) Perform regressions to find $K_d$ with and without an outlier detection scheme. You should present the results of your regression both graphically and with numerical summaries (e.g., mode with HPD). How do the results differ depending on whether or not you were trying to detect outliers?


Problem 5.2: Ritonavir revisited with MCMC (10 pts)

In Problem 3.2b, we plotted the posterior distribution for a regression of viral load in an HIV patient doing a contour plot. Use MCMC to make a similar plot. You do not need to make the contours (though you can if you like, e.g., using corner); you can use the density of your MCMC samples to illustrate the posterior. Remember, the data set may be downloaded here.


Problem 5.3 (70 pts + 40 pts extra credit)

This problem features unpublished research, which the authors wish to remain private until their work is published. BE/Bi 103 students may access the password-protected homework here.