Homework 2.3: Nonidentifiability in HIV modeling (35 pts)

Data set download


Last term, we discussed the nonidentifiability of a model for HIV clearance with a single patient’s data. You should refresh yourself about the model by reading that lecture up to the “The statistical model” section of that lecture from last term. You can access the data set here: https://s3.amazonaws.com/bebi103.caltech.edu/data/hiv_data.csv. It consists of \((t_i, V_i)\) pairs, where \(t_i\) is the time point and \(V_i\) is the viral load in the patient’s blood.

Similarly to what we did last term, we will take the likelihood to be a Normal distribution with a location given by the theoretical curve.

\begin{align} &\mu_i = V(t_i;V_0,c,\delta) = V_0e^{-ct_i} + \frac{cV_0}{c-\delta}\left[\frac{c}{c-\delta}(e^{-{\delta}t_i} - e^{-ct_i}) - {\delta}t_ie^{-ct_i}\right] \;\forall i, \\[1em] &V_i \sim \text{Norm}(\mu_i, \sigma) \;\forall i. \end{align}

There are four parameters in this model: the initial viral load, \(V_0\); the clearance rate due to cell death, \(\delta\); the clearance rate due to the Ritonavir drug \(c\), and the scale parameter of the Normal distribution \(\sigma\). We will specify priors for all, and will be completely uninformative for our purposes here.

\begin{align} &g(V_0, c, \delta) = \text{constant},\\[1em] &g(\sigma) = 1/\sigma. \end{align}

a) Write down an expression proportional to the marginalized posterior \(g(V_0, c, \delta \mid \{t_i, V_i\})\). You can do this analytically using the fact that

\begin{align} \int_0^\infty \frac{\mathrm{d}\sigma}{\sigma^{n+1}}\,\mathrm{e}^{-x/2\sigma^2} \propto x^{-n/2}. \end{align}

b) Compute the marginalized log posterior \(g(c, \delta)\) on the domain \(c, \delta \in [0, 10]\) days⁻¹. You will have to use numerical quadrature to marginalize out \(V_0\). You will need to perform this numerical quadrature for each pair of \((c, \delta)\) values for which you wish to evaluate the marginalized posterior.

c) Plot the marginalized posterior \(g(c, \delta)\) and comment on how the plot of the posterior makes clean the nonidentifiability of the model.

d) What additional piece of prior information might make the model identifiable?